There are a ton of things that get me excited. From a practical perspective, one initiative that we are participating in is called NetPerf, the real world impact and real world improvements that we're going to see from this can be very profound because it's about medical AI and getting better performance estimates in medical AI is actually a very fundamental challenge. So that's something I'm quite keen on contributing to. Big thanks to our partners, Lindo Fastly and LaunchDarkly.
We love Lindo. They keep it fast and simple. Check them out at Lindo.com slash changelog. Our bandwidth is provided by Fastly.
Learn more at Fastly. And get your future flags powered by LaunchDarkly. Get a demo at LaunchDarkly.com. This episode is brought to you by our friends at Ruddersstack and we're calling all data engineers to check out Ruddersstack Cloud and start building smart customer data pipelines.
Ruddersstack is warehouse first, no more silos. Ruddersstack builds your customer data lake on your data warehouse, not theirs, enabling all functionality of a CDP with more security and retaining full ownership of your data. It's open source and API first. Ruddersstack can be easily integrated into your existing development processes.
And because they're open source, you can see all their code, so you don't have to worry about vendor lock-in or black boxes. And best of all, they have transparent pricing, stopping your CDP a premium to store your data. Ruddersstack is free of the 500,000 events and pricing scales transparently from there. And more and get started at Ruddersstack.com again, Ruddersstack.com.
That's R-U-D-E-R-S-C-A-C-K.com. Welcome to Practical AI, a weekly podcast that makes artificial intelligence practical, productive, and accessible to everyone. This is where conversations around AI, machine learning, and data science happen. Join the community and Slack with us around various topics of the show at Change.com.com slash community and follow us on Twitter if you're at Practical AI.
Well, welcome to another episode of Practical AI. This is Daniel Whiteknack. I'm a data scientist with S-I-L international. And I'm joined as always by my co-host, Chris Benson, who is a tech strategist at Lockheed Martin.
How was your Thanksgiving, Chris? It was US Thanksgiving for those listeners that aren't in the US, might not be aware. It was very good. Nice family stuff, flew around the plane, things like that.
And now we're into the holiday season and looking forward to see what kind of machine learning gifts are under the tree this year. Yes. Well, in the spirit of distributing machine learning to all the boys and girls, maybe not by Santa. But a couple weeks ago, you and I had a conversation about federated learning.
Now neither you or I is an expert in that area or a practitioner in that area, although I think it was a good conversation. But today we're privileged to have Daniel Boysell with us, who is one of the creators of Flower, which is one of the open source federated learning frameworks that we talked about. He's a co-founder at ADAP and a visiting researcher at University of Cambridge. Welcome, Daniel.
Thanks for having me. Yeah. Well, as you heard, Chris and I were talking about federated learning without being experts in federated learning. So maybe to follow up on that conversation and maybe for people that didn't hear that conversation, could you just give us a sketch of what federated learning is and then we can take it from there?
Yeah, of course. So federated learning is a way to train models across multiple datasets. That's a very easy take on it. So you might be wondering how does this work?
The way you do it in federated learning, and let's just start off by giving an example. Let's say we have, for example, a group of hospitals. They have some in-house data, but due to regulations, they cannot share this data and then they cannot put this data in a cloud and they can't use the usual machine learning workflow where you basically collect all of the data in a central repository and then train your model on it. So that's not an option for them.
So they might be interested in using federated learning and how would a federated learning set of then and then work in such a scenario. So the way it works is that you have your plain old machinery model, say it's a network, like for example, a CNN that does some kind of image classification. Maybe you want to look at radiology images, for example, and you would initialize this model in a central place. Let's call this a central server.
And the central server would, after initializing the model, send this model out to all of the participating hospitals. So it would send the uninitialized model, but there are other variants of it just to say this for a second completeness. But in our initial example, just to explain the very basic version of it, they would send out the initialized model. So model that hasn't done anything yet, the model would then be trained locally within each hospital on the data that is available locally.
So each hospital obviously has a different data set. They would train the model, not until convergence, but they would only train it for a little while. So let's say they would train it for one or two epochs. And after they train the model for one or two epochs, they would send the updated model parameters or the gradients that are accumulated back to the central server.
So that way they don't have to share the data. The data is very originated. There's data always days within each participating hospital. And the central server would only get the refined model parameters.
So the model parameters that have been trained for one or two epochs. You would get that from all of the participating hospitals. And what the central server then does is it aggregates those parameters. In the simplest version, it just doesn't weight average over these parameters.
What I just described is a way of initializing about sending it out, training it locally, collecting the updated parameters, and then aggregating the parameters. That is one single round of federated learning. And then what you usually do is to perform these rounds over and over again until the model converges. And the interesting part about it is why organizations actually do this is they get access to a lot more data than they had before.
So we've probably all had this experience, especially in the practical AI projects that oftentimes there is just not enough data and having more data beats any fancy model architecture. So in this case, better learning solves this data access problem. They can collaborate on the model training without having to share the underlying training data. Yeah, that's just a good explanation.
It has much better than the one we were trying a few weeks ago. Yeah, we should link this episode to that one because it took us half an hour to get there. We just need to voice over what he just said to what we said. Yeah, I mean, I left out a ton of detail, right?
I get it. But we can ask you questions and find out what some of that is and looking forward to that. So as a starter, it's very clear, given the data is distributed in terms of where it's located and given laws and regulations and other such things that may constrain the training process with privacy concerns and stuff. It's very clear what the advantage is in federated learning.
What also might be considered some disadvantages or maybe another way of asking it is when you do consolidate the model after you've done the federated learning and stuff, what is the delta in a trained model versus if you had not done that, if you had been able to aggregate kind of in the traditional way, all the data into one spot and train it in the traditional way we've done before federated learning, what's the difference in what you get as an output, you know, or is there much of one? Yeah, there is a difference. The biggest difference, I shall say, is obviously in convergence time because you have these rounds of communication and also the averaging process has some impact there. Often as researchers, we make these comparisons between centralized learning and federated aspects of it.
The interesting bit is that this comparison is somewhat artificial because it's not something that one would face in reality very often. It's either federated learning or nothing. We've seen this in the past, right? If you look a little bit at the journey that machine learning and deep learning now took is that somewhere around 2012, we realized that by making these models bigger, we suddenly get better accuracy.
So there was this image at net moment and then a couple of other moments like this afterwards. And we thought we can achieve ever greater accuracy and then other performance metrics with these models. And the thing is we always, when we read a research paper, for example, and when we look at these recent advances, it's often quite fascinating and it's often in the context of web-scale companies like Google or Facebook, we have these massive amounts of data in-house. But then often the practice there is this realization that, okay, I read about this cool technique.
I'm trying to apply it to my problem and suddenly I don't get amazing results that I expected to have. So the question is what happened, right? And in many cases, the answer is really that the amount of data and the diversity that you have in your local data set is just not enough. And the interesting thing and the thing that got us very interested in federated learning was this realization that for many of those cases, you might not have a large data set on your own, but there are a lot of others just like you who are facing the same challenge and who might want to train the same model.
But they also have some data but not enough data for a very good model. I mean, we could obviously solve this if we could put all of this data in a single destination, in a single cloud account and then train a model on it. But that's something that just doesn't happen. It doesn't happen for regulatory reasons.
It doesn't happen for confidentiality reasons. For example, corporations, they have a lot of financial data and they might want to have models that project pretty certain aspects about these data. But again, it's a thing of confidentiality. It's something they would never share.
And the types of use cases that federated learning gets used for, sometimes we are surprised ourselves and where exactly these companies are hesitant to share data. For example, there was one case where a couple of manufacturing companies, they are all operating the same manufacturing machine and they want to train a model that does predictive maintenance basically for this machine to predict whether this machine is likely going to fail. So whether they need to do some manual maintenance or something like that. And one would think that this is a case where they could just collaborate and they could just put all of their machine sensory data in a cloud account and train a predictive maintenance model.
No, they don't. Why don't they collaborate on this? Well, the reason is that the data that they have from running these machines could allow others to see how often they run these machines, which could allow others to draw some conclusions about how many parts they are producing, which is highly confidential. So even in those seemingly easy cases, it realities not that easy.
So that's almost a perfect leading for what I wanted to ask next is that federated learning, it sounds like offers different business models from maybe some of the things we've done in the past where even among competitors, directly cooperating. So have you seen the start to happening out where maybe consortiums come into being and they may include direct competitors who are all in the same line of business, they want to protect their data so that they don't give away competitive intelligence and federated learning through consortium or some other structure similar to that might be a way to everyone benefit from that and get the new model without giving away the secrets also to speak. Do you expect to see more of that kind of thing? We are sort of seeing this.
The way it usually starts out is that organizations who are maybe not direct competitors, maybe they are somewhat in the same space, but they're not like the toughest competitors. They start to get together. Yeah. In some cases, it can even be the sub-organizations of a larger enterprise, for example, because they are also often facing these restrictions for sharing data.
But then we also see that sometimes even real, really strong competitors, they get together because they see something else as a threat to their business model and they see that this is a way to collaborate without sharing this as you call it as the secret source. The interesting thing is that the way I described federated learning at the beginning is this is really end-to-end federated learning where you initialize the model just globally and then you train the model end-to-end with all participating parties. This is not the only model that's possible. I want to describe one which I think is quite interesting, especially for this case where you have competing organizations collaborating.
It's one where you train a certain part of the model in a federated fashion across multiple datasets and then out of parts of the model, you just train it yourself on your local data. This is pretty interesting because you can, in such a federation, you can, for example, train the entire backbone of a model. By the end of the last few layers, the head of the model, you don't train this in a federated fashion. You leave that up to each of the participating organizations to do it itself.
So everyone ends up with a similar, yes, different model and everyone has something where they say, okay, we benefit from this federation but we are not giving away everything. One important thing to mention though is that there are different types of federated learning. So you can roughly categorize it into two different types. One is this cross-highload type that we just talked about where different organizations collaborate with each other.
The other type that we often see also in scientific literature is the cross-device setting where you would, usually, typically you would have one organization. For example, think about Google or Apple, for example, and this organization would have access to a large number of devices, for example, mobile devices like Android phones or iOS phones. And the goal in this case is also to train a model to train a model across all of these devices and these devices, they hold data that is also where you wouldn't want to upload this data to the cloud. So this is the cross-device setting where a single organization trains these models without access to the underlying trading data.
This episode is brought to you by me, myself and the eye. It's a podcast on artificial intelligence and business and is produced by our friends at MIT Sloan Management Review and Boston Consulting Group. The question is, why do only 10% of companies succeed with artificial intelligence? That's the question the end of the inch with this podcast.
Here's Google Cloud's Will Granis on an unusual AI challenge. When I think about what AI is, I find the algorithm is mathematically fascinating, but I find the use of the algorithms far more fascinating because from a technical perspective, we're finding correlations in extremely high dimensional nonlinear spaces. It's statistics at scale in some sense, right? We're finding these correlations between A and B and those algorithms are really interesting and I'm still teaching those now and they're fun.
But what's more interesting to me is what do those correlations mean for the people? All right, me, myself and I is a collaboration between MIT's low management review and Boston Consulting Group. It's available where future podcasts just search me, myself and AI. So Daniel, I think we've mostly talked about some of the data-centric motivations for federated learning or maybe privacy focus or whatever it is, competitive type of advantages.
But I'm also thinking of the devices on which the actual training is happening. So if I'm thinking of the centralized model, I'm thinking of, oh, I'm going to spin up a pod of GPUs and a really expensive pod of GPUs and do all my training there and dip my data there somehow. So am I correct that you could have some sort of infrastructure savings with this where the actual computation is happening on those edge devices and you're doing a smaller amount of aggregation and updating of the model centrally? Could you talk to that a little bit and what people have seen and how they look at infrastructure in that way?
Yes, that's a very interesting question. The answer is, as almost always in engineering, it depends. So as you know that correctly in the centralized setting, you have a pretty well-defined stack that there's a lot of changes from one set up to another. So you usually have some kind of x86 processor and then you have a, usually you have a GPU attached to that, you have Linux running on that machine and then the biggest choice you have is whether to use TensorFlow or PyTorch or Jax nowadays.
In the federated setting, that's quite different. In the federated setting, you can have anything as a client starting from a tiny and bad device. There's research going on in that direction. Then you can have something like an Apple Watch or a mobile phone or you can have some bigger device like a tablet or laptop.
You can have your standard x86 server that I just described or you can even have a much larger compute cluster if you're in the cross-silo setting where you have a ton of data and one of these organizations has massive in-house infrastructure. You can have a HPC cluster as a client. So this is obviously quite interesting and also challenging from infrastructure and just a software perspective. In some cases, you can actually, and there is some recent research, for example, from a group in Cambridge that I'm involved with about the CO2 impact of these workloads comparing, for example, the CO2 impact and this obviously quite related to your question about the CO2 impact of federated workloads versus central workloads.
And the interesting bit is that it's not, you can't say it in general, actually, it's quite an interesting thing because I originally expected federated learning to do much worse because you have these communication rounds and takes long to converge. So obviously it must have a higher CO2 impact. It turned out that that's not necessarily the case because in some situations, the reason is once you're here, it's quite obvious. But it was surprising to me in the central setting, you have the major impact on this CO2 emissions is the cooling.
So you have active cooling of your two few clusters. In the federated setting, you don't necessarily have cooling. You have additional cost for communication, but then if you have a mobile edge device, these edge devices, they're usually passively cooled. So they are running the workload and they produce a result without ever needing energy for cooling.
So that can be quite good, but obviously it depends a lot on the workload, the type of model you're trying, the number of communication rounds you do and other aspects. In terms of infrastructure cost, this sort of answers this question as well because you can have, in some cases, if you have, for example, the cross-device setting, then obviously, if you're not the one operating these devices, then you don't have to pay for the energy that goes into training. Usually when companies do this, they're very careful about it. They do it in a very careful way.
So they wait until the device is plugged in and the device is connected to Wi-Fi until it's fully charged and idle. And only then they do the federated learning to not impact user experience or not to not drain the battery or things like that. In the cross-silos setting, I wouldn't say that there's much of a difference in terms of infrastructure. You need, well, each company needs infrastructure, they would need anyways, and then you need one additional server, so that's pretty similar.
Especially in the cross-silos setting where you often have large models, you do have a lot of network bandwidth that you need, so that's something that you should consider. You talked a little bit about the training time, you talked a little bit about what's happening on the device. I think what's happening in the back of my mind is I'm thinking, okay, I've got all of these devices and there's various axes along which things could change. I could have the computational power of that edge device or the client.
And then I've got also the number of samples that are available for training on that device. I'm thinking if, and maybe you could speak to this, I'm thinking in the scenario of a low-power edge device or a phone, I'm going to have very few samples which might be a quick update on that device of the model and communicate the parameters back, whereas as the amount of data that you have on the client is larger, you need more computational power at least more time to do the update. Is that how that trade-off happens in practice? Yes, obviously, if you have more data on the device, you need more time to train the model on the data.
This is actually also a very interesting aspect, not just in terms of practical things like communication bandwidth and so on, but it's also quite interesting from a more fundamental perspective, namely in both the cross-device setting and in the cross-silos setting, usually the data in these partitions, as we like to call them, data in these partitions is coming from different distributions. So it's what we like to call it non-ID data. And this actually has an impact on the learning process. There are certain scenarios which are very rare in practical setting, but there are scenarios actually where the data distribution within each partition can be so different that it's just not possible that these workloads converge.
And this is something where a lot of research is going on on how to make a federated learning more robust towards such scenarios. And yeah, the practical aspects of it are also quite interesting because if you have multiple clients in the same workload, one of these clients just has very few data examples and another client has tons of data examples. For example, we all know that one type of person that takes very few photos when they're on web vacation and we all know that other type of person who takes a ton of photos when they're on vacation. So this is a very practical example for different amounts of data on each device.
In such a scenario when you instruct a client to do, for example, one epoch on their data, then obviously this one client will be the update, will be coming back much, much faster. So what you want to have in your entire system is some robustness towards clients who either take a very long time because they have so much data or towards even real stragglers who I don't know, maybe the device is suddenly getting busy with other things which delays the update coming back. So this is something where your software infrastructure needs to be able to handle these kinds of cases. And it's also something where you need appropriate ways of handling it on the server side.
So the one obvious or the one that the easiest thing to do is obviously to discard those clients that are taking a long time that are stragglers. But then there are more clever ways to approach this, for example, to let this client know, hey, your time is running out, we are about to close the round. Why don't you submit your partial update? But then your server side and the aggregation logic, it needs to be able to handle those partial updates coming from clients.
So Daniel, you've talked a little bit about like certain client devices being stragglers from one perspective, but I'm curious in terms of how the federated learning community is thinking about things like bias in data. So if I am a data scientist in a central location, I'm seeing maybe updates to my model, but I'm not seeing the data that is producing those updates to the weights and biases of my model. So if there's bias in terms of those in client devices, maybe 97% of my client devices are being operated by males and I have some gender bias in the data that's coming back. Are there ways that the community is thinking about that and ways to address that?
I guess maybe there's a term for it. I'm thinking of it like client bias. Yeah, any thoughts there? Yes, absolutely.
It's a very good question and it's a very important question. There are different ways to think about it. No ways or one approach that one topic that the community thinks about a lot is how to address that from an algorithmic perspective. So there are approaches, for example, if you're federated learning that tackle this from an algorithmic perspective.
So when you collect updates, you can do this in a certain way and you can try, for example, there are many different approaches, but one thing you could do is you could try to address that, for example, through the averaging process. It's the way that averaging. So there are ways to influence this. Another perspective is more from a more intuitive and more practical perspective in the sense that you can think of federated learning as a way compared to centralized learning to actually overcome bias because you can overcome it completely.
But that's not what I mean, but help to overcome it in the sense that you can suddenly get access to more training data and hopefully more representative training data. And then you can make better decisions about how to train your model and what kind of pieces of data to include in your training process, how to sample these data examples that you have on the clients and a lot of those related questions. Early in the show, you heard a teaser from our friends behind the podcast, me, myself and AI and my T-slow Man's Review and Boston Consulting Group came together to produce this awesome podcast and every episode of Sam and Shervin talked to leaders that are engaged in the theory in the practice of AI. I remember one project we had, we were training a chatbot and it turned out we used raw logs, all privacy assured and everything, but we used these logs that the customer had provided because they wanted to see if we could build a better model.
And it turns out that the chat agent wasn't exactly speaking the way we'd want another human being to speak to us. And why? Because people get pretty upset when they're talking to customer support. And the language that they use isn't necessarily the language I think we would use with each other on this podcast.
Alright, me, myself and I is a collaboration between MIT's Low Man's Review and Boston Consulting Group. I'll wear a feature podcast that serves me, myself and AI. Well Daniel, this is practical AI. So we definitely should get into the practicalities of how federated learning can be implemented.
And I think you're probably one of the better people to speak to that because you've been heavily involved in one of the creators of the flower framework. So maybe just to start out our discussion around that, could you talk about kind of the backstory of flower, the motivation behind it and what it is? Yeah, absolutely. I'm trying to do my best.
So when we started out on this journey, we obviously got excited about how to learn for the reasons I was studying earlier. So we were actually in real industry projects, we were facing these challenges where we saw that the data these organizations had in house was just not enough. But we saw that there were other organizations who had similar challenges and then we saw the potential to build collaborative approaches there. And the only way to do this, these kinds of collaborations would be the only way that would be feasible would be if the underlying trade at the end would not have to be shared.
This was sort of the setting that we were in when we first looked at federated learning. And at the time, we obviously looked at our solutions, but there wasn't really a solution that was a really good fit for our requirements. Well, for our requirements was obviously because we were coming from this, we were looking at these practical problems was that we could build a system that we can then add a later point, move into production. So obviously you would start out with some prototyping and see if you could get such a workload to converge, but then add a later point, if you cannot move this into production, then why would you invest in this?
So this was one of our hard requirements. And then for moving a federated learning or federated analytics workload into production, there are a ton of associated challenges with that. I was sitting at this large hydrogen 80 that we see on the client side. So being able to integrate with an embedded device, mobile device, server, HPC cluster, this is something we thought was high on our priority list.
So at the time, we didn't really see any solution that that was a fit to the requirements that we had. We sort of had to shift our focus a little bit away from building this one particular system that we had in mind, and we shifted the focus away to first building the infrastructure that we had in mind for it. Out of that, we built a prototype for that. And then out of that prototype, we gathered a lot of learnings obviously, and eventually at the beginning of last year, my co-founder and I, we said, okay, let's start a company and build this infrastructure to bring these advances to the C and this huge potential to make this really accessible for others to use as well.
The flower framework is probably obvious by now that one of the reasons the flower framework is there is that we want to enable everyone to build such workloads because there's a lot of details going on under the hood that are not easy to implement. And if you just want to do a federated learning, it would obviously be a huge hurdle for others to first build this infrastructure before they then build their actual workload. We wanted to make this easy. We wanted to make it easy to start in research and then gradually enhance these workloads and move them through production eventually and then to operate them in production.
This is also something we haven't quite seen in other frameworks. Other frameworks that we've seen are usually focused on one thing, for example, focused on being a good simulation engine, but then you can't take these workloads and move them into production. So we had an opportunity that we saw and this is part of this user journey making it easy to start a prototype something is the opportunity to be compatible with all of the machine learning frameworks that we're seeing out there. So we see huge excitement about TensorFlow and PyTorch.
Obviously, there's a lot of explaining frameworks that should say now there's a lot of excitement about Jax, by many people, and there are these other frameworks which are also relevant, sometimes relevant for very specific cases. And the opportunity that we saw is based around the story. You have an existing machine learning project. What's the minimal amount of code changes that you have to do in order to federate this thing?
And then you can take an existing workload and federate it in less than 20 lines of code, which is actually, I still find it amazing given the amount of things that are going on under the hood. Yeah. And you mentioned supporting all of these different frameworks, which does seem like a big task. And I'm kind of looking through the flower usage examples and the documentation.
And I also love just, you know, I mean, you explicitly say it's a friendly framework, which I think is great. You talked about accessibility. You've got a very friendly flower logo. And so, yeah, I think it, you know, it puts up an inviting front for people, which I think is cool because it is a, it can be, like you said, a very overwhelming, complicated thing to get into.
You were talking about supporting these different frameworks and maybe you could give a sense of like, it seems like a big task to support all of those in this way. And I see that the main kind of way in which you wrap things with flowers, like creating this class Python class, maybe that wraps certain methods. And within those, you can define your own sort of TensorFlow or PyTorch or whatever ways to fit or get parameters of a model or whatever it is. Did you purposely create that structure because you had this vision of supporting the multiple frameworks?
And am I representing that accurately? Yes, absolutely. We call it flower the friendly, feathery, doing framework exactly for that reason. So we want to be friendly in many different dimensions, actually.
We want to be friendly when it comes to different machine learning frameworks. We want to be friendly when it comes to different device types. We want to be friendly when it comes to different transport mechanisms. So we actually have this and that's something that is upfront on the website, but we have different transport mechanisms building and you can swap these out actually.
So building and support for different frameworks, this was something that we intended to do from the very beginning. And there's different layers to this that are important or at least interesting to understand. So one layer is the client class that you just described. So when you build your client in Python, then you would create a subclass of clients.
So flower.client.line and what is called or a subclass of flower.client.non-py-client, which is in easy to implement. And you basically just need to add these few lines of code that then call into your existing machine learning pipelines, which is on the one and a simple concept, but on the other hand, a very powerful concept because it allows you, when you implement these classes, it allows you to call arbitrary Python libraries. So for example, one good and one important example of that is the support of differential privacy. We sometimes get requests, hey, that's flower come with differential privacy built in.
And actually the answer is we don't have to because you can, for example, for a pytorch based workload, you can use the library called a packers, which is gives you a sort of a differential private SD optimizer that you can plug into your workload. And then you can just use it. And the amazing thing about this is that with the flower framework doesn't even have to change. If there's a new library coming out, a new approach coming out for what you can do on the client side, you can just integrate with arbitrary code.
The other layer that is maybe interesting to understand, maybe not so much for researchers who do most of their day-to-day work in Python, but to others who want to maybe more deeply integrate this in the automotive setting or something similar to that, they wouldn't want to use Python for their own device processing. So they would want to use a different language, for example, C to do that in the automotive world. There's this C dialect called Mr. C that you have to use for safety purposes, for example, it prevents you from using recursion and other things like that, things that are being considered unsafe in the automotive world.
And in those scenarios, you can still integrate your device with flower by directly handling the events that are coming from the server. So in the end, the flower is designed in a way where the client side is actually rather easy to implement. And if you have something that is running on C or C++, all you would have to do is you would have to establish a connection to the server. Then occasionally select this client and when it's selected client, it sends it a message.
You want to client side, you have to handle this message. You can do your processing. It doesn't have to be any of the well-known machine frameworks. You can hand code the type of model that you have, and then you send back a message containing your update, for example, the gradient city collected.
That's awesome. I love that sort of client agnostic focus. It's cool. One of the things I was curious about, because as a practitioner, I'm kind of in and out and I'll do other things in my job.
And when I'm coming back in, I'm having to kind of go, how did I do that before and stuff? And one of the things that I've noticed in the industry is that the barriers to be able to access or utilize machine learning are getting lower. There's a lot of tools around usability coming out. What does the story look like for flower and maybe for federated learning at large, as you have more users out there of various technical capability and maybe gradually having that technical requirement going lower and lower as the tooling is better?
How will federated learning fit into that world where more users with less specific skill in this area are accessing these tools and creating models of various types? What does that look like there? That's a great question. So I'm sometimes saying that we've been or maybe we still are not perfect on that in a pre-chance of flow era when it comes to federated learning.
It was the case for a long time that if you wanted to build a federated learning workload, you usually had research scientists type of person start out to prototype this, make a simulation of it. And if that converges, then you could actually you connected decision that you want to have this in production, but then you would basically start from scratch and you would implement it in quote unquote real system with I don't know, a Java or a C++ or something like that. So you had to build these systems by hand that there's a, for example, that there's a blog post that compares federated learning frameworks. And before flower was around, the conclusion was really if you want to build this workload, a federated learning system and you want to build it in a really a production environment, then your best option is to just build it from scratch by hand.
They've recently updated this blog post to say that for their scenario, they choose to use flower for that. Obviously I'm happy about that, but it's still not a super super polished experience of flower, makes it a lot easier to start out on that journey, but it's still a couple of moving pieces that you should sort of understand to make informed decisions about how to configure your workload, for example, that's something that is obviously one of our priorities to make this even easier, to make it even less likely that if you are not an expert on this, that you are configuring something, building something that might not be a good choice in production. So one of the things that we take very seriously is that we build in the right defaults. So one of the defaults, for example, that the flower framework is following, that is for short types of workloads, it's the go to recommendation is that the flower framework, when it gets updates from clients, it does not persist these updates in any way.
So there's these individual updates from clients and they could allow you to peek into it and to draw some, at least some minor conclusions about the client's dataset. And therefore the recommendation is to receive these updates, only keep them in memory and only for the minimum amount of time, absolutely necessary. So once you aggregate it with other updates, you can safely discard it. And another very related thing is that, for example, the server does not log any client-specific metrics by default.
So those are things that we are trying to build in that if you just start a server with all defaults that makes something that takes a sensible approach. But then obviously there are more advanced users and they want to customize it. So the perspective is make defaults sort of safe as safe as we can and then allow more advanced users to customize these workloads. So as we close out here, I'm interested to hear about what is one or a couple of things that really excites you about the future of flower and maybe its applications within the wider context of federated learning.
What's the one thing or the couple of things that really get you excited about where this is headed or maybe within the roadmap of flower? There are a ton of things that get me excited, both from a research perspective but also from a practical perspective. From a research perspective, we just launched a preview of a new feature that we're calling the Virtual Client Engine. Virtual Client Engine is something that, well, it manages clients as a virtual client, so those clients don't actually exist in memory.
What this gives you is pretty trivial, but what this gives you is amazing scalability for your research workloads. So we did a survey of research papers and looked at what the scale of these workloads is in the research of those experiments. And really the vast majority of papers, they used up to 100 clients and also up to 100 clients doing work concurrently, so training concurrently, for example. So you can have a large client pool, you can have a client pool of 10,000 clients, but then they would have only 100 of them participating in the same round.
And this is something that is likely due to resource constraints because those workloads can get very heavy and the systems that we read about from industry, they are at a vastly different scale. So they have millions or tens of millions or even hundreds of millions of clients in such a workload. And this is quite interesting and also quite an important challenge to address because obviously we want to have research that eventually translates to the real world to practical setting. And if the scale and research is a very different scale from the practical settings, it's less likely that the research that we are conducting will translate into the practical setting.
So Virtual Client Managers is one thing where we demonstrated on quite average hardware actually, we ran a workload where 15 million clients in it and a thousand of these clients training concurrently and this works super well. So I'm quite excited about that one and especially quite excited to see what the community is going to do with that. That's from a research perspective and we have a couple of things in the pipeline that we are going to announce over the coming months. Also from a practical perspective that's maybe even more exciting in terms of the real outcomes that we're going to see from that one initiative that we are, for example, participating in is called NetPerf which is a host of my ML comments which is sort of the organization that emerged out of MLPerf.
So NetPerf is a way to use federated evaluation to get a better understanding of the performance of medical AI models. It also requires federated infrastructure. We put a paper in archive to the entire NetPerf group, put a paper in archive a couple of weeks ago. So it's a very tight recommended and this is something where you can really see that the real world impact and real world improvements that we're going to see from this can be very profound because it's about medical AI and medical getting better performance estimates in medical AI is actually a very fundamental challenge.
Once we have these better estimates, it is much safer to roll out medical AI models much faster and apart from NetPerf, there are also a couple of other initiatives in the medical AI space and the Dr. Discovery space that I'm very excited about because any advance our infrastructure will help in generating can have a very profound impact on society as a whole. So that's something I'm quite keen on contributing to. Well Daniel, I'm super excited about all of the things that you've mentioned in terms of things on the roadmap of research with flower or practical uses of flower and federated learning and really appreciate you joining us and talking us through everything on the podcast.
Appreciate it and we'll include some show notes in our show posting for flower and all the wonderful things that you've talked about. Yeah, thank you so much for joining and looking forward to keeping tabs on flower. Thanks for having me. That's our show.
Thanks for listening. For more like this, check out our Master Feed. It is all changelog podcasts in one easy to consume place. Let your podcast app snag everything we produce and then pick and choose which ones to listen to.
Subscribe today at changelog.com slash master or just search for changelog master in your podcast app of choice. You'll find it. Special thanks to Rig Master Cylinder for providing our music and to our longtime sponsors, Fastly, Longstar, and Leno. That's all for this week.
We'll talk to you next time.