I'm a Microsoft guy, I remember BizTalk, I don't know if you remember that product, it was supposed to do away with all coders, and like, you know, then you encounter it, and like, one asymmetry in the data, or one irregular use case blows the whole system up, so, we have enough experience, and now, look, we're giving you a tool that takes away the arduous elements of deep learning, but you still apply your creativity and understanding that gets you there a lot faster, and like, I think that's going to be with us for some time. Then with our ChangeLog is provided by Fastly, learn more at Fastly.com. We move fast and fix things here at ChangeLog because of Rollbar. Check them out at Rollbar.com, and we're hosted on Leno Cloud Servers, head to leno.com slash changelog.
Welcome to Practical AI, a weekly podcast that makes artificial intelligence practical, productive, and accessible to everyone. This is where conversations around AI, machine learning, and data science happen. Join the community and Slack with us around various topics of the show at change.com slash community, and follow us on Twitter here at Practical AI FM. Well, welcome to another episode of Practical AI.
Hi, my name is Daniel Whitenack. I'm a data scientist with SIL International, and normally I'm joined by my co-host Chris Benson, who is principal AI strategist at Lockheed Martin, but this week is kind of weird in a couple of respects, so Chris is out dealing with a personal thing, which I totally understand, and he'll be back next week, but also we're all just kind of, at least if you're in the U.S. or watching on from afar, it's kind of a crazy time right now. Yeah, there's a lot of people struggling and suffering and experiencing a lot of hardship, whether that be from the sort of police brutality that's happened, or the looting, or even just the ongoing struggle from COVID.
It's definitely a hard time right now, but there's important issues to talk about, actually not unrelated to this. We know as AI practitioners that a lot of how our models behave is driven by the data that we put in, and because we're often gathering data from a biased world, then often our models end up being biased and not fair, and so this is a real problem, so people like to talk about the sort of sentient problem or singularity that they're afraid of AI taking over the world in that way, but I think in our immediate terms, you know, these sorts of problems of bias and fairness and understanding why that happens, the explainability around our AI models is even more so important because of the things that are happening in our world and because of all of these things that are driving the data that we're using, so today I'm really excited to have an expert on this topic with us, and not only an expert on the topic of explainability and building things related to explainability, but also the CEO of a really innovative company doing a bunch of great things. The company is Darwin AI, and today I have with me Sheldon Fernandez, who's CEO of Darwin AI. Welcome, Sheldon.
Thank you for having me today. Yeah, definitely. So as we get started into this topic, I'd love to just hear a little bit about your background and how you got into the AI world and eventually found your way into Darwin AI. Yeah, it's quite an interesting story.
So I went to the University of Waterloo here in Canada, so I'm right now speaking to you from Toronto, Canada, home of the NBA champion Toronto Raptors, and will be champions for a little bit longer, it looks like. And Waterloo's kind of a tech hub, right? Correct. So for Americans who might not be like a little ignorant of Canadian things, that's like a big tech hub, right?
Correct. It's kind of like the MIT of the United States. So it's a very engineering-focused, heavy school, and a lot of our tech innovation comes from Waterloo. There's U of T, there's Montreal, but of course I'm going to be biased and say Waterloo's my part of the best.
Of course. So I went to Waterloo to get computer engineering, started a consulting company, enterprise software consulting company in 2001, and grew that to a size of 700. And we were acquired in 2017 by a company called Avanon. They're co-ed by Microsoft and Accenture.
So my job, and I was CTO of that company, was to bring emerging technologies to the enterprise about two or three years before the enterprise was ready to use them. And so when DeepMind accomplished what they did with AlphaGo, you might remember that in 2016, I remember really paying attention to that and thinking, this was significant. I followed computer chess, if you remember, for many years. And it was a very sad day for me when Deep Blue B.
Kasparov, I was very downcast. And my mom thought it was a girl, and when she found out it was Deep Blue B. Kasparov, I think she was half proud, but also half worried that she'd never get grandkids, right? Right, right.
But the thing with Go, the game of Go wasn't supposed to be conquered by machines until 2030 or even 2050. So when DeepMind did that, I remember thinking, this is significant, and how do they do this? And then really getting into deep learning and doing the Jeffrey Hinton deep learning course. And then when we got acquired, I had been speaking about deep learning.
A mutual friend said, go have a conversation with this academic team at Waterloo. It's a special team, and you'll have just a wonderful conversation. And I did. And they just had incredible technology.
And I was just supposed to advise them. I wasn't supposed to start another venture. I had just finished a 17-year journey and was going to take time off and drive my wife to work and watch The Price is Right and all the wonderful things you do when you're retired. But this team was just too special.
Not to be. Yeah, right? So I started advising them, and then the talks got more seriously. I'm like, okay, I have to do this.
And so we formalized the company in 2017 and got our venture planning in 2018. And four months after that, my wife got pregnant with her first child. Oh, wow. So I actually have two startups.
I've got an artificial intelligence startup and a biological intelligence startup. Right, both on the same sort of timeline. Exactly. And they're both magical and exhausting in equal measure, but in different ways.
And so, yeah, it's just been an incredible ride. And our chief scientist, Professor Alexander Wong, is Canada's researcher at Artificial Intelligence. So this team just has an incredible amount of scholarship and innovation behind them. And so to work with them to take the product to market has just been incredibly exciting, especially given the use cases around AI and so forth.
So that's a quick journey. And then very quickly, although I did an engineering degree, I took some time off from my previous venture. I did a master's degree in theology. Oh, awesome.
Which was just out of interest and somehow that's significant now because of the ethics and the way we think about AI. So just a fascinating combination of things. Well, there's a number of interesting groups kind of exploring that connection. I know there's a group in Seattle.
Also, I work for a faith-based non-value of human life or whether it's like the sort of things that we started talking about with bias and fairness and all. Yeah, it's interesting to have those conversations. So interesting to hear that background as well. Yeah, for sure.
As I was looking through the Darwin AI website and some of your work, I'll link to your website in our show notes. But I was looking specifically through the information about the platform, but also this page you have about research. And it seemed like there were a few kind of themes popping out that were really focused areas. One of those being kind of edge computing and running AI at the edge.
So I see things like edge segnet, which is a compact network for segmentation, edge speech net for speech recognition at the edge. I also saw a theme of kind of generative machines and generating networks in some way. And then, of course, what we talked about a little bit at the beginning, which was related to explainability. So I was wondering if you could kind of give a little background on how those themes came up and maybe starting with the edge case.
Yeah. Why are people concerned about AI at the edge now? Why did that seem like a sort of direction you wanted to put some focus into? Yeah.
So let me bring this together and talk about how the core IP was formed because it kind of informs those three areas, right? Yeah, that'd be great. So our academics had been working with deep learning for about a decade, right? So well before it even entered the consciousness of the enterprise, our academics knew what this was.
They were familiar with the machinery and they were actually doing it for their own research. And they found it to be terribly difficult to develop deep learning neural networks. And they said, look, as academics, this is difficult. We can only imagine how hard this is going to be when non-academics encounter this.
And was that difficulty mostly related to like computational infrastructure difficulty or was it like the sort of complication around the tooling or sort of the theory behind like what's best to do when or what was the difficulty mainly focused around? So there were three difficulties, right? Yeah. The first was they said you need an incredible level of skill to develop these things.
The tool sets are immature, but the mathematical background you need is significant and a barrier. The second, as you say, is the computational overhead in running these networks. So our professor often jokes, he originally did this work for some of his scholarship and he didn't have the funding to pay for hundreds of thousands of dollars in Azure or GCP. So he had to invent a technique to make it quicker.
And then the third was it was so painstaking to do this because you had no understanding of how these networks came to this conclusion. It was like debugging a program without the source code, right? Right, right. So those were the three problems they encountered.
So they invented scholarship initially for their own purposes to address those problems. To scratch their own itch, essentially. Exactly. And so they termed it generative synthesis.
And the way it works is they use other machine learning techniques to probe and understand a neural net. They develop a very sophisticated mathematical understanding of the neural network. And then they use a generative approach to generate an entirely new family of neural nets that is a lot more compact than the original, as good as the original can be explained, right? Yeah.
So as you're kind of describing that, I'm starting to get like vibes of like some of the meta learning, auto ML sort of things. And I know there's a lot of people interested in this. This seems like a sort of unique flavor of it, I guess, at least. Do you see those as kind of these generative techniques that you're talking about and like stuff that people might call auto ML or meta learning?
Do you see those as in the same family or? I'd say they're in the same family. They're analogous to what we do, but different, right? So auto ML does a search across a fast search space and gives you something that it thinks is appropriate.
Which in itself is computationally. Exactly, right? So Google gives you for free, but you've got to do it in GCP and that's where they get you, right? Yeah.
It's not free. Whereas we will look at your data and then synthesize a new network from scratch. It's a lot more granular in terms of how we do it, but it is like conceptually similar. And so, yeah, that's the process that these guys invented.
And then, of course, we asked the question when we started the business, okay, what's the commercial potential? Like, what do you do with this? And one of the first ones, and this addresses your first question, was the edge-based scenario, right? What do you do when you need to deploy deep learning to a GPU or CPU?
You don't have, you know, three or four servers to run it. And so that was kind of the first place in crevice that we found when we started thinking about this tech. Yeah. And what have you seen now that you've kind of explored that space a bit, are working with clients and people that are doing things at the edge?
What do you see as the sort of real-world driving factor of people wanting to deploy AI at the edge? Because from my perspective, I hear different things. I hear, like, on the one side, like, privacy is the main issue, which I can definitely, you know, see that if data is not leaving the device. Yeah.
But then there's also, like, you know, your device and maybe you're running at a farm or maybe you're running at a factory where the connectivity is not that great. So it's a connectivity to the cloud thing. Yeah. Or maybe, and then I hear, like, a third set of things, which is, like, latency, right?
So you've got, like, your algorithm at the edge and it's really fast and you don't have to, like, wait for things to come back from the cloud or wherever. So from your perspective with clients, what are you seeing as sort of the driving factor there? So it depends on the vertical is what I would say, right? So when you're talking autonomous vehicles or aerospace or defense, they can't afford a trip to the round server.
One, because of latency reasons. If a car needs to make a decision or a drone needs to make a decision in milliseconds, you can't depend on that round trip. And they still have to do things at the edge. So that's the predominant, you know, motivation that I see in those verticals.
In consumer electronics and health, privacy is probably the more predominant factor, right? If you have a watch and that watch is using deep learning to do some kind of analysis on your heartbeat or detect COVID or whatever it is, you know, you as a consumer don't want that data shared with some central location that is going to aggregate and monetize that data. And so in that situation, privacy is the motivating factor to do it entirely on the edge. So it does differ depending on the vertical that you're dealing with.
Interesting. And just to get a sense, if we're talking, and this is another maybe a point of confusion that I see a lot of times is like when we're talking about AI at the edge, what is the edge? Is the edge, you know, a mobile device? Is it a Raspberry Pi with a camera in your house?
Is it like a legit computer, but it's like at the edge in a manufacturing place? So what are you seeing trends that way? I know there's a lot of, even NVIDIA came out with their new architecture. They have like the edge specific card that they're talking about.
So maybe what you see as the trend of the focus in the future, maybe it depends on the vertical again. Yeah, it's a great question. I mean, now I think we use the edge colloquially to describe a scenario where all the processing is done on device, whatever that might mean, with no processing done in the cloud, right? And so that could be a super powerful GPU in a car.
One of the autonomous trucking companies, I know that eight GPUs in the truck, right? And so with that edge, I mean, there's more compute power there than Deep Blue had in 1997. Yeah, or that I ever use for training. Or that you ever use for whatever, right?
So it is an evolving term. I do think we generally mean on some kind of device or something that is autonomous and self-contained and is not being done on the server, right? So a satellite, a drone, your hand, not a computer in the traditional sense is the way we think of it. Gotcha.
And in these cases, what is the biggest concern or hurdle when you're getting to the edge? So you talk about like compact networks, you know, obviously for a thing that is like lower power, you know, if we're thinking about like a small computer, like a small smart camera or something like that, it's going to be low power. It's not going to have that much power and storage space and RAM and all that stuff. So is the compactness, is it dual purpose, both for the, like getting it on the device and the efficiency or what's the blocker there?
So again, it depends on vertical. Yeah. In the case of, for example, defense, they have pretty powerful, you know, devices already that are outfitted on whatever it is they're trying to do. So it's efficiency on those devices.
How many concurrent systems can I run with this hardware that I've already agreed is going to be on the device? In autonomous vehicles, it's I need my perception network to recognize the scene in front of me in 10 milliseconds. And I therefore need it to be really fast on this hardware, right? With consumer electronics, it's a bit different.
It's, look, accuracy is important, but it's not as mission critical as, you know, finding a child in a scene when a car's driving. Right. You know, when you're talking to Siri, hey, get your last name wrong, let me say it again. In that case, it's performance accuracy on the device is usually the predominant factor.
So again, it differs, right? Depending on your use case. I'm Jared Santo, GoTimes producer and a loyal listener of the show. This is the podcast for diverse discussions from around the Go community.
GoTimes panel hosts special guests like Kelsey Hightower. And sometimes you can leverage a cloud provider and make margins on top. That's just good business. But when we're at the helm making a decision, we're like, yo, forget good business.
I'm about to deploy Kafka to process 25 messages a year. It's nerd pride, right? Picks the brains of the Go team at Google. You don't get a good design by just grabbing features from the languages and gluing them together.
Instead, we try to build a coherent model for the language where all the pieces works in concert. Shares their expertise from years in the industry. Don't expect to get it right in the start. You'll almost definitely get it wrong.
You'll almost definitely have to go back and change some things. So yeah, I think it goes back to what Peter said at the start, which is just make your code, write your code in a way that is easy to change, and then just don't be afraid to change it. And has an absolute riot along the way. Yeah, you know that little small voice in your head that tells you not to say things?
What is that? How do you get one? You want another like an app purchase? It is go time.
Please select a recent episode, give it a listen, and subscribe today. We'd love to have you with us. Okay, so I would love to maybe dive into this generative model technology a little bit more. You mentioned that there's this sort of, I think what you call an acquisitor model that kind of studies something.
I'm not sure if I'm clear, you know, exactly the process that that goes through. So I guess from a practical perspective, when we're talking about this, is this a case where like, I still am using like the same types of models, like I have a convolutional net or I have a recurrent neural network or whatever it is, and I have another model that's performing this function? Or is everything happening together in the same sort of different model of some type? Right.
So I mean, it's happening underneath the hood with this technology that uses this inquisitor generator pair. I don't know. You're giving it a neural network, and you're getting a number of neural networks generated that are more compact and work against your data with usually the same accuracy and are faster. So the internal is how it works.
It's interesting to academics, and we've issued papers on this about giving away the core IP. But really, you're giving me a neural network, which is a graph, and you're getting a much more compact graph as an output. And then our platform will provide the explainability and so forth. Gotcha.
It would still be up to, let's say if I'm an AI practitioner, I'm using this type of technology, it may still be up to me to determine, like, hey, here's a computer vision problem. I'm going to train a convolutional neural network on this data. But then afterwards, I'm going to provide it to the system and get a better architecture out. Is it kind of two-stage like that, or can you do, like, everything in one shot?
So you can do it in one shot. You can choose a popular public reference model. We're adding this feature to the platform. You can say, look, I have a computer vision problem.
I don't know what the best thing is at Inception, is it ResNet. And we will take a public model and produce a really optimized version for you against your data. Or if you're a more intermediate or advanced user, you might already have a network. You might have already trained it and done all that pre-work.
And you're just going to give that to the platform and say, give me the best version of this against my data. So it can work in either way, depending on where you are in the process. Yeah. Yeah.
Personally, I kind of like this way of thinking about it. Because oftentimes when I talk to people about auto-ML or meta-learning or something like that, it seems like the end goal of where people want to get is, like, I just have data. And then, like, whatever sophisticated system I have figures out everything for me. Yeah.
Maybe a possibility in certain cases that have been very well studied. But I also know just from my own experience that every use case I come up with, it's, like, weird in some way that just doesn't match, like, something. That's our big thing. Like, we are proponents of human-machine collaboration.
Yeah. Right? You need a human in the loop to couple the laborious intelligence of AI with your own intuition as a human being. And that's not going away anytime soon.
Right? I mean, how often I've been in the technology field for 25 years. So many times I've heard you, you don't ever have to code again. Right?
Yeah. There's all these two. Software engineers are going to automate away software engineers. Right?
And, like, you hear that and you roll your eyes because you've seen it so many times. You know, I'm a Microsoft guy. I remember BizTalk. I don't know if you remember that product.
It was supposed to do away with all coders. And, like, you know, then you encounter it in, like, one asymmetry in the data or one irregular use case blows the whole system up. So we have enough experience to know. Look, we're giving you a tool that takes away the arduous elements of deep learning.
But you still apply your creativity and understanding that gets you there a lot faster. And, like, I think that's going to be with us for some time. Yeah. Do you think that, I know one of the things, I forget where I saw this, I think maybe it was at a TensorFlow Dev Summit when they were talking about AutoML, that one of the things that they saw as interesting in this process is not so much automating away everything, but just learning new architectures that they wouldn't have guessed prior.
Is that something that you found in doing this different, it's a different approach, but you are still generating sort of new graphs, like you say. In looking at those new graphs and those new architectures, have surprising things come out from that in terms of, like, what's actually needed to solve certain problems? That's a great question. And one I would have to ask are, like, deep researchers.
Yeah. Do they look at the new architecture and does that give them an idea? The fact of the matter is, like, very few people are designing networks from the ground up. Yeah.
It's like, you know, like the big five basically do it because they've got the intellectual horsepower to do it. Now, where we do have insights, though, and maybe we'll get to this, is the explainability piece of why certain things are being made. Like, that is intriguing, and that teaches you things that just never would have occurred to you before. Yeah, so I totally agree with you.
I often, when I teach classes, I say, like, most of AI in practice is not sort of, like, drawing networks on the chalkboard and, like, starting with a blank chalkboard and then going. It's more like cooking in the sense that you get a recipe, and then you have to bring your ingredients to it, your data to it. And you might have to change the recipe a bit because you don't have these ingredients or those. That's a great analogy.
I'm going to steal that. Okay, please do. I like that a lot, but I will give you credit when I use it, yeah. It sounds good.
Yeah, at least I can contribute something to the AI community. Since we kind of got there naturally, let's talk a bit more about the explainability piece. And maybe we can actually start at a higher level there as well and talk about, like, let's say, in the absence of the things that you're doing and your team is doing. And actually, you know, many other teams are exploring explainability things.
For those that are maybe newer to AI or are in a company and exploring AI and are concerned about it, what are the main sort of reasons why we need explainability? And then at what level do we need explainability? Like, you know, because, like, for example, with GDPR, when people are talking about, oh, you give an explanation for how you process people's data. Well, like, I don't always know why my network did something, and I think it would be infinitely hard to describe everything.
So what are the main challenges and what expectations can we have for explaining? Okay, so great question. So first of all, the fundamental problem with machine learning and deep learning is you're essentially saying to these systems, here's a bunch of data. You infer your own rules as to how you're going to make a decision against this data, and all I care about is the results, right?
And the reason machine learning, you know, is so powerful is it is great at, you know, characterizing situations where the rules cannot be codified in human terms. And that's why it's great. An example I often give is, you know, giving a neural network a picture of something, a lion or tiger, and saying, hey, classify this. You know, and this is a profoundly difficult problem in computer graphics before neural networks.
Something my son can do now at 18 months was impossible, right? So it's wonderful that we can do that, but the explainability problem exists because we don't know how the neural network is orienting itself internally with its weights and so forth to reach that conclusion, right? And so the problem with explainability is if you don't understand how a decision is being made, you don't understand where it will fail. And if you don't understand where it will fail, there are all these edge cases that are lurking in the network with potentially catastrophic consequences.
So a very practical example I can give you. In the early days of Darwin, we worked with an autonomous vehicle company, and I've used this example in my writing a little bit, where their car, the perception network in their car, or the AI in their car, would turn left with increasing statistical frequency when the color of the sky was a certain shade of purple, right? So just pause on that. Like, you and I know that the color of the sky never influences the way you turn.
Maybe there's a volcano on the right or something, right? But that was mystifying to them. And so without explainability, they couldn't understand what are the drivers here? It's what we call a nonsensical correlation.
And so we were able to help them debug that they had done the training for the turning scenario in the Nevada desert when the color of the sky was the shade of purple, and that was the correlation the car had made, right? But in order to understand those nonsensical correlations, in order to identify the edge cases, you need to have some insight into why the neural network is doing what it's doing. And so that is why explainability is so important, is to make more robust networks and give the data scientists and the deep learning developer tools to make those more robust networks. Yeah, I think you really summarized that well.
And I think some of this is, like, it's beginning to be on my mind so much more as I develop. I know just last week, just to shout out one of my favorite podcasts, which is the NLP Highlights podcast from Allen Institute of AI. His name is Marco Rubiro on there, he's from Microsoft, and he was talking about behavioral testing of NLP models. And he basically used a bunch of commercially available systems in his paper, like, from Microsoft and Google and Amazon.
They sell, for example, for, like, sentiment analysis. And he did some very kind of, like, so what he called minimum functionality tests that were not based on, like, a training set that they used to train the model, but just were, like, the minimum functionality you'd expect from a sentiment analysis. Like, can it get, I don't like food, the sentiment of that. Yeah, yeah.
But then what he did is he made perturbations on that, like, changing, you know, I don't like burritos to I don't like oranges. So, like, simple perturbations that should not change the sentiment from, like, negative to positive. Right. Or changing, like, I like the U.S.
to I like turkey. Right. And then seeing if that changed. And he actually found that these, like, huge percentage of failures in these commercially available systems were these kind of minimum functionality tests.
Right. And some of those things were tied to the way things were represented in the data. Like, turkey was represented in a very negative light in a lot of the data that was trained on this model. So I totally agree with you.
In that case, like, this thing's already deployed, right? And you have this existing problem in the system. Yeah. And you don't know it until you hit it.
Right. Yeah. Whereas you're saying, like, if you're developing, you should ideally know where you're going to maybe hit some of those pitfalls or at least understand why you hit those so you can get the system better. And so sometimes the system will get it right, but for the wrong reasons.
Right. So a very popular example I've heard is there's a neural network that was trained to detect horses, recognize horses, and it performed admirably well. But what they didn't realize was apparently this is news to me. Many professional pictures of horses are copyrighted.
So it was actually looking at the copyright symbol at the bottom of the picture. And that was right. But of course, so what you do then is you say, oh, OK, let's remove the copyright from the picture and let it organically or naturally look at the features of horse to detect things. So that is what you're trying to do is align how it's triggering on data for decision making with your own intuition that, yes, this makes sense.
Right. Now, sometimes you will actually learn from explainability. The neural network will teach you something about explainability. So our professor does a lot of work with neural network and detection of disease.
And so actually, we've been in the news because when Corona became a serious thing here in Canada, we released an open source network called COVIDnet that detects Corona using chest X-rays. But his previous work was detecting lung cancer with CT scans. And we show this example where the neural network was looking at the walls of the lung, which had never occurred to radiologists, apparently. I'm probably oversimplifying a little.
But they actually started looking at it and thinking, huh, what maybe here can we learn from what the neural network is looking at? So explainability. So there's a second benefit in that sometimes, not often, but occasionally, it will actually teach the subject matter expert about a new way of thinking about the domain. So as we're thinking about this explainability piece, I'm kind of curious.
We kind of motivated, I guess, why explainability and some of the pitfalls that people can fall into and also this sort of dual benefit of also learning in some cases from explainability. But I'm curious, like from a practical perspective, as I'm using this system and I'm learning about my network, what is the sort of range of things that I can learn since, you know, there could be so many different types of like features that could be contributing to something. So like when I'm using the system, what sort of feedback am I getting? Is it like, you know, these portions of the network are doing something weird or interesting or is it more having to do with the data?
Like this, like you say, the segment of the data is important for this prediction or what's the sort of range of things? Yeah, it's a great question. So we really asked the question, how do we surface explainable insights that are most useful to developers? And what we saw was very few of them really go down to the architectural level and tweak individual weights and so forth.
It happens, but like in a very small minority of cases, most of them want explainability against data. Why is the network doing what it's doing against this data set of this family of data? So that's what we surface in the platform, right? You know, when we detected a lion being a lion, what were we looking at predominantly?
When the network got it wrong and different from the human labeler, what did it get wrong? When you remove the predominant, you know, inputs that the explainability algorithm detects are important, when you remove those, how does the prediction change? So that's kind of the data that we surface that we find really accelerates the deep learning development process. Gotcha.
How do you balance the sort of range of data that people are dealing with? Is it a matter of kind of starting to specialize in a few different types of data, like text and images, and then like moving on past that, just in terms of product development, I mean, it's got to be a burden to sort of think about all of these weird scenarios. Yeah, so it's a great question. So right now, our explainability focuses on things you can represent visually, object detection, object classification.
Your question's a good one. How do you surface explainability for natural language translation or something that's inherently non-visual? So we're doing the visual stuff first because it's easier to surface, and then eventually we'll get to some of the other stuff, yeah. Yeah, I know that there's some interesting attempts out there.
Actually, we had a guy on the podcast who was talking about recurrent units in a neural network and how those behave in terms of the memory or what they pay attention to previously or forward in a text sequence and visualizing that sort of thing, and that was really interesting. I'll link that one in the show notes. So maybe there's some things that are possible there. I could definitely see it.
That in itself is a research topic, almost. Yeah, yeah, that's interesting. So in terms of the system, how do you make decisions about, like, do you provide, and I'm kind of curious about this just from like an AI startup perspective, because there's a lot of people out there trying to do different things, and some are like, well, you know, our system bolts onto TensorFlow or has a TensorFlow backend or PyTorch or whatever. Are you kind of providing a self-service portal for people to do things where, like, the frameworks and that sort of thing are transparent and they're really just kind of importing models and that sort of thing, or is it a kind of augmentation to their existing workflow?
Yeah, it's the latter. So we build on top of TensorFlow right now. You give us a TensorFlow model and the data, we do what we do, and then you get a TensorFlow output. It's not SaaS.
And so that was a big learning for us when we started Darwin. We wanted it to be SaaS because then we wouldn't have to expose anything. But the enterprise was quickly saying, we're not sharing our models, number one, and we're not sharing our data. So, like, you know, disabuse yourself of that fantasy right now.
So, yeah, so we provide an enterprise platform that sits on top of TensorFlow. We're adding support for PyTorch later this year. And, you know, that way your workflow doesn't have to change considerably. You just use the parts of the tool that you find are useful.
Awesome. And how big is the team now? Is it quickly growing or what's the... It's about 25.
Okay. So we're, you know, a smaller team in Waterloo. I live in Toronto and was commuting before COVID. But as the commercial traction increases and people start doing it, I imagine we'll be growing in the months ahead, yeah.
Yeah. And I'm always curious for organizations like yours, you know, if I go to your research page and look at all the great things that you've submitted, or if you look at a company like Hugging Face or others where it's still a relatively small team, but it's like there's a product and there's also, like, this great research coming out. Like, I don't know how you do it all. Is that, like, strategic partnership with the university as well or...?
Yeah, great question. So two of our co-founders are professors at Waterloo. Okay. We're in the unique position of having an organic academic connection with the university.
And part of their university responsibility is to publish scholarship and to publish, you know, academic papers and so forth. It's a natural part of our identity. It requires discipline to get the balance right because research is good and academics love to research things. But if you're a startup, you also need to channel that research into your product while at the same time giving your academics the latitude to explore things for the next, you know, the next-gen stuff.
So it requires discipline. We're hopefully getting it right, but it's something you're constantly thinking about. Yeah, yeah, for sure. I mean, you're doing great work.
I'm really happy to see a lot of it. From your perspective, maybe whether that's with edge computing, AI at the edge, or maybe it's with explainability, what are you excited about for the future in terms of, like, things coming down the road or things that you want to explore that seem like really interesting areas? What are some of those things that excite you personally? Yeah, I think one of the big things is really seeing significant deep learning use cases realized in the next, you know, couple years.
You know, there's this concept of the adjacent possible that you may have heard about, which is what can you do with your technology given where the world is at? And sometimes at Darwin, you know, we were maybe a year too early when we started, you know, and so the industry is now catching up and grappling with the problems that our academics knew they would have 10 years ago. And so what excites me is now that they're actually doing it, now that, you know, Lockheed is trying to do AI and implement robust AI at the edge, the problems we... foresaw are ones our tool set helps with.
So it's seeing how those use cases play out and just knowing what deep learning can do and the number of different areas where it can be used. I mean, healthcare, to an obvious example, the amount of interest we've gotten because that vertical is really looking seriously at experimental technology to create a vaccine is incredible. Digital learning, like so many things. I think just the general applicability of it is what fascinates me.
I say sometimes it sort of reminds me of the internet in the early 90s. And I don't know if you remember that, but I was a teenager in high school and we didn't think in 1992 or 93 that these little signals going over telephone wires with modems would reimagine the world and reimagine industries. And to me, it's the same thing with artificial intelligence, only greater. So the potential there is is incredible for me.
Yeah, that's really good to hear. And I can definitely see what you're saying. Even looking at conferences that I attended like three, four years ago and kind of leading up until now, it seemed like of course there was a lot of focus on like, you know, we did this cool thing with the model. We process this much data with our big data platform and now I think, you know, I see kind of two huge things, almost in some cases dominating discussion at conferences now, which is like the explainability and fairness and bias piece.
And then there's the like, how am I going to manage this now? Like it's a legit piece of our software stack and we've decided to buy in. So like, how do we actually like integrate AI? And so you see a bunch of, you know, platforms and tools and deployment systems and all those things out there.
So yeah, I think, you know, that's really exciting for sure, from my perspective, because of course this is practical AI. So, you know. Think about the first days when the very first developers wrote in Notepad or the text editor and they just wrote program code and compiled and that was it. And then it got more sophisticated.
We had IDEs and source control and QA and like, you know, we're speaking at MLOps, DevOps, you know, virtual event later. And like, yeah, the enterprise is now like, oh, how do we manage this? How do you version models? How do you version data?
Like, you know, so it shows you the seriousness with which organizations are taking this, right? And the tooling needs to be there for it. Yeah. And then it's not going anywhere for sure.
Yeah. I don't know. I feel like, and I've heard people refer to this, that there's kind of going to be this AI layer in the software stack now. Right.
And just like if you're a software engineer and you don't know anything at all about CICD, probably you need to learn a little bit about at least where you can interact with that system. Not that you have to like spin up your own or whatever, but like you're going to be interacting with this in a similar way. Like, okay, maybe you don't need to be an AI expert, but you're going to need to interact with these systems and they're going to be part of our lives. So exactly.
Yeah, for sure. Well, it's been really good to chat with you. Where can people find out more about Darwin AI and what sort of tips would you give people if maybe they're convinced now that the explainability is important? What kind of tips would you give them for maybe either getting started with it, with your system, or maybe just learning about explainability in general and the topic out there?
Any suggestions? Yeah. So we actually wrote a pretty lengthy explainability primer on Medium, which goes over the problem, which goes over traditional techniques. And so if you follow us on our Twitter, Darwin AI or our LinkedIn page, we post this all.
And so that's a really good place to start because we go over why the problem exists, what the common techniques are, what our technique is, our scholarship around this. I think that that's a really good starting point for people that are grappling with this and thinking about addressing the problem. Yeah. Yeah.
Great. And in light of a lot of the things going on in our nation and otherwise too, I definitely recommend one thing that I found really useful out there in terms of the sort of exposing maybe on the data side, like the biases and the fairness in your data. There's a great toolkit from IBM, IBM Fairness 360. Even if you don't use their toolkit, you can learn a lot about the various ways people are looking at bias and data and other things like that.
So just for our listeners who might be interested on that topic. But really appreciate you joining us today, Sheldon. It's been a really great and timely conversation and really excited to see what comes out of Darwin AI and how the platform progresses. And I hope we can meet and chat at either a virtual or real conference or something sometime.
Yeah. No, thank you for having me. And as you alluded to at the beginning of the program, we're in a period of real challenge as a species. So Godspeed to everybody who's listening.
And there's a phrase I told my team this morning that I'm fond of. Martin Luther King, you know, the arc of the moral universe is long, but it bends towards justice. I believe that. And it's sometimes hard to see.
And we regress a little bit, but I do believe that's true. So, you know, let's just all stay strong and united. Yeah, thank you for that. And thank you for joining.
Thank you so much. Thank you for listening to Practical AI. We appreciate your time and your attention. Word of mouth is the number one way people find new podcasts.
If Practical AI has helped you on your AI journey, please do tell a friend, hey, we'll thank you later. Special thanks to Breakmaster Cylinder for the beats and to our awesome partners for their support. Shout out to Fastly, Linode, and Rollbar. If you and your organization would benefit by speaking directly to the AI community, you should sponsor Practical AI.
Podcast advertising is highly effective and we would love to work with you. Head to changelog.com slash sponsor to learn more. That's all for now. We'll talk to you again next week.