Going full bore with Graphcore! - Changelog Master Feed |...

What this episode covers

Dave Lacey takes Daniel and Chris on a journey that connects the user interfaces that we already know - TensorFlow and PyTorch - with the layers that connect to the underlying hardware. Along the way, we learn about Poplar Graph Framework Software. If you are the type of practitioner who values ‘under the hood’ knowledge, then this is the episode for you.Sponsors:O'Reilly Media – Learn by doing — Python, data, AI, machine learning, Kubernetes, Docker, and more. Just open your browser and dive in. Learn more and keep your teams’ skills sharp at oreilly.com/changelogRudderStack – Smart customer data pipeline made for developers. RudderStack is the smart customer data pipeline. Connect your whole customer data stack. Warehouse-first, open source Segment alternative. The Brave Browser – Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token. Download Brave for free and give tipping a try right here on changelog.com. Featuring:Dave Lacey – GitHub, LinkedIn, XChris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:GraphcorePoplar Graph Framework SoftwareUpcoming Events: Register for upcoming webinars here!

of MATCHES

TRANSCRIPT · AUTO-GENERATED

This is one of the reasons that we're kind of quite keen on having a very kind of accessible for a powerful software stack because they're all these choices to make. It's a really great example you made out of G&M's because it shows that what we might be doing in two, three years at a time, this will be quite different to what we were doing two, three years ago. And if you think like five years before that, then it wasn't your networks, right? So it wasn't even then.

So I think key for us as much as having the right process and having these graphic needs, if we can call them as a family, the important thing is to have the capacity to do innovation. Big thanks for our partners, Leno Fastly and LaunchDarkly. We love Leno. They keep it fast and simple.

Check them out at Leno.com slash change log. Our bandwidth is provided by Fastly. Learn more at Fastly.com and get your feature flags powered by LaunchDarkly. Get a demo at LaunchDarkly.com.

Welcome to Practical AI, a weekly podcast that makes artificial intelligence practical, productive, and accessible to everyone. This is where conversations around AI, machine learning, and data science happens. Join the community and Slack with us around various topics of the show at Change.com slash community and follow us on Twitter. We're at Practical AI.

Welcome to another episode of Practical AI. This is Daniel Weicak. I'm a data scientist with SIL International. I'm joined as always by my co-host Chris Benson, who is a principal emerging technology strategist at Lockheed Martin.

How are you doing, Chris? I'm doing good. It's a beautiful spring day and we're about to talk AI with interesting people that you can't ask for much more than that, my friend. Spring has sprung.

Spring has sprung. Since you're in the South, someone, one of my co-workers this morning was talking to me about every 17 years with cicadas, like cicada, burrud. It goes crazy. Intimologist.

Wait, is that what it is? I forget the name. I'm not one of those people. I'm on the settings box, but are you aware of this?

I had heard that actually before and it affects me in a fairly direct way. So long-time listeners and certainly Daniel knows that I do a lot of wildlife stuff. And so on the side, separate from this whole AI thing, I do a lot of wildlife stuff. And one of the things I do is I am a snake wrangler among other things as part of that wildlife thing.

So I am commonly picking up copperheads with the appropriate safe equipment, by the way, just so no one gets the wrong thing. And copperheads love cicadas. The poison and snake for those that aren't in the US. Chris is very adventurous.

If you're not in the southeastern United States, a copperhead is a venom of snake. And so I have a lot of experience with those. And so totally off topic, of course. The cicadas?

Yes. Favorite meal of a copperhead? Oh, OK. It's their prime time.

That's right. Who knew we were going here at the beginning of the show? Yeah. Yeah.

Well, I mean, I'm sure that there's all sorts of interesting AI to like predict the data brood numbers this year. But that's not the topic of the show. That's not what we're talking about today. I'm so sorry, folks.

Yeah. Actually, I think that I'm pretty interested by the topic in this week because it seems like we talk a lot on the show or, you know, GPUs or accelerators or specialty cards for AI and mentioned, but a lot of times are mentioned just in the context of the accelerators, not the sort of software component that goes along with them. And of course I found, and I don't know about you, but I found it a lot of times when I like, oh, I'm really accessed to this really cool card or something. And I want to use it.

But the problem is not having access to it. It's the software that I write for it has all of these issues. So I'm really excited to talk about that sort of software hardware interface today. Real life AI right there.

Yeah, exactly. Real life practical. Very practical. Dave Lacy, who is chief software architect at Raph.A.F.

Welcome Dave. Hi, Steve. Great to see you, I think. Yeah.

And before we jump into things, do you want to give us just a brief sketch of your back? And how you got into doing what you're doing now? Sure. Yeah.

So I'm a computer scientist and particularly specialized in compilers and that kind of that area of computer science, my life. So I did research in that in academia. And then over the last so many years, I don't want to say really, that's why a few years, I've been working in various companies working on the software stack to target new types of hardware in different areas, most recently we'll talk about the thing, Parkour and AI previously in HPC and in processing and everything, so very much software guy trying to make sure that as we push the field forward and get new hardware to enable new things, we have software that can support that and let users actually program these new wonderful things that get created. Cool.

Yeah. And you mentioned a few different things. Of course, there's AI specific hardware. There's high performance computing or HPC.

I think a lot of times when people think about AI hardware, maybe they're thinking about GPUs, but I mean, the story is a lot more diverse than that. So maybe you could give us a little bit of a sketch of the sort of landscape of AI specific or hardware that targets AI workloads, I guess is a good way to put it. What are those categories of things out there? So I think the ones everyone knows about our CPUs are in clearly and they definitely might have their places in this landscape for specific tasks.

And then obviously, more recently GPUs because of the nature of the kind of highly powerful computer and what they target they've had quite a bit of success. I think you see quite a few companies out and I think ours is, of course, what I'm from and obviously unbiased is I think one of the leading, it's what a de-leading kind of new type of chip in there, where what we're trying to do is produce a chip that's specifically for machine learning in AI. So one thing about other chips is you'll find that they've been slightly something like CPUs, they do all kinds of things, GPUs, which is graphics and our chip is something where we've gone, well, can we design a processor that's just for this space? And this space is really important.

You've got so much application that it's worth designing a processor specifically for. I think that's the new kind of read of processor you'll get and I think I've used the one obviously, but that really kind of fits that sort of allows us to do more in this space than other processors can because it's because we thought about from the beginning the kind of attributes you need for an AI chip. So what kind of data are we dealing with? What kind of data patterns does that kind of involve?

What ramifications does that have for us, for example, on the memory hierarchy? It's very different to other processors and so on, just so we can kind of process data in a different way or the compute units and so on. That's where we're finding with first place now where we have specialized hardware, ML, AI, intelligent computing general, and that's where the IPU and SIS for us in GraphQL. So as we've talked to different folks through different episodes and everything is trying to understand kind of where you fit in and you're kind of laying out the landscape kind of still at a high level.

Can you talk about a little bit about the different categories before we dive fully in the GraphQL just so that we kind of know in our heads where to position it, how that fits in because people come into this conversation with different levels of understanding and knowledge on that that is. Can you kind of differentiate the different pieces of the landscape than we as we focus in on where you're at? We'll kind of have some context. Yeah, yeah.

I think if I can take that last game and talk about it from the types of processing the different processes to do. Sure, sure. That's fine. Yeah, that'd be great.

Yeah. So like kind of a CPU does very much deals with scalar processing. Scared processing is dealing with individual numbers, maybe kind of small groups of numbers together, like small vectors and things like that. And they'll do it.

And that's very good for the general compute, right? Because actually, really anything can be split down into individual numbers, right? So that's kind of the CPU kind of space. GPU's what they deal with primarily is large vectors of numbers.

I think that's very clear, right? So to me, contiguous blocks of numbers that they come in and deal with an empowerment, of course they're dealing with an empowerment, they can get better efficiency so they can go faster with those the same. I'd say what you could call the IPU, say, I want to call them graph processes in the fact that they're dealing with data that's where you have to deal with a lot of power and up to the highly parallel processes. But the connections between them, I mean, they don't come in big contiguous blocks of vectors.

There's actually kind of you're going to have to take one number over there. And then kind of deal with them in parallel. And that's kind of in some ways what you know, a way of characterise the graph processor and it is one of the aspects you think of in the customization. Another categorization if I may is also about the type of number you're dealing with.

So again, both CPUs and GPUs have a wide range of kind of number four, that's if you either deal with integers kind of and also fixed point numbers and floating point numbers with certain different positions. But in AI, I think we're finding a lot that what you need to deal with, floating point numbers, so kind of fractional numbers, you're kind of representing probability distributions most of the time, right? So you don't need a lot of precision. So what you need is a lot of range, like kind of like you think of probability distribution, that's kind of what it is.

So again, it naturally, you're dealing with numbers that are low precision floating point, with kind of ranges you have to deal with and kind of randomness involved in those ones injected it. So you can think about the shape of the data and the type of the data and which process it deals with. That's kind of a reasonable way to describe that landscape across the chips. I swear there are other ways you can split it up as well.

But I quite like those as a way of kind of splitting these things out. And it kind of leads you to think, oh, what kind of applications could these be good out kind of by thinking about them in that way. Yeah, I see your point on that, absolutely. And maybe you could connect.

So you talked about like graph nature of the IQ, the graph processor, and like taking a number from over there and how that works. But you sort of map that onto the typical like AI types of tasks that we're running to perform and how that connects people that, for example, are started using sensor flow before sensor flow two, or maybe familiar with saving graphs or something like that or loading them. Does that connect to what you're saying or what's the connection? Yeah, yeah.

So they're several and that kind of thing was one of the gross guess. Anything's the gross. It's one of the things. But one of the gross people talk about it and things like 10, so one is the compute graph, which is the way data flows around between the intensive operations.

And that's only one of the kind of larger course great things you can think about. But I guess I'm talking more about the graphs that connect the information together. So if you think about what are you doing operations and quite a bit of matrix multiplies or their convolutions. Now, let's look at say convolutions for image.

That's a really good example, right? Because there, what you have there is you don't have quite the auto-all connectivity of the matrix model operation. Like you have a kernel which is going to be applied. The same thing is a lot of different parts of the image, you know, kind of across the image axis.

And then you can see that the computer data is being kind of a scandalous data. So you have a kind of particular fan out graph there. And then you can get in the operations. You can get into it.

You get into other neural networks. Say if you look at things like reds next or efficient net in image based neural networks. You can see that actually you get kind of grouped convolutions or separable convolutions. Again, where you're kind of splitting up the data.

Then recombining it in kind of fine room, fine way. So it's those kind of things that require hardware that can cope with efficiently moving that data around and also software stuff that can target that division as well. So we should get the software stack as always. I always like to remind people that you need to stop this operator to target it as well.

It takes them both. Yeah, yeah. And then if you get into even more kind of recent things that people are doing where you have, imagine being a lot of what we do at the moment, we do it the same way as in the round of it. It might make you say times make you do something like that.

It takes them days to come again and do a big way to multiply between that. But really that can be quite inefficient. We've got a strange, really big, really big reason, which is great. Actually, it's pushing things forward with these really big kind of seeing language models and so on.

But you know, it's not very efficient. And you'll see that the shape of that data means there's actually quite a lot of sparsity in there. There's a lot of that data could be zero. But that could be even more kind of complex data patterns.

And I think kind of that push for efficiency is another space where you'll get kind of naturally a graph structure naturally kind of cross anything where you have kind of almost in mind your regular connections between the data. So I think there are lots of different ways of looking at the graph within a kind of neural network or any kind of achieving program. But I think the key ones to me is like, how are the data? How's the data together?

Yeah. And I know that at some point the person I was talking about, I don't know, friends, I forget which of the major AI conferences it was, but they're talking about trends year over year in terms of what people are focusing on. And I know that there was a big boost in people looking more and more at what's called a graph neural networks. Does that connect directly to what you're saying past the other things in some way?

Because I know a lot of people are talking about these things. And my understanding is it's, you know, in some ways people say a graph neural network, they have just the way of encoding the graph into a tensor. And then you still have the tensor operations. But in other ways, there's actually graph native sort of operations that you can do within the network itself.

I'm not an expert in these things yet. Have those connected with what your company is doing at all? Yeah. Yeah.

So I think they were quite interested in and actually some of the research groups we've worked with kind of interested in that kind of network. It's yet another type of graph, but the graph is not the model itself, but the data that the model is consuming. What we found is that there's a lot of choice there. There's a lot of choice in how you represent the data structure.

There are many ways to do that. And then the actual kind of basic processing operation you'll do will be different depending on that. So you do represent it as a list of edges or a dense matrix or a bit vector or things like this. Actually, I think what people really want is to explore through those choices of the moment actually.

So it's kind of pushing things forward. This is one of the reasons that we're kind of quite keen on having a very kind of accessible, powerful software stuff because they're always choices to make. I think it's a really great example you made there with GNNs because it shows that what we might be doing in two, three years at a time, this will be quite different to what we were doing two, three years ago. And if you think like five years before that, then it wasn't your networks, right?

So it wasn't deep learning and stuff. So I think kind of key for us as much as having the right process of having these graph techniques if we can call them as a family. The important thing is to have the capacity to do innovation. Apart from that, I think, actually does come down to the software quite a bit.

It comes down to making sure that people can modify it in a good way to extend it, as well as having the kind of easy out of box thing where they just want to run a tensile print or something like that. I think that the GNNs is a very good example of the kind of way you want that kind of looks at it. So my head is also still spinning a little bit on something that you were talking about a little while ago, and that is the software and the hardware. You need them both.

Over the decades, the world has been very, very CPU focused up until this most recent time period where we're really digging into AI and all of these new hardware architectures have come into being. How do you approach, given the fact that we're having all this innovation and how these hardware architectures are coming out, how do you approach writing the software that goes for that hardware that makes that hardware run, especially when you think about the fact that we have a long history of being specific to different types of CPUs, but now the sky's the limit on what's happening. How does that change the act and the thought process and the planning for building software? I think it does a lot actually.

I think CPUs are amazing. I think that they've been so successful in fixing an architecture, fixing all aspects of the architecture, the way the instruction set runs, the way the memory hierarchy works, and then applying it to lots of different fields and having that success. I think it's a remarkable thing. I don't know if it works for all fields.

I think what we find in AI is one thing that we've been very keen on right from the start, and I definitely see other people who are talking about here is this idea of code design. So code design is this general idea where you want to design several aspects of a system together. And so clearly one of the big aspects of code design is design your software stack and your hardware architecture at the same time. And that's something we're very keen on.

I thought actually we'd be kind of right from the beginning. We're very keen on making sure we worked on how the software was going to target the chip, make sure we kind of model everything right up to the kind of full neural network applications as we design chip and pushing that all through. I think at later on in the company, what happens there is you only to extend that software stack and kind of really kind of invest in building that culture with our solutions. And now we've got more software people in hardware people, so it's a really big part of what we do.

Whereas software companies are hardware companies. Even right at the beginning, it was very much less and less together. I don't think it stops there actually. I think actually you could also say your code is out in and I must be more and more of this.

Machine Learning algorithm architecture itself along with the software and hardware and the system. So the kind of way you, you know, we're not just particularly training, right, the large scale, large data machine learning training, which is the world we're offering in the graphical. You're not talking about one process. You're talking about hundreds of thousands of processes working together.

And they all need to be connected by networking, gables and put interacts in the power management and all this kind of stuff has to be put around it. And really you need to code design all of that together. And that's good. I had someone kind of use the term architecture before to describe this.

And I think, yeah, why not? That's a fair enough term. That's fascinating what you're saying. I'm not trying to cut you off.

I actually want you to expand on that. The idea of doing software hardware, the actual ML architecture and the system, that seems daunting because all of those individual topics are people spend their whole careers doing that. How do you blend that in a cohesive way? It seems like your team make up is very important for that.

I would imagine so. Absolutely. I think for us, like, yes, team make up to important in the company. We've really kind of stressed that.

I think partnerships are very important for us. So we've always been quite keen on working with commercial kind of internet, consumer, companies that want to do large scale deployments behind search engines or whatever. So, you know, the kind of thing, which we also want to work with research groups, both kind of university and company research groups to kind of see where that's going. I'm not learning that a lot of companies.

Well, that's why I think that's being part of why there's this keenness to bring ML researchers into, within companies to work on that. So I think I make up in the communication between those. It's really important in the company to do that. And partly it's just an awareness to want to do it.

It says having those communication, like really varying and actually to have the right kind of culture within a company to foster that. We really feel that actually to get this out of the working, we could work it this way. There is a flip side, I have to say, in the you've got to be careful not to design out the future, if you like, because you could interpret this idea as being, we'll harden a piece of software into a chip. So we'll take like, oh, I don't know, rest their 50s, give a good example from the future.

We'll make a chip that will do that really well. I don't think that's really code designed because you have to also design for generality and say design for not just the algorithms of three years time or four years time as well. Because a chip has to last that long and people want that flexibility. Is that a little bit of an artifact left over from an earlier time when you were still thinking about CPUs and there would be things would start in software, you know, you'd solve a problem in software and it would stick and be maintained over the years and finally it would get incorporated into chips.

Is that a legacy mindset that maybe is been brought forward? It doesn't work when you're advancing on the ML architectures as rapidly as we are. I mean, if you bind it into the chip and you're stuck with that, where we're seeing rapid advancement over the last few years in terms of where things are going. Exactly that.

I think that's exactly right. As everyone who subscribes to archive, we'll see the amount of the fire hose of innovation here is just huge and you have to plan for that. I think good flexible software is a key to that. But then you have to design your hardware to kind of match that as well.

So exactly. I think that's a hit and a nail on the head. We're in a real fast moving space here. It's not so.

There's still a lot of ways to go before we kind of settle down exactly what the algorithm exists. I'd love to get into the practicalities. And I guess what I'm thinking of is you have your graph forward processor, the IPU over here on this side. And over here, you have already established community frameworks like TensorFlow PyTorch over here on this side.

And obviously people want to use those. They want to use Keras or whatever. And I'm sort of looking at a diagram which will link in our show notes about your popular graph framework software, which is very well written. But you've sort of got the frameworks on this side and you've got the processor on this side.

Could you explain just for people like what it takes to connect something written in TensorFlow or PyTorch like sort of people are used to do this new way of processing on the graph processor. What's in between there? Sure. Let's take TensorFlow as an example and I'll see if I can walk through that diagram from left to right.

So kind of you have your TensorFlow program written in Python usually. And what that describes is it will describe usually it describes some model that you want to optimize and then you apply and optimize it to it. And that gives that model really is about this kind of compute graph, right? So it's a series of linear algebra operations kind of connected together and out of the end pops and outs, which you can evaluate on how close you're getting your optimizer to learning a good solution to that.

That graph gets explicitly represented in TensorFlow. So the day structure gets created, which is that graph. At that stage, it's called the core graph in TensorFlow. So it's pretty much what you've written more or less.

And then what happens is, well, the first thing that happens really is that that gets differentiated. So we kind of create the backwards path to say how do we calculate gradients on that. And they get wrapped in a kind of, that whole day structure kind of then represents a big loop outside that says V data in and kind of update my weights as I, as I calculate these gradients. We take that whole graph and then it gets passed first through the TensorFlow compiler flow.

And that's not kind of a graph thing. That's the tensile load kind of stream developers. They'll take that graph, they'll kind of can analyze it into kind of small operations. They'll do some optimizations on it.

And they'll convert it into what's called H low graph, the XLA graph. So the discolce through their compiler infrastructure, which is called the tensile. So at the end of that, you have a kind of slightly lower level split up graph where we're still talking about kind of quite big linear algebra operations like matrix, or something like that. But it's been kind of reduced down and tidied up and made a bit more kind of closer to the hardware.

So at that point, our graph for TensorFlow backend takes over. And the first thing it will do is it will do a few more kind of optimizations on that data structure at that level. For example, in a chip, we have the hardware unit for doing, the expensials and sigmoids and the kind of things that come up in certain on the Verity. So it will recognize those patterns in that graph and say, well, we replaced them with one special operator that will kind of go down to the hardware.

So we'll do that kind of, and then it will basically convert those operations into an even lower level form of graph, which is much more fine-grained than that. So we have something called pop-lips, which are libraries that implement things like matrix multiplies or not in the RIT operations or things like that in pop-lips. So we're going to talk about pop-lips briefly, because we're going to introduce that as a result. So pop-lips are graph programming frameworks, so that is a way of representing graphs that run natively on our device, that do these kind of operations.

And in pop-lips, we have graphs that break it down to the individual processing unit. So in each of our chips, we have about 1,400 cores, processes on the graph, each of which has hardware threading in there. So we've got our 7,000 parallel compute units, and the pop-lips graph represents the graph at that level. And pop-lips is what kind of then says, well, I've got this matrix for Y to do.

So I split that over those parallel units in an efficient way, so that's where it does partitioning and accessing and stuff like that. Then we have the pop-lip graph compiler, which then will take that fine low-level graph and create actual code to the device. We then go into the graph engine, which then kind of runs into it. There are quite a few levels, just quite a few compilers involved.

And we have like, so if we counted them, there's like five or six different compilers that have to interact to get that kind of efficient implementation down on that device. There's some other things that go on, like sometimes you might want to morph a chip model, so at a high level, you'll do kind of model pipelining and things like that to get efficient models, ready to model chips and things like that. But fundamentally, that's the thing. That was great.

And by the way, I don't think I can recall anyone ever taking us through even a genera size, not specific to your system, but that. So I appreciate that very much. That was super fascinating. And I think as you mentioned, David, there's all of these layers.

And of course, you're connecting to different frameworks that are out there. It's intriguing to me from a software development standpoint, like with these frameworks updating and new architectures and operations being added and all of that. I guess one, how do you keep up that pace and test like that whole pipeline of things. And it doesn't matter having like reference implementations of all of these different models and essentially running like tests against the compilation of those on new versions of your framework or how does that work.

Exactly that. You know, there's a lot of investment. Well, one is development. There's a lot of software development.

And then like a lot of investment in regression system and test infrastructure all has to be done because you have to have a kind of really robust kind of comprehensive software product that there's no way round that to be usable. And I think to kind of throw it in front of it here, I mean, I think graphical amongst the kind of graph to very nice startups is really well-defined. And I think the high startups is really well advanced in our space. All the elements you have with us is like documentation, you can see the storage and documentation how you do that and so on.

So I think we have to kind of try and keep on top of that. Keep working concerning graphical expressions going. I think the other thing I'd say that kind of really helps with this is being very open. So, those are making sure that people can be documented and make sure people have access.

Well, this is something that you do have other platforms that I think we try and be really leading in this and kind of being able to know what popular the low level graph and by the kind of explaining that, make sure people have custom kind of operators in that. And as I talked about, they're open source on GitHub, that kind of thing. So trying to make our attempts to flow with a PyTorch package are open source and everything as well. So by having a more, more open infrastructure, it makes it easier for the community at large to help you adapt to new things as well.

And as we find as we're kind of getting more popular, we're using as the, we kind of get more kind of community involvement like that. I think that's an important part to be as well. We deserve a better Internet and the Brave team has the recipe for bringing it to us. Start with Google Chrome, keep the extension, the dev tools and the rendering engine that make Chrome great.

We're about to Google bits. We don't need them. Mix in, add and tracker blocking by default. Quick access to the Tor network for true private browsing and an opt-in reward system so you can get paid to view privacy respecting ads and turn around and use those rewards to support your favorite web creators like us.

Download Brave today using the link in the show notes and give tippy a try on change.com. So Dave, I'm curious as an AI practitioner, one of the things I would love to know from you is as you've spent all of this time making AI programs sort of be sympathetic to certain architecture and for certain tasks. Do you have any sort of good advice or help for AI practitioners out there in terms of knowing how to tailor our AI programs or models more generally to be efficient for a certain data set or task? Are there any sort of good advice you have or common challenges that are common pitfalls that you see people falling into that could be mitigated with some best practices?

Yeah, it's a really good question. It's like depends on what you're talking about kind of task performance, like how good your model is at a particular task or whether you're talking about kind of computer efficiency kind of how fast is it run. So, you know, the task performance, I think people need to be aware that those two are not kind of islands separate, the architecture, right, even if you're aware of it or not, even if you just like taking what other people do or whatever has been kind of affected by the underlying compute platform and what's efficient because if you were kind of trying to study whether a particular kind of model architecture, for example, got good task performance, but it was really, really slow. But then you probably wouldn't like to see that, even though it might be quite a good way of going for if you're just looking at your task performance independent of those things that be referenced according to this like the hard way of like the hard way of your or hard way of your problem effects kind of how lucky you are being able to explore certain things.

I think it's a good question on practice there. So I think one thing is that's useful. It's just a kind of based understanding of what's going on underneath. And probably the important things there are if you're interested in the kind of efficiency, the computer efficiency, knowing kind of how things like, for example, bat size or kind of certain sizes, axes of your, your matrix is actually kind of affect the hard line beneath and could be faster.

I think the other thing is being very aware of the floating point behavior of various platforms is that can back a very a lot between platform and a good platform where you should document that and you should be aware of kind of the tools they have to show when things are overflowing or kind of underflowing and so on to know when you might be losing task performance, not because of the model structure, but because actually because of the data format they've written there. And as you were seeing new techniques coming in to help that actually, you'll see things like more traffic loss statements, for example, the software stack and then to try to help them try to make things more adaptive so you don't have to think about those things. But I think it kind of is worth kind of just having a kind of good kind of surface level understanding of kind of what's going on under the board. Yeah, you're probably familiar with Bill Kennedy in the go world, maybe you are as well, but we spent some time working together and he always in our conversations was talking about this idea of mechanical sympathy as you're writing code, which I think gets to a lot of what you're saying Dave in terms of Yeah, it's maybe not all software engineers don't have to also be hardware engineers, but there is an element of like developing a mechanical sympathy for what you're writing for that helps you write like really robust and good software that I think is really valuable like understanding, you know, and go it's, you know understanding code if I initialize variable this way, it actually, you know, does something different than if I initialize variable this way in terms of the memory that's allocated and copies that are made and all those things.

And that sort of knowledge, I think is really cool. It's interesting to hear you talk about it in the context of AI. I think it's something that I definitely want to develop a little bit better intuition for in my own work. I think that's like ways into what I was going to ask to say.

We're thinking along the same lines in terms of questions and and that is is you've taken us into this and explain kind of that compiler series and we've talked about kind of the interface being pytorch or tensor flow and I know I'm coming at it as a practitioner and as is Daniel, you have presumably different types of users that need different amounts of that kind of mechanical sympathy is Daniel described it from Bill Kennedy or knowledge of what's under the hood at different levels. How far do I need to go? Should I learn popular? Am I going that far?

Am I going beyond that? What other users if the answer was no for me? What other users are learning that? How do you break that out?

How do you know who should be addressing what? Because that's a question we get all the time from people in this field is there seems like so much stuff. I don't actually know what to address first and second third. So can you break that down a little bit from that end user perspective?

Yeah, it'd be unrealistic to say that like everyone needs to become a full stack developer for the full stack being this kind of this is incredibly comfortable with down through the hardware. Yeah. Yeah. I think that's a really rare thing.

You do get people like that. I mean, they're great. And you're high. Basically, you're not going to get that.

So I think what we find is it's hard for a soul practitioner. I think kind of what we find in the companies we work for is they do that by teams, right? So you have kind of very much a kind of implementation focus team at kind of algorithm and all focus team and the challenge is to make them work kind of hand in hand and be like co-designing. I thought earlier to work together and kind of just being honest about that.

You're not going to get this all in one person. I wouldn't kind of suggest that people try and really do that unless they're super interested in all the parts. You kind of always need to pick where you want to go and so on. But I think despite what I said about kind of like these low parts and stack, the majority of what I call an L scientists or kind of a data scientist or something like that, don't need to know about those level details.

They need to just have some sympathy with it. And I think that's it. So kind of just a bit maybe double a bit to understand that a little bit. I'm going to try and learn it really well.

I mean, definitely I use a basis like that, right? You might say that maybe about 90% of people are programming at the PyTorch TensorFlow level, maybe a bit of understanding mission and you've got a small kind of group. But those groups are very important for us because what they can do is they can add new functionality, new capability to a stack or an application or framework or something like that. Actually, maybe this is a case for the people to specialize a bit.

And then kind of if you want to be that person that kind of understand the details, that do that to kind of find good people to work with and understand the high level and vice versa. If you kind of really want to have some of your learning parts a bit, have some sympathy, but then maybe try to be able to work either the community or kind of set up your kind of professional kind of team structure to make sure you've got people with those other skills as well. We kind of have a mix of Graph Core. So we have some people who are that kind of full-stack trying to kind of go up and down it could also be in a bit different position because we're kind of an implementation side of it.

And then we have kind of keep them specialized if they're kind of wanting to be up. So Dave, I'm interested in, of course, the space is developing rapidly. And I know Graph Core and IPUs have been gaining fraction. There's good use cases out there.

We'll definitely link to some of the materials on the website and all that. I'm kind of wondering as you look to the future this next year or two years with Graph Core, where do you see things going and what are you excited about seeing in terms of the development of this technology over the next year to do you think about it? I think the algorithm space will continue to move very quickly. So I'm very excited to see the fact that we might be doing very different things even two years from now to what we're doing.

So if you weren't talking about wow, can I join all this transform? Well, it's everything we've done. So that's very exciting. I think that what's happening in the kind of more kind of systems point of view, I think kind of what's happening in the data center will be very interesting the way they're put together, the way to get efficiency, the kind of the software and the process.

And the network will have to work together is a fascinating space. The world I think quite quickly in the next couple of years. For the software point of view, I'm interested in like how the current frameworks and stacks evolve as we try and go towards getting more efficiency. I guess I'm talking about kind of parameter efficiency here, I suppose, because that's going to kind of give me to new algorithms.

And I think this is a personal stuff that I don't know quite a bit more visible here, but it might be that we kind of are these kind of very linear algebra based frameworks that we have now the ones that we've very popular in terms of items, or are they the right ones? I don't think that's a kind of thing. That's very exciting, right? But they're still that kind of uncertainty about that kind of thing.

There's all kinds of stuff out there about how it's going to grow and scale. And that's not even talking about the actual applications that they're going to come out of this space as well. So, so all the way through top of it, there's loads of exciting stuff coming up and I'm absolutely sure that. Yeah, awesome.

Well, Dave, we really appreciate you joining us. This is super fascinating and really excited by what Grafport is doing. We'll make sure and link a bunch of links in our show notes for listeners. Definitely check out what Grafport is doing.

They're white paper and all the information about the popular software framework. It's really cool. And of course, the hardware. And yeah, I was just really enthused by the conversation and I have a lot going on in my mind that I want to think about more.

So appreciate that. Thank you for joining us Dave. Thank you for joining us Dave. Thank you very much.

It's been a pleasure. It's been a good talk. Thank you for listening to Practical AI. We appreciate your time and your attention.

Follow the show on Apple Podcasts, Spotify, or your favorite podcast app, your neural networks. Well, thank you. We are also on the web at practicalai.fm. There you'll find recommended episodes, listen, our favorites and a free sign up to join the community.

Practical AI is hosted by Chris Benson and Daniel Whiteneck. It's produced by Jared Santo with music by Breakmaster Cylinder. Thanks again to our sponsors Fastly, Linode, and LaunchDarkly. That's our show.

We hope you enjoyed it and we'll talk to you again next week.

Share this episode

Similar Episodes

Milk Proteins without the Dairy - Adam Tarshis and Dr. Cory Tobin

Jun 9, 2026 ·50m

New Technology in Severe Burn Care - Dr. Katie Bush

Jun 1, 2026 ·31m

New Methods in Early Cancer Detection - Dr. Nate Montgomery

May 25, 2026 ·39m

Strategies in Combating Chronic Kidney Disease - Dr. Salvadore Viscomi

May 17, 2026 ·37m

AI and the Future of Healthcare -- Dr. Emilia Javorsky

May 8, 2026 ·39m

The First Environmental GE Organism Release - almost! Dr. Steven Lindow

Apr 28, 2026 ·25m

Similar Podcasts

PodQuesting Dwight J Randolph- WolfShield Media PodQuesting: -By WolfShield Media and Dwight J RandolphJoin us on an exciting journey to master the world of fiction podcasting! At PodQuesting, we document our quest to improve and innovate, sharing valuable insights, strategies, and behind-the-scenes tips along the way. Whether you're an experienced podcaster or just starting your first show, our podcast is your go-to resource for everything podcasting.Discover practical advice, creative techniques, and lessons from our own experiences as we explore the ever-evolving podcasting landscape. Ready to level up your skills and embark on this adventure with us? Tune in and join the quest!Have questions or feedback? Reach out to us at [email protected] and visit our website:WolfShield.Media The PFN Cincinnati Bengals Podcast Pro Football Network The PFN Cincinnati Bengals Podcast is where you can stay up-to-date with the latest news and analysis on the Cincinnati Bengals! Our hosts, industry experts Jay Morrison and Dallas Robinson, provide weekly coverage of all the latest rumors and updates about the Bengals. Don’t forget to follow the show to receive new episodes directly in your podcast feed and leave a rating and review to let us know your thoughts. The 48 Laws of Power by Robert Greene (Full Audiobook) Robert Greene Amoral, cunning, ruthless, and instructive, this multi-million-copy New York Times bestseller is the definitive manual for anyone interested in gaining, observing, or defending against ultimate control – from the author of The Laws of Human Nature.In the book that People magazine proclaimed “beguiling” and “fascinating,” Robert Greene and Joost Elffers have distilled three thousand years of the history of power into 48 essential laws by drawing from the philosophies of Machiavelli, Sun Tzu, and Carl Von Clausewitz and also from the lives of figures ranging from Henry Kissinger to P.T. Barnum.Some laws teach the need for prudence (“Law 1: Never Outshine the Master”), others teach the value of confidence (“Law 28: Enter Action with Boldness”), and many recommend absolute self-preservation (“Law 15: Crush Your Enemy Totally”). Every law, though, has one thing in common: an interest in t Mind Force Radio.com Mind Force Radio.com Natural Strength Night is an informative, humorous, sometimes a little raucous, good-time of myth busting and honest training information from the trenches. We strive to help everyone involved with old school strength training (without steroids) to not make some common training mistakes. Along with great information, you'll hear a fair share of steroid bashing, flamingo sightings, breaking goons, iron game history, and honest drug-free training information from various leaders and strength coaches in the field to help you get real results! If your primary training information comes from reading "Muscle & Fiction" magazine we'll help get you straightened out. If you love high-intensity strength training, dinosaur style training and just like lifting heavy weights ... or loved Jack Lalanne, Sandow, Grimek, Peary Rader's Iron Man magazine, Brad Steiner's articles, Stuart McRobert's Hardgainer, Iron Nation, Osmo Kiiha's The Iron Master, you will love the show.On The Rugged Individual, we

Frequently Asked Questions

How long is this episode of Changelog Master Feed?

This episode is 44 minutes long.

When was this Changelog Master Feed episode published?

This episode was published on April 13, 2021.

What is this episode about?

Dave Lacey takes Daniel and Chris on a journey that connects the user interfaces that we already know - TensorFlow and PyTorch - with the layers that connect to the underlying hardware. Along the way, we learn about Poplar Graph Framework Software....

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this Changelog Master Feed episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.