Roles to play in the AI dev workflow episode artwork

EPISODE · Jun 22, 2020 · 50 MIN

Roles to play in the AI dev workflow

from Changelog Master Feed · host Practical AI LLC

This full connected has it all: news, updates on AI/ML tooling, discussions about AI workflow, and learning resources. Chris and Daniel breakdown the various roles to be played in AI development including scoping out a solution, finding AI value, experimentation, and more technical engineering tasks. They also point out some good resources for exploring bias in your data/model and monitoring for fairness.Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. The Brave Browser – Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token. Download Brave for free and give tipping a try right here on changelog.com. Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Rollbar – We move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog. Featuring:Chris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:Streamlit:Streamlit funding announcementPrevious Practical AI episode about StreamlitGPU acceleration in Windows Subsystem for LinuxFairness and bias:Google’s explanation of bias in their ML crash courseIBM fairness 360Google’s responsible AI practicesDriven Data’s Deon projectPrevious Practical AI episode about bias in AI and hiringUS Department of Defense Ethical principles for AIChris’s personal, COVID-related blog postUpcoming Events: Register for upcoming webinars here!

This full connected has it all: news, updates on AI/ML tooling, discussions about AI workflow, and learning resources. Chris and Daniel breakdown the various roles to be played in AI development including scoping out a solution, finding AI value, experimentation, and more technical engineering tasks. They also point out some good resources for exploring bias in your data/model and monitoring for fairness.Sponsors:DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog. The Brave Browser – Browse the web up to 8x faster than Chrome and Safari, block ads and trackers by default, and reward your favorite creators with the built-in Basic Attention Token. Download Brave for free and give tipping a try right here on changelog.com. Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com. Rollbar – We move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog. Featuring:Chris Benson – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:Streamlit:Streamlit funding announcementPrevious Practical AI episode about StreamlitGPU acceleration in Windows Subsystem for LinuxFairness and bias:Google’s explanation of bias in their ML crash courseIBM fairness 360Google’s responsible AI practicesDriven Data’s Deon projectPrevious Practical AI episode about bias in AI and hiringUS Department of Defense Ethical principles for AIChris’s personal, COVID-related blog postUpcoming Events: Register for upcoming webinars here!

NOW PLAYING

Roles to play in the AI dev workflow

0:00 50:26
of MATCHES

TRANSCRIPT · AUTO-GENERATED

I'm gonna say something slightly controversial, I think. And that is that I think of AI development as a component of software development, which a lot of data scientists will say, no, it's not, no, it's not. But when I'm looking at it in production and I'm looking at us actually managing that, I see it in that larger context because all of those other activities are happening around it. The Amherst ChangeLog is provided by Fastly, learnmore at facet.com.

We move fast and fix things here at ChangeLog because of rollbar, check them out at rollbar.com. And we're hosted on Linode Cloud servers at thelinode.com slash changelog. This episode is brought to you by DigitalOcean. DigitalOcean's developer cloud makes it simple to launch in the cloud and scale up as you grow.

They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99 uptime SLA and 24, 7, 365 world cloud support to back that up. DigitalOcean makes it easy to deploy, scale, store, secure and monitor your cloud environments. Head to dio.co.co slash changelog. to get started with a $100 credit.

Again, dio.co slash changelog. Welcome to Practical AI, a weekly podcast that makes artificial intelligence practical, productive and accessible to everyone. This is where conversations around AI, machine learning and data science happen. Join the community and Slack with us around various topics at the show at changelog.com slash community and follow us on Twitter if you're at Practical AI event.

Welcome to another fully connected episode of the Practical AI podcast. This is where Chris and I keep you fully connected with everything that's happening in the AI community. We'll take some time to discuss some of the latest AI news and we'll dig into a few learning resources to help you level up your machine learning game. I'm Daniel Whiteneck.

I'm a data scientist with SIL International. And I'm joined as always by my co-host, Chris Benson who is a principal AI strategist at Lockheed Martin. It's been quite a season in our lives, Chris. Oh boy, 2020 has definitely had an impact on my life.

Yeah, definitely. I think we would not be right to just ignore everything that's happening in our world as we enter into these conversations. Of course, we've got the unrest that's really happening in our country, but around the world as a result of injustices and police brutality and sort of systematic racism that's happened in our country, but also around the world. And then that kind of piled on top of COVID virus related things.

And then that piled on top of the economic impact and fallout of that and unemployment. Of course, these things are not separate from AI things. And I think probably over the course of these coming years, I think it'll be years of fallout from everything that's happening though. Totally.

It'll impact our conversations. It will. Yeah, it's all real life. And a couple of thoughts there.

You talked about the injustice of what's happening in terms of in Black Lives Matter, being able to come back out and be meaningful in this discussion which I think is fantastic. It's a time of change right now. It's a time of massive shift. And I know it impacts everybody in the audience.

I know for me, you mentioned COVID. And you and I had talked a little bit about before the show. And so I'll share very briefly with the audience what's happened recently to me. I'm actually choking up a little bit.

So my mother-in-law recently died of COVID. So it's impacted my family. And just wanted to share that with the audience. I've been kind of missing in action for a little while.

I know you did an episode with Darwin AI recently. And I thank you for doing that. And I know with the unrest we paused to show briefly. But I've been kind of out of action.

Just wanted to let folks know. I know so many people that say they hear about COVID in the news but it hasn't touched their lives in a direct way. And speaking of someone who has had it touched directly, it is a serious disease. And so I just hope everybody will follow the safety guidelines and be aware when you lose someone that you love, it changes how you see it.

When you have other family members that have it, then you're working on them. And when you have a whole family in isolation, it makes a difference. So stay safe people. I appreciate it.

And just wanted to let you know it's real. And it's touched my life. And thank you for letting me say that. It was important to me.

Yeah. Thank you for sharing, Chris. I know it takes a lot to share that as well. And I know my thoughts and prayers have been with you and your family.

And yeah, I think it's just another data point to motivate people to take things seriously. But also, I think the AI community, a lot of the people that listen to this podcast, there are many meaningful ways that people can contribute, whether it's on the COVID and virus related front, whether it's on the racial injustice side of things or the economic side of things. Of course, there's community things that we can all do, being good neighbors, being caring for people. But then also being tech people, being AI people.

I mean, there are some real intersections with AI technology. I mean, of course, on the policing front and that side of things, we've seen increased usage of things like facial recognition and other things that are concerning for certain groups, algorithmic decisions that are impacting certain groups. On the virus side of things, there's a whole bunch of AI people that are trying to come up with beneficial applications to help that scenario, not necessarily all predicting COVID outcomes, but helping people get the right information. We had the episode with the COVID QA group that was working on that.

We also talked to you about the core 19 data set as related to COVID. There's ways that AI people can contribute both in terms of data annotation, in terms of coding, in terms of jumping into open source projects. So I think I'd really encourage people, if you're interested in those things, or wanting to know how to contribute to those things, or wanting to know how to make your voice heard, in terms of good AI ethics related things. And reach out to us on our Slack channel, our Slack team.

You can find us at changelac.com slash community. We're on our LinkedIn page or on Twitter. We really are wanting to have some discussions around these topics and point people to good resources. So I'm really hoping that people reach out and find some of those ways to contribute.

You know, I'm so glad that not only did you bring up the practitioner side of being an AI professional, or an enthusiast, but also the AI ethics side. And as we've talked about before, I'm very involved in AI ethics. And so, you know, as we talked about injustice, both the technical skills that you have there, and the incredibly deep, rich thinking that we hear from people in this community, you have a voice and you can shape the future. This is really something that we have a role to play in.

So I am asking our listenership to engage, engage with these issues in real life and bring your expertise and your skills to bear on this. Yeah, and later on, normally in these fully connected episodes, we take some time at the end to share some learning resources. I've pulled in a few that I've run into over the years as related to bias and fairness in AI. And so, we'll talk about those later on in the episode and maybe some places where you can find out about some of those things.

But before we get there, we do wanna kind of acknowledge that there are a lot of, you know, encouraging and exciting things coming out in the AI community in terms of advancing various efforts and various toolkits. And that's one of the things that we wanted to do in this episode was highlight a couple of those. The first of those that I saw, which really excited me, was the announcement from Streamlet that they finished a series A of funding for actually $21 million, which is kind of crazy. If you remember, we had Streamlet on the podcast, that was episode 66, and we talked all about the Streamlet project and everything, so we definitely recommend you go back and listen to that.

But in general, I think Streamlet is an incredible project. I don't know if you've been following it at all, Chris. Certainly after we had the conversation with the team, I found it incredibly inspirational. And, you know, they are, Streamlet is an open source framework to turn Python scripts into interactive apps.

And I know prior to us engaging them, I wasn't really aware of that, but it's a super cool approach and it's showing the creativity. So, yeah. Yeah, so I know for me, I have no, well, I don't know if I want to say absolutely zero, but I don't have much exposure and experience in terms of like front end engineering or building actual graphical interfaces or web apps or anything like that. At the same time, often, when you're trying to integrate a machine learning application into a business process, there's a very human side of that becomes very difficult if you aren't able to let people interact with what you're building in a visual way.

So, I'm thinking right now, I attended a couple workshops recently on active learning and sort of human in the loop methods. And so you could have this scenario where, you know, maybe you're working on like the workshops we were talking about like translation applications, machine translation applications. And sometimes when you deploy that, you might want to have a model in the loop that tries to identify like, you know, bad translations or something that your machine translation application is producing and then have a user actually review and look at those and correct them. And so you've got this kind of graphical piece, but also the user piece, the non-technical user piece potentially, that's interacting with that.

And so, yeah, I see that scenario, I think popping up all the time and streamlet, I think fits right in there, which is why it seems like, to me that it's getting a lot of attention is that there's an often seen pain point that isn't really well dealt with. I agree and not only that, but I think that our community often undervalues what that user experience should be. We tend to think, we go straight to the model, we go straight to the training and talk about the latest algorithms. And when you're doing this stuff for real in production and you have a community that is expecting performance from you, not fulfilling the appropriate type of user interface and user experience really degrades from the work that you are doing in AI.

If you don't either yourself or people you work with have those skill sets, you can lose the value in something that would otherwise be great very, very quickly. And as I've worked in a professional context, that point has been driven home to me over and over again. So I tend to approach AI from a user perspective, even if I'm the developer doing the work. And streamlet is talking about the ways that I'll use this money, they want to extend the application.

And I should mention too, this is open source application that you can use. So you kind of just pip install, I think you have pip install, if I remember right back to it, streamlet and then run it locally. And they have a whole bunch of different customizations that you can add like little sliders and text input and file upload and plotting and all sorts of ways you can configure it. And so I think that when they're talking about extending, of course extending that and the customizability of it and customized layouts, they also talk about building in programmable state.

One of the things I was curious about, of course, because I've always used streamlet just as an open source application is if they're raising money, they're obviously a business. And so I think the other thing that they're going to devote that effort into is the streamlet for teams, which in my understanding is some sort of sharing and a combination of like sharing and deploying in a secure way, streamlet apps that are actually sort of production applications and not just like demos or proof of concept sort of things or little tools and that sort of thing. Yeah, I'm looking forward to seeing some of the other things they used to do as they go into this new phase. So we may have to revisit with them at some point as they get some of this work done and they're able to use that capital well.

ChaneLog News is the best way to keep up with the ever-changing world of software. We track, blog and contextualize the coolest projects, the best practices and the biggest stories each and every week. Make ChaneLog.com your daily destination or hit the snooze button and subscribe to our weekly newsletter that hits inboxes on Sunday mornings. Join more than 15,000 enthusiastic readers.

It'll cost you exactly zero dollars and you can subscribe right now at chaneLog.com slash weekly. So I know you had a topic that you wanted to go into, which I think is a good one, but before we do that, I just wanted to mention one other thing that actually just before recording today, I saw as I was scrolling through Twitter, which is GPU accelerated training now supported in Windows subsystem for Linux. And I have to admit, I have not been a Windows user for quite some time, but in my understanding, there are quite a few of them out there. There are, yes, there are a few.

Yeah, quite a number. And I know, for example, when I taught a couple courses at Purdue over the last few years, of course, the lab machines, their Windows machines, or at least some of them. And so it was always a struggle for me to kind of figure out the best ways of doing sort of AI experiments and programming in that environment. And mostly that's just my unfamiliarity with that whole world, but yeah, this is pretty cool.

So I guess Windows subsystem for Linux or WSL enables the users of Windows to run a native, unmodified Linux kernel or Linux command line directly on Windows. So that's pretty cool in and of itself. But now I guess the step is that they're adding the GPU acceleration to that and connecting up things nicely to CUDA and those sorts of things. Yeah, and I think that's great.

And like you, I have not recently been a Windows, once upon a time I was in Windows, moved away, but I've been hearing they're really embracing open source in recent years and that's definitely brought me back around to being very, I would consider them for a while when they before they kind of hit that approach. But so I total CUDA's to Microsoft for making that very hard. It's hard to steer a big organization in a very different direction. So I've been very impressed.

I think it's a fantastic step forward to have the GPU support in that. And the funny thing is I keep running across Windows subsystem for Linux being incredibly usable from people that are using it. And I worked in an organization that has a lot of Windows users. And so I'm getting really, really good feedback on the work they've done and being able to utilize that Linux kernel.

It's not a second class citizen as I understand it really does a good job. So now seeing that they have that support may change the landscape a little bit as that gets adopted over the next couple of years. Yeah, I think the ability to run unmodified Linux things in Windows, that part sort of rings true right away for me. And it's cool that you could do the GPU accelerated stuff.

I guess in terms of my own workflows, often I don't have the GPU in my laptop or sitting on my desk, but I'm using it on either a remote computer or in the cloud. So in that case, if I was on Windows, I think the important thing would be the command line stuff and scripting things and all that sort of things that I could do in the way that I'm used to. But I know also that people build a lot of great systems for also like gaming computers, for example, that are Windows based. This is where my mind's kind of going with this, I guess, is that there's all of these gaming computers out there with GPUs and games for the most part.

And I'm also not a gamer, so I'm really speaking outside of my domain, but for the most part, running on a Windows system. So it seems like now this would make it maybe easier to buy a sort of off the shelf gaming computer or game in laptop that's Windows and then use the GPU on that for AI purposes. Whereas before, maybe you have to buy that and then install Linux and figure out all the drivers and blah, blah, blah, blah. Maybe that makes that process easier, I'm not sure.

So I would agree. I'm not much of a gamer, but I think that makes a lot of sense. I actually think I'm probably gonna try a Windows up system for Linux out in this context. So like yesterday, I didn't have a chance today, but yesterday I was logging into a dedicated DGX2 and I got all 16 GPUs for myself.

And that was a lot of fun doing some work on there. It was a lot of fun. It was a lot of fun. And so I might have to pull out a Windows laptop and do the same thing.

I did it from my Mac going in. But yeah, I think I'm gonna give it a whirl. You can write up blog posts about Windows laptop versus DGX2. There you go.

There you go. I figured there will be a clear winner, but it would be interesting to do the comparison. Well, I can start on the Windows side and use that as a client and log into the DGX and we'll use both systems so we can make that work. Yeah, sure.

Cool. Well, let us know. I'll be interested to hear from people if and when they start getting into this Windows mix of things. But moving on, I think you were mentioning a topic to me that I think is pretty interesting and oftentimes very confusing for people.

And I know that we've touched on it before. You wanna mention what you were thinking there? Sure. So I do quite a bit of mentoring for people, not only at my employer, but just in general.

And people will reach out and ask for advice. And probably the thing that people ask about most often is they're trying to figure out how to orient their own careers on AI, ML, focus. I've been pretty open that I came from the software development world and reoriented my own career some years back on this. And it's completely doable.

I think it's a myth that everybody in AI is a data scientist. I think it's a myth that you have to have a PhD or some other university-based experience to get into this field. It's certainly not the case. Those are the case for me and a lot of people that I've worked with.

And I think in a previous episode, I don't recall which one, but I mentioned the fact that because I've been in my career now for, I don't know, 25 years-ish in that frame, I was around when the web was taking off. And that was the early part of my career was when the web went from the internet with no web into the web that was initially just academic and then took off. And I have observed as we've gone through this AI revolution that it follows many of the same trends of a brand new field that is exploding outward. And in the beginning, people thought computer science was the thing.

You had to have a computer science degree to do that. But we rapidly, one role, changed into many roles very rapidly. And there was a lot of diversity that got introduced, as well as the skills you needed, the level of experience to do different roles. It got complicated.

And that's good. It's a sign of maturity. And we're definitely seeing that in this field. And so a lot of people, when they're trying to figure out, how do I do this?

How do I fit into this new exciting AI world? That's where I really want to be in the years to come. But that's not where my education has been. That's not where my previous experience has been.

And one of those things that I start with with people that I wanted to address today is there's not one role out there that you have to find your way into. There's many ways. And actually, it might be a role that you're already playing in a slightly different context. It may be that you can kind of evolve your way into this.

And so if you're already working with databases and other data sources, data lakes, that's one area that's now very involved in the big data input that goes into these AI models and stuff. So I really wanted to talk in a practical sense and have a conversation about what are different avenues people might be able to take to get into this fun field? Yeah. And I think along with that, of course, there's, like you say, there's a lot of jargon and job titles out there that people hear and might be confusing as to how they fit in like data scientists versus machine learning engineer or research scientist or data engineer.

But maybe it would be good to kind of talk about the various pieces of the AI workflow and where certain people might fit in in terms of a team of people working on these sorts of solutions. That's a great idea. From my perspective, when you're thinking about the workflow that often happens here, there is sort of an initial phase, which involves a lot of problem defining and scoping in terms of what may or may not be possible and what might be good to experiment with or try. And also an exploratory kind of phase of data gathering and pre-processing in an exploratory and interactive way doing some model training and sort of proof of concept, evaluation, and validation of a certain process.

So for example, if you're a manufacturing company and you say, oh, we've got this problem on our manufacturing line, and we think maybe we could stick a camera in this location and detect this problem or something like that, you have to figure out, OK, well, what would I want as my input and output data? What's actually going to be fed in? Could this camera be placed? What would be the appropriate output that would actually be useful?

And then in an exploratory way, could I actually gather some of the data which would allow me to train that sort of model? And if I could gather that data, what sort of model might I go after? And all of this stuff is very iterative and fuzzy. I guess this is the fuzzy phase?

I don't know if you'd agree with me. I think a lot of these projects start out that sort of way. It does. There is expertise required on the front end.

In real life, you don't jump in to model development. I think there's this kind of perception of come join us, hop on, pick an environment, whatever you care about, and build a model. But there's a whole lot of work that goes into it on the front end before you even get to exploring in the data context, you've got to figure out what is it that you think you want to build and why? And why on earth would this particular approach be the right approach?

And why would AI bring value in versus similar solution? Yeah, I mean, that's a great point. And there might be five different ways of approaching a solution to the problem. And if building a neural network is the most expensive approach to doing that.

And when I say expensive, I mean the amount of effort and time and resources necessary to do it, why would you do that? If you can get a result that's just as good from some other algorithmic approach and you need whatever problem you're going to solve, you need expertise as far as being a domain expert on that problem area. And that might mean working with the business side of your company on what it is that they're trying to provide for customers. Because at the end of the day, that's what a company is there to do.

And we're just barely touching on the front end of this process. So there are so many ways to engage in this AI process that we're talking about that don't require that you have a PhD in data science from a top university and have 30 years of data behind your belt under your belt. Yeah, I think actually there's in this category of contribution, I guess we can call it. This problem defining, scoping, exploratory stuff.

In fact, I think there is a solution architect role here where you do need some type of knowledge about AI systems and what is possible and what is feasible and what isn't feasible and what's overkill and what's not overkill and appropriate usage. And scoping in terms of how long this is going to take or how much data we might need. But those are skills that you can pick up without knowing the difference between LSTM and Groove. That level of detail is not required, I think, for this sort of thing.

Although I may not be one of them, there are people out there that I think really enjoy that going into a situation or a problem, maybe dealing with a client on a shorter time scale, a few months, and scoping out a potential solution and then passing that off to another team to actually do some more implementation and production-related things. Absolutely. I'm one of those people sometimes. It's one of the things that I do in my own job.

And I'll tell you, having built up some expertise in the field, if you can go talk to people in the front end and help them figure out what it is they should be thinking about, what's going to serve the need, it can be quite fulfilling. And it does take some understanding and expertise of the field to be able to do that successfully. If you go in and only do the business analyst without any background at all, and no interest in developing the background, you won't be as effective at being able to decide that. So strategy is a key part of the front end of this process.

Yep. And I think once the problems start shaping up, this seems like it's going to be a valuable thing to do. There's still that exploratory phase of getting an initial proof of concept data set together, proving out that this will actually work and produce the type of value that we want. And oftentimes, in this stage of things, I think getting a brute force solution is how I think about it in terms of this thing might not be optimized in every way.

It might not have the exact accuracy of performance that we want, but all of the right things are plumbed together. And the right type of data is coming in. The right type of pre-processing is happening. The right type of model is producing some result, which is then being used to create something of value.

That rough plumbing of those things together requires now some technical skill, but this doesn't have to be a fine-tuned C++ application that runs with super high performance on an embedded device out in the field. This is proving out that the thing works and developing the right type of solution. So I think it's a more technical level, but it's not as hardcore software engineering or data engineering as it could be. When you say that, I agree with everything you just said.

And the way I would express that is that AI development fits very well into an agile software development process, where you're having to iterate and you learn from that iteration and you make those adjustments and you go back. And that happens both at the model level. And it also happens in terms of how you're going to choose to deploy and do the engineering you need to accomplish that. I very much, and I'm going to say something slightly controversial, I think, and that is that I think of AI development as a component of software development, which a lot of data scientists will say, no, it's not, no, it's not.

But when I'm looking at it in production and I'm looking at us actually managing that, I see it in that larger context, because all of those other activities are happening around it. So definitely. We deserve a better internet and the Brave team has the recipe for bringing it to us. Start with Google Chrome, keep the extension, the dev tools and the rendering engine that make Chrome great.

Rip out the Google bits, we don't need them. We have the analytics in ad and tracker blocking by default, quick access to the Tor network for true private browsing and an opt-in reward system. So you can get paid to view privacy respecting ads and turn around and use those rewards to support your favorite web creators like us. Download Brave today using the link in the show notes and give dipping a try on change.com.

We really liked where you were headed with what you're saying, Chris, in terms of AI development being viewed as a sort of subcategory of software development. I think this fits very well into the kind of mindset of another person we had on the show, Joel Grooves, will link to his episode from the Allen Institute for AI. I think he's mainly working on the Allen and LP project. And I think he had a lot more things to say about that and why it's useful.

I definitely think that we kind of started talking about the more technical exploratory stuff where you're trying to figure out what you're going to do and start plumbing the right pieces together and validate a solution. You will see some difference in industry, at least from my perspective, in terms of sometimes at an organization, the people that are doing that are not the same people that are at the end of the day involved in producing the production system that's actually implemented. And then you'll see other organizations where at least there is some overlap between the team that does this sort of exploratory work and the team that actually produces production systems. From my perspective, the latter has a big advantage because if you have total separation between those groups, then when something goes wrong in production, basically, the production team will maybe in a non-confrontational way.

But basically, at the end of the day, they'll say, well, this is a problem with the solution and the model and the way it was developed, not a problem with our implementation. And then the people that did the exploratory work and validated the solution will say, no, our solution's great. There must be something in the implementation. No one's taking ownership of it.

And no one's taking ownership of the robustness of it, in particular, like in how robust the solution is. So I think in a perfect world, there is some overlap between the group that does those things. No, I agree with you completely. And I think the reason to state it, the reason that that that second group has the advantage is because they are able to learn from those earlier processes.

So if you have one group doing a prototype, they've gone through that process and they've learned what they need to know. And if they're going to hand it off to a production only group, well, they're starting from zero again, or from whatever documentation came out of that first thing. So there's certainly an advantage to the learning process, which is why AI ML development is best served in a larger agile development process. And if you're in that software development world and you're hearing this, these should be familiar terms to you.

And those are all potential inroads for you and your career and your particular interest in this to translate existing skills and existing interests into this AI world and be able to do that. And there's no point where you're ever done. You can continue to migrate across that space by always learning and always deciding where you want to go next and doing that. Yeah.

I think that's crucial for career development in general, but especially in this one. Yeah. And even in the phase of this that's exploratory, I often use this analogy, which listeners will be familiar with. A lot of AI development is more akin to cooking, according to a recipe than it is some intense research and development.

And so even in that exploratory phase, it's taking pieces of things that have been done before and putting them together in a unique solution, which is very similar to software engineering. And if you were to produce a proof of concept and software engineering, the difference, I think, there is a sort of tool set difference, maybe, that some software engineers might be a little bit uncomfortable with, like, in this exploratory phase, you might have a jupiter notebook that shows, here's how I ingress data. And then here's how I preprocess the data. And then here's how I train my model.

And then here's how I do inference. And then when you move into the production side of things, maybe it gets a little bit more comfortable in terms of the tooling for software engineers, where you would take that notebook and then say, well, I'm not going to run my notebook in production, I've got to take out this data gathering piece and make it a Docker container that's going to run in Kubernetes on AWS. And then I've got to take out this pre-processing container and figure out how to run it in parallel over a large data set in the cloud. And then I've got to take my training piece and pull that out and dockerize that and figure out how to run it on some GPU accelerated infrastructure.

And those pieces still carry through, but the tool set and the way you go about it definitely changes. Yeah, that's a great point there. And that is that at different points, you may have different people involved in the maturation, the maturity aspect of this process. And so it's really common for software developers to look at a jupiter notebook for the first time in scoff at it and say, no, I grew up in software development best practices.

I'm looking at this jupiter notebook, and it's why would you do that? But if you were the data scientist that's trying to put the model together, it's a fantastic way of iterating rapidly. And your job at that point is not to produce production software, it's to test and try different things out. You may be implementing a transfer learning approach where you're then trying to customize that transfer learning into the specific solution you need.

And likewise, the data scientist needs to recognize, when you deploy it, you're not deploying that notebook. You were using the notebook for what it's good for, but it has to be a software component. It's a model that's wrapped in a software component that's being deployed out into a larger software system at the end. And so there's a role for all of these things.

And so leave your biases at the door, leave them there, look for why each tool or each role is so important, and recognize that. Because I've seen people fall down in that way many times. Yep. I know, for example, we had a question in our Slack recently in a discussion about, hey, I hear all of this stuff about training, and I'm able to run these examples.

But then when I try to do this inference in production, the performance is so terrible. Why is no one talking about this, or how is it hard to find resources about this? Great question. There definitely are resources out there.

And I think like the commenter said, it would be great to have even a full episode about that side of things and model optimization. That is another piece of the puzzle that changes when you move later on into a project is, if I'm running this an edge device in a manufacturing plant, it's going to have concerns. If I'm doing it on a mobile device, it'll have different challenges. If I'm doing it on a beefy cloud instance, then you have maybe more flexibility.

But you may have latency issues you want to deal with or something in responding to people. That's a great question from the listener. And I love how you led into that. And really, I'm not sure it's an official term or not at this point.

But we have conversations I know in my own collection of colleagues about this all the time. We refer to it as AI engineering. And I think the thing that is so crucial about that is to recognize that two years ago, we were talking about the edge as kind of an exception case, because people were really deploying most often into servers or locally or whatever. And it was more of a standard, well-known environment.

But going forward, most things will be at the edge. As you make models and the utility of models pervasive in our society and our culture, you're going to see edge devices being the targets of that deployment in so many different ways. And so that requires that you rethink your engineering to accommodate that. Once upon a time, deploying software was really code-centric.

And you'd think about just processors and stuff like that. But now it's all about data. If you are deploying to some sort of mobile platform, maybe it's an autonomous vehicle, you have telemetry. From that vehicle, you have sensors in that vehicle.

You have cameras in that vehicle. And to provide the level of performance you need to be able to do real-time inference on that requires special knowledge of engineering on getting the right data in the right way to the right place at the right time so that it can be acted upon. And you no longer are doing static data that you're running through a server or something. AI engineering is crucial for making this stuff actually work.

It's later in the process than what we were talking about. But after that data scientist has been working in the Jupiter Notebook, you got to either put it out there in the world or it's useless. It doesn't do anything for you. Another piece of this puzzle is actually, I think.

So there's the AI workflow and the different phases along a project all the way from kind of solution architecting or consulting to the very technical side of AI engineering things. But then there's also, I think, you could look at that workflow in different domains or verticals. And that's going to look very different. Of course, in maybe the manufacturing world, you're going to be thinking a lot about computer vision and running things and edge devices in potentially hazardous conditions where they might have to do.

You have a lot of device issues. In other cases, in web space, if you have a web app that you're dealing with or software as a service company, then you might be running your models a lot of time in the cloud. And maybe you're dealing with a lot of natural language processing issues and dialogue-related issues with customer service and all of that. And each of those sets of problems has its own tooling and its own methods and its own community and its own way of going about things.

And so I think another thing to think about when you're thinking about the lay of the land is also the domain. And I think, like you said, this happens in software engineering, too. And people have specialized in certain areas of software engineering. And AI, I think, will be no different.

There's a lot of specialization that can happen. Yeah, I think in my own experience, it definitely bears that out. If I look at counting my current employment, my last three organizations that I've been a part of, and all three had an AI role. In the first one, we were working with clients, and it was server-based.

It was kind of what I think of as a little bit old school now. It's funny that it doesn't take very long for something to become old school because it evolves so fast. But yes, we were deploying models into big servers that were resource-rich. And then in the next organization I went to, we were focused on warehouse spaces and introducing robotics and cameras and different things that make logistics work.

And that presented a different set of challenges that were specific to the domain. And then now I've moved into the defense industry. And I focus on autonomous platforms and other adjacent technologies. And some of the previous things certainly had an effect.

But this is a new domain that has its own specific constraints and challenges. And that's the case. So we are definitely seeing diversity in how AI is conceived and implemented depending on the context that you're using it in. Yep.

Well, one thing that's true across all of these workflows and domains is that definitely you're going to have to deal with bias in your data and model fairness. And this kind of brings us to the end of our conversation, where we're going to share some learning resources with you. And I think in light of our current climate and things going on in our world, it's only natural to share some resources about bias in your data and model fairness. I think that one of those resources, which maybe is a good jumping off point, there's a nice write up in Google's machine learning crash course about fairness and types of bias.

And I thought this was pretty interesting. And maybe certain branches of science have similar terminology around this sort of thing. And think about like survey science, for example, things about bias a lot and populations and those sorts of things. So this was really helpful for me to kind of pick up some of this terminology in examples.

They actually go through a talking about reporting bias, automation bias, selection bias, group attribution bias, and others, and give examples of those types of biases and how they can creep into your data, which I thought was incredibly useful. I don't know how familiar you are with some of these things, Chris, but it was really helpful for me, because I was not familiar with the sort of categories that you could think about bias in. Yeah, totally. And bias in the involvement I have in the AI ethics space, bias is a huge part of it.

It's probably the concern that most people are associated most with AI ethics. It's the thing that people think about the first. And so understanding those different types of bias and how they impact an outcome and how they can result in unexpected outcomes, which can be incredibly common is pretty important. So it's a first good way to get into that.

And kind of going back, I think it's particularly applicable as we have this episode at this particular time, given the large public response to injustice, to think about some of these tools I've already heard are being used in unexpected ways against protesters, for instance, even ones that are not breaking the law in any way. And it's just as we think about different types of bias here, think about how do you want the application of these tools to be used? Do you know, facial recognition can occur long before or after a protest event by following people through cameras and having to do an automatic tracking? There's a lot of impact on how we may want to think about this.

I'd also encourage people, just a couple more quick mentions here to take a look at IBM's Fairness 360 website. It just includes a really great breakdown about various ways that people are dealing with fairness, both pre-processing of data in processing or model change, actual changes to your model that you can make. Also, post-processing monitoring of your predictions. They talk about a whole variety of things with great examples.

So check that out. Also, Google's responsible AI practices. They have a great write-up and discussion of fairness and bias. There's also a good project from Driven Data called Dion, which includes a nice checklist.

If you like checklists, that you can sort of start with a default checklist and update it to make sure that you're checking for certain things like bias and fairness in a project. And that can be embedded within your repository, within a Jupyter Notebook or other things. So we'll link to all of those in our show notes. I think it's well worth people's time to take a look at those things and make sure and educate themselves about how that can creep into your process.

Totally. There's one other one that I'll throw out that is it has been useful beyond the industry that it started in, is that is because of the process that the US Department of Defense entered into on their AI ethical principles, and we had a show, we addressed that in depth previously, they went out into industry and academia and solicited feedback from many, many different people in the space, many of them were luminaries whose names you would recognize. And you can actually go and do, like, if you Google DOD AI principles, you'll find that they have their five, and just like Google and Microsoft and all the other players do. But I've noticed recently that they're being adopted in completely different use cases, because they're not specific necessarily to the industry that they were formed in.

So that's a really good one that I end up interacting with quite a lot. Awesome. Well, it's been great to have a conversation with you again, Chris, great to have you back. And looking forward to our future conversations and how those will be shaped with our ever-changing world in the future.

But I appreciate our listeners hanging through us this spring with changes in our schedule and also changes in your life and being in different places than you normally would be. Glad that you continued to stick with us and looking forward to more conversations. Absolutely. And for my part, I just want to thank the listeners for bearing with us as we started the show and having my sharing what had happened to me.

In the show notes, I'm also going to include a link to my experience of COVID in a way. So if it's something you're interested in and want to know somebody that's actually dealt with it in a first-hand way, you can check that out in addition to the normal notes for the show. Thank you so much for listening. Thank you for listening to this episode of Practical AI.

People ask us all the time. They say, hey, how can I support your work? One easy way is to leave a five-star review on Apple Podcasts. Tell folks why you listen and why they should, and it only takes about 30 seconds.

And believe it or not, those ratings and reviews really do help us rank higher in AI-related search results. Practical AI is hosted by Daniel Whiteneck and Chris Benson. It's produced by Jared Samsow. That's me.

And our music is brought to you by the one and only Breakmaster cylinder. We are sponsored by amazing people at companies who get it. Thanks again to Fastly, the Node and Roll Bar. Did you know we have a Master Feed of all Changelog podcasts?

We do. It's your one-stop shop for everything we produce. If you like this show, you'll love the Changelog, brain science, and go time. Check it out at Changelog.com or search for Changelogmaster in your favorite podcast app.

You'll find us. That's it for now. We'll talk to you again next week.

PodQuesting Dwight J Randolph- WolfShield Media PodQuesting: -By WolfShield Media and Dwight J RandolphJoin us on an exciting journey to master the world of fiction podcasting! At PodQuesting, we document our quest to improve and innovate, sharing valuable insights, strategies, and behind-the-scenes tips along the way. Whether you're an experienced podcaster or just starting your first show, our podcast is your go-to resource for everything podcasting.Discover practical advice, creative techniques, and lessons from our own experiences as we explore the ever-evolving podcasting landscape. Ready to level up your skills and embark on this adventure with us? Tune in and join the quest!Have questions or feedback? Reach out to us at [email protected] and visit our website:WolfShield.Media The PFN Cincinnati Bengals Podcast Pro Football Network The PFN Cincinnati Bengals Podcast is where you can stay up-to-date with the latest news and analysis on the Cincinnati Bengals! Our hosts, industry experts Jay Morrison and Dallas Robinson, provide weekly coverage of all the latest rumors and updates about the Bengals. Don’t forget to follow the show to receive new episodes directly in your podcast feed and leave a rating and review to let us know your thoughts. The 48 Laws of Power by Robert Greene (Full Audiobook) Robert Greene Amoral, cunning, ruthless, and instructive, this multi-million-copy New York Times bestseller is the definitive manual for anyone interested in gaining, observing, or defending against ultimate control – from the author of The Laws of Human Nature.In the book that People magazine proclaimed “beguiling” and “fascinating,” Robert Greene and Joost Elffers have distilled three thousand years of the history of power into 48 essential laws by drawing from the philosophies of Machiavelli, Sun Tzu, and Carl Von Clausewitz and also from the lives of figures ranging from Henry Kissinger to P.T. Barnum.Some laws teach the need for prudence (“Law 1: Never Outshine the Master”), others teach the value of confidence (“Law 28: Enter Action with Boldness”), and many recommend absolute self-preservation (“Law 15: Crush Your Enemy Totally”). Every law, though, has one thing in common: an interest in t Mind Force Radio.com Mind Force Radio.com Natural Strength Night is an informative, humorous, sometimes a little raucous, good-time of myth busting and honest training information from the trenches. We strive to help everyone involved with old school strength training (without steroids) to not make some common training mistakes. Along with great information, you'll hear a fair share of steroid bashing, flamingo sightings, breaking goons, iron game history, and honest drug-free training information from various leaders and strength coaches in the field to help you get real results! If your primary training information comes from reading "Muscle & Fiction" magazine we'll help get you straightened out. If you love high-intensity strength training, dinosaur style training and just like lifting heavy weights ... or loved Jack Lalanne, Sandow, Grimek, Peary Rader's Iron Man magazine, Brad Steiner's articles, Stuart McRobert's Hardgainer, Iron Nation, Osmo Kiiha's The Iron Master, you will love the show.On The Rugged Individual, we

Frequently Asked Questions

How long is this episode of Changelog Master Feed?

This episode is 50 minutes long.

When was this Changelog Master Feed episode published?

This episode was published on June 22, 2020.

What is this episode about?

This full connected has it all: news, updates on AI/ML tooling, discussions about AI workflow, and learning resources. Chris and Daniel breakdown the various roles to be played in AI development including scoping out a solution, finding AI value,...

Is there a transcript available for this episode?

Yes, a full transcript is available for this episode. You can read the complete transcript on the episode page.

Can I download this Changelog Master Feed episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!