Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul
Aarti Bagul is a machine learning engineer at Snorkel AI. Before Snorkel, she worked closely with Andrew Ng in various capacities: (1) at AI Fund helping build ML companies from scratch internally and investing in ML companies, (2) as an ML engineer at his startup Landing AI, (3) as head TA for his deep learning class CS230, and (4) as an assistant in his research lab at Stanford. Aarti graduated with a master's in Computer Science from Stanford, where she participated in the Threshold Venture and Greylock X fellowships. Before Stanford, she got her bachelor's in Computer Science and Computer Engineering from NYU with the highest honors. During her time at NYU, she worked in David Sontag's lab on machine learning applications to clinical medicine and at Microsoft Research as a research intern for John Langford, where she contributed to Vowpal Wabbit, an open-source project.
First published
01/20/2022
Genres:
technology
careers
Similar Episodes
Episode 88: Sales Engineering and Future of Work with Evan Cummack
Release Date: 04/03/2022
Description: Show Notes(02:00) Evan shared his upbringing, born and raised in a small coastal town on New Zealand’s North Island and later studied Software Engineering and Business.(03:55) Evan recalled working as a software solution architect at NEC Corporation back in New Zealand.(06:17) Evan talked about his decision to join Twilio in 2011 as one of the company’s early employees right after its Series B financing.(08:40) Evan shared his perspectives on joining startups and big companies as a new grad.(13:01) Evan provided insights on attributes of exceptional sales engineers, given his time building the first iteration of Twilio’s global pre-sales team.(17:30) Evan unpacked the evolution of his career at Twilio — working as a product manager, a director of product & engineering, and a general manager of IoT & wireless.(22:51) Evan dissected Twilio’s unique “middle-out” sales strategy, which has hugely impacted the company’s incredible growth from Series B through to IPO and beyond.(29:03) Evan went over the untapped opportunity being enabled by new cellular IoT technologies.(33:25) Evan explained his decision to embark on a new journey as the CEO of Fin.com after a decade at Twilio.(37:26) Evan talked about the need for workflow automation and how Fin’s product features are built to address that.(40:35) Evan went over Fin’s remote performance optimization capabilities that help teams thrive in a remote-first environment.(42:56) Evan shared valuable hiring lessons to attract the right leaders who are excited about Fin’s mission.(45:38) Evan shared the hurdles his team has to go through while finding early customers for Fin (as it pivoted to building a SaaS product).(48:02) Evan talked about the qualities of Jeff Lawson that made him such a great CEO.(50:41) Closing segment.Evan’s Contact InfoTwitterLinkedInFin’s ResourcesWebsiteLinkedInTwitter“Fin.com Raises $20M from Coatue” (Sep 2021)“Customers Operations Benchmarks for 2022” (Nov 2021)“Fin’s new Experiments Product Enables CX teams to Confidently Deliver Business Process Changes that Maximize Business Impact” (Dec 2021)Mentioned ContentPeopleJack DorseyBret TaylorPaul BuchheitBook“Startup CXO: A Field Guide to Scaling Up Your Company’s Critical Functions and Teams” (by Matt Blumberg)About the showDatacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing [email protected] by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list. This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe
Explicit: No
Episode 81: Research, Engineering, and Product in Machine Learning with Aarti Bagul
Release Date: 01/20/2022
Description: Timestamps(02:00) Aarti shared her upbringing growing up in India and going to New York for undergraduate.(04:47) Aarti recalled her academic experience getting dual degrees in Computer Science and Computer Engineering at New York University.(07:17) Aarti shared details about her involvement with the ACM chapter and the Women in Computing club at NYU.(10:46) Aarti shared valuable lessons from her research internships.(14:16) Aarti discussed her decision to pursue an MS degree in Computer Science at Stanford University.(20:27) Aarti reflected on her learnings being the Head Teaching Assistant for CS 230, one of Stanford’s most popular Deep Learning courses.(23:59) Aarti shared her thoughts on ML applications in both clinical and administrative healthcare settings.(26:47) Aarti unpacked the motivation and empirical work behind CheXNet, an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists.(29:39) Aarti went over the implications of MURA, a large dataset of musculoskeletal radiographs containing over 40,000 images from close to 15,000 studies, for ML applications in radiology.(32:50) Aarti went over her experience working briefly as an ML engineer at Andrew Ng’s startup Landing AI and applying ML to visual inspection tasks in manufacturing.(36:56) Aarti talked about her participation in external entrepreneurial initiatives such as Threshold Venture Fellowship and Greylock X Fellowship.(43:41) Aarti reminisced her time in a hybrid ML engineer/product manager/VC associate role at AI Fund, which works intensively with entrepreneurs during their startups’ most critical and risky phase from 0 to 1.(48:43) Aarti shared advice that AI fund companies tended to receive regarding product-market fit and go-to-market fit strategy.(54:04) Aarti walked through her decision to onboard Snorkel AI, the startup behind the popular Snorkel open-source project capable of quickly generating training data with weak supervision.(56:36) Aarti reflected on the difference between being an ML researcher and an ML engineer.(01:00:18) Closing segment.Aarti’s Contact InfoLinkedInTwitterGoogle ScholarPeopleAndrew NgJohn LangfordDavid SontagBooks and Papers“The Art of Doing Science & Engineering” (by Richard Hamming)“Deep Medicine: How AI Can Make Healthcare Human Again” (by Eric Topol)“CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning” (Dec 2017)“MURA: Large Dataset for Abnormality Detection in Musculoskeletal Radiographs” (May 2018)About the showDatacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing [email protected] by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list. This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe
Explicit: No
Episode 71: Trusted AI with Saishruthi Swaminathan
Release Date: 09/09/2021
Description: Timestamps(01:59) Saishruthi talked about her upbringing, growing up in a rural town in India with no Internet connection and no computers.(05:50) Saishruthi discussed her undergraduate studying Electrical Engineering at Sri Sairam Engineering College in the early 2010s.(11:56) Saishruthi mentioned the projects and learnings during her two years working at Tata Consultancy Services as an instrumentation engineer.(15:57) Saishruthi went over her MS degree in Electrical Engineering at San Jose State University and her journey into data science.(22:20) Saishruthi shared the initial hurdles she faced transitioning back to school and assimilating to the US culture.(26:10) Saishruthi touched on her work with San Jose City on disaster management.(28:20) Saishruthi went over her job search process, eventually landing a data science position at IBM.(32:16) Saishruthi unpacked lessons learned from public speaking.(35:20) Saishruthi summarized IBM’s data science and machine learning initiatives.(37:02) Saishruthi brought up various projects happening at IBM’s Center for Open Source Data and AI Technologies, whose mission is to make open-source AI models dramatically easier to create, deploy, and manage in the enterprise.(39:40) Saishruthi unpacked the qualities needed to contribute to open-source projects and their role in shaping the development of ML technologies.(44:50) Saishruthi dissected examples of bias in ML, identified solutions to combat unwanted bias, and presented tools for that (as delivered in her talk titled “Digital Discrimination: Cognitive Bias in Machine Learning”).(49:12) Saishruthi shared her thoughts on the evolution of research and applications within the Trusted AI landscape.(54:07) Saishruthi discussed the core value propositions of IBM’s Elyra, a set of AI-centric extensions to JupyterLab that aims to help data practitioners deal with the complexities of the model development lifecycle.(56:11) Saishruthi briefly shared the challenges with developing Coursera courses on data visualization with Python and with R.(01:00:47) Saishruthi went over her passion for movements such as Women In Tech and Girls Who Code.(01:03:27) Saishruthi shared details about her initiative to bring education to rural children.(01:06:36) Closing segment.Saishruthi’s Contact InfoTwitterLinkedInMediumGitHubCourseraMentioned ContentTalks“Digital Discrimination: Cognitive Bias in Machine Learning” (All Things Open 2020)ProjectsAI Fairness 360AI Explainability 360Adversarial Robustness ToolkitModel Asset ExchangeData Asset ExchangeElyraCoursesData Visualization with PythonData Visualization with RAbout the showDatacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing [email protected] by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list. This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe
Explicit: No
Episode 78: Open-Source Investing and Data Product Management with Julia Schottenstein
Release Date: 01/03/2022
Description: Timestamps(01:40) Julia shared the differences growing up in New York and moving to San Francisco.(03:05) Julia discussed her overall undergraduate experience at Stanford — getting dual degrees in Computer Science and Management Science & Engineering_._(05:40) Julia went over her time as an Investment Banker at Qatalyst Partners — notably working on Microsoft’s acquisition of LinkedIn.(09:11) Julia talked about her career transition to venture capital — working as an associate investor at New Enterprise Associates.(10:46) Julia emphasized the importance of getting up-to-speed and forming an investment thesis as a new investor.(15:05) Julia discussed her Series A investment in Metabase, an open-source business intelligence software project.(18:36) Julia unpacked her investment(s) in Sentry, an application monitoring platform that helps developers monitor apps in real-time to catch bugs early.(20:14) Julia explained her investment in the Series B round for Anyscale, an end-to-end computing platform that makes building and managing a scaled application across clouds as easy as developing an app on a single computer.(23:03) Julia contextualized her investments in the seed round for Datafold, a data observability platform that equips analytics engineers with the tools to address data quality issues.(24:24) Julia shared typical hiring and go-to-market decisions that companies need to make (depending upon their growth stages and product strategies).(27:05) Julia mentioned her Metabase application to help investors pick winning open-source startups.(29:05) Julia rationalized her switch to becoming a product manager at dbt Labs.(30:34) Julia peeked into the roadmap of dbt Cloud, a hosted service that helps data analysts and engineers productionize dbt deployments.(33:34) Julia went over an under-invested area and the role of interoperability within the broader data tooling ecosystem.(37:56) Julia reflected on the difference between being a venture investor and a product manager.(41:05) Closing segment.Julia’s Contact InfoLinkedInTwitterdbt’s ResourcesSlack CommunityCoalesce 2021 Replaysdbt LearnGitHubEvents and MeetupsMentioned ContentPeopleTristan Handy (Founder and CEO of dbt Labs)Ali Ghodsi (Co-Creator of Apache Spark, Co-Founder and CEO of Databricks)Dan Levine (General Partner at Accel Partners)Book“Working Backwards: Insights, Stories, and Secrets from Inside Amazon” (by Bill Carr and Colin Bryar)NotesMy conversation with Julia was recorded back in May 2021. Since the podcast was recorded, a lot has happened at dbt Labs! I’d recommend:Reading Julia’s recent blog posts on adopting CI/CD and introducing Environment Variables in dbt Cloud.Watching the talk replays from Coalesce, dbt’s 2nd annual analytics engineering conferenceListening to Season 1 of the Analytics Engineering Podcast, where Julia co-hosts with Tristan Handy to go deep into the hopes, dreams, motivations, and failures of leading data and analytics practitioners.About the showDatacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing [email protected] by searching for Datacast wherever you get podcasts or click one of the links below:Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list. This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe
Explicit: No
Similar Podcasts
DataCast
Release Date: 08/27/2020
Authors: Mohammad Reza Mashoufi
Description: پادکست دیتاکستپادکستی در مورد Big Data Data Mining Machine Learning وسایر اصطلاحاتی که امروزه ترند شده
Explicit: No
DataCast - Habeas Data FND
Release Date: 04/07/2021
Authors: DataCast - Habeas Data FND
Description: DataCast é o podcast do Habeas Data, o portal informativo da Nacional de Direito-UFRJ.
Explicit: No
دیتاکست | گفتوگو با طعم علم داده
Release Date: 04/29/2021
Authors: Masoud Kaviani
Description: در پادکست دیتاکست، به بررسی حل مسائل حوزهی علم داده خواهیم پرداخت و به سراغ افرادی میرویم که تجربهی کار در حوزههای یادگیری ماشین، علم داده، داده کاوی و هوش مصنوعی را داشته باشند...پادکست دیتاکست توسط چیستیوChistio.irپشتیبانی میشودمشاهدهی همهی اپیزودها:Chistio.ir/category/datacast
Explicit: No
Datacast: Data & Analytics at Scale
Release Date: 09/11/2020
Authors: Teradata
Description: Using data and analytics to create actionable insights at scale
Explicit: No