Modern Natural Language Processing and AI during COVID-19 with Daniel Whitenack episode artwork

EPISODE · May 6, 2020 · 42 MIN

Modern Natural Language Processing and AI during COVID-19 with Daniel Whitenack

from HumAIn Podcast · host David Yakobovitch

[Audio] Podcast: Play in new window | DownloadSubscribe: Google Podcasts | Spotify | Stitcher | TuneIn | RSSDaniel Whitenack is a Ph.D. trained data scientist working with Pachyderm. Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, PyCon, GopherCon, JuliaCon, and more), teaches data science/engineering with Purdue University and Ardan Labs , maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.Episode Links:  Daniel Whitenack’s LinkedIn: https://www.linkedin.com/in/danielwhitenack/ Daniel Whitenack’s Twitter: @dwhitenaDaniel Whitenack’s Website: https://datadan.io/ Podcast Details: Podcast website: https://www.humainpodcast.comApple Podcasts:  https://podcasts.apple.com/us/podcast/humain-podcast-artificial-intelligence-data-science/id1452117009Spotify:  https://open.spotify.com/show/6tXysq5TzHXvttWtJhmRpSRSS: https://feeds.redcircle.com/99113f24-2bd1-4332-8cd0-32e0556c8bc9YouTube Full Episodes: https://www.youtube.com/channel/UCxvclFvpPvFM9_RxcNg1ragYouTube Clips:  https://www.youtube.com/channel/UCxvclFvpPvFM9_RxcNg1rag/videosSupport and Social Media:  – Check out the sponsors above, it’s the best way to support this podcast– Support on Patreon: https://www.patreon.com/humain/creators  – Twitter:  https://twitter.com/dyakobovitch– Instagram: https://www.instagram.com/humainpodcast/– LinkedIn: https://www.linkedin.com/in/davidyakobovitch/– Facebook: https://www.facebook.com/HumainPodcast/– HumAIn Website Articles: https://www.humainpodcast.com/blog/Outline: Here’s the timestamps for the episode: (00:00) – Introduction(02:13) – Being online is pretty normal for myself and my team. I am fairly often on calls with people all across the U.S. but also in Singapore, and India, and Africa and all over mostly via zoom.  (02:55) – Our India teammates went fully remote from their office cause they're all programmers and software engineers and that sort of thing so they're all working from home. (03:56) – What's really boosted NLP in the last couple of years are these large scale language models, so oftentimes what you'll have in an AI model and that's processing text is you'll have a series either one or a series of encoders for text classification. What's really been interesting is these sort of large scale language models that have been trained like GPT-2 and BERT and ELMo, and there's a bunch of other ones. They're trained on a massive set of data, even sometimes for multiple languages, such that you really can apply that model to a wide range of tasks by just fine tuning to one of these tasks like translation or sentiment analysis, or text classification with a much smaller amount of data than was required before. That led to this explosion and application of AI and NLP(06:12) – The size of the models has increased a lot and they're processing a lot of data. These word embeddings or these representations of texts that are learned in the model encode a lot about language in general so it's been shown in a couple of studies that you can backtrack out of these embeddings, the actual traditional syntax structure of texts that linguists are familiar with like grammars and such and so in these embeddings is encoded a lot of information. (08:07) – Transfer learning depends a lot on that sort of parent model that you transfer from and there are sort of very multilingual models out there some including up to a hundred and 104 hundred nine languages maybe. There's actually 7,117 languages currently being spoken in the world. if we think about a multilingual model that has like 104 languages in it and it's Embeddings that it's language model supports, that's a drop in the bucket and some tasks like speech to text, or text to speech especially in NLP platforms only support maybe 10 to 20 languages and so there's a long way to go in terms of NLP for the world's languages. (11:29) – I'm really hoping that what we start to see in 2020 is a an acceleration of this technology through the long tail of languages because with 7,000 languages if we tackle like one language every six months or 12 months or something like that it's going to take us a long time to support things like translation or speech to text in 7,000 languages, so I'm hoping that we see some sort of rapid adaptation technology come about in 2020 that will let us tackle, 40, 50, a hundred languages more at a time.(13:46) – Teams that are starting to leverage that those existing resources, which really haven't been tapped into I don't think because they're archived in weird ways they're not in the sort of formats that like AI people typically are used to working in, so we're just at the tipping point where we can really jump in and utilize a lot of that data in creative ways. (15:17) –  There are certain languages that maybe aren't being used in the same way that they were before. There's other languages that would be used digitally, they're just not supported yet and there's economic concerns and literacy concerns and all of these things all wrapped up and so we have a lot of data around all of those things.(18:09) – For chatbots in general, I would say that there's less support for those than there is for a general technology like Google Translate or machine translation. So it's fewer languages than that, but you can do, again, some creative things to bridge the gap, like doing some of this transfer, learning and other things to build custom components under the hood to support new languages. whoever does crack the nut of rapidly.(22:38) – Imagine going into a new language community with a virtual assistant, imagine if that virtual assistant had the ability to query a natural language, that could enable there's still other pieces of that puzzle, like document search and that sort of thing but this is a big step in the right direction. (26:40) –  There's a lot of disruption and that's definitely true and there's a lot of people experiencing real suffering out there but at the same time there also some new opportunities that are arising. (36:15) – Our show is really focused on as you might have guessed the practicalities of being an AI developer these days and not only for those that are currently AI developers, but those that would like to be AI developers so we dig into a bunch of the different technology(38:03) – Reinforcement learning and generative adversarial networks scans both of those technologies get a lot of hype because of some of the things that they power like deep fakes and other things we haven't really entered into a season where reinforcement learning and GANs are really powering a lot of enterprise applications the way that deep learning models have actually penetrated.Advertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

🆕 In this episode: Daniel Whitenack, Modern Natural Language Processing and AI during COVID-19. 🚀 Today's episode is sponsored by Code Story. (http://www.codestory.co/) 💙 Show your support for HumAIn with a monthly membership. (http://www.humainpodcast.com/membership) 🎧 Learn more about your ad-choices. (www.humainpodcast.com/advertise) 📰 Receive subscriber-only content with our newsletter. (http://www.humainpodcast.com/newsletter) Advertising Inquiries: https://redcircle.com/brands Privacy & Opt-Out: https://redcircle.com/privacy

NOW PLAYING

Modern Natural Language Processing and AI during COVID-19 with Daniel Whitenack

0:00 42:27

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

That Hoarder: Overcome Compulsive Hoarding That Hoarder Hoarding disorder is stigmatised and people who hoard feel vast amounts of shame. This podcast began life as an audio diary, an anonymous outlet for somebody with this weird condition. That Hoarder speaks about her experiences living with compulsive hoarding, she interviews therapists, academics, researchers, children of hoarders, professional organisers and influencers, and she shares insight and tips for others with the problem. Listened to by people who hoard as well as those who love them and those who work with them, Overcome Compulsive Hoarding with That Hoarder aims to shatter the stigma, share the truth and speak openly and honestly to improve lives. The Small Business Startup School – Business Notes | Financial Literacy | Retail Psychology – For Professionals & Entrepreneurs The Small Business Startup School Inc. Starting or buying a small business? While personal circumstances may vary, business patterns remain timeless. On The Small Business Startup School, we explore strategies, insights, and practical solutions to help entrepreneurs confidently navigate their journey.Hosted by Ola Williams—a retail entrepreneur, fintech founder, and financial coach with over two decades of experience—this podcast marries financial awareness and retail psychology with optimism to deliver actionable takeaways.Join us to learn, grow, and connect as we uncover the keys to business success.Let’s continue to learn together and be encouraged to keep on connecting! DIOSA. Carolina Sanper This podcast is a sacred space created by Carolina Sanper where you connect with your inner wisdom and embody your magnetic feminine power.It is the realization that the mystical realm is where you plant the seeds of your desired reality.It is a portal to your true essence: awareness, presence, and receiving with ease. Welcome home, DIOSA. 🖤 XXX Tech by SOVRYN Dr. Brian Sovryn The crossroads between technology, sensuality, and metaphysics - and the longest running anarchist podcast in the world! Brought to you by Dr. Brian Sovryn.

Frequently Asked Questions

How long is this episode of HumAIn Podcast?

This episode is 42 minutes long.

When was this HumAIn Podcast episode published?

This episode was published on May 6, 2020.

What is this episode about?

[Audio] Podcast: Play in new window | DownloadSubscribe: Google Podcasts | Spotify | Stitcher | TuneIn | RSSDaniel Whitenack is a Ph.D. trained data scientist working with Pachyderm. Daniel develops innovative, distributed data pipelines which...

Can I download this HumAIn Podcast episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!