We don't know what most microbial genes do. Can genomic language models help? (Yunha Hwang, Ep #7) episode artwork

EPISODE · Dec 8, 2025 · 1H 42M

We don't know what most microbial genes do. Can genomic language models help? (Yunha Hwang, Ep #7)

from Owl Posting · host Abhishaike Mahajan

Note: Thank you to rush.cloud and latch.bio for sponsoring this episode!Rush is augmenting drug discovery for all scientists with machine-driven superintelligence.LatchBio is building agentic scientific tooling that can analyze a wide range of scientific data, with an early focus on spatial biology. Clip on them in the episode.If you’re at all interested in sponsoring future episodes, reach out!***This is an interview with Yunha Hwang, an assistant professor at MIT (and co-founder of the non-profit Tatta Bio). She is working on building and applying genomic language models to help annotate the function of the (mostly unknown) universe of microbial genomes.There are two reasons you should watch this episode.One, Yunha is working on an absurdly difficult and interesting problem: microbial genome function annotation. Even for E. coli, one of the most studied organisms on Earth, we don’t know what half to two-thirds of its genes actually do. For a random microbe from soil, that number jumps to 80-90%. Her lab is one of the leading groups working to apply deep learning to solving the problem, and last year, released a paper that increasingly feels foundational within it (with prior Owl Posting podcast guest Sergey Ovchinnikov an author on it!). We talk about that paper, its implications, and where the future of machine learning in metagenomics may go.And two, I was especially excited to film this so I could help bring some light to a platform that she and her team at Tatta Bio has developed: SeqHub. There’s been a lot of discussion online about AI co-scientists in the biology space, but I have increasingly felt a vague suspicion that people are trying to be too broad with them. It feels like the value of these tools are not with general scientific reasoning, but rather from deep integration with how a specific domain of research engages with their open problems. SeqHub feels like one of the few systems that mirrors this viewpoint, and while it isn’t something I can personally use—since its use-case is primarily in annotating and sharing microbial genomes, neither of which I work on!—I would still love for it to succeed. If you’re in the metagenomics space, you should try it out!Youtube: https://youtu.be/w6L9-ySnxZI?si=7RBusTAyy0Ums6Oh Spotify: https://open.spotify.com/episode/2EgnV9Y1Mm9JV5m9KAY6yL?si=J5ZmF2i3TtuT10D40jjgawApple Podcast: https://apple.co/4pu4TRBTranscript: https://www.owlposting.com/p/we-dont-know-what-most-microbialTimestamps:00:02:07 – Introduction00:02:23 – Why do microbial genomes matter00:04:07 – Deep learning acceptance in metagenomics00:05:25 – The case for genomic “context” over sequence matching00:06:43 – OMG: the only ML-ready metagenomic dataset00:09:27 – gLM2: A multimodal genomic language model00:11:06 – What do you do with the output of genomic language models?00:17:41 – How will OMG evolve?00:20:26 – Why train on only microbial genomes, as opposed to all genomes?00:22:58 – Do we need more sequences or more annotations?00:23:54 – Is there a conserved microbial genome ‘language’?00:28:11 – What non-obvious things can this genomic language model tell you?00:33:08 – Semantic deduplication and evaluation00:37:33 – How does benchmarking work for these types of models?00:41:31 – Gaia: A genomic search engine00:44:18 – Even ‘well-studied’ genomes are mostly unannotated00:50:51 – Using agents on Gaia00:54:53 – Will genomic language models reshape the tree of life?00:59:18 – Current limitations of genomic language models01:08:54 – Directed evolution as training data01:12:35 – What is Tatta Bio?01:19:02 – Building Google for genomic sequences (SeqHub)01:25:46 – How to create communities around scientific OSS01:29:06 – What’s the purpose in the centralization of the software?01:35:37 – How will the way science is done change in 10 years? Get full access to Owl Posting at www.owlposting.com/subscribe

NOW PLAYING

We don't know what most microbial genes do. Can genomic language models help? (Yunha Hwang, Ep #7)

0:00 1:42:42

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

This Social Cottage Gemma Watts Welcome to This Social Cottage: the podcast for business owners, content creators, experts and solopreneurs who want to grow on Instagram, sell their offers daily, and still have time for the life they actually want to live.Hosted by ex-teacher and Instagram strategist and offer strategist Gemma, each episode is packed with strategy, content clarity, and sustainable growth tips to help you build an online presence without being exhausted by it. Whether you’re:Just getting started with posting and feeling overwhelmed,Already creating content but it’s taking forever, orCraving a sustainable way to sell your expert offers daily...This podcast is your space.You’ll hear solo episodes full of no-fluff Instagram growth tips, content creation strategies, and ways to show up as the expert you already are — even when life is chaotic and motherhood is... motherhood-ing. Plus, guest chats with women who are doing business with heart, clarity, and iShowSpeed - Biography Flash Inception Point AI Dive into the extraordinary story of Darren Jason Watkins Jr., better known to millions as IShowSpeed or simply Speed, the electrifying American YouTuber and streamer who transformed chaotic gaming livestreams into a global entertainment empire. Born on January 21, 2005, in Cincinnati, Ohio, IShowSpeed rose from posting low-view NBA 2K and Fortnite videos as an 11-year-old in 2016 to becoming one of the most recognized internet personalities on the planet, with over 76 million followers across platforms and a net worth built on YouTube ad revenue, music releases, brand deals, and sold-out tours. This podcast delivers a comprehensive IShowSpeed biography covering every chapter of his remarkable journey, from his humble early life and YouTube beginnings through his explosive viral breakthrough in 2021, when TikTok clips of his legendary rages, barking, and raw emotional reactions catapulted him from 100,000 to over 1 million subscribers in just months. Follow his passionate Cristiano Ron Owl Pellets: Tips for Ag Teachers Owl Pellets Practical tips for your ag classroom and interesting information to incorporate in your teaching. Quick and easy resources for you to read, or “pellets” of information. iShowSpeed - Biography Flash Inception Point Ai Dive into the extraordinary story of Darren Jason Watkins Jr., better known to millions as IShowSpeed or simply Speed, the electrifying American YouTuber and streamer who transformed chaotic gaming livestreams into a global entertainment empire. Born on January 21, 2005, in Cincinnati, Ohio, IShowSpeed rose from posting low-view NBA 2K and Fortnite videos as an 11-year-old in 2016 to becoming one of the most recognized internet personalities on the planet, with over 76 million followers across platforms and a net worth built on YouTube ad revenue, music releases, brand deals, and sold-out tours. This podcast delivers a comprehensive IShowSpeed biography covering every chapter of his remarkable journey, from his humble early life and YouTube beginnings through his explosive viral breakthrough in 2021, when TikTok clips of his legendary rages, barking, and raw emotional reactions catapulted him from 100,000 to over 1 million subscribers in just months. Follow his passionate Cristiano Ron

Frequently Asked Questions

How long is this episode of Owl Posting?

This episode is 1 hour and 42 minutes long.

When was this Owl Posting episode published?

This episode was published on December 8, 2025.

What is this episode about?

Note: Thank you to rush.cloud and latch.bio for sponsoring this episode!Rush is augmenting drug discovery for all scientists with machine-driven superintelligence.LatchBio is building agentic scientific tooling that can analyze a wide range of...

Can I download this Owl Posting episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!