EPISODE · Dec 31, 2025 · 19 MIN
245: Benchmarking DNA foundation models
from Base by Base · host Gustavo Barra
Feng H et al., Nat Commun - A comprehensive, unbiased benchmark compares five DNA foundation models across 57 datasets and multiple tasks, finding mean token embeddings improve classification and that model strengths vary by task and pre-training. Key terms: DNA foundation models, mean token embedding, sequence classification, variant effect, gene expression. Study Highlights:The study evaluated DNABERT-2, NT-v2, HyenaDNA, Caduceus-Ph, and GROVER on 57 datasets spanning sequence classification, gene expression prediction, variant effect quantification, and TAD recognition. Mean token embedding consistently and significantly outperformed summary-token and max pooling for sequence classification. Model performance was task-dependent: Caduceus-Ph excelled at human TFBS and promoter tasks, NT-v2 led pathogenic variant identification, HyenaDNA scaled efficiently and benefited from multi-species pre-training, while specialized models outperformed general foundations on QTL prediction. Zero-shot embeddings provided modest gene expression prediction and NT-v2 attention patterns did not reveal inherent TAD recognition. Conclusion:Mean token pooling yields more robust sequence-level representations and model choice should align with task, input length, and pre-training data for best genomic performance Music:Enjoy the music based on this article at the end of the episode. First author:Feng H Journal:Nat Commun DOI:10.1038/s41467-025-65823-8 Reference:Feng H, Wu L, Zhao B, Huff C, Zhang J, Wu J, Lin L, Wei P & Wu C. Benchmarking DNA foundation models for genomic and genetic tasks. Nat Commun. 2025;16:10780. https://doi.org/10.1038/s41467-025-65823-8 License:This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support:Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/dna-foundation-models-benchmark QC:This episode was checked against the original article PDF and publication metadata for the episode release published on 2025-12-31. QC Scope:- article metadata and core scientific claims from the narration- excludes analogies, intro/outro, and music- transcript coverage: Audited the transcript's coverage of core scientific claims: DNA foundation models benchmarking, pooling strategies (mean token embedding), zero-shot embeddings with a downstream classifier, VEQ dichotomy, multispecies pre-training and cross-species generalization, long-sequence performance, TAD recognition limitations- transcript topics: DNA foundation models and zero-shot embeddings; Pooling strategies for sequence representations (mean token embedding vs summary/max pooling); Downstream classification using zero-shot embeddings (random forest); Variant effect quantification: pathogenic vs QTL (VEQ dichotomy); Multispecies pre-training and cross-species generalization; Cross-species transfer in promoter identification (Arabidopsis example) QC Summary:- factual score: 10/10- metadata score: 10/10- supported core claims: 8- claims flagged for review: 0- metadata checks passed: 4- metadata issues found: 0 Metadata Audited:- article_doi- article_title- article_journal- license Factual Items Audited:- Mean token embedding consistently improves sequence classification across all foundation models and yields measurable AUROC gains.- Zero-shot embeddings with frozen weights are evaluated with a downstream random forest classif...
What this episode covers
Feng H et al., Nat Commun - A comprehensive, unbiased benchmark compares five DNA foundation models across 57 datasets and multiple tasks, finding mean token embeddings improve classification and that model strengths vary by task and pre-training. Key terms: DNA foundation models, mean token embedding, sequence classification, variant effect, gene expression. Study Highlights:The study evaluated DNABERT-2, NT-v2, HyenaDNA, Caduceus-Ph, and GROVER on 57 datasets spanning sequence classification, gene expression prediction, variant effect quantification, and TAD recognition. Mean token embedding consistently and significantly outperformed summary-token and max pooling for sequence classification. Model performance was task-dependent: Caduceus-Ph excelled at human TFBS and promoter tasks, NT-v2 led pathogenic variant identification, HyenaDNA scaled efficiently and benefited from multi-species pre-training, while specialized models outperformed general foundations on QTL prediction. Zero-shot embeddings provided modest gene expression prediction and NT-v2 attention patterns did not reveal inherent TAD recognition. Conclusion:Mean token pooling yields more robust sequence-level representations and model choice should align with task, input length, and pre-training data for best genomic performance Music:Enjoy the music based on this article at the end of the episode. First author:Feng H Journal:Nat Commun DOI:10.1038/s41467-025-65823-8 Reference:Feng H, Wu L, Zhao B, Huff C, Zhang J, Wu J, Lin L, Wei P & Wu C. Benchmarking DNA foundation models for genomic and genetic tasks. Nat Commun. 2025;16:10780. https://doi.org/10.1038/s41467-025-65823-8 License:This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support:Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/dna-foundation-models-benchmark QC:This episode was checked against the original article PDF and publication metadata for the episode release published on 2025-12-31. QC Scope:- article metadata and core scientific claims from the narration- excludes analogies, intro/outro, and music- transcript coverage: Audited the transcript's coverage of core scientific claims: DNA foundation models benchmarking, pooling strategies (mean token embedding), zero-shot embeddings with a downstream classifier, VEQ dichotomy, multispecies pre-training and cross-species generalization, long-sequence performance, TAD recognition limitations- transcript topics: DNA foundation models and zero-shot embeddings; Pooling strategies for sequence representations (mean token embedding vs summary/max pooling); Downstream classification using zero-shot embeddings (random forest); Variant effect quantification: pathogenic vs QTL (VEQ dichotomy); Multispecies pre-training and cross-species generalization; Cross-species transfer in promoter identification (Arabidopsis example) QC Summary:- factual score: 10/10- metadata score: 10/10- supported core claims: 8- claims flagged for review: 0- metadata checks passed: 4- metadata issues found: 0 Metadata Audited:- article_doi- article_title- article_journal- license Factual Items Audited:- Mean token embedding consistently improves sequence classification across all foundation models and yields measurable AUROC gains.- Zero-shot embeddings with frozen weights are evaluated with a downstream random forest classif...
NOW PLAYING
245: Benchmarking DNA foundation models
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m