EPISODE · Jan 3, 2023 · 36 MIN
NLP research by & for local communities
from Changelog Master Feed · host Practical AI LLC
While at EMNLP 2022, Daniel got a chance to sit down with an amazing group of researchers creating NLP technology that actually works for their local language communities. Just Zwennicker (Universiteit van Amsterdam) discusses his work on a machine translation system for Sranan Tongo, a creole language that is spoken in Suriname. Andiswa Bukula (SADiLaR), Rooweither Mabuya (SADiLaR), and Bonaventure Dossou (Lanfrica, Mila) discuss their work with Masakhane to strengthen and spur NLP research in African languages, for Africans, by Africans.The group emphasized the need for more linguistically diverse NLP systems that work in scenarios of data scarcity, non-Latin scripts, rich morphology, etc. You don’t want to miss this one!Featuring:Just Zwennicker – LinkedInAndiswa Bukula – XRooweither Mabuya – XBonaventure Dossou – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:EMNLP 2022 papers from the guests:Towards a general purpose machine translation system for SranantongoMasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity RecognitionAfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African LanguagesOther links relevant to the discussion:MasakhaneLanfricaThe South African Centre for Digital Language Resources (SADiLaR)Upcoming Events: Register for upcoming webinars here!
What this episode covers
While at EMNLP 2022, Daniel got a chance to sit down with an amazing group of researchers creating NLP technology that actually works for their local language communities. Just Zwennicker (Universiteit van Amsterdam) discusses his work on a machine translation system for Sranan Tongo, a creole language that is spoken in Suriname. Andiswa Bukula (SADiLaR), Rooweither Mabuya (SADiLaR), and Bonaventure Dossou (Lanfrica, Mila) discuss their work with Masakhane to strengthen and spur NLP research in African languages, for Africans, by Africans.The group emphasized the need for more linguistically diverse NLP systems that work in scenarios of data scarcity, non-Latin scripts, rich morphology, etc. You don’t want to miss this one!Featuring:Just Zwennicker – LinkedInAndiswa Bukula – XRooweither Mabuya – XBonaventure Dossou – Website, GitHub, LinkedIn, XDaniel Whitenack – Website, GitHub, XShow Notes:EMNLP 2022 papers from the guests:Towards a general purpose machine translation system for SranantongoMasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity RecognitionAfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African LanguagesOther links relevant to the discussion:MasakhaneLanfricaThe South African Centre for Digital Language Resources (SADiLaR)Upcoming Events: Register for upcoming webinars here!
NOW PLAYING
NLP research by & for local communities
No transcript for this episode yet
Similar Episodes
Jun 9, 2026 ·50m
Jun 1, 2026 ·31m
May 25, 2026 ·39m
May 17, 2026 ·37m
May 8, 2026 ·39m
Apr 28, 2026 ·25m