PODCAST · technology

Kana & Mari’s SoundRepos

by Kana & Mari

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら：Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

Subscribe · 0 Bookmark

101

wildminder/awesome-ai-voice

List of open-source TTS, voice cloning, and music generation models

May 13, 2026

1m
100

mahimairaja/voiceai

Set of with to help those building Voice AI agents ️

May 12, 2026

1m
99

PowerBeef/QwenVoice

Vocello is a local-first voice generation app for Apple Silicon Macs. Public beta for macOS 26; QwenVoice v1.2.3 remains the stable macOS 15 fallback.

May 11, 2026

1m
98

r9y9/ttslearn

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

May 10, 2026

1m
97

livekit-examples/kitt

Talk to ChatGPT in real time using LiveKit

May 9, 2026

1m
96

yl4579/PL-BERT

Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

May 8, 2026

1m
95

ElmTran/praises

Praises is a text-to-speech tool that can help you read text easily.

May 7, 2026

1m
94

Elleo/pied

Pied makes it simple to install and manage text-to-speech Piper voices for use with Speech Dispatcher.

May 6, 2026

1m
93

1neReality/MITSUHA

World's First Multilingual Inexpensive Therapeutic Sophisticated Ultra-responsive Holographic Agent. In simple terms, an AI you can talk to and it'll talk back with a body using VTube Studio.

May 5, 2026

2m
92

rishikksh20/iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

May 4, 2026

1m
91

frostming/tetos

A unified interface for multiple Text-to-Speech (TTS) providers.

May 3, 2026

1m
90

atomicoo/FCH-TTS

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型，适用于英语、普通话/中文、日语、韩语、俄语和藏语（当前已测试）。

May 2, 2026

1m
89

Executedone/Chinese-FastSpeech2

基于标贝数据继续训练，同时对原本的FastSpeech2模型做了改进，引入了韵律表征以及韵律预测模块，使中文发音更生动且富有节奏

May 1, 2026

1m
88

JackismyShephard/ultimate-rvc

An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.

Apr 30, 2026

2m
87

mathigatti/midi2voice

Singing synthesis from MIDI file

Apr 29, 2026

1m
86

trymirai/uzu

A high-performance inference engine for AI models

Apr 28, 2026

1m
85

maum-ai/univnet

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

Apr 27, 2026

1m
84

LSimon95/megatts2

Unoffical implementation of Megatts2

Apr 26, 2026

1m
83

makerjackie/MTTS

A Demo of Mandarin/Chinese TTS frontend

Apr 25, 2026

1m
82

haoheliu/voicefixer_main

General Speech Restoration

Apr 24, 2026

1m
81

ManimCommunity/manim-voiceover

Manim plugin for all things voiceover

Apr 23, 2026

1m
80

travisvn/obsidian-edge-tts

Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.

Apr 22, 2026

1m
79

zlargon/google-tts

Google TTS (Text-To-Speech) for node.js

Apr 21, 2026

1m
78

developersdigest/ai-devices

AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

Apr 20, 2026

1m
77

zarazhangrui/personalized-podcast

Turn any content into a personalized AI podcast. NotebookLM-style, except you control the script, voices, and hosts. Listen in Apple Podcasts, Spotify, or any podcast app.

Apr 19, 2026

1m
76

debpalash/OmniVoice-Studio

A Cinematic audio dubbing, Cloning and voice generation studio

Apr 18, 2026

1m
75

akdeb/ElatoAI

Realtime Voice AI with 100+ Models on Arduino ESP32 for AI Toys, Companions, and Devices

Apr 17, 2026

2m
74

izwi-ai/izwi

Voice AI runtime. Local first transcription, speaker diarization, TTS, and voice cloning with an OpenAI compatible API.

Apr 16, 2026

1m
73

moonshine-ai/moonshine

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces

Apr 15, 2026

1m
72

Saganaki22/ComfyUI-OmniVoice-TTS

OmniVoice TTS nodes for ComfyUI - Zero-shot multilingual text-to-speech with voice cloning, voice design, and multi-speaker dialogue

Apr 14, 2026

1m
71

OpenMOSS/MOSS-TTS-Nano

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run directly on CPU without a GPU, and keeps the deployment stack simple enough for local demos, web serving, and lightweight product integration.

Apr 13, 2026

1m
70

Aratako/T5Gemma-TTS

Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

Apr 12, 2026

1m
69

lmnt-com/wavegrad

A fast, high-quality neural vocoder.

Apr 11, 2026

1m
68

mbzuai-oryx/LLMVoX

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Apr 10, 2026

2m
67

Adri6336/gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

Apr 9, 2026

1m
66

richardr1126/openreader

An open-source read-along document reader server with high-quality TTS options, synchronized highlighting, and audiobook export for EPUB, PDF, DOCX, TXT, and MD.

Apr 8, 2026

1m
65

Aratako/Irodori-TTS

A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control

Apr 7, 2026

1m
64

LlmKira/fast-langdetect

⚡️ 80x faster Fasttext language detection out of the box | Split text by language

Apr 6, 2026

1m
63

Sharrnah/whispering-ui

Native UI for the Whispering Tiger project - https://github.com/Sharrnah/whispering (live transcription / translation)

Apr 5, 2026

1m
62

keonlee9420/Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

Apr 4, 2026

1m
61

Agents365-ai/video-podcast-maker

AI-powered video podcast creation skill for coding agents. Supports Bilibili & YouTube, multi-language (zh-CN/en-US), 6 TTS engines (Edge/Azure/ElevenLabs/OpenAI/Doubao/CosyVoice), 4K Remotion rendering.

Apr 3, 2026

1m
60

funnyzak/tts-now

跨平台基于云平台(阿里云、讯飞等)语音合成 API 的文字转语音助手。支持单文本快速合成和批量合成。支持windows、macOS、Linux。

Apr 2, 2026

1m
59

yandexdataschool/speech_course

YSDA course in Speech Processing.

Apr 1, 2026

1m
58

FlorianEagox/WeeaBlind

A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!

Mar 31, 2026

1m
57

sipeter/CloneTTS

A lightweight, offline Android Text-to-Speech (TTS) engine enabling seamless system-wide voice cloning and high-fidelity text reading. / 运行在安卓本地的轻量级文字转语音 (TTS) 引擎，支持离线发音人提取、零门槛音色克隆与双擎系统级全局听书。

Mar 30, 2026

1m
56

TrevorS/voxtral-mini-realtime-rs

Voxtral ASR & TTS running natively and in the browser. A Rust implementation of Mistral's Voxtral mini realtime ASR / TTS using the Burn ML framework

Mar 29, 2026

2m
55

Poeschl/Hassio-Addons

The repository for my Home Assistant Supervisor Add-ons.

Mar 28, 2026

1m
54

Migushthe2nd/MsEdgeTTS

A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API. https://www.npmjs.com/package/msedge-tts

Mar 27, 2026

1m
53

keonlee9420/Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Mar 26, 2026

1m
52

Rongjiehuang/GenerSpeech

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Mar 25, 2026

1m

View all 101 episodes →

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

Share your thoughts

ABOUT THIS SHOW

HOSTED BY

Kana & Mari

Produced by Aquariumy Studio Inc.

wildminder/awesome-ai-voice

mahimairaja/voiceai

PowerBeef/QwenVoice

r9y9/ttslearn

livekit-examples/kitt

yl4579/PL-BERT

ElmTran/praises

Elleo/pied

1neReality/MITSUHA

rishikksh20/iSTFTNet-pytorch

frostming/tetos

atomicoo/FCH-TTS

Executedone/Chinese-FastSpeech2

JackismyShephard/ultimate-rvc

mathigatti/midi2voice

trymirai/uzu

maum-ai/univnet

LSimon95/megatts2

makerjackie/MTTS

haoheliu/voicefixer_main

ManimCommunity/manim-voiceover

travisvn/obsidian-edge-tts

zlargon/google-tts

developersdigest/ai-devices

zarazhangrui/personalized-podcast

debpalash/OmniVoice-Studio

akdeb/ElatoAI

izwi-ai/izwi

moonshine-ai/moonshine

Saganaki22/ComfyUI-OmniVoice-TTS

OpenMOSS/MOSS-TTS-Nano

Aratako/T5Gemma-TTS

lmnt-com/wavegrad

mbzuai-oryx/LLMVoX

Adri6336/gpt-voice-conversation-chatbot

richardr1126/openreader

Aratako/Irodori-TTS

LlmKira/fast-langdetect

Sharrnah/whispering-ui

keonlee9420/Expressive-FastSpeech2

Agents365-ai/video-podcast-maker

funnyzak/tts-now

yandexdataschool/speech_course

FlorianEagox/WeeaBlind

sipeter/CloneTTS

TrevorS/voxtral-mini-realtime-rs

Poeschl/Hassio-Addons

Migushthe2nd/MsEdgeTTS

keonlee9420/Comprehensive-Transformer-TTS

Rongjiehuang/GenerSpeech

Authentication Required