Kana & Mari’s SoundRepos

PODCAST · technology

Kana & Mari’s SoundRepos

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら:Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

  1. 101

    wildminder/awesome-ai-voice

    List of open-source TTS, voice cloning, and music generation models

  2. 100

    mahimairaja/voiceai

    Set of with to help those building Voice AI agents ️

  3. 99

    PowerBeef/QwenVoice

    Vocello is a local-first voice generation app for Apple Silicon Macs. Public beta for macOS 26; QwenVoice v1.2.3 remains the stable macOS 15 fallback.

  4. 98

    r9y9/ttslearn

    ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

  5. 97

    livekit-examples/kitt

    Talk to ChatGPT in real time using LiveKit

  6. 96

    yl4579/PL-BERT

    Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

  7. 95

    ElmTran/praises

    Praises is a text-to-speech tool that can help you read text easily.

  8. 94

    Elleo/pied

    Pied makes it simple to install and manage text-to-speech Piper voices for use with Speech Dispatcher.

  9. 93

    1neReality/MITSUHA

    World's First Multilingual Inexpensive Therapeutic Sophisticated Ultra-responsive Holographic Agent. In simple terms, an AI you can talk to and it'll talk back with a body using VTube Studio.

  10. 92

    rishikksh20/iSTFTNet-pytorch

    iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

  11. 91

    frostming/tetos

    A unified interface for multiple Text-to-Speech (TTS) providers.

  12. 90

    atomicoo/FCH-TTS

    A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型,适用于英语、普通话/中文、日语、韩语、俄语和藏语(当前已测试)。

  13. 89

    Executedone/Chinese-FastSpeech2

    基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏

  14. 88

    JackismyShephard/ultimate-rvc

    An app for creating audio-based content such as song covers and speech using Retrieval-based Voice Conversion.

  15. 87

    mathigatti/midi2voice

    Singing synthesis from MIDI file

  16. 86

    trymirai/uzu

    A high-performance inference engine for AI models

  17. 85

    maum-ai/univnet

    Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

  18. 84

    LSimon95/megatts2

    Unoffical implementation of Megatts2

  19. 83

    makerjackie/MTTS

    A Demo of Mandarin/Chinese TTS frontend

  20. 82

    haoheliu/voicefixer_main

    General Speech Restoration

  21. 81

    ManimCommunity/manim-voiceover

    Manim plugin for all things voiceover

  22. 80

    travisvn/obsidian-edge-tts

    Free, high quality text-to-speech for your Obsidian notes, leveraging Microsoft Edge's Read Aloud API.

  23. 79

    zlargon/google-tts

    Google TTS (Text-To-Speech) for node.js

  24. 78

    developersdigest/ai-devices

    AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

  25. 77

    zarazhangrui/personalized-podcast

    Turn any content into a personalized AI podcast. NotebookLM-style, except you control the script, voices, and hosts. Listen in Apple Podcasts, Spotify, or any podcast app.

  26. 76

    debpalash/OmniVoice-Studio

    A Cinematic audio dubbing, Cloning and voice generation studio

  27. 75

    akdeb/ElatoAI

    Realtime Voice AI with 100+ Models on Arduino ESP32 for AI Toys, Companions, and Devices

  28. 74

    izwi-ai/izwi

    Voice AI runtime. Local first transcription, speaker diarization, TTS, and voice cloning with an OpenAI compatible API.

  29. 73

    moonshine-ai/moonshine

    Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces

  30. 72

    Saganaki22/ComfyUI-OmniVoice-TTS

    OmniVoice TTS nodes for ComfyUI - Zero-shot multilingual text-to-speech with voice cloning, voice design, and multi-speaker dialogue

  31. 71

    OpenMOSS/MOSS-TTS-Nano

    MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run directly on CPU without a GPU, and keeps the deployment stack simple enough for local demos, web serving, and lightweight product integration.

  32. 70

    Aratako/T5Gemma-TTS

    Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

  33. 69

    lmnt-com/wavegrad

    A fast, high-quality neural vocoder.

  34. 68

    mbzuai-oryx/LLMVoX

    LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

  35. 67

    Adri6336/gpt-voice-conversation-chatbot

    Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

  36. 66

    richardr1126/openreader

    An open-source read-along document reader server with high-quality TTS options, synchronized highlighting, and audiobook export for EPUB, PDF, DOCX, TXT, and MD.

  37. 65

    Aratako/Irodori-TTS

    A Flow Matching-based Text-to-Speech Model with Emoji-driven Style Control

  38. 64

    LlmKira/fast-langdetect

    ⚡️ 80x faster Fasttext language detection out of the box | Split text by language

  39. 63

    Sharrnah/whispering-ui

    Native UI for the Whispering Tiger project - https://github.com/Sharrnah/whispering (live transcription / translation)

  40. 62

    keonlee9420/Expressive-FastSpeech2

    PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.

  41. 61

    Agents365-ai/video-podcast-maker

    AI-powered video podcast creation skill for coding agents. Supports Bilibili & YouTube, multi-language (zh-CN/en-US), 6 TTS engines (Edge/Azure/ElevenLabs/OpenAI/Doubao/CosyVoice), 4K Remotion rendering.

  42. 60

    funnyzak/tts-now

    跨平台基于云平台(阿里云、讯飞等)语音合成 API 的文字转语音助手。支持单文本快速合成和批量合成。支持windows、macOS、Linux。

  43. 59

    yandexdataschool/speech_course

    YSDA course in Speech Processing.

  44. 58

    FlorianEagox/WeeaBlind

    A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!

  45. 57

    sipeter/CloneTTS

    A lightweight, offline Android Text-to-Speech (TTS) engine enabling seamless system-wide voice cloning and high-fidelity text reading. / 运行在安卓本地的轻量级文字转语音 (TTS) 引擎,支持离线发音人提取、零门槛音色克隆与双擎系统级全局听书。

  46. 56

    TrevorS/voxtral-mini-realtime-rs

    Voxtral ASR & TTS running natively and in the browser. A Rust implementation of Mistral's Voxtral mini realtime ASR / TTS using the Burn ML framework

  47. 55

    Poeschl/Hassio-Addons

    The repository for my Home Assistant Supervisor Add-ons.

  48. 54

    Migushthe2nd/MsEdgeTTS

    A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API. https://www.npmjs.com/package/msedge-tts

  49. 53

    keonlee9420/Comprehensive-Transformer-TTS

    A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

  50. 52

    Rongjiehuang/GenerSpeech

    PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら:Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

HOSTED BY

Kana & Mari

Produced by Aquariumy Studio Inc.

CATEGORIES

URL copied to clipboard!