Kana & Mari’s SoundRepos podcast artwork

PODCAST · technology

Kana & Mari’s SoundRepos

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら:Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

  1. 101

    gpustack/vox-box

    Vox Box は、OpenAI API 互換の音声認識(speech-to-text)/音声合成(text-to-speech)サーバーです。Whisper、FunASR、Bark、Dia、CosyVoice などのバックエンドモデルを切り替えて利用でき、/v1/audio/speech、/v1/audio/transcriptions、/v1/models、/v1/voices、/health などのAPIを提供します。

  2. 100

    zhenye234/CoMoSpeech

    CoMoSpeechという、テキストから音声・歌声を生成するための拡散モデル/Consistency Modelベースの音声合成リポジトリです。1ステップ生成による高速推論を目指しており、教師モデルの蒸留による学生モデル学習、推論、LJSpeechを用いた学習コードが含まれています。HiFi-GAN вокoder を使ってメルスペクトログラムから波形を生成します。

  3. 99

    jscrane/TTS

    Arduino向けのText-to-Speech(TTS)ライブラリです。英語の語彙・音素変換ルールと音声データをPROGMEMに保持し、PWMやDAC出力を使ってArduino系ボード上で音声合成を行います。

  4. 98

    hegedustibor/htgo-tts

    Go言語向けのText-to-Speech(TTS)ライブラリです。Google Translateの音声生成APIを利用してテキストをMP3化し、ファイル保存や再生まで行えます。再生はmplayerを使う方法と、go-mp3 + oto/v2 を使うネイティブ再生の両方に対応しています。

  5. 97

    mtkresearch/BreezeApp

    BreezeAPP 是一款為 Android 和 iOS 平台開發的純手機 AI 應用程式。從 App Store下載,即可在不連網的狀態下享受多項 AI 功能。源碼由聯發創新基地(MediaTek Research)提供。我們旨在推廣兩個概念: 人人都可以在自己的手機上自由選擇並運行不同的LLM - one is free to choose one's own LLM to run on a phone,以及任何app開發者都可以輕鬆寫作創意的純手機AI應用 - any dev can create purely phone-based AI apps easily。

  6. 96

    netease-youdao/Confucius4-TTS

    Confucius4-TTS: a Multilingual and Cross-Lingual Zero-Shot TTS Engine

  7. 95

    Lyrcaxis/KokoroSharp

    Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.

  8. 94

    OEvortex/llm4free

    LLM4Free — All-in-one Python toolkit for web search, AI interaction (40+ free providers), digital utilities, and more. Formerly WebScout.

  9. 93

    p0p4k/pflowtts_pytorch

    Unofficial implementation of NVIDIA P-Flow TTS paper

  10. 92

    OpenMOSS/MOSS-Audio-Tokenizer

    MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA reconstruction and strong performance in generation and understanding—serving as a unified interface for next-generation native audio language models.

  11. 91

    worldwonderer/video-recap-skills

    Turn any video into a narration recap with claude code skill|用claude code skill把任何视频剪辑成中文解说视频,支持剪映导出

  12. 90

    thuhcsi/Crystal

    Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.

  13. 89

    dunky11/voicesmith

    [WIP] VoiceSmith makes training text to speech models easy.

  14. 88

    ekwek1/soprano-factory

    Soprano-Factory: Train your own 2000x realtime text-to-speech model

  15. 87

    small-cactus/M.I.L.E.S

    M.I.L.E.S, a GPT-4-Turbo voice assistant, self-adapts its prompts and AI model, can play any Spotify song, adjusts system and Spotify volume, performs calculations, browses the web and internet, searches global weather, delivers date and time, autonomously chooses and retains long-term memories. Available for macOS and Windows.

  16. 86

    Yazdi9/Talking_Face_Avatar

    Avatar Generation For Characters and Game Assets Using Deep Fakes

  17. 85

    XilinJia/Podcini

    Open source podcast instrument for Android supporting contents from YouTube and YT Music as well as normal podcasts.

  18. 84

    LonePheasantWarrior/TalkifyTTS

    云端大模型驱动的 Android 语音合成应用(TTS引擎)。支持豆包、腾讯、微软、千问等模型。An Android text-to-speech (TTS) engine powered by cloud-based large language models. Supports models such as Doubao, Tencent, Microsoft, and Qwen.

  19. 83

    rishikksh20/FastSpeech2

    PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech

  20. 82

    herimor/voxtream

    VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control

  21. 81

    Hagsten/Talkify

    Javascript Text to speech library

  22. 80

    robinhad/ukrainian-tts

    Ukrainian TTS (text-to-speech) using ESPNET

  23. 79

    foyoux/pygtrans

    谷歌翻译, 支持 APIKEY 一口气翻译十万条

  24. 78

    CMsmartvoice/One-Shot-Voice-Cloning

    :relaxed: One Shot Voice Cloning base on Unet-TTS

  25. 77

    keonlee9420/DiffSinger

    PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

  26. 76

    AIFSH/ComfyUI-GPT_SoVITS

    a comfyui custom node for GPT-SoVITS! you can voice cloning and tts in comfyui now

  27. 75

    yl4579/HiFTNet

    HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

  28. 74

    asiff00/On-Device-Speech-to-Speech-Conversational-AI

    This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and natural interruption handling.

  29. 73

    BinWang28/audio-ai-hub

    The hub for audio AI research: papers, open models, benchmarks & datasets across audio LLMs, speech recognition, TTS, music & audio generation.

  30. 72

    Xerophayze/TTS-Story

    TTS-Story is a web-based multi‑voice TTS studio for turning tagged scripts into audiobooks—featuring full speaker management, chunk review/regeneration, a job queue and library system, and local GPU or API backends including Kokoro, Chatterbox, VOX CPM, Pocket-TTS, Kitten-TTS, IndexTTS-2, QWEN3 TTS and Omnivoice engines

  31. 71

    AutoArk/GPA

    [AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny model!

  32. 70

    elevenlabs/skills

    Collections of skills for building with ElevenLabs

  33. 69

    KevinMIN95/StyleSpeech

    Official implementation of Meta-StyleSpeech and StyleSpeech

  34. 68

    ddxfish/sapphire

    She's the AI agent you come home to.

  35. 67

    shell-nlp/gpt_server

    gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。

  36. 66

    mlalma/kokoro-ios

    Kokoro TTS for iOS and macOSX

  37. 65

    keonlee9420/DailyTalk

    Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023

  38. 64

    DBJD-CR/astrbot_plugin_proactive_chat

    一个能让 Bot 在私聊和群聊中发起主动消息的插件,拥有上下文感知、持久化数据、动态情绪、免打扰时段和 TTS 集成。还有独立 WebUI,可进行个性化配置。 An AstrBot plugin that enables Bot to send proactive messages in private and group chats, featuring context awareness, persistent data, dynamic emotions, do-not-disturb periods, and TTS integration. It also boasts an independent WebUI for personalized.

  39. 63

    devnen/Kitten-TTS-Server

    Self-host the ultra-lightweight Kitten TTS model with this enhanced API server with an intuitive Web UI, large text processing for audiobooks, and GPU acceleration.

  40. 62

    leaonline/easy-speech

    Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API

  41. 61

    rendchevi/nix-tts

    Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation

  42. 60

    HITsz-TMG/VideoClaw

    AI 全自动化视频生成员工 | Your First AIGC Coworker. Chat an Idea. Get a Film.

  43. 59

    ChaituRajSagar/gemini-youtube-automation

    A fully autonomous AI Agent/Python pipeline that utilizes Large Language Models (LLMs) like Gemini to generate content, produce videos, and automatically upload educational videos to YouTube.

  44. 58

    wildminder/awesome-ai-voice

    List of open-source TTS, voice cloning, and music generation models

  45. 57

    mahimairaja/voiceai

    Set of with to help those building Voice AI agents ️

  46. 56

    PowerBeef/QwenVoice

    Vocello is a local-first voice generation app for Apple Silicon Macs. Public beta for macOS 26; QwenVoice v1.2.3 remains the stable macOS 15 fallback.

  47. 55

    r9y9/ttslearn

    ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

  48. 54

    livekit-examples/kitt

    Talk to ChatGPT in real time using LiveKit

  49. 53

    yl4579/PL-BERT

    Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

  50. 52

    ElmTran/praises

    Praises is a text-to-speech tool that can help you read text easily.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら:Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

HOSTED BY

Kana & Mari

Produced by Aquariumy Studio Inc.

CATEGORIES

Frequently Asked Questions

How many episodes does Kana & Mari’s SoundRepos have?

Kana & Mari’s SoundRepos currently has 50 episodes available on PodParley. New episodes are automatically indexed when they're published to the podcast feed.

What is Kana & Mari’s SoundRepos about?

Kana と Mari が、GitHub で見つけた TTS・MIDI・Audio など “音” にまつわる注目リポジトリを声で紹介。音とコードが交差するオープンソースの世界を軽やかにナビゲートします。 Kana と Mari のプロフィールはこちら:Kana – Newbie Esports CasterMari – Newbie Esports Analyst※ 本番組の原稿は生成 AI を用いて自動生成されています。内容には誤りを含む可能性がありますので参考情報としてお楽しみください。

How often does Kana & Mari’s SoundRepos release new episodes?

Kana & Mari’s SoundRepos has 50 episodes. Check the episode list to see recent publication dates and frequency.

Where can I listen to Kana & Mari’s SoundRepos?

You can listen to Kana & Mari’s SoundRepos on PodParley by clicking any episode. We provide an embedded audio player for direct listening, and you can also subscribe via your preferred podcast app using the RSS feed.

Who hosts Kana & Mari’s SoundRepos?

Kana & Mari’s SoundRepos is created and hosted by Kana & Mari.
URL copied to clipboard!