EPISODE · Feb 21, 2026 · 1 MIN
waybarrios/vllm-mlx
from Kana & Mari’s SoundRepos · host Kana & Mari
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
What this episode covers
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
NOW PLAYING
waybarrios/vllm-mlx
No transcript for this episode yet
Similar Episodes
Jul 22, 2025 ·55m
Jul 15, 2025 ·47m
Jul 8, 2025 ·61m
Jun 17, 2025 ·43m