EPISODE · Jul 1, 2025 · 37 MIN
Building Enterprise RAG: Lessons from 2+ Years of Production Deployments
from YAAP (Yet Another AI Podcast) · host AI21
Building production AI systems is hard — especially when you're pioneering entirely new categories. In this episode, Yuval speaks with Guy Becker, Group Product Manager at AI21, to trace the evolution from task-specific models to Agent planning and orchestration systems. Guy shares hard-won lessons from building some of the first RAG-as-a-service offerings when there were literally zero handbooks to follow. Key Topics: Task-specific models vs. general LLMs: Why focused, smaller models with pre and post-processing beat general purpose LLMs for business use cases. Building RAG before it was cool: Creating one of the first RAG-as-a-service platforms in early 2023 without any established patterns. The one-size-fits-all problem: Why chunking strategies, embedding models, and retrieval parameters need customization per use case. From SaaS to on-prem: Scaling deployment models for enterprise customers with sensitive data. When RAG breaks down: Multi-hop queries, metadata filtering, and why semantic search isn't always enough. Multi-agent orchestration: How AI21 Maestro uses automated planning to break complex queries into parallelizable subtasks. Production lessons: Evaluation strategies, quality guarantees, and building explainable AI systems for enterprise..
NOW PLAYING
Building Enterprise RAG: Lessons from 2+ Years of Production Deployments
No transcript for this episode yet
Similar Episodes
Mar 26, 2026 ·1m
Jan 2, 2026 ·47m
Dec 21, 2025 ·46m