Metrics Driven Development

Metrics Driven Development

Author: Practical AI LLC August 29, 2024 Duration: 42:12

How do you systematically measure, optimize, and improve the performance of LLM applications (like those powered by RAG or tool use)? Ragas is an open source effort that has been trying to answer this question comprehensively, and they are promoting a “Metrics Driven Development” approach. Shahul from Ragas joins us to discuss Ragas in this episode, and we dig into specific metrics, the difference between benchmarking models and evaluating LLM apps, generating synthetic test data and more.


Sponsors:

  • Assembly AI – Turn voice data into summaries with AssemblyAI’s leading Speech AI models. Built by AI experts, their Speech AI models include accurate speech-to-text for voice data (such as calls, virtual meetings, and podcasts), speaker detection, sentiment analysis, chapter detection, PII redaction, and more. 

Featuring:

Show Notes:

Upcoming Events: 


There's a lot of noise out there about artificial intelligence, but cutting through the hype to find what's genuinely useful can be a challenge. That's the space where Practical AI operates. Hosted by the team at Practical AI LLC, this technology podcast moves beyond abstract theory to explore how AI, machine learning, and large language models are actually being applied right now. Each episode features unscripted conversations with a diverse mix of experts, developers, business leaders, and curious minds. You'll hear tangible discussions about implementing machine learning systems, the realities of MLOps, the evolution of neural networks, and the practical implications of breakthroughs in deep learning and GANs. The dialogue is grounded in real-world scenarios, focusing on how these technologies solve problems, drive productivity, and create value in accessible ways. Whether you're a professional building models, a business person integrating AI tools, or an enthusiast eager to understand the landscape, this podcast offers a clear, conversational entry point. It’s about making sense of a complex field through the lens of practical application, demystifying the concepts that are shaping our world without losing sight of how they work on the ground.
Author: Language: en-us Episodes: 100

Practical AI
Podcast Episodes
How is AI shaping democracy? [not-audio_url] [/not-audio_url]

Duration: 48:23
As AI increasingly shapes geopolitics, elections, and civic life, its impact on democracy is becoming impossible to ignore. In this episode, Daniel and Chris are joined by security expert Bruce Schneier to explore how AI…
Controlling AI Models from the Inside [not-audio_url] [/not-audio_url]

Duration: 43:55
As generative AI moves into production, traditional guardrails and input/output filters can prove too slow, too expensive, and/or too limited. In this episode, Alizishaan Khatri of Wrynx joins Daniel and Chris to explore…
2025 was the year of agents, what's coming in 2026? [not-audio_url] [/not-audio_url]

Duration: 51:15
In this start-of-year FC episode, Chris and Daniel break down what really mattered in AI in 2025, and what to expect in 2026. They explore the rise of AI agents, the practical reality of multimodal AI, and how reasoning…
Beyond chatbots: Agents that tackle your SOPs [not-audio_url] [/not-audio_url]

Duration: 45:53
As AI reshapes the workplace, employees and leaders face questions about meaningful work, automation, and human impact. In this episode, Jason Beutler, CEO of RoboSource, shares how companies can rethink workflows, integ…
The AI engineer skills gap [not-audio_url] [/not-audio_url]

Duration: 45:33
Chris and Daniel talk with returning guest, Ramin Mohammadi, about how those seeking to get into AI Engineer/ Data Science jobs are expected to come in a mid level engineers (not entry level). They explore this growing g…
Technical advances in document understanding [not-audio_url] [/not-audio_url]

Duration: 49:18
Chris and Daniel unpack how AI-driven document processing has rapidly evolved well beyond traditional OCR with many technical advances that fly under the radar. They explore the progression from document structure models…
Chris on AI, autonomous swarming, home automation and Rust! [not-audio_url] [/not-audio_url]

Duration: 1:37:09
This episode is a special crossover between the Practical AI podcast and The Changelog podcast. Chris was recently invited by longtime friends Jerod Santo and Adam Stacoviak, cohosts of The Changelog, to join them on the…
Beyond note-taking with Fireflies [not-audio_url] [/not-audio_url]

Duration: 48:59
Fireflies CEO, Krish Ramineni shares how the company is transforming AI-powered note-taking into a deeper layer of knowledge automation. He breaks down the technology behind real-time functionality like Live Assist, the…
Autonomous Vehicle Research at Waymo [not-audio_url] [/not-audio_url]

Duration: 52:08
Waymo’s VP of Research, Drago Anguelov, joins Practical AI to explore how advances in autonomy, vision models, and large-scale testing are shaping the future of driverless technology. The conversation dives into the dual…
Are we in an AI bubble? [not-audio_url] [/not-audio_url]

Duration: 49:41
Dan and Chris unpack whether today’s surge in AI deployment across enterprise workflows, manufacturing, healthcare, and scientific research signals a lasting transformation or an overhyped bubble. Drawing parallels to th…