Fast Inference with Hassan El Mghari

Fast Inference with Hassan El Mghari

Author: Software Huddle April 8, 2025 Duration: 53:06
Today we have Hassan back on the show. Hassan was one of our first guests for Huddle when he was working at Vercel, but since then, he's joined Together AI, one of the hottest companies in the world. They just raised a massive series B round. Hassan joins us to talk about Together AI, inference optimization and building AI applications. We touch on a bunch of topics like customer uses of AI, best practices for building apps, and what's next for Together AI. Timestamps 01:42 Opportunity at Together AI 04:26 Together raised a big round 06:06 Vision Behind Together AI 08:32 Problems in running Open Source Models 11:40 Speed For Inference 14:24 Fine Tuning 19:23 One or Two Models or a Combination of them 21:32 Serverless 22:21 Cold Start issues? 27:46 How much data do you need? 30:00 Balancing Reliability and Cost 34:07 How customers are using Together 42:36 Agent Recipes 47:03 Typical Mistakes buiilding AI apps

Every week on Software Huddle, Alex DeBrie and Sean Falconer sit down with a different expert from across the tech landscape. The conversations are less about quick tips and more about substantive discussions, digging into the real challenges and decisions behind building software, launching products, and navigating the industry's constant shifts. You'll hear from practitioners who have been in the trenches, offering perspectives that blend deep technical knowledge with hard-won business and entrepreneurial experience. Alex brings his specialized expertise as the author of The DynamoDB Book and an AWS Data Hero, while Sean contributes a unique viewpoint shaped by over two decades as an engineer, founder, and marketing executive, recognized as a Snowflake Data Superhero. Together, they create a space where complex topics in software development and technology trends become accessible and genuinely engaging. This podcast is for anyone who wants to move beyond surface-level news and understand the "why" behind the tools and strategies shaping our digital world. Tune in for a thoughtful huddle that feels more like a candid conversation between colleagues than a formal interview.
Author: Language: en-us Episodes: 79

Software Huddle
Podcast Episodes
Developer Education and Training with Craig Dennis from Twilio [not-audio_url] [/not-audio_url]

Duration: 52:13
In this episode, We spoke with Craig Dennis from Twilio about developer education and training. Craig's been working in the developer education space for a long time and has a ton of experience. And, of course, Twilio is…
Generative AI and LLMs with Dash Desai from Snowflake [not-audio_url] [/not-audio_url]

Duration: 34:31
If you've been involved with the Snowflake world, today's guest probably can skip an introduction as he is the demo king from the Snowflake Summit and well-known within the Snowflake builder community. We're talking abou…
V0 by Vercel, Bun, RCS on Apple Devices, Retool Breach, & more [not-audio_url] [/not-audio_url]

Duration: 49:55
Our special episode is here, and it's all about the latest news. Join Sean and Alex for an in-depth discussion. Follow Alex: https://twitter.com/alexbdebrie Follow Sean: https://twitter.com/seanfalconer Software Huddle ⤵…
Building Generative AI Apps with Hassan El Mghari from Vercel [not-audio_url] [/not-audio_url]

Duration: 45:41
Over the past couple of months, Generative AI has taken the world by storm. OpenAI’s launch of ChatGPT was a turning point. With each iteration, the Transformer has improved its capabilities while the underlying compute…
Scaling MySQL with Sam Lambert from PlanetScale [not-audio_url] [/not-audio_url]

Duration: 1:21:18
What would it look like if databases were built for developers rather than operators? Sam Lambert is the CEO of PlanetScale, a company that provides a managed MySQL database solution. PlanetScale uses Vitess, a database…