Fast Inference with Hassan El Mghari

Fast Inference with Hassan El Mghari

Author: Software Huddle April 8, 2025 Duration: 53:06
Today we have Hassan back on the show. Hassan was one of our first guests for Huddle when he was working at Vercel, but since then, he's joined Together AI, one of the hottest companies in the world. They just raised a massive series B round. Hassan joins us to talk about Together AI, inference optimization and building AI applications. We touch on a bunch of topics like customer uses of AI, best practices for building apps, and what's next for Together AI. Timestamps 01:42 Opportunity at Together AI 04:26 Together raised a big round 06:06 Vision Behind Together AI 08:32 Problems in running Open Source Models 11:40 Speed For Inference 14:24 Fine Tuning 19:23 One or Two Models or a Combination of them 21:32 Serverless 22:21 Cold Start issues? 27:46 How much data do you need? 30:00 Balancing Reliability and Cost 34:07 How customers are using Together 42:36 Agent Recipes 47:03 Typical Mistakes buiilding AI apps

Every week on Software Huddle, Alex DeBrie and Sean Falconer sit down with a different expert from across the tech landscape. The conversations are less about quick tips and more about substantive discussions, digging into the real challenges and decisions behind building software, launching products, and navigating the industry's constant shifts. You'll hear from practitioners who have been in the trenches, offering perspectives that blend deep technical knowledge with hard-won business and entrepreneurial experience. Alex brings his specialized expertise as the author of The DynamoDB Book and an AWS Data Hero, while Sean contributes a unique viewpoint shaped by over two decades as an engineer, founder, and marketing executive, recognized as a Snowflake Data Superhero. Together, they create a space where complex topics in software development and technology trends become accessible and genuinely engaging. This podcast is for anyone who wants to move beyond surface-level news and understand the "why" behind the tools and strategies shaping our digital world. Tune in for a thoughtful huddle that feels more like a candid conversation between colleagues than a formal interview.
Author: Language: en-us Episodes: 79

Software Huddle
Podcast Episodes
Navigating Large Language Models with Vino Duraisamy from Snowflake [not-audio_url] [/not-audio_url]

Duration: 59:42
In this episode, we spoke with Vino Duraisamy, Developer advocate at Snowflake. Vino has been working as a data and AI engineer for her entire career across companies like Apple, Treeverse, and now Snowflake. And in this…
AGI is Surely Coming with Former Snowflake CEO Bob Muglia [not-audio_url] [/not-audio_url]

Duration: 59:14
Today we have the former CEO of Snowflake, a 23 year veteran of Microsoft, Bob Muglia on the show. In this interview, we discuss Bob's book, Datapreneurs, which takes you on a journey about the people behind the first re…
reInvent BTS, Sam Altman, SEC on Solarwinds, Apple RCS, and more [not-audio_url] [/not-audio_url]

Duration: 51:19
Our special episode is back, and we have a special guest this time. Join Sean, Alex & Merritt in this fun conversation. Timestamps: 00:00 Introduction 01:19 What is a CISO 08:10 Balance of Power 13:50 reInvent BTS 19:45…
AI-driven Database Cache with Ben Hagan from PolyScale [not-audio_url] [/not-audio_url]

Duration: 57:10
PolyScale is a database cache, specifically designed to cache just your database. It is completely Plug and Play and it allows you to scale a database without a huge amount of effort, cost, and complexity. PolyScale curr…
Building for Scale with Mario Žagar from Infobip [not-audio_url] [/not-audio_url]

Duration: 50:00
In this episode, we spoke with Mario Žagar, a Distinguished Engineer at Infobip. Infobip is a tech unicorn based out of Croatia that is a global leader in omnichannel communication, bootstrapping its way to a staggering…
Distributed Financial Databases with Joran Dirk Greef of TigerBeetle [not-audio_url] [/not-audio_url]

Duration: 1:03:39
In this episode we spoke with Joran Dirk Greef, who's the co-founder at TigerBeetle. TigerBeetle is a Financial Transactions Database that's focused on correctness and safety while hitting orders of magnitude more perfor…
First Year as a Startup Founder and CEO with Nucleus's Evis Drenova [not-audio_url] [/not-audio_url]

Duration: 44:42
In this episode, we spoke with Evis Drenova, CEO and co-founder of Nucleus, a Y Combinator graduate from 2022 focused on making it easy to deploy, build, and manage on Kubernetes. Evis left Skyflow, where he was one of t…
Architecting Real-time Analytics with Dhruba Borthakur of Rockset [not-audio_url] [/not-audio_url]

Duration: 1:09:30
In this episode, we spoke with Dhruba Borthakur, Dhruba is the CTO and Co-founder at Rockset. Rockset is a search and analytics database hosted on the cloud. Dhruba was the founding engineer of the RocksDB project at Fac…