E181: Why Multimodal Is the Future of AI Data Workloads

E181: Why Multimodal Is the Future of AI Data Workloads

Author: Robby (MTF); Tim (Essence VC) September 10, 2025 Duration: 36:31

Chang She is Co-Founder & CEO of LanceDB, the multimodal lakehouse platform. Their open source data format lance has over 5K stars on GitHub and is a modern columnar data format for ML and LLMs implemented in Rust.

LanceDB has raised $41M from investors including Theory Ventures, CRV, and Essence VC.

In this episode, we dig into:

  • Early focus: autonomous vehicles; solved real-time analysis limits with Lance format → 9,000% performance gain.

  • Multi-modal AI taking off (vision, audio, text); Midjourney & Runway as pioneers; audio now a major category.

  • How they built trust through open source.

  • Integrated workflows (data prep + search + embedding) going beyond vector DBs; education needed to show full value.

  • Cloud/serverless launch in 2023–24 enabled seamless local-to-production use.

  • Future bets: audio infra, robotics, spatial reasoning; vector DBs risk irrelevance if they don’t evolve.


Building a company around open source software is a unique and often misunderstood path, full of specific challenges and rare opportunities. The Open Source Startup Podcast digs into that journey directly with the people who have navigated it, moving beyond theory to the practical realities shared in conversation. Hosts Robby and Tim bring their distinct perspectives from MTF and Essence VC to these discussions, creating a space where founders speak candidly. You’ll hear from the architects behind names like HashiCorp, MongoDB, and Vercel, as well as leaders from Chronosphere, DBT, and mobile.dev, as they unpack their experiences. This podcast focuses on the pivotal decisions around community building, monetization strategies, and maintaining project ethos under the pressures of scaling a business. Each episode serves as a detailed case study, revealing how these companies turned publicly available code into sustainable, impactful enterprises. The dialogue naturally explores the tensions between open collaboration and commercial needs, offering a real-world blueprint that is both instructive and nuanced. For anyone curious about the intersection of community-driven development and venture-scale growth, this series provides an essential and unfiltered resource.
Author: Language: English Episodes: 100

Open Source Startup Podcast
Podcast Episodes
E142: Redefining Self-Serve Analytics with Dremio [not-audio_url] [/not-audio_url]

Duration: 41:26
Tomer Shiran is Founder of Dremio, the data lakehouse platform for self-service analytics and AI based on open source frameworks Apache Arrow, which the Dremio team created, and Apache Iceberg. Dremio has raised over $40…
E139: Taking on AWS with an Open Source Alternative [not-audio_url] [/not-audio_url]

Duration: 38:05
Umur Cubukcu is Co-Founder of Ubicloud, the open source and portable cloud that can reduce cloud spend by 3–10x. Their project, also called ubicloud, has over 3K stars and provides elastic compute, block storage, virtual…
E138: The Database Pioneer Behind Ingres, Postgres & DBOS [not-audio_url] [/not-audio_url]

Duration: 38:28
Michael Stonebraker is a legendary database system pioneer as the founder of Ingres, Postgres, and now DBOS. His work while at Berkeley and then MIT has been central to many relational database companies. His new company…
E137: Monitoring Infrastructure with Chalk Marks [not-audio_url] [/not-audio_url]

Duration: 40:13
John Viega is Co-Founder & CEO of Crash Override, the open source monitoring platform based on the Chalk project which has 22K stars on GitHub. Crash Override has raised $14M from investors including SYN Ventures, BVP &…
E136: Creating the Vector Database for AI Application Developers [not-audio_url] [/not-audio_url]

Duration: 39:35
Jeff Huber is Co-Founder of Chroma, the open source vector database. Their open source project, also called chroma, has 13K stars on GitHub. Chroma has raised $20M from investors including Quiet Ventures and Bloomberg Be…
E135: Riding the Homebrew Wave [not-audio_url] [/not-audio_url]

Duration: 42:31
John Britton & Mike McQuaid are Co-Founders of Workbrew, the company that provides additional features and support for companies using Homebrew. Homebrew's main project, brew, is a wildly popular open source project with…
E134: Making Complex Data RAG-Ready with Unstructured [not-audio_url] [/not-audio_url]

Duration: 37:06
Brian Raymond is Founder & CEO of Unstructured, the platform to extract and transform complex data for use with every major vector database and LLM framework. Their open source project has 7K stars on GitHub and includes…
E133: Reinventing Authorization with Google's Zanzibar Paper [not-audio_url] [/not-audio_url]

Duration: 39:25
Jake Moshenko is Co-Founder & CEO of AuthZed, the scalable authorization platform based on Google's Zanzibar white paper. Their open source permissions database spiceDB has 5K stars on GitHub and enables fine-grained acc…