Open-Weight AI Models

Open-Weight AI Models

Author: softwareengineeringdaily.com April 28, 2026 Duration: 53:13
Open-weight models are AI systems whose trained parameters are publicly released, which allows developers to run, fine-tune, and deploy them independently rather than accessing them only through a hosted API. While closed-weight models from companies like OpenAI or Anthropic are delivered as managed services, open-weight models give organizations direct control over how the models are deployed and used. Importantly, the performance of these models is steadily improving and they’ve become credible alternatives for production workloads, with advantages in customization and data privacy. ​ Fireworks AI is building a platform focused on serving and customizing open-weight models at scale. The platform includes optimized inference infrastructure, multi-hardware support across NVIDIA and AMD, and reinforcement fine-tuning capabilities. ​ Benny Chen is a Co-Founder of Fireworks AI. In this episode, he joins Gregor Vand to discuss his path from Meta’s ML infrastructure teams to co-founding Fireworks AI, why open-weight models are becoming increasingly competitive, how custom kernels and speculative decoding improve performance, reinforcement fine-tuning, and much more. Gregor Vand is a security-focused technologist, having previously been a CTO across cybersecurity, cyber insurance and general software engineering companies. He is based in Singapore and can be found via his profile at vand.hk or on LinkedIn. Please click here to see the transcript of this episode. Sponsorship inquiries: sponsor@softwareengineeringdaily.com

For anyone curious about how the code running our world actually gets built, Software Engineering Daily offers a clear and consistent look behind the curtain. This isn't about hype cycles or surface-level news; it's a deep, technical conversation with the engineers, architects, and thinkers who are shaping our digital infrastructure. Each episode focuses on a specific technology, practice, or problem, breaking down complex systems into understandable parts. You'll hear detailed discussions on everything from database architectures and programming language design to the organizational challenges of scaling teams and the real-world trade-offs made in production systems. Hosted by softwareengineeringdaily.com, the podcast serves as a reliable source for developers who want to stay informed and inspired, translating the rapid pace of technological change into substantive, lasting knowledge. It’s for professionals who believe that understanding the "how" and "why" is just as important as knowing the "what." By dedicating time to thorough exploration, this podcast provides context that shorter formats simply cannot, making it an essential resource for anyone building the future, one line of code at a time. Tune in to hear unfiltered insights from the people on the front lines, discussing the tools and decisions that define modern software engineering.
Author: Language: en-us Episodes: 100

Software Engineering Daily
Podcast Episodes
RxJS with Ben Lesh [not-audio_url] [/not-audio_url]

Duration: 50:53
RxJS is an open-source library for composing asynchronous and event-based programs. It provides powerful operators for transforming, filtering, combining, and managing streams of data, from user input and web requests to…
Small AI Models with Yoeven Khemlani [not-audio_url] [/not-audio_url]

Duration: 42:20
JigsawStack is a startup that develops a suite of custom small models for tasks such as scraping, forecasting, vOCR, and translation. The platform is designed to support collaborative knowledge work, especially in resear…
Streamlining Cloud Infrastructure Deployments with Jake Cooper [not-audio_url] [/not-audio_url]

Duration: 43:25
Railway is a software company that provides a popular platform for deploying and managing applications in the cloud. It automates tasks such as infrastructure provisioning, scaling, and deployment and is particularly kno…
Building Open Infrastructure for AI with Illia Polosukhin [not-audio_url] [/not-audio_url]

Duration: 50:12
Illia Polosukhin is a veteran AI researcher and one of the original authors of the landmark Transformer paper, Attention is All You Need, which he co-authored during his time at Google Research. He has a deep background…
TypeScript with Jake Bailey [not-audio_url] [/not-audio_url]

Duration: 48:10
TypeScript is a statically typed superset of JavaScript that adds optional type annotations and modern language features to improve developer productivity and code safety. The TypeScript compiler performs type checking a…
MCP Security at Wiz with Rami McCarthy [not-audio_url] [/not-audio_url]

Duration: 56:07
Wiz is a cloud security platform that helps organizations identify and remediate risks across their cloud environments. The company’s platform scans layers of the cloud stack, including virtual machines, containers, and…
AI at Anaconda with Greg Jennings [not-audio_url] [/not-audio_url]

Duration: 49:47
Anaconda is a software company that's well-known for its solutions for managing packages, environments, and security in large-scale data workflows. The company has played a major role in making Python-based data science…
ByteDance’s Container Networking Stack with Chen Tang [not-audio_url] [/not-audio_url]

Duration: 47:57
ByteDance is a global technology company operating a wide range of content platforms around the world, and is best known for creating TikTok. The company operates at a massive scale, which naturally presents challenges i…
WayForward Games with Tomm Hulett and Voldi Way [not-audio_url] [/not-audio_url]

Duration: 46:02
WayForward is a renowned video game studio that was founded in 1990. The company has developed games for publishers such as Capcom, Konami, and Nintendo and has released their games across major hardware platforms from t…