Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Author: Sam Charrington March 4, 2025 Duration: 49:29

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Today, we're joined by Niklas Muennighoff, a PhD student at Stanford University, to discuss his paper, “S1: Simple Test-Time Scaling.” We explore the motivations behind S1, as well as how it compares to OpenAI's O1 and DeepSeek's R1 models. We dig into the different approaches to test-time scaling, including parallel and sequential scaling, as well as S1’s data curation process, its training recipe, and its use of model distillation from Google Gemini and DeepSeek R1. We explore the novel "budget forcing" technique developed in the paper, allowing it to think longer for harder problems and optimize test-time compute for better performance. Additionally, we cover the evaluation benchmarks used, the comparison between supervised fine-tuning and reinforcement learning, and similar projects like the Hugging Face Open R1 project. Finally, we discuss the open-sourcing of S1 and its future directions. The complete show notes for this episode can be found at https://twimlai.com/go/721.

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Hosted by industry analyst and commentator Sam Charrington, The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) serves as a vital conduit between cutting-edge research and its real-world implications. This isn't just a series of technical lectures; it's a series of conversations that unpack how AI and machine learning are actively reshaping industries and societal structures. Each episode connects you directly with leading researchers, engineers, and innovative thinkers who are defining the frontiers of the field. The discussions go beyond abstract theory to explore the practical challenges, ethical considerations, and business transformations driven by these technologies. Whether you're a data scientist deep in the code, a tech-savvy leader strategizing implementation, or simply fascinated by the future of intelligent systems, this podcast provides the context and depth needed to stay informed. By focusing on the people behind the algorithms and the ideas powering the platforms, Sam creates a resource that is both intellectually substantive and genuinely engaging, building a thoughtful community around one of the most significant technological shifts of our time.

Author: Sam Charrington Language: English Episodes: 100

Official website RSS

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcast Episodes

[not-audio_url]

[/not-audio_url]

Building Real-World LLM Products with Fine-Tuning and More with Hamel Husain - #694

24.07.2024

Duration: 1:20:05

Today, we're joined by Hamel Husain, founder of Parlance Labs, to discuss the ins and outs of building real-world products using large language models (LLMs). We kick things off discussing novel applications of LLMs and…

[not-audio_url]

[/not-audio_url]

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

17.07.2024

Duration: 57:54

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in gene…

[not-audio_url]

[/not-audio_url]

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - #692

09.07.2024

Duration: 43:16

Today, we're joined by Amir Bar, a PhD candidate at Tel Aviv University and UC Berkeley to discuss his research on visual-based learning, including his recent paper, “EgoPet: Egomotion and Interaction Data from an Animal…

[not-audio_url]

[/not-audio_url]

How Microsoft Scales Testing and Safety for Generative AI with Sarah Bird - #691

01.07.2024

Duration: 57:12

Today, we're joined by Sarah Bird, chief product officer of responsible AI at Microsoft. We discuss the testing and evaluation techniques Microsoft applies to ensure safe deployment and use of generative AI, large langua…

[not-audio_url]

[/not-audio_url]

Long Context Language Models and their Biological Applications with Eric Nguyen - #690

25.06.2024

Duration: 45:41

Today, we're joined by Eric Nguyen, PhD student at Stanford University. In our conversation, we explore his research on long context foundation models and their application to biology particularly Hyena, and its evolutio…

[not-audio_url]

[/not-audio_url]

Accelerating Sustainability with AI with Andres Ravinet - #689

18.06.2024

Duration: 47:46

Today, we're joined by Andres Ravinet, sustainability global black belt at Microsoft, to discuss the role of AI in sustainability. We explore real-world use cases where AI-driven solutions are leveraged to help tackle en…

[not-audio_url]

[/not-audio_url]

Gen AI at the Edge: Qualcomm AI Research at CVPR 2024 with Fatih Porikli - #688

11.06.2024

Duration: 1:10:41

Today we’re joined by Fatih Porikli, senior director of technology at Qualcomm AI Research. In our conversation, we covered several of the Qualcomm team’s 16 accepted main track and workshop papers at this year’s CVPR co…

[not-audio_url]

[/not-audio_url]

Energy Star Ratings for AI Models with Sasha Luccioni - #687

04.06.2024

Duration: 48:26

Today, we're joined by Sasha Luccioni, AI and Climate lead at Hugging Face, to discuss the environmental impact of AI models. We dig into her recent research into the relative energy consumption of general purpose pre-tr…

[not-audio_url]

[/not-audio_url]

Language Understanding and LLMs with Christopher Manning - #686

27.05.2024

Duration: 56:10

Today, we're joined by Christopher Manning, the Thomas M. Siebel professor in Machine Learning at Stanford University and a recent recipient of the 2024 IEEE John von Neumann medal. In our conversation with Chris, we dis…

[not-audio_url]

[/not-audio_url]

Chronos: Learning the Language of Time Series with Abdul Fatir Ansari - #685

20.05.2024

Duration: 43:05

Today we're joined by Abdul Fatir Ansari, a machine learning scientist at AWS AI Labs in Berlin, to discuss his paper, "Chronos: Learning the Language of Time Series." Fatir explains the challenges of leveraging pre-trai…