Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727

Author: Sam Charrington April 14, 2025 Duration: 1:34:06

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a Large Language Model." Emmanuel explains how his team developed mechanistic interpretability methods to understand the internal workings of Claude by replacing dense neural network components with sparse, interpretable alternatives. The conversation explores several fascinating discoveries about large language models, including how they plan ahead when writing poetry (selecting the rhyming word "rabbit" before crafting the sentence leading to it), perform mathematical calculations using unique algorithms, and process concepts across multiple languages using shared neural representations. Emmanuel details how the team can intervene in model behavior by manipulating specific neural pathways, revealing how concepts are distributed throughout the network's MLPs and attention mechanisms. The discussion highlights both capabilities and limitations of LLMs, showing how hallucinations occur through separate recognition and recall circuits, and demonstrates why chain-of-thought explanations aren't always faithful representations of the model's actual reasoning. This research ultimately supports Anthropic's safety strategy by providing a deeper understanding of how these AI systems actually work. The complete show notes for this episode can be found at https://twimlai.com/go/727.

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Hosted by industry analyst and commentator Sam Charrington, The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) serves as a vital conduit between cutting-edge research and its real-world implications. This isn't just a series of technical lectures; it's a series of conversations that unpack how AI and machine learning are actively reshaping industries and societal structures. Each episode connects you directly with leading researchers, engineers, and innovative thinkers who are defining the frontiers of the field. The discussions go beyond abstract theory to explore the practical challenges, ethical considerations, and business transformations driven by these technologies. Whether you're a data scientist deep in the code, a tech-savvy leader strategizing implementation, or simply fascinated by the future of intelligent systems, this podcast provides the context and depth needed to stay informed. By focusing on the people behind the algorithms and the ideas powering the platforms, Sam creates a resource that is both intellectually substantive and genuinely engaging, building a thoughtful community around one of the most significant technological shifts of our time.

Author: Sam Charrington Language: English Episodes: 100

Official website RSS

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Podcast Episodes

[not-audio_url]

[/not-audio_url]

Building Real-World LLM Products with Fine-Tuning and More with Hamel Husain - #694

24.07.2024

Duration: 1:20:05

Today, we're joined by Hamel Husain, founder of Parlance Labs, to discuss the ins and outs of building real-world products using large language models (LLMs). We kick things off discussing novel applications of LLMs and…

[not-audio_url]

[/not-audio_url]

Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - #693

17.07.2024

Duration: 57:54

Today, we're joined by Albert Gu, assistant professor at Carnegie Mellon University, to discuss his research on post-transformer architectures for multi-modal foundation models, with a focus on state-space models in gene…

[not-audio_url]

[/not-audio_url]

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - #692

09.07.2024

Duration: 43:16

Today, we're joined by Amir Bar, a PhD candidate at Tel Aviv University and UC Berkeley to discuss his research on visual-based learning, including his recent paper, “EgoPet: Egomotion and Interaction Data from an Animal…

[not-audio_url]

[/not-audio_url]

How Microsoft Scales Testing and Safety for Generative AI with Sarah Bird - #691

01.07.2024

Duration: 57:12

Today, we're joined by Sarah Bird, chief product officer of responsible AI at Microsoft. We discuss the testing and evaluation techniques Microsoft applies to ensure safe deployment and use of generative AI, large langua…

[not-audio_url]

[/not-audio_url]

Long Context Language Models and their Biological Applications with Eric Nguyen - #690

25.06.2024

Duration: 45:41

Today, we're joined by Eric Nguyen, PhD student at Stanford University. In our conversation, we explore his research on long context foundation models and their application to biology particularly Hyena, and its evolutio…

[not-audio_url]

[/not-audio_url]

Accelerating Sustainability with AI with Andres Ravinet - #689

18.06.2024

Duration: 47:46

Today, we're joined by Andres Ravinet, sustainability global black belt at Microsoft, to discuss the role of AI in sustainability. We explore real-world use cases where AI-driven solutions are leveraged to help tackle en…

[not-audio_url]

[/not-audio_url]

Gen AI at the Edge: Qualcomm AI Research at CVPR 2024 with Fatih Porikli - #688

11.06.2024

Duration: 1:10:41

Today we’re joined by Fatih Porikli, senior director of technology at Qualcomm AI Research. In our conversation, we covered several of the Qualcomm team’s 16 accepted main track and workshop papers at this year’s CVPR co…

[not-audio_url]

[/not-audio_url]

Energy Star Ratings for AI Models with Sasha Luccioni - #687

04.06.2024

Duration: 48:26

Today, we're joined by Sasha Luccioni, AI and Climate lead at Hugging Face, to discuss the environmental impact of AI models. We dig into her recent research into the relative energy consumption of general purpose pre-tr…

[not-audio_url]

[/not-audio_url]

Language Understanding and LLMs with Christopher Manning - #686

27.05.2024

Duration: 56:10

Today, we're joined by Christopher Manning, the Thomas M. Siebel professor in Machine Learning at Stanford University and a recent recipient of the 2024 IEEE John von Neumann medal. In our conversation with Chris, we dis…

[not-audio_url]

[/not-audio_url]

Chronos: Learning the Language of Time Series with Abdul Fatir Ansari - #685

20.05.2024

Duration: 43:05

Today we're joined by Abdul Fatir Ansari, a machine learning scientist at AWS AI Labs in Berlin, to discuss his paper, "Chronos: Learning the Language of Time Series." Fatir explains the challenges of leveraging pre-trai…