DeepSeek: What Happened, What Matters,  and Why It’s Interesting

Author: Helen and Dave Edwards January 28, 2025 Duration: 25:58

Stay Human, from the Artificiality Institute

Technology

First:

- Apologies for the audio! We had a production error…

What’s new:

- DeepSeek has created breakthroughs in both: How AI systems are trained (making it much more affordable) and how they run in real-world use (making them faster and more efficient)

Details

- FP8 Training: Working With Less Precise Numbers

- Traditional AI training requires extremely precise numbers

- DeepSeek found you can use less precise numbers (like rounding $10.857643 to $10.86)

- Cut memory and computation needs significantly with minimal impact

- Like teaching someone math using rounded numbers instead of carrying every decimal place

- Learning from Other AIs (Distillation)

- Traditional approach: AI learns everything from scratch by studying massive amounts of data

- DeepSeek's approach: Use existing AI models as teachers

- Like having experienced programmers mentor new developers:

- Trial & Error Learning (for their R1 model)

- Started with some basic "tutoring" from advanced models

- Then let it practice solving problems on its own

- When it found good solutions, these were fed back into training

- Led to "Aha moments" where R1 discovered better ways to solve problems

- Finally, polished its ability to explain its thinking clearly to humans

- Smart Team Management (Mixture of Experts)

- Instead of one massive system that does everything, built a team of specialists

- Like running a software company with:

- 256 specialists who focus on different areas

- 1 generalist who helps with everything

- Smart project manager who assigns work efficiently

- For each task, only need 8 specialists plus the generalist

- More efficient than having everyone work on everything

- Efficient Memory Management (Multi-head Latent Attention)

- Traditional AI is like keeping complete transcripts of every conversation

- DeepSeek's approach is like taking smart meeting minutes

- Captures key information in compressed format

- Similar to how JPEG compresses images

- Looking Ahead (Multi-Token Prediction)

- Traditional AI reads one word at a time

- DeepSeek looks ahead and predicts two words at once

- Like a skilled reader who can read ahead while maintaining comprehension

Why This Matters

- Cost Revolution: Training costs of $5.6M (vs hundreds of millions) suggests a future where AI development isn't limited to tech giants.

- Working Around Constraints: Shows how limitations can drive innovation—DeepSeek achieved state-of-the-art results without access to the most powerful chips (at least that’s the best conclusion at the moment).

What’s Interesting

- Efficiency vs Power: Challenges the assumption that advancing AI requires ever-increasing computing power - sometimes smarter engineering beats raw force.

- Self-Teaching AI: R1's ability to develop reasoning capabilities through pure reinforcement learning suggests AIs can discover problem-solving methods on their own.

- AI Teaching AI: The success of distillation shows how knowledge can be transferred between AI models, potentially leading to compounding improvements over time.

- IP for Free: If DeepSeek can be such a fast follower through distillation, what’s the advantage of OpenAI, Google, or another company to release a novel model?

Stay Human, from the Artificiality Institute

Hosted by Helen and Dave Edwards, Stay Human, from the Artificiality Institute is a conversation that lives in the messy, human space between our tools and our selves. Each episode digs into the subtle ways artificial intelligence is reshaping our daily decisions, our creative impulses, and even our sense of identity. This isn't a technical manual or a series of futuristic predictions; it's a grounded exploration of how we maintain our agency in a world increasingly mediated by algorithms. The podcast operates from a core belief: that our engagement with AI should be about more than just safety or efficiency-it needs to be meaningful and worthwhile. You'll hear discussions rooted in story-based research, where complex ideas about cognition and ethics are unpacked through relatable narratives and real-world examples. The goal is to provide a framework for thoughtful choice, helping each of us consciously design the relationship we want with the machines in our lives. Tuning in offers a chance to step back from the hype and consider how we can actively remain the authors of our own minds, preserving what makes us uniquely human even as the technology evolves. It's an essential listen for anyone curious about the personal and philosophical dimensions of our digital age.

Author: Helen and Dave Edwards Language: en-us Episodes: 100

Official website RSS

Stay Human, from the Artificiality Institute

Podcast Episodes

[not-audio_url]

[/not-audio_url]

James Boyle: The Line—AI And the Future of Personhood

28.09.2024

Duration: 58:04

We're excited to welcome Jamie Boyle to the podcast. Jamie is a law professor and author of the thought-provoking book The Line: AI and the Future of Personhood. In The Line, Jamie challenges our assumptions about person…

[not-audio_url]

[/not-audio_url]

Shannon Vallor: The AI Mirror

14.09.2024

Duration: 56:34

We're excited to welcome to the podcast Shannon Vallor, professor of ethics and technology at the University of Edinburgh, and the author of The AI Mirror. In her book, Shannon invites us to rethink AI—not as a futuristi…

[not-audio_url]

[/not-audio_url]

Matt Beane: The Skill Code

31.08.2024

Duration: 55:34

We're excited to welcome to the podcast Matt Beane, Assistant Professor at UC Santa Barbara and the author of the book "The Skill Code: How to Save Human Ability in an Age of Intelligent Machines." Matt’s research invest…

[not-audio_url]

[/not-audio_url]

Emily M. Bender: AI, Linguistics, Parrots, and more!

02.08.2024

Duration: 57:18

We're excited to welcome to the podcast Emily M. Bender, professor of computational linguistics at the University of Washington. As our listeners know, we enjoy tapping expertise in fields adjacent to the intersection of…

[not-audio_url]

[/not-audio_url]

John Havens: Heartificial Intelligence

14.07.2024

Duration: 1:02:29

We're excited to welcome to the podcast John Havens, a multifaceted thinker at the intersection of technology, ethics, and sustainability. John's journey has taken him from professional acting to becoming a thought leade…

[not-audio_url]

[/not-audio_url]

Leslie Valiant: Educability

22.06.2024

Duration: 56:41

We’re excited to welcome to the podcast Leslie Valiant, a pioneering computer scientist and Turing Award winner renowned for his groundbreaking work in machine learning and computational learning theory. In his seminal 1…

[not-audio_url]

[/not-audio_url]

Jonathan Feinstein: The Context of Creativity

08.06.2024

Duration: 53:31

We’re excited to welcome to the podcast Jonathan Feinstein, professor at the Yale School of Management and author of Creativity in Large-Scale Contexts: Guiding Creative Engagement and Exploration. Our interest in creati…

[not-audio_url]

[/not-audio_url]

Karaitiana Taiuru: Indigenous AI

26.05.2024

Duration: 48:01

We’re excited to welcome to the podcast Karaitiana Taiuru. Dr Taiuru is a leading authority and a highly accomplished visionary Māori technology ethicist specialising in Māori rights with AI, Māori Data Sovereignty and G…

[not-audio_url]

[/not-audio_url]

Omri Allouche: Gong AI

05.05.2024

Duration: 40:47

We’re excited to welcome to the podcast Omri Allouche, the VP of Research at Gong, an AI-driven revenue intelligence platform for B2B sales teams. Omri has had a fascinating career journey with a PhD in computational eco…

[not-audio_url]

[/not-audio_url]

Susannah Fox: Rebel Health

21.04.2024

Duration: 48:52

We’re excited to welcome to the podcast Susannah Fox, a renowned researcher who has spent over 20 years studying how patients and caregivers use the internet to gather information and support each other. Susannah has col…

DeepSeek: What Happened, What Matters, and Why It’s Interesting

DeepSeek: What Happened, What Matters,  and Why It’s Interesting