#228 - GPT 5.2, Scaling Agents, Weird Generalization

#228 - GPT 5.2, Scaling Agents, Weird Generalization

Author: Skynet Today December 17, 2025 Duration: 1:26:42
Our 228th episode with a summary and discussion of last week's big AI news! Recorded on 12/12/2025 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode:OpenAI's latest model GPT-5.2 demonstrates improved performance and enhanced multi-modal capabilities but comes with increased costs and a different knowledge cutoff date.Disney invests $1 billion in OpenAI to generate Disney character content, creating unique licensing agreements across characters from Marvel, Pixar, and Star Wars franchises.The U.S. government imposes new AI chip export rules involving security reviews, while simultaneously moving to prevent states from independently regulating AI.DeepMind releases a paper outlining the challenges and findings in scaling multi-agent systems, highlighting the complexities of tool coordination and task performance. Timestamps:(00:00:00) Intro / Banter(00:01:19) News PreviewTools & Apps(00:01:58) GPT-5.2 is OpenAI’s latest move in the agentic AI battle | The Verge(00:08:48) Runway releases its first world model, adds native audio to latest video model | TechCrunch(00:11:51) Google says it will link to more sources in AI Mode | The Verge(00:12:24) ChatGPT can now use Adobe apps to edit your photos and PDFs for free | The Verge(00:13:05) Tencent releases Hunyuan 2.0 with 406B parametersApplications & Business(00:16:15) China set to limit access to Nvidia’s H200 chips despite Trump export approval(00:21:02) Disney investing $1 billion in OpenAI, will allow characters on Sora(00:24:48) Unconventional AI confirms its massive $475M seed round(00:29:06) Slack CEO Denise Dresser to join OpenAI as chief revenue officer | TechCrunch(00:31:18) The state of enterprise AIProjects & Open Source(00:33:49) [2512.10791] The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality(00:36:27) Claude 4.5 Opus' Soul DocumentResearch & Advancements(00:43:49) [2512.08296] Towards a Science of Scaling Agent Systems(00:48:43) Evaluating Gemini Robotics Policies in a Veo World Simulator(00:52:10) Guided Self-Evolving LLMs with Minimal Human Supervision(00:56:08) Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning(01:00:39) [2512.07783] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models(01:04:42) Stabilizing Reinforcement Learning with LLMs: Formulation and Practices(01:09:42) Google’s AI unit DeepMind announces UK 'automated research lab'Policy & Safety(01:10:28) Trump Moves to Stop States From Regulating AI With a New Executive Order - The New York Times(01:13:54) [2512.09742] Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs(01:17:57) Forecasting AI Time Horizon Under Compute Slowdowns(01:20:46) AI Security Institute focuses on AI measurements and evaluations(01:21:16) Nvidia AI Chips to Undergo Unusual U.S. Security Review Before Export to China(01:22:01) U.S. Authorities Shut Down Major China-Linked AI Tech Smuggling NetworkSynthetic Media & Art(01:24:01) RSL 1.0 has arrived, allowing publishers to ask AI companies pay to scrape content | The Verge See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Keeping up with artificial intelligence can feel like drinking from a firehose. Every week brings a new breakthrough, a surprising application, or an urgent ethical debate. Last Week in AI, from the team at Skynet Today, is here to turn that torrent into a clear, digestible stream. Instead of getting lost in the noise, you'll get a thoughtful rundown of the developments that actually have impact, explained without unnecessary jargon. Each episode feels like a conversation with well-informed friends who have done the homework for you, sifting through research papers, product launches, and industry announcements to highlight what's substantive. You'll hear nuanced discussions that go beyond the headlines, considering the real-world implications of new models, policy shifts, and corporate moves in the tech landscape. This podcast doesn't just tell you what happened; it provides context on why it matters for developers, businesses, and society at large. It’s an efficient way to stay informed and critically engaged with a field that is reshaping our world at a breathtaking pace. Tune in for a consistently insightful analysis that makes the complex world of AI feel accessible and relevant, week after week.
Author: Language: English Episodes: 100

Last Week in AI
Podcast Episodes
#200 - ChatGPT Roadmap, Musk OpenAI Bid, Model Tampering [not-audio_url] [/not-audio_url]

Duration: 1:48:09
Our 200th episode with a summary and discussion of last week's big AI news! Recorded on 02/14/2025 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to…
#199 - OpenAI's 03-mini, Gemini Thinking, Deep Research, s1 [not-audio_url] [/not-audio_url]

Duration: 1:37:46
Our 199th episode with a summary and discussion of last week's big AI news! Recorded on 02/09/2025 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to…
#198 - DeepSeek R1 & Janus, Qwen2.5, OpenAI Agents [not-audio_url] [/not-audio_url]

Duration: 1:37:26
Our 198th episode with a summary and discussion of last week's big AI news! Recorded on 01/31/2024 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to…
AI Computing Hardware - Past, Present, and Future [not-audio_url] [/not-audio_url]

Duration: 2:04:24
A special one-off episode with a deep dive into the past, present, and future of how computer hardware makes AI possible. Join our brand new Discord here! https://discord.gg/nTyezGSKwP Read out our text newsletter and co…
#197 - AI in Gmail+Docs, MiniMax-01, Titans, Transformer^2 [not-audio_url] [/not-audio_url]

Duration: 1:23:52
Our 197th episode with a summary and discussion of last week's big AI news! Recorded on 01/17/2024 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov and guest-hosted by the folks fr…
#196 - Nvidia Digits, Cosmos, PRIME, ICLR, InfAlign [not-audio_url] [/not-audio_url]

Duration: 1:46:34
Our 196th episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Recorded on 01/10/2024 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov…
#195 - OpenAI o3 & for-profit, DeepSeek-V3, Latent Space [not-audio_url] [/not-audio_url]

Duration: 1:39:05
Our 195th episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Recorded on 01/04/2024 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Note: apologies for Andre…
#194 - Gemini Reasoning, Veo 2, Meta vs OpenAI, Fake Alignment [not-audio_url] [/not-audio_url]

Duration: 1:59:55
Our 194th episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Recorded on 12/19/2024 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and f…
#193 - Sora release, Gemini 2, OpenAI's AGI Rule, US AI Czar [not-audio_url] [/not-audio_url]

Duration: 2:05:28
Our 193rd episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Note: this one was recorded on 12/13, so the news is a bit outdated... will get things back on track soon! Host…
#192 - ChatGPT Pro, Amazon Nova, GenFM, Llama 3.3, Genie 2 [not-audio_url] [/not-audio_url]

Duration: 1:58:13
Our 192nd episode with a summary and discussion of last week's* big AI news! *and sometimes last last week's Note: this one was recorded on 12/04 , so the news is a bit outdated... Hosted by Andrey Kurenkov and Jeremie H…