#243 - GPT 5.5, DeepSeek V4, AI safety sabotage

#243 - GPT 5.5, DeepSeek V4, AI safety sabotage

Author: Skynet Today May 3, 2026 Duration: 1:52:22
Our 243rd episode with a summary and discussion of last week's big AI news! Recorded on 04/29/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ In this episode:OpenAI released GPT-5.5 with strong coding-oriented improvements, a system card discussing chain-of-thought monitorability and misalignment testing, higher pricing than GPT-5.4, and notable quirks like a system-prompt warning about “goblins.”xAI launched Grok Voice Think Fast 1.0, claiming large benchmark leads for real-time voice agents and reporting major Starlink customer-support automation and sales conversion impact.DeepSeek open-sourced DeepSeek V4 (Pro and Flash) featuring MoE scaling and 1M-token context via hybrid/compressed attention changes, while Tencent released Hunyuan 3 preview with weaker benchmark performance; a new long-horizon agent benchmark (Clawmark) shows low task success rates.Major business, legal, and policy updates include Google’s planned up-to-$40B investment and 5GW compute commitment to Anthropic, Meta’s AWS Gravitron deal and China blocking Meta’s Manus acquisition, a revamped OpenAI–Microsoft agreement, ongoing Musk–OpenAI trial developments, and new safety/security research on sabotage, document degradation under delegation, and bit-flip attacks. Timestamps:(00:00:10) Intro / Banter(00:02:00) News Preview(00:02:26) Response to listener comments(00:02:55) Sponsors Tools & Apps(00:05:55) OpenAI Unveils Its New, More Powerful GPT-5.5 Model - The New York Times(00:23:33) xAI Launches grok-voice-think-fast-1.0: Topping τ-voice Bench at 67.3%, Outperforming Gemini, GPT Realtime, and More - MarkTechPost(00:29:00) Claude can now plug directly into Photoshop, Blender, and Ableton | The Verge Projects & Open Source(00:29:38) China's DeepSeek releases preview of long-awaited V4 model as AI race intensifies(00:47:05) Tencent Unveils Hy3 preview; Model Enhances Agent Capabilities and Real-World Usability - Tencent 腾讯(00:50:14) ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents Applications & Business(00:53:03) Google Plans to Invest Up to $40 Billion in Anthropic(00:56:26) Meta will use hundreds of thousands of AWS Graviton chips(00:59:51) China blocks Meta's $2 billion takeover of AI startup Manus(01:01:45) OpenAI shakes up partnership with Microsoft, capping revenue share payments(01:07:13) Elon Musk Testifies of AI Risk at Trial, Says OpenAI Tried to ‘Steal’ a Charity - WSJ(01:11:50) Judge rejects DOJ bid to delay Anthropic appeal in Pentagon dispute(01:14:42) Google’s Gemini can now run on a single air-gapped server — and vanish when you pull the plug(01:19:07) DeepMind's David Silver just raised $1.1B to build an AI that learns without human data | TechCrunch Policy & Safety(01:22:47) Evaluating whether AI models would sabotage AI safety research(01:28:59) LLMs Corrupt Your Documents When You Delegate(01:32:50) Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability(01:39:53) Memorandum on Adversarial Distillation of American AI Models(01:41:41) Teen boys are dating their AI chatbots—and experts warn it could kill their careers | Fortune(01:43:57) Announcing the Anthropic Economic Index Survey(01:45:21) Scoop: CISA lacks access to Anthropic's Mythos Synthetic Media & Art(01:48:03) Taylor Swift Files to Trademark Voice and Likeness to Protect Against AI Misuse Research & Advancements(01:49:15) Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Keeping up with artificial intelligence can feel like drinking from a firehose. Every week brings a new breakthrough, a surprising application, or an urgent ethical debate. Last Week in AI, from the team at Skynet Today, is here to turn that torrent into a clear, digestible stream. Instead of getting lost in the noise, you'll get a thoughtful rundown of the developments that actually have impact, explained without unnecessary jargon. Each episode feels like a conversation with well-informed friends who have done the homework for you, sifting through research papers, product launches, and industry announcements to highlight what's substantive. You'll hear nuanced discussions that go beyond the headlines, considering the real-world implications of new models, policy shifts, and corporate moves in the tech landscape. This podcast doesn't just tell you what happened; it provides context on why it matters for developers, businesses, and society at large. It’s an efficient way to stay informed and critically engaged with a field that is reshaping our world at a breathtaking pace. Tune in for a consistently insightful analysis that makes the complex world of AI feel accessible and relevant, week after week.
Author: Language: English Episodes: 100

Last Week in AI
Podcast Episodes
#245 - TML-Interaction, Claude For Legal, Sam Altman on Stand [not-audio_url] [/not-audio_url]

Duration: 1:49:14
Our 245th episode with a summary and discussion of last week's big AI news! Recorded on 05/13/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#244 - GPT-5.5 Instant, Grok 4.3, OpenAI vs Musk [not-audio_url] [/not-audio_url]

Duration: 1:55:16
Our 244th episode with a summary and discussion of last week's big AI news! Recorded on 05/08/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#242 - ChatGPT Images 2.0, Qwen 3.6 Max, Kimi-K2.6 [not-audio_url] [/not-audio_url]

Duration: 1:30:48
Our 242nd episode with a summary and discussion of last week's big AI news! Recorded on 04/22/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#241 - Opus 4.7, Muse Spark, GPT-5.4-Cyber, HY-World 2.0 [not-audio_url] [/not-audio_url]

Duration: 1:59:48
Our 241st episode with a summary and discussion of last week's big AI news! Recorded on 04/18/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#240 - Project Glasswing, Claude Mythos, GLM-5.1, emotion concepts [not-audio_url] [/not-audio_url]

Duration: 1:44:30
Our 240th episode with a summary and discussion of last week's big AI news! Recorded on 04/08/2026 (sorry I keep releasing stuff late, will get better with it soon!) Hosted by Andrey Kurenkov and Jeremie Harris Feel free…
#239 - RIP Sora, Claude Openclaw, HyperAgents [not-audio_url] [/not-audio_url]

Duration: 1:37:42
Our 239th episode with a summary and discussion of last week's big AI news! FYI: this one has pretty out of date news, I was traveling last week and failed to upload... apologies. Recorded on 03/25/2026 Hosted by Andrey…
#238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals [not-audio_url] [/not-audio_url]

Duration: 2:00:49
Our 238th episode with a summary and discussion of last week's big AI news! Recorded on 03/18/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#237 - Nemotron 3 Super, xAI reborn, Anthropic Lawsuit, Research!!! [not-audio_url] [/not-audio_url]

Duration: 2:27:19
Our 237th episode with a summary and discussion of last week's big AI news! Recorded on 03/13/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#236 - GPT 5.4, Gemini 3.1 Flash Lite, Supply Chain Risk [not-audio_url] [/not-audio_url]

Duration: 1:28:34
Our 236th episode with a summary and discussion of last week's big AI news! Recorded on 03/06/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…
#235 - Sonnet 4.6, Deep-thinking tokens, Anthropic vs Pentagon [not-audio_url] [/not-audio_url]

Duration: 1:41:48
Our 235th episode with a summary and discussion of last week's big AI news! Recorded on 02/27/2026 Hosted by Andrey Kurenkov and Jeremie Harris Feel free to email us your questions and feedback at andreyvkurenkov@gmail.c…