#201 - GPT 4.5, Sonnet 3.7, Grok 3, Phi 4

Author: Skynet Today March 5, 2025 Duration: 58:37

Our 201st episode with a summary and discussion of last week's big AI news! Recorded on 03/02/2025 Join our brand new Discord here! https://discord.gg/nTyezGSKwP Hosted by Andrey Kurenkov and guest host Sharon Zhou Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. In this episode: - The release of GPT-4.5 from OpenAI, Anthropic's Claude 3.7, and Grok 3 from XAI, comparing their features, costs, and capabilities. - Discussion on new tools and applications including Sesame's new voice assistant and Google's AI coding assistant, Gemini Code Assist, highlighting their unique benefits. - OpenAI's continued user growth despite competition, pricing models for Google's text-to-video platform, and HP acquiring and shutting down Humane's AI pin. - Insights into new research on alignment and specification gaming in LLMs, including papers on fine-tuning causing broad misalignment and Google's multi-agent system for scientific collaboration. Timestamps + Links: (00:00:00) Intro / Banter (00:01:36) News Preview Tools & Apps (00:02:33) OpenAI announces GPT-4.5, warns it’s not a frontier AI model (00:07:22) Anthropic launches a new AI model that ‘thinks’ as long as you want (00:11:14) New Grok 3 release tops LLM leaderboards (00:16:43) Sesame is the first voice assistant I’ve ever wanted to talk to more than once (00:18:30) Google launches a free AI coding assistant with very high usage caps (00:20:45) Rabbit shows off the AI agent it should have launched with (00:22:23) Mistral’s Le Chat tops 1M downloads in just 14 days Applications & Business (00:24:06) OpenAI Tops 400 Million Users Despite DeepSeek’s Emergence (00:27:37) Google’s new AI video model Veo 2 will cost 50 cents per second (00:29:52) HP is buying Humane and shutting down the AI Pin Projects & Open Source (00:31:44) Microsoft launches next-gen Phi AI models. (00:33:47) OpenAI introduces SWE-Lancer: A Benchmark for Evaluating Model Performance on Real-World Freelance Software Engineering Work (00:37:12) SWE-Bench+: Enhanced Coding Benchmark for LLMs Research & Advancements (00:40:00) Towards an AI co-scientist (00:42:52) Magma: A Foundation Model for Multimodal AI Agents Policy & Safety (00:47:32) Demonstrating specification gaming in reasoning models (00:51:03) Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Last Week in AI

Keeping up with artificial intelligence can feel like drinking from a firehose. Every week brings a new breakthrough, a surprising application, or an urgent ethical debate. Last Week in AI, from the team at Skynet Today, is here to turn that torrent into a clear, digestible stream. Instead of getting lost in the noise, you'll get a thoughtful rundown of the developments that actually have impact, explained without unnecessary jargon. Each episode feels like a conversation with well-informed friends who have done the homework for you, sifting through research papers, product launches, and industry announcements to highlight what's substantive. You'll hear nuanced discussions that go beyond the headlines, considering the real-world implications of new models, policy shifts, and corporate moves in the tech landscape. This podcast doesn't just tell you what happened; it provides context on why it matters for developers, businesses, and society at large. It’s an efficient way to stay informed and critically engaged with a field that is reshaping our world at a breathtaking pace. Tune in for a consistently insightful analysis that makes the complex world of AI feel accessible and relevant, week after week.

Author: Skynet Today Language: English Episodes: 100

Official website RSS

Podcast Episodes

[not-audio_url]

[/not-audio_url]

#161 - Claude 3 beats GPT-4, Stability CEO resigns, DBRX, TacticAI, UN resolution on AI

31.03.2024

Duration: 1:36:24

Our 161st episode with a summary and discussion of last week's big AI news! Check out our sponsor, the SuperDataScience podcast. You can listen to SDS across all major podcasting platforms (e.g., Spotify, Apple Podcasts,…

[not-audio_url]

[/not-audio_url]

#160 - Nvidia's new GPU, Microsoft pays for Inflection AI, Grok-1 open sourced, Jeremie's Action Plan

24.03.2024

Duration: 1:39:34

Our 160th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai…

[not-audio_url]

[/not-audio_url]

#159 - Inflection-2.5, Devin, OpenAI board update, SIMA, EU AI Act passed

18.03.2024

Duration: 1:00:03

Our 159th episode with a summary and discussion of last week's big AI news! Check out our sponsor, the SuperDataScience podcast. You can listen to SDS across all major podcasting platforms (e.g., Spotify, Apple Podcasts,…

[not-audio_url]

[/not-audio_url]

#158 - Claude 3, Elon Musk sues OpenAI, StarCoder 2, AI-Generated Spam

10.03.2024

Duration: 1:48:45

Our 158th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai…

[not-audio_url]

[/not-audio_url]

#157 - Gemini controversy, new Mistral models, Deepmind's Genie & Griffinn, AI Warfare is here

03.03.2024

Duration: 1:44:57

Our 157th episode with a summary and discussion of last week's big AI news! Check out our sponsor, the SuperDataScience podcast. You can listen to SDS across all major podcasting platforms (e.g., Spotify, Apple Podcasts,…

[not-audio_url]

[/not-audio_url]

#156 - OpenAI's Sora, Gemini 1.5, BioMistral, V-JEPA, AI Task Force, Fun!

25.02.2024

Duration: 1:45:06

Our 156th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai…

[not-audio_url]

[/not-audio_url]

#155 - ChatGPT memory, Altman seeks trillions, Califonia AI regulation, art gen lawsuit

16.02.2024

Duration: 1:45:41

Our 155th episode with a summary and discussion of last week's big AI news! Correction: Andrey said CLIP came out with DALL-E 2; it came out alongside the first DALL-E. Check out our sponsor, the SuperDataScience podcast…

[not-audio_url]

[/not-audio_url]

#154 - Google Gemini, Waymo Collision, Smaug-72B, EU AI Act final text, image watermarks

11.02.2024

Duration: 1:37:16

Our 154th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai…

[not-audio_url]

[/not-audio_url]

#153 - Taylor Swift Deepfakes, ChatGPT features, Meta-Prompting, two new US bills

04.02.2024

Duration: 1:46:06

Our 153rd episode with a summary and discussion of last week's big AI news! Check out our sponsor, the SuperDataScience podcast. You can listen to SDS across all major podcasting platforms (e.g., Spotify, Apple Podcasts,…

[not-audio_url]

[/not-audio_url]

#152 - live translation on phones, Meta aims at AGI, AlphaGeometry, political deepfakes

28.01.2024

Duration: 1:27:18

Our 152nd episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai…