Can Grok and Claude run a business? We just did it

Can Grok and Claude run a business? We just did it

Author: Wes Roth and Dylan Curious December 29, 2025 Duration: 1:28:44

Andon Labs tests AI autonomy by letting agents run businesses in messy reality with real customers, consequences. In VendingBench, an agent starts with $500 and an empty vending machine, researches trends and suppliers, emails wholesalers, restocks, tracks sales, and iterates for profit. When deployed at Anthropic, humans red-teamed it with sob stories, discount demands, and bizarre requests like tungsten cubes, triggering “bank runs” of freebie seekers. Long histories caused drift and hallucinations, including dramatic escalations and invented security reports. Multi-agent supervisors often amplified each other into hype or doom. Better tools and memory compression help, but long-horizon planning stays fragile.



Every week, Wes Roth and Dylan Curious sit down to untangle the rapid and often bewildering evolution of artificial intelligence. Their AI Pod is a direct line to the people creating the tools and theories that are reshaping society. You’ll hear candid conversations with the researchers, engineers, and entrepreneurs working on everything from practical robotics and synthetic biology to the philosophical questions surrounding superintelligence. Rather than just reporting the headlines, this podcast digs into the context behind them, exploring how today's experiments in autonomous systems or startup disruptions become tomorrow's everyday reality. The tone is curious and accessible, breaking down complex ideas without oversimplifying the profound implications. For anyone trying to make sense of where this technology is headed and who is steering it, tuning into this series feels like getting a thoughtful briefing from informed friends. It’s that blend of timely news analysis and deep-dive interviews that makes the AI Pod a consistent source for understanding the forces at the cutting edge.
Author: Language: en-us Episodes: 36

AI Pod by Wes Roth and Dylan Curious | Artificial Intelligence News and Interviews With Experts
Podcast Episodes