Hardening Agents for E-commerce Scale: From RL Alignment to Reliability // Panel 2

Hardening Agents for E-commerce Scale: From RL Alignment to Reliability // Panel 2

Author: Demetrios December 2, 2025 Duration: 29:16

Thanks to Prosus Group for collaborating on the Agents in Production Virtual Conference 2025.


Abstract //

The discussion centers on highly technical yet practical themes, such as the use of advanced post-training techniques like Direct Preference Optimization (DPO) and Parameter-Efficient Fine-Tuning (PEFT) to ensure LLMs maintain stability while specializing for e-commerce domains. We compare the implementation challenges of Computer-Using Agents in automating legacy enterprise systems versus the stability issues faced by conversational agents when inputs become unpredictable in production. We will analyze the role of cloud infrastructure in supporting the continuous, iterative training loops required by Reinforcement Learning-based agents for e-commerce!


Bio //

Paul van der Boor (Panel Host) //

Paul van der Boor is a Senior Director of Data Science at Prosus and a member of its internal AI group.


Arushi Jain (Panelist) //

Arushi is a Senior Applied Scientist at Microsoft, working on LLM post-training for Computer-Using Agent (CUA) through Reinforcement Learning. She previously completed Microsoft’s competitive 2-year AI Rotational Program (MAIDAP), building and shipping AI-powered features across four product teams.


She holds a Master’s in Machine Learning from the University of Michigan, Ann Arbor, and a Dual Degree in Economics from IIT Kanpur. At Michigan, she led the NLG efforts for the Alexa Prize Team, securing a $250K research grant to develop a personalized, active-listening socialbot. Her research spans collaborations with Rutgers School of Information, Virginia Tech’s Economics Department, and UCLA’s Center for Digital Behavior.


Beyond her technical work, Arushi is a passionate advocate for gender equity in AI. She leads the Women in Data Science (WiDS) Cambridge community, scaling participation in her ML workshops from 25 women in 2020 to 100+ in 2025—empowering women and non-binary technologists through education and mentorship.


Swati Bhatia //

Passionate about building and investing in cutting-edge technology to drive positive impact.


Currently shaping the future of AI/ML at Google Cloud.


10+ years of global experience across the U.S, EMEA, and India in product, strategy & venture capital (Google, Uber, BCG, Morpheus Ventures).


Audi Liu //

I’m passionate about making AI more useful and safe.


Why? Because AI will be ubiquitous in every workflow, powering our lives just like how electricity revolutionized our society - It’s pivotal we get it right.


At Inworld AI, we believe all future software will be powered by voice. As a Sr Product Manager at Inworld, I'm focused on building a real-time voice API that empowers developers to create engaging, human-like experiences. Inworld offers state-of-the-art voice AI at a radically accessible price - No. 1 on Hugging Face and Artificial Analysis, instant voice cloning, rich multilingual support, real-time streaming, and emotion plus non-verbal control, all for just $5 per million characters.


Isabella Piratininga //

Experienced Product Leader with over 10 years in the tech industry, shaping impactful solutions across micro-mobility, e-commerce, and leading organizations in the new economy, such as OLX, iFood, and now Nubank. I began my journey as a Product Owner during the early days of modern product management, contributing to pivotal moments like scaling startups, mergers of major tech companies, and driving innovation in digital banking.


My passion lies in solving complex challenges through user-centered product strategies. I believe in creating products that serve as a bridge between user needs and business goals, fostering value and driving growth. At Nubank, I focus on redefining financial experiences and empowering users with accessible and innovative solutions.


Check out all the talks from the conference here: https://go.mlops.community/carzleGet some "I hallucinate more than ChatGPT" t-shirts here: https://go.mlops.community/NL_RY25_Merch


Hosted by Demetrios, MLOps.community is a space for honest, meandering talks about the real work of making artificial intelligence systems actually work. This isn't about hype or theoretical papers; it's about the messy, practical, and often surprising journey of taking models from a notebook into a live environment. You'll hear from engineers and practitioners who are in the trenches, discussing the tools, the frustrations, and the occasional breakthroughs that define the day-to-day. The conversations are deliberately relaxed, covering everything from traditional machine learning pipelines to the new world of large language models and even the intangible "vibes" of team culture and process. Each episode peels back a layer on what "production" really means, whether that involves deploying a predictive service, managing an agentic system, or maintaining reliability as everything scales. Tuning into this podcast feels like grabbing a coffee with colleagues who aren't afraid to dig into the technical nitty-gritty while keeping the tone conversational and accessible. It's for anyone who builds, manages, or is just curious about the operational backbone that allows AI to deliver value, offering a grounded perspective often missing from the broader conversation.
Author: Language: en-us Episodes: 100

MLOps.community
Podcast Episodes
Autonomous Agents at Work: From OpenClaw Hype to Enterprise Reality [not-audio_url] [/not-audio_url]

Duration: 42:19
Pramod Krishnan is a Managing Director - AI Managed Services at PwC, specializing in enterprise AI transformation — helping large organizations move from AI experimentation to production operating models. In this episode…
Agents are Just While Loops [not-audio_url] [/not-audio_url]

Duration: 41:11
Hamza Tahir, co-founder of ZenML, joins the show to cut through the hype around long-running agents — arguing that at the end of the day, an agent is just a while loop that talks to a model, calls a tool, and writes to a…
The Latency Goldilocks Zone Explained [not-audio_url] [/not-audio_url]

Duration: 48:13
Rafael (Head of Innovation, iFood) and Daniel (Data and AI Manager, iFood) pull back the curtain on ILO-Agent — iFood's conversational AI ordering system built for 200 million users across Latin America. Recorded live at…
Building MCP Before MCP Existed: Inside Despegar's Sofia Agent [not-audio_url] [/not-audio_url]

Duration: 41:13
Nicolas Alejandro Bogliolo is the AI PM at Despegar, the largest online travel agency in Latin America, and the engineer-product-hybrid behind Sofia, the GenAI travel concierge that beat most of the OTA world to a workin…
Voice Agent Use Cases [not-audio_url] [/not-audio_url]

Duration: 51:04
This episode is brought to you by the MLflow team. Check out more information at MLflow.org.What does it actually take to build voice AI at a billion-interaction scale? This episode features an ex-Amazon voice AI enginee…
It's 2026, and We're Still Talking Evals [not-audio_url] [/not-audio_url]

Duration: 40:56
Maggie Konstanty is an AI Product Manager at Prosus, one of the world's largest consumer internet companies, where she builds and evaluates AI agents for food ordering and ecommerce at scale. She's been inside the messy…
Why Agents are Driving Software Development to the Cloud [not-audio_url] [/not-audio_url]

Duration: 51:07
This episode is brought to you by Hyperbolic and the MLflow team. Check out more information at hyperbolic.ai and MLflow.org.Why AI Coding Agents Are Moving to the Cloud — With Zach Lloyd, CEO of WarpZach Lloyd is the fo…
The Modern Software Engineer [not-audio_url] [/not-audio_url]

Duration: 53:37
This episode is brought to you by the MLflow team. Check out more information at MLflow.org.Mihail Eric is Head of AI at Monaco and Adjunct Lecturer at Stanford University, where he teaches CS146S: "The Modern Software D…
We Cut LLM Latency by 70% in Production [not-audio_url] [/not-audio_url]

Duration: 1:05:20
Maher Hanafi is an engineering leader who went from zero AI experience to self-hosting LLMs at enterprise scale — managing GPU costs, optimizing inference with TensorRT LLM, and building an AI platform for HR tech. In th…