Kubernetes Shake-ups, Platform Reality, and AI-Native SRE

Kubernetes Shake-ups, Platform Reality, and AI-Native SRE

Author: Teller's Tech - DevOps, SRE and Cloud Podcast November 21, 2025 Duration: 15:53

In this episode of Ship It Weekly, Brian digs into 3 big themes for anyone running Kubernetes or building internal platforms.

First, Kubernetes is officially retiring Ingress NGINX and moving it into best-effort maintenance until March 2026. We talk about what that actually means if you’re still using it and how to think about choosing and rolling out a replacement ingress.

Second, we look at how CNCF is defining platform engineering and what “platform as a product” looks like in practice, plus some hard-earned lessons from running Kubernetes in production.

Third, we talk about AI as a first-class workload on Kubernetes. CNCF’s new Certified Kubernetes AI Conformance Program aims to standardize how AI runs on K8s, and recent writing on SRE in the age of AI looks at what reliability means when systems learn and drift.

In the lightning round, we hit good reads on database migrations, Postgres upgrades, and a distributed priority queue on Kafka. We wrap with the human side of incidents: fixation during incident response and using incidents as landmarks for the tradeoffs you’ve been making over time.

If you’re on a platform team, responsible for SLOs, or the person people ping when “Kubernetes is weird,” this one should give you concrete questions to take back to your roadmap and runbooks.

Links from this episode

https://kubernetes.io/blog/2025/11/11/ingress-nginx-retirement/

https://www.haproxy.com/blog/ingress-nginx-is-retiring

https://www.cncf.io/blog/2025/11/19/what-is-platform-engineering/

https://www.cncf.io/announcements/2025/11/11/cncf-launches-certified-kubernetes-ai-conformance-program-to-standardize-ai-workloads-on-kubernetes/

https://devops.com/sre-in-the-age-of-ai-what-reliability-looks-like-when-systems-learn/

Lightning round

https://www.cncf.io/blog/2025/11/18/top-5-hard-earned-lessons-from-the-experts-on-managing-kubernetes/

https://www.tines.com/blog/zero-downtime-database-migrations-lessons-from-moving-a-live-production

https://palark.com/blog/postgresql-upgrade-no-data-loss-downtime/

https://klaviyo.tech/building-a-distributed-priority-queue-in-kafka-1b2d8063649e

https://sreweekly.com/sre-weekly-issue-497/

https://ferd.ca/ongoing-tradeoffs-and-incidents-as-landmarks.html


For anyone building or running modern systems, the sheer volume of news, tools, and incident reports can be overwhelming. Ship It Weekly cuts through that noise. This isn't a surface-level scan of headlines. Host Brian Teller digs into the latest significant outages, major software releases, and insightful post-mortems, focusing squarely on the practical implications for DevOps, SRE, and platform engineering work. Each episode of the podcast breaks down a couple of key stories, providing the crucial context often missing from tech news. You'll hear analysis that translates events into actionable insights, answering the "so what?" for your own infrastructure and processes. The show also includes a quick rundown of tools or updates actually worth your attention, saving you hours of browsing. The tone is direct and informed, favoring depth over breadth. It’s designed for engineers and technical leaders who need a concise, reliable filter for the week's most relevant developments. Listen to this podcast for a focused recap that prioritizes what actually matters, delivered without fluff. You get the news, plus the necessary interpretation to understand how it might affect your systems, your team, and your on-call rotation. It's a weekly briefing that respects your time while aiming to make you more effective.
Author: Language: English Episodes: 37

Ship It Weekly - DevOps, SRE, Platform and Cloud Engineering News
Podcast Episodes
GitHub Runner Pricing Pause, Terraform Cloud Limits, and AI in CI [not-audio_url] [/not-audio_url]

Duration: 12:06
This week on Ship It Weekly, Brian looks at how the “platform tax” is showing up everywhere: pricing model shifts, CI dependencies, and new security boundaries thanks to AI agents.We start with GitHub Actions. GitHub ann…
IBM Buys Confluent, React2Shell, and Netflix on Aurora [not-audio_url] [/not-audio_url]

Duration: 16:14
In this episode of Ship It Weekly, Brian powers through a cold and digs into a very “infra grown-up” week in DevOps.First up, IBM is buying Confluent for $11B. We talk about what that means if you’re on Confluent Cloud t…