Early Twitter's fail-whale wars | Dmitriy Ryaboy

Early Twitter's fail-whale wars | Dmitriy Ryaboy

Author: Ronak Nathani, Guang Yang August 13, 2024 Duration: 1:08:46

A veteran of early Twitter's fail whale wars, Dmitriy joins the show to chat about the time when 70% of the Hadoop cluster got accidentally deleted, the financial reality of writing a book, and how to navigate acquisitions.

Segments:
(00:00:00) The Infamous Hadoop Outage
(00:02:36) War Stories from Twitter's Early Days
(00:04:47) The Fail Whale Era
(00:06:48) The Hadoop Cluster Shutdown
(00:12:20) "First Restore the Service Then Fix the Problem. Not the Other Way Around."
(00:14:10) War Rooms and Organic Decision-Making
(00:16:16) The Importance of Communication in Incident Management
(00:19:07) That Time When the Data Center Caught Fire
(00:21:45) The "Best Email Ever" at Twitter
(00:25:34) The Importance of Failing
(00:27:17) Distributed Systems and Error Handling
(00:29:49) The Missing README
(00:33:13) Agile and Scrum
(00:38:44) The Financial Reality of Writing a Book
(00:43:23) Collaborative Writing Is Like Open-Source Coding
(00:44:41) Finding a Publisher and the Role of Editors
(00:50:33) Defining the Tone and Voice of the Book
(00:54:23) Acquisitions from an Engineer's Perspective
(00:56:00) Integrating Acquired Teams
(01:02:47) Technical Due Diligence
(01:04:31) The Reality of System Implementation
(01:06:11) Integration Challenges and Gotchas

Show Notes:
- Dmitriy Ryaboy on Twitter: https://x.com/squarecog
- The Missing README: https://www.amazon.com/Missing-README-Guide-Software-Engineer/dp/1718501838
- Chris Riccomini on how to write a technical book: https://cnr.sh/essays/how-to-write-a-technical-book

Stay in touch:
- Make Ronak's day by signing up for our newsletter to get our favorites parts of the convo straight to your inbox every week :D https://softwaremisadventures.com/

Music: Vlad Gluschenko — Forest License: Creative Commons Attribution 3.0 Unported: https://creativecommons.org/licenses/by/3.0/deed.en


Behind every line of code, there's a person with a story, and that's where Software Misadventures finds its pulse. Hosts Ronak Nathani and Guang Yang pull up a chair with engineers, founders, and investors, but the conversation rarely stays in the technical manual. Instead, it wanders into the human territory of career detours, hard-won insights, and those unpredictable stumbles that often teach the most. This podcast is built on the idea that the journey is just as important as the destination, especially in the fast-moving tech world. You'll hear guests recount the projects that went sideways, the decisions they'd rethink, and the moments of clarity that emerged from the chaos. It’s a refreshingly honest look at the industry, emphasizing that expertise isn't just about what you build, but what you learn when things don't go as planned. Tune in for conversations that are less about perfect solutions and more about the real, sometimes messy, process of creating with technology. Each episode offers a blend of professional wisdom and personal narrative, making it a compelling listen for anyone curious about the lives woven into our digital landscape.
Author: Language: English Episodes: 55

Software Misadventures
Podcast Episodes
Ryan Underwood - On debugging the Linux kernel - #4 [not-audio_url] [/not-audio_url]

Duration: 1:02:57
Ryan Underwood is a Staff SRE and tech lead on the Helix and Zookeeper SRE team at LinkedIn. Prior to LinkedIn, he was an SRE at Machine Zone and Google. Apart from his regular responsibilities, Ryan's interest and exper…