PodcastsTechnologyThis is Fine! A podcast about resilience engineering and software

This is Fine! A podcast about resilience engineering and software

Colette Alexander and Clint Byrum
This is Fine! A podcast about resilience engineering and software
Latest episode

33 episodes

  • This is Fine! A podcast about resilience engineering and software

    SRECon Americas 2026 recap

    14/04/2026 | 55 mins.
    Colette’s talk at SRECon intro: https://www.usenix.org/conference/srecon26americas/presentation/alexander

    Clint’s talk at SRECon intro: https://www.usenix.org/conference/srecon26americas/presentation/byrum

    Dan Slimmon is an excellent engineer (per Clint’s shoutout) and ALSO an excellent podcast creator/host: https://techblows.net/

    Michelle Brush’s Keynote summary is here: https://www.usenix.org/conference/srecon26americas/presentation/brush

    Jevon’s Paradox: https://en.wikipedia.org/wiki/Jevons_paradox

    Dr. Nicole Forsgren’s talk summary: https://www.usenix.org/conference/srecon26americas/presentation/forsgren

    DORA is always worth a dive into if you haven’t taken a look yet: https://dora.dev/

    The blog post Colette mentioned comparing AI gold rush to Mao’s Revolution: https://leehanchung.github.io/blogs/2026/04/05/the-ai-great-leap-forward/

    Many people have written about why MTTR is a bad metric to track, you can read a write up from Adrian Hornsby here: https://newsletter.resiliumlabs.com/p/mttr-problems-better-incident-metrics

    And watch the OG, Courtney Nash, speak about it here: https://www.youtube.com/watch?v=uhCgBOHo8EY

    Beth Long’s SRE Soundbath: https://www.usenix.org/conference/srecon26americas/presentation/long

    Vanessa Huerta-Granda’s talk is summarized here: https://www.usenix.org/conference/srecon26americas/presentation/huerta-granda

    Martin Smith and Abe Hoffman’s talk is summarized here: https://www.usenix.org/conference/srecon26americas/presentation/hoffman

    Some information about Metrist: https://vault42consulting.com/about/portfolio/metrist

    AI Agents Good Bad and Ugly talk: https://www.usenix.org/conference/srecon26americas/presentation/budichenko

    The CAST talk: https://www.usenix.org/conference/srecon26americas/presentation/barroso

    Engineering a Safer World by Nancy Leveson is worth a look: https://bookshop.org/p/books/engineering-a-safer-world-systems-thinking-applied-to-safety-nancy-g-leveson/57b01ef464f9f81b?ean=9780262533690&next=t

    Erik Hollnagel wrote the book on FRAM and it has a lot of support in the safety world across industries: https://functionalresonance.com/ and https://etn-peter.eu/2021/02/11/fram-in-a-nutshell/ are good resources.

    Daria Barteneva’s closing keynote on game theory and SRE was great: https://www.usenix.org/conference/srecon26americas/presentation/barteneva

    Some good stuff on Above the Line/Below the Line, if you’re curious:
    https://queue.acm.org/detail.cfm?id=3380777

    https://www.youtube.com/watch?v=xA5U85LSk0M

    Lorin Hochstein’s closing keynote on storytelling was rad: https://www.usenix.org/conference/srecon26americas/presentation/hochstein

    SRECon EMEA 2026 (in Dublin) has their CFP up: https://www.usenix.org/conference/srecon26emea/call-for-participation

    As always, you can check out the Resilience in Software Foundation at resilienceinsoftware.org
  • This is Fine! A podcast about resilience engineering and software

    The 2025 DORA Report w/special guest Fred Hebert

    12/03/2026 | 59 mins.
    You can find the 2025 DORA Report here: https://dora.dev/research/2025/dora-report/

    Read more of Fred’s work/opinions here: https://ferd.ca/

    If you want to know more about Lund’s Human Factors and Systems Safety program, you can read here: https://www.humanfactors.lth.se/

    DORA has some good writeups of generative leadership and Westrum’s model here: https://dora.dev/capabilities/generative-organizational-culture/

    We can reset the counter, it’s been 0 episodes since we mentioned Lorin’s Law: https://surfingcomplexity.blog/2017/06/24/a-conjecture-on-why-reliable-systems-fail/

    Fred writes well about the Law of Stretched Systems: https://ferd.ca/the-law-of-stretched-cognitive-systems.html

    We’re still trying to schedule a DORA event with our friends who make the report, but keep an eye out on https://resilienceinsoftware.org/events - it will pop up there when we do!
  • This is Fine! A podcast about resilience engineering and software

    Building and Revising Adaptive Capacity Sharing for Technical Incident Response with Beth Adele Long

    26/02/2026 | 1h 9 mins.
    The Keewenaw snow gauge that Colette mentioned is a tourist attraction. If you want to see where measurements are at for the season you can find them here: https://www.pasty.com/snow/

    The paper we’re talking about today can be found here: https://www.sciencedirect.com/science/article/abs/pii/S0003687020301903

    If you want to know more about SNAFU Catchers, you can see their website here: https://www.snafucatchers.com/

    They produced the STELLA report: https://snafucatchers.github.io/

    Richard Cook’s Bone Talk is kind of famous - here’s a version from REDeploy: https://www.youtube.com/watch?v=8LbePBiOvZ4

    Some writing from New Relic about NERFs: https://newrelic.com/blog/observability/best-practices-incident-commander-training

    We failed to mention it in the podcast itself, but Michael Wettick did a great thesis at Lund on asking for help in software operations incidents: https://lup.lub.lu.se/luur/download?func=downloadFile&recordOId=9150096&fileOId=9150099

    Speaking of Hitchhiker’s Guide, etsy has some cool merch: https://www.etsy.com/listing/1071043200/dont-panic-hitchhikers-guide-to-the

    You can find David Woods’ paper on Graceful Extensibility here: https://link.springer.com/article/10.1007/s10669-018-9708-3

    Our Paper Club event on this paper on March 17th can be signed up for here: https://resilienceinsoftware.org/events/164680
  • This is Fine! A podcast about resilience engineering and software

    Outsourcing and Resilience

    12/02/2026 | 41 mins.
    Colette mentioned Menlo Innovations https://menloinnovations.com/ and Atomic Object https://atomicobject.com/ who both build custom software for folks. The CEO of Menlo is Richard Sheridan who wrote Joy, Inc. - https://bookshop.org/p/books/joy-inc-how-we-built-a-workplace-people-love-richard-sheridan/7677689?ean=9781591847120&next=t

    Chad Todd’s thesis on Handovers in Software Operations is worth a read: https://lup.lub.lu.se/luur/download?func=downloadFile&recordOId=9076274&fileOId=9076276

    Clint refers to Zingerman’s and their servant leadership model, one of Colette’s favorite places to learn about leadership from. If you want to know more, go to https://www.zingtrain.com/ and in particular, read https://shop.zingtrain.com/products/a-lapsed-anarchists-approach-to-being-a-better-leader
  • This is Fine! A podcast about resilience engineering and software

    The Messy 9 and Coding with AI - A Panel Discussion

    01/02/2026 | 1h 43 mins.
    Special thanks to John Allspaw, Sheeri Cabral, Martin Smith, and David Woods for joining us!

    Ben Affleck’s been making the promo rounds, but the specific convo we reference is recapped here: https://www.moviemaker.com/ben-affleck-ai-explains/

    The Messy 9 are:
    congestion
    cascade
    conflict
    lag
    saturation
    friction
    tempo
    surprise
    tangles

    Dave’s been doing a set of videos on Resilience Engineering, some of which have some crossover with the Messy 9 - you can find the first one here:
    https://resiliencefoundations.github.io/video-1-introduction-pt-1-it's-all-about-viability.html

    Previous TiF episode on the messy 9:
    https://www.thisisfinepod.com/the-pod/complex-systems-and-the-messy-nine-wspecial-guests-dave-woods-and-john-allspaw

    Richard Cook on Above the Line/Below the Line:
    Written - https://dl.acm.org/doi/pdf/10.1145/3379510
    A good excerpt from a talk from John Allspaw on Above the Line/Below the Line: https://www.youtube.com/watch?v=8bxj-FLEi10&list=PLb1aZTnPf3-OMChMkrr6WsokRI6LOnuem

    Colette mentioned the competence knowledge model: https://en.wikipedia.org/wiki/Four_stages_of_competence

    There’s a good argument based on the conversation here that AI makes it harder for Consciously Incompetent people to graduate to Conscious Competence. And, in Martin’s case, it makes Unconsciously Competent folks need to backtrack into Conscience Competence to “teach” it how to do things they don’t always think about.

    We can reset the clock to 0 episodes since we’ve mentioned the Ironies of Automation: https://ckrybus.com/static/papers/Bainbridge_1983_Automatica.pdf

    There is a good blog on Jamie Zawinski’s saying on regular expressions here: https://regex.info/blog/2006-09-15/247

    Alex Gorbachev and The Battle Against Any Guess seems to have become a paper https://www.researchgate.net/publication/251255185_Battle_Against_Any_Guess

    Dave talks about Robust Yet Fragile as part of Resilience Engineering here: https://www.youtube.com/watch?v=gFotUdLL2zs

    Lorin Hochstein’s blog post that Dave is referencing is https://surfingcomplexity.blog/2026/01/19/amdahl-gustafson-coding-agents-and-you/

    Fred writes a good one on the Law of Stretched Systems: ​​https://ferd.ca/the-law-of-stretched-cognitive-systems.html

    The 1985 paper Dave keeps mentioning could be any number of things he released that year, but I have a hunch it’s this one: https://ojs.aaai.org/aimagazine/index.php/aimagazine/article/view/511 or this one: https://link.springer.com/chapter/10.1007/978-3-642-50329-0_11

    Dave references a lot of things around the economic sustainability around AI, and Ed Zitron has been writing quite a bit about that for the last year and change. See: https://www.wheresyoured.at/wheres-the-money/

    https://www.wheresyoured.at/big-tech-2tr/

    Among others.

More Technology podcasts

About This is Fine! A podcast about resilience engineering and software

A podcast about resilience engineering and software. Ever wondered why things on the internet break? Do you work in software and wish that you could have a Dear-Abby-Like call-in show that could answer your deepest questions about how to make your workplace suck less? We're here to help! Write us anonymously at our open question form Email us at: [email protected] Call us and leave a voicemail, or text us at: ‪(401) 592-7574‬
Podcast website

Listen to This is Fine! A podcast about resilience engineering and software, Dwarkesh Podcast and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features