Resilience Bites #7 - LinkedIn Rewind (week 13)
March 29th, 2025
Hi everyone,
Welcome to my weekly LinkedIn recap!
I'm thrilled to see how our community continues to grow around this newsletter. Your engagement and insightful comments on last week's edition have been truly motivating.
Now, let's jump into this week's highlights!
The Linkedin Roundup
Using GameDays to replace false confidence with awareness
An engineer's feedback, "By doing gamedays, I realized I was absolutely not ready to be on-call," reveals a crucial but overlooked benefit of chaos engineering exercises. While gamedays are often promoted as confidence-builders, they work bidirectionally: building justified confidence where needed and replacing false confidence with healthy awareness where appropriate. This team had lost half its engineers and implemented two-hour "mini-gamedays" to prepare junior replacements for on-call duties. These controlled exercises revealed knowledge gaps that would have remained hidden until discovered during real customer-impacting incidents. Teams develop true readiness only through controlled exposure to challenging scenarios, where failure is safe and learning is prioritized, creating an environment where both hesitant and overconfident engineers can calibrate their actual capabilities.
Read the full post on LinkedIn
AI has transformed from a simple coding assistant into a valuable intellectual sparring partner that challenges thinking, identifies weaknesses in arguments, and accelerates learning through active dialogue. Unlike human colleagues who might have biases or limited availability, AI provides unfiltered feedback without concern for ego, making it an ideal devil's advocate that helps overcome confirmation bias. The technology dramatically speeds up research by analyzing vast amounts of information quickly and enables accelerated learning through continuous questioning. Rather than passive information consumption, this "boxing with AI" approach creates a personalized, interactive learning experience that could revolutionize education by providing students with patient, adaptable tutors that respond to endless questions and help explore topics at an individual pace.
Read the full post on LinkedIn
"Best practices" for resilient systems often fail because they promote static, rigid answers for unique, evolving contexts. These standardized approaches become problematic as they freeze in time (solving yesterday's problems, not tomorrow's), focus on components rather than interactions, promote compliance over understanding, and create false security. While best practices can serve as useful starting points, they become problematic when treated as perfect blueprints rather than adaptable guidance. A more effective approach focuses on building contextual understanding of why practices work in specific situations, developing system intuition through chaos experiments, creating space for adaptation by questioning standard responses, and investing in detection capabilities to notice changing conditions. The critical skill is "seeing what is really there"—staying curious and addressing actual problems rather than applying generic solutions.
Read the full post on LinkedIn
The problem with scheduled GameDays
Scheduled GameDays provide limited insights because teams operate in "best behavior" mode when they know an exercise is coming, showing their idealized responses rather than natural reactions. Research confirms that managers demonstrate greater adaptive capacity during actual incidents than during prepared exercises—similar to how someone's home appears different during a planned visit versus an unexpected one. Surprise GameDays reveal an organization's true resilience by exposing natural communication patterns, how teams default to procedures versus demonstrating flexibility, and how quickly the organization detects unexpected conditions. While these unannounced exercises may create more initial stress, they provide something far more valuable: an honest assessment of adaptive capacity that perfectly scheduled exercises can never match. If GameDays always proceed flawlessly, you're likely not learning enough about your organization's actual resilience.
Read the full post on LinkedIn
New study identifies adaptive range and capacity as critical
The new study examining incident response in high-reliability organizations identifies adaptive range and capacity as critical resilience factors. Organizations need a spectrum of possible responses (adaptive range) and the ability to flexibly navigate this spectrum (adaptive capacity). The research identified eight bipolar categories that form this range, including following procedures versus innovating responses, simplifying versus acknowledging complexity, and communicating openly versus maintaining discretion. The most resilient organizations demonstrate "both-and" thinking rather than "either-or" approaches, moving fluidly along each spectrum based on changing conditions. Interestingly, managers showed greater adaptive capacity during serious accidents compared to near-misses, suggesting that adaptation emerges under pressure. These findings confirm that resilience isn't about preventing every failure but developing the capacity to respond effectively when things go wrong.
Read the full post on LinkedIn
On “normal engineers” and adaptive capacity
"When your systems are designed to be used by normal engineers, all that excess brilliance they have can get poured into the product itself, instead of wasting it on navigating the system." - Charity Majors
Designing systems for "normal engineers" preserves cognitive bandwidth for innovation and product development. This directly connects to resilience engineering principles, where adaptive capacity enables organizations to handle unexpected challenges. When systems are intuitive, engineers maintain mental energy for solving novel problems rather than struggling with tools or processes. This creates what resilience engineers call "cognitive headroom"—mental space that allows for graceful extensibility (stretching beyond normal operating parameters) and requisite variety (having diverse approaches to match diverse problems). By reducing the cognitive load required for basic operations, organizations create capacity for adaptation and innovation precisely when it's most needed: during complex problem-solving or incident response.
Read the full post on LinkedIn