Cipher Skin and High-Reliability Theory - Pt. 2

Cipher Skin is a powerful tool that can help organizations ranging from athletic teams to industrial companies continuously monitor the variables that produce complex failures.

In part 1 of this series, we talked about how complex failures happen, and how they arise not so much from the pieces of a system, but of how those pieces interact.

To read part 1 of this series, go here: CS + HRT Part 1

Complex failures occur when one or more individual components of a system have an unnoticed “drift to low performance” - a gradual degradation in quality that happens slowly enough it goes unnoticed. Eventually, these components interact in an unforeseen way that produces a disaster.

A body of research called high-reliability theory (HRT) looks at how people and organizations can avoid these failures. HRT addresses several key concepts that make recognizing drifts visible and preventable.

Preoccupation with failure

A form of situational awareness, a preoccupation with failure is an “active consideration of all the places and moments where you don’t want to fail.” It’s a knowledge of system traps and cognitive biases that lead to ruin, and a constant eye towards them. A High Reliability Organization does not wait for a disaster to occur to start looking for signs of danger.

Cipher Skin provides a valuable insight into the behavior of a complex system, whether a human body or a pipeline, that enables active monitoring of the components of failure.

Reluctance to simplify

Marcus Aurelius wrote, “Of each particular thing, ask: ’What is it, in its own construction?”

This perspective avoids the temptation to immediately categorize problems with broad strokes and to carefully consider the context and pathway by which they occur. Aurelius warns that we must look at any given warning signal with a wide range of mental models and perspectives.

For example, dismissing a recurring pain as “training soreness” without considering alternative root causes is always the wrong decision. Assuming that a newly-built pipeline won’t require intensive, proactive monitoring would be a similar error.

Sensitivity to operations

In combat, the men (and women) on the ground are the most knowledgeable of the conditions they are facing. They are the most sensitive to the rapidly changing environment. Therefore, they are the most equipped to receive data inputs, triage, prioritize, and decide on the next steps. One would think that the order of operations would reflect that logic but that is not always the case. Sometimes the manager (or commanding officer) who has limited clarity (ie. can only see a map which is an abstract representation) can be influenced by external (Read: uninformed) interests which negatively impacts the people on the ground who are best equipped to drive the operation.

Just as an effective special operations unit relies on the real-time observations of its people on the ground, the most reliable data for decision-making about risk management comes directly from the source, whether that’s being measured in real-time on the body or locally on every segment of a pipeline. The closer you are to the situation, the more reliable and complete your information will be. Better information produces better decisions.

Deference to expertise

Seek the most qualified person for that job and skip past the second-hand bro-advice.

If you’ve got a piece of equipment that keeps malfunctioning, learn from the equipment specialist what’s going wrong and how to fix it. If you’re trying to improve your running times, ask someone who professionally helps people do that, not some guy who sometimes runs fast.

But beware that expertise can have unintended consequences. Especially in environments without specific and immediate feedback loops, many experts can gradually developmental shortcuts or biases that are not actually beneficial. In fact, in many occupations, (including many parts of the medical community) people become worse at their jobs over time.

When assessing the value of an expert, ask: Will this person’s decisions have been immediately and visibly proven right or wrong for the duration of their career, or could they have been working from a flawed theory this whole time and not known it? If they’re right or wrong, how would we truly know? (for a fascinating long read on how this happened in the field of arson investigation, go Here.

Avoiding this pitfall is critical when managing complex systems, whether that means the human body or industrial operations.

Being an expert at one metric of success is myopic and dangerous. Knowing everything there is to know about pipe corrosion isn’t helpful if a pipeline breaks due to asymmetric expansion caused by a temperature differential. How does any complex system improve without taking into account the entire system? The answer is obvious. The system eventually fails.

True expertise requires the ability to see an entire system at once and know with certainty whether an assumption is valid or not. It requires good data, amassed over time.

Commitment to resilience

Resilience is a property of a system as a whole.

It’s what enables a special operator to absorb strain and preserve function despite adversity, rapidly recover from setbacks and learn and grow from those episodes.

It’s what results from the ability to simultaneously monitor and account for the behavior of each component of a system as well as how all of those components are constantly interacting and influencing one another.

We cannot build resilient systems if we can only see a few pieces of them in isolation, or if we can only see those pieces periodically.

What is High-Reliability Theory? Part Two