Management can get things wrong

For some reason I’ve been thinking about bad management decisions and when employees get to “disobey” management. One of the classic examples of recent times was NASA managment’s overruling of the Morton Thiokol engineers who told them that the O rings wouldn’t work in cold weather. I went looking and found a post I wrote on the subject back in January 2007.

A remark in Thomas Kida’s splendid book Don’t Believe Everything You Think (Prometheus) snagged my attention yesterday. Page 193:

However, overconfidence can also cause catastrophic results. Before the space shuttle Challenger exploded, NASA estimated the probability of a catastrophe to be one in one hundred thousand launches.

What?! thought I. They did!?! They can’t have! Can they? I was staggered at the idea, for many reasons. One, NASA is run by science types, it’s packed to the rafters with engineers, it couldn’t be so off. Two, I remember a lot of talk – after the explosion, to be sure – about the fact that everyone at NASA, emphatically including all astronauts, knows and has always known that the space shuttle is extremely risky. Three, the reasons the shuttle is extremely and obviously risky were also widely canvassed: a launch is a controlled explosion and the shuttle is sitting on top of tons of highly volatile fuel. Four, a mere drive in a car is a hell of a lot riskier than a one in one hundred thousand chance, so how could the shuttle possibly be less risky?

There was no footnote for that particular item, so I found Kida’s email address and asked him if he could remember where he found it. He couldn’t, but he very very kindly looked through his sources and found it: it’s in a book which in turn cites an article by Richard Feynman in Physics Today. I knew Feynman had written about the Challenger and NASA, but no details. The article is not online, but there is interesting stuff at Wikipedia – interesting, useful, and absolutely mind-boggling. They can have, they did. Just for one thing, my ‘One’ was wrong – NASA is apparently not run by science types, it’s run by run things types. Well silly me, thinking they’d want experts running it.

Feynman was requested to serve on the Presidential Rogers Commission which investigated the Challenger disaster of 1986. Feynman devoted the latter half of his book What Do You Care What Other People Think? to his experience on the Rogers Commission…Feynman’s account reveals a disconnect between NASA’s engineers and executives that was far more striking than he expected. His interviews of NASA’s high-ranking managers revealed startling misunderstandings of elementary concepts. In one example, early stress tests resulted in some of the booster rocket’s O-rings cracking a third of the way through. NASA managers recorded that this result demonstrated that the O-rings had a “safety factor” of 3, based on the 1/3 penetration of the crack. Feynman incredulously explains the gravity of this error: a “safety factor” refers to the practice of building an object to be capable of withstanding more force than it will ever conceivably be subjected to. To paraphrase Feynman’s example, if engineers built a bridge that could bear 3,000 pounds without any damage, even though it was never expected to bear more than 1,000 pounds in practice, the safety factor would be 3. If, however, a truck drove across the bridge and it cracked at all, the safety factor is now zero: the bridge is defective. Feynman was clearly disturbed by the fact that NASA management not only misunderstood this concept, but in fact inverted it by using a term denoting an extra level of safety to describe a part that was actually defective and unsafe.

Christ almighty.

Feynman continued to investigate the lack of communication between NASA’s management and its engineers and was struck by the management’s claim that the risk of catastrophic malfunction on the shuttle was 1 in 10^5; i.e., 1 in 100,000…Feynman was bothered not just by this sloppy science but by the fact that NASA claimed that the risk of catastrophic failure was “necessarily” 1 in 10^5. As the figure itself was beyond belief, Feynman questioned exactly what “necessarily” meant in this context – did it mean that the figure followed logically from other calculations, or did it reflect NASA management’s desire to make the numbers fit? Feynman…decided to poll the engineers themselves, asking them to write down an anonymous estimate of the odds of shuttle explosion. Feynman found that the bulk of the engineers’ estimates fell between 1 in 50 and 1 in 100. Not only did this confirm that NASA management had clearly failed to communicate with their own engineers, but the disparity engaged Feynman’s emotions…he was clearly upset that NASA presented its clearly fantastical figures as fact to convince a member of the laity, schoolteacher Christa McCauliffe, to join the crew.

That’s one of the most off the charts examples of wishful thinking in action I’ve ever seen.

After writing that post I read Feynman’s report. It’s a great, educational read.

Read Appendix F.

It appears that there are enormous differences of opinion as to the
probability of a failure with loss of vehicle and of human life. The
estimates range from roughly 1 in 100 to 1 in 100,000. The higher
figures come from the working engineers, and the very low figures from
management. What are the causes and consequences of this lack of
agreement? Since 1 part in 100,000 would imply that one could put a
Shuttle up each day for 300 years expecting to lose only one, we could
properly ask “What is the cause of management’s fantastic faith in the
machinery?”

What indeed?

Management is not always right. Sometimes in an emergency an employee will have better information. Sometimes in an emergency an employee has to do the right thing as opposed to obeying a supervisor.

18 Responses to “Management can get things wrong”