NE499/515 - Lecture 10: Safety Culture and the Boeing 737 MAX Airplane Crashes
Summary
TLDRThis lecture addresses the importance of a healthy safety culture in preventing criticality accidents. It uses case studies, including the Space Shuttle Challenger disaster and the Boeing 737 Max crashes, to illustrate how poor safety culture and management pressures can lead to fatal outcomes. The talk emphasizes the need for operators to feel empowered to raise safety concerns and for management to support a culture that prioritizes safety over production pressures.
Takeaways
- 🚨 Safety culture is crucial in preventing accidents and is influenced by shared attitudes, values, goals, and practices within an organization.
- 📈 The probability of failure can escalate rapidly, as illustrated by the hypothetical reactor test scenario, underscoring the importance of vigilance.
- 👥 A healthy safety culture is not mandated or enforced but is cultivated through operators' attitudes and questioning of unsafe conditions.
- 🛠 The Space Shuttle Challenger disaster was a result of poor safety culture, where management pressures overrode an engineer's critical safety warning.
- ✈️ The Boeing 737 MAX crashes were a consequence of a flawed safety culture, where cost-cutting measures and inadequate training led to tragic outcomes.
- 🔍 A single point of failure, like the MCAS system in the Boeing 737 MAX, can have deadly consequences if not properly managed.
- 👩🏫 Training is vital; inadequate training on the MCAS system contributed to the Boeing crashes, highlighting the need for comprehensive safety education.
- 🔄 Economic pressures can compromise safety, as seen in the Deep Water Horizon accident, where cost-saving decisions increased the risk of a blowout.
- 🤝 Operator involvement and management buy-in are essential for a strong safety culture, ensuring that rules are understood and followed.
- 🔄 Routine self-assessments and audits can identify weaknesses in safety controls and are necessary for continuous improvement.
- 🌟 Setting the right example and fostering a team mentality can encourage a proactive approach to safety and prevent accidents.
Q & A
What is the main focus of the lecture series?
-The lecture series focuses on nuclear criticality safety, discussing how an unhealthy safety culture can lead to criticality accidents.
What is the significance of the hypothetical case study presented in the lecture?
-The hypothetical case study illustrates the dilemma of proceeding with a high-risk test under pressure, drawing parallels to real-world disasters like the Space Shuttle Challenger accident.
What does the term 'safety culture' refer to in the context of the lecture?
-Safety culture refers to a set of shared attitudes, values, goals, and practices within an organization that prioritize safety in day-to-day operations.
Why is it crucial for operators to be comfortable raising safety concerns?
-Operators must feel comfortable raising safety concerns to prevent accidents, as they are often the first to notice potential hazards in their work environment.
What role did a poor safety culture play in the Boeing 737 Max crashes?
-A poor safety culture at Boeing led to the design of the Maneuvering Characteristics Augmentation System (MCAS) with a single point of failure, inadequate pilot training, and a failure to address warning signs, contributing to the crashes.
How did management pressures contribute to the Space Shuttle Challenger disaster?
-Management pressures led to the decision to launch the Challenger despite safety concerns raised by an engineer about the cold temperatures affecting the shuttle's O-rings.
What are some ways to cultivate a healthy safety culture in an organization?
-Cultivating a healthy safety culture involves getting operator involvement, management buy-in, routine self-assessments, audit closeout meetings, tracking corrective actions, and promoting a questioning attitude.
Why is it important for criticality safety engineers to network with each other?
-Networking allows criticality safety engineers to share experiences, learn from mistakes, and find better ways to ensure safety, as illustrated by the case of an expert noticing an abnormal condition at Y-12.
What is the significance of the ANS standards mentioned in the lecture?
-The ANS standards, specifically ANS 819 and 820, provide guidelines for developing and maintaining a healthy safety culture in nuclear and other high-risk industries.
How can operators be encouraged to follow safety rules?
-Operators are more likely to follow safety rules if they understand their purpose and see the rules as making everyone safer, rather than as inconvenient restrictions.
Outlines
🚀 The Impact of Safety Culture on Criticality Accidents
This paragraph introduces the concept of safety culture and its critical role in preventing accidents, particularly in the context of a hypothetical nuclear reactor test. It presents a scenario where a new reactor design is to be tested with a significant risk of failure. The dilemma of proceeding with the test despite the risk is highlighted, drawing parallels to the Space Shuttle Challenger disaster, where poor safety culture led to a tragic outcome. The importance of fostering a culture that values safety over operational pressures is emphasized.
✈️ The Boeing 737 MAX Tragedy: A Case Study in Safety Culture
Paragraph 2 delves into the technical flaws and safety culture issues that led to the Boeing 737 MAX disasters. It explains how the Maneuvering Characteristics Augmentation System (MCAS) was designed with a single point of failure and was not properly communicated to pilots, leading to two fatal crashes. The narrative underscores the consequences of management pressures that override safety concerns, resulting in inadequate training and a lack of transparency about the MCAS system.
🛠️ Cultivating a Strong Safety Culture in Operations
This section discusses strategies for developing a robust safety culture, particularly in the context of physical material operations. It emphasizes the need for operator involvement, management support, and routine self-assessments. The paragraph also touches on the importance of addressing economic pressures that can compromise safety. It provides examples of how historical events, management decisions, and economic factors have influenced safety culture and led to accidents.
🌐 Networking and Learning for Enhanced Safety Culture
The final paragraph focuses on the value of networking and continuous learning in enhancing safety culture. It mentions the role of professional societies and standards in sharing best practices and learning from mistakes. The paragraph concludes with a call to action for attendees to engage with these resources and to consider the broader implications of safety culture in their work.
Mindmap
Keywords
💡Nuclear Criticality Safety
💡Safety Culture
💡Hypothetical Case Study
💡Feedback Mechanisms
💡Risk Acceptance
💡Space Shuttle Challenger
💡Boeing 737 Max
💡MCAS (Maneuvering Characteristics Augmentation System)
💡Economic Pressures
💡Self-Assessments
💡Root Cause Analysis
Highlights
Unhealthy safety culture can increase the likelihood of criticality accidents.
Hypothetical case study of a high visibility advanced reactor development program.
Risk assessment of a reactor test with a 1 in 100,000 chance of failure.
The dilemma of proceeding with a test despite potential funding loss.
Engineer's warning of unstable feedback mechanisms due to cold temperatures.
The ethical decision-making involved in conducting a risky test.
The real-life example of the Space Shuttle Challenger disaster.
The role of safety culture in the Challenger disaster and its implications.
Definition of safety culture as shared attitudes, values, goals, and practices.
Importance of operators questioning what could go wrong and refusing unsafe operations.
The impact of poor safety culture on multiple nuclear and non-nuclear accidents.
Case study of the Boeing 737 MAX and the Maneuvering Characteristics Augmentation System (MCAS).
The design flaws and single point of failure in the MCAS system.
The inadequate training and lack of awareness of the MCAS system among pilots.
The consequences of ignoring warning signs and the subsequent accidents.
The role of historical events and management pressures in shaping safety culture.
The influence of economic pressures on safety decisions and their impact on accidents.
Strategies to cultivate a strong safety culture in physical material operations facilities.
The importance of operator involvement, management buy-in, and routine self-assessments.
The value of audit closeout meetings and tracking corrective actions.
Encouraging a safety goal setting and questioning attitude among operators.
The significance of setting the right example and fostering a collaborative environment.
The benefits of networking and continuous learning for criticality safety engineers.
Transcripts
hello everyone and welcome back to the
nuclear criticality safety lecture
series
today we're going to discuss how having
an unhealthy safety culture can make
criticality accidents more likely and
let's begin by discussing a hypothetical
case study
let's say that you're in charge of a
very high visibility
20 billion dollar advanced reactor
development program and your team has
just finished developing a prototype for
a new reactor design concept at the
idaho national laboratory
you plan to begin operating and testing
this prototype for the very first time
which will be a very high visibility
event the entire nuclear engineering
world will be watching and there is a 1
in 100 000 chance that this test will
end in failure killing the seven reactor
operators
would you perform this test
it will be viewed as a public failure if
you decide to delay this test if you
delay this test for too long then the
funding agencies are likely to run out
of patience and cancel your expensive
reactor development program
you've spoken with the seven reactor
operators and they each understand the
risk involved with this test and have
voluntarily accepted this risk
so would you perform this test
let's say that you do decide to perform
this test and that on the morning of the
test one of the engineers says that the
cold idaho temperatures have made the
reactor's feedback mechanisms unstable
the engineer now says that there is a 1
in 100 chance that this test will fail
and kill the seven operators however the
other engineers disagree
would you still perform this test
seeing this probability of failure
quickly increase by a factor of 1000 is
certainly worrying but the operators are
still willing to accept this risk of
failure and a 1 in 100 failure rate
still isn't super likely
so be honest would you perform this test
or would you risk closing your reactor
development program and suffering a
potentially fatal blow to your otherwise
bright career
it turns out that this hypothetical
scenario isn't hypothetical at all but
instead of testing a reactor prototype
this test really was launching the space
shuttle challenger
one engineer thought that the cold
temperatures during that morning would
make the shuttle's o-rings fail to seal
allowing hot high-pressure gas from
burning the solid fuel to escape and
destroy the shuttle but management
decide to overrule this one engineer's
advice and to launch anyway
because of its poor safety culture nasa
allowed management pressures to override
the concerns of an experienced engineer
and caused the deaths of all seven
astronauts aboard the challenger
so what is safety culture and how do we
make sure that physical material
operations do not fall into the same
trap as nasa did
safety culture is a set of shared
attitudes values goals and practices
within an organization
safety culture is not an enforcement
issue and it's not how many safety
training videos you make your staff
suffer through it's the attitude that
operators bring with them during normal
day-to-day operations
you cannot cultivate a healthy safety
culture by mandating it or by enforcing
it you must make sure that operators
approach a task with safety in mind and
that they question what could go wrong
and what doesn't feel right
they must be comfortable raising these
concerns and with refusing to perform an
operation if things feel unsafe
so how important is it to have a healthy
safety culture
well a poor safety culture has factored
into multiple nuclear and non-nuclear
accidents including the fukushima
daiyachi accident the space shuttle
challenger disaster multiple criticality
accidents most notoriously the russian
criticality accidents and the boeing 737
max airplane crashes which we will now
discuss
in 2016 the new airbus a320 neo
commercial airliners entered service
these airplanes were bigger cleaner and
as much as 15 percent more fuel
efficient than competing designs and by
october of 2019 the airbus a30 had
surpassed the boeing 737 as the
best-selling airliner
boeing respond to this competition by
upgrading their existing 737 design they
chose to simply upgrade the design to
avoid the costly process of completely
recertifying their design and retraining
their pilots
the design upgrades caused the engines
to be moved forward on the airplane
which changed the plane's center of
gravity and its aerodynamics
this introduced some instability to the
airliner and so they introduced the
maneuvering characteristics augmentation
system or mcas to compensate
the mcas system used one of two sensors
on the airplane to detect instability if
it detected instability then it would
respond automatically to re-stabilize
the airplane
re-stabilizing the plane involved
pushing the airplane's nose down which
would also cause the plane to lose
altitude
unfortunately the mcas system's design
introduced some potentially deadly
consequences
because only one of its two total
sensors was needed to trigger its
response to an instability this meant
that the mcas system was designed around
a single point of failure that it was
likely to see false positives for
unstable conditions
furthermore the mcas systems actuator
which again would lower the airplane's
nose was set to respond automatically
pilots would be flying the plane and all
of a sudden the plane's nose would lower
and it would lose altitude
this might not be an issue if the pilots
knew what was going on and how to
respond to it but unfortunately the mcas
was not mentioned in the flight crew
operations manual and moving the
airplane's control yoke would not
disengage the mcas
so when the mcas activated some pilots
didn't know how to turn it off
as it was designed the mcas system
required proper installation of both
sensors to be effective improperly
installed sensors would falsely trip the
mcas and cause planes to lose altitude
unfortunately boeing decided to reduce
the number of mcas sensors from two to
only one which made our single point of
failure even worse
to make matters even worse pilots were
improperly trained on the mcas system
many pilots only learned about the mcas
from a two-hour ipad training video
because of this pilots began reporting
mcas-caused issues early in 2018 these
conditions could not be replicated most
likely because they are caused by the
mcas sensor randomly failing or being
iced over
because they couldn't replicate the
conditions the problem was incorrectly
assumed to be resolved they just thought
that it went away on its own
on october 29th of 2018
lion air flight 610 took off from
jakarta and the pilots quickly
experienced difficulty controlling the
airplane after takeoff the airplane's
mcas system was falsely detecting an
unstable condition and as you can see in
this plot kept responding by trying to
lower the airplane's nose multiple times
because of this the pilots were unable
to gain much altitude reaching only
about 5500 feet until the mcas lowered
the nose one final time causing the
airplane to crash into the java sea
killing all 189 passengers and crew
as a result of this accident the us faa
and boeing issued warnings and training
advisories to all 737 max series
operators but these advisories were not
fully implemented
several months later a similar accident
took place on march 10th of 2019.
the mcas system activated an error
during ethiopian airlines flight 302
causing the airplane to continuously
lose altitude and crash into the ground
at nearly 700 miles per hour which
killed all 157 passengers and crew upon
impact
boeing tried to cover up the cause of
these accidents but was later sued for
fraud and settled for 2.5 billion
dollars in damages
additionally all 737 max airliners
across the globe were grounded following
these accidents
in november of 2020 the faa allowed the
737 max airliners to re-enter service
subject to a list of mandated design
changes and training changes
this example shows how a poor safety
culture at boeing led to these accidents
boeing engineers succumbed to management
pressures and designed a workaround to
avoid a costly 737 recertification
process this workaround was one single
point failure away from causing
dangerous conditions
some pilots weren't even aware that the
mcas was installed on their planes and
they certainly didn't know how to
deactivate it in case it was triggered
in error
the natural intuitive way for
deactivating the mcas which was to move
the control yoke didn't actually work
and it didn't actually shut off the mcas
lastly when the warning signs of an
accident appeared boeing assumed that
the problem had resolved itself and they
failed to investigate further
so as we see a facility safety culture
is influenced by
one historical events if the operators
have always done it that way and didn't
run into problems in the past then
they're likely to assume that it's safe
to do things that way even if operating
procedures or the site license forbid
doing things that way
we saw this during the tokomir accident
where operators multi-batched 18.8
percent enriched uranium probably
because they had already done it in the
past for low enrichment uranium with no
consequences
another factor affecting a site safety
culture is management changes or
pressures
nasa managers allow the space shuttle
program to unduly pressure them to
launch the challenger
after the accident the rogers commission
recommended that nasa restructure the
space shuttle program's management to
prevent project managers from being
pressured by the space shuttle
organization to launch under unsafe
conditions
we see this effect often in criticality
safety management pressure has led to
multiple criticality accidents which is
why the ansi ans standards state that a
criticality safety program should remain
independent of operations the crit
safety staff should not be subject to
production pressures
economic pressures can also affect a
site safety culture and as we've seen
these pressures can also lead to
criticality accidents
economic pressures also played a role in
the 2010 deep water horizon accident
where decisions that british petroleum
halliburton and trans ocean made
regarding the rig's blowout preventer
made a blowout significantly more likely
on november 9th of 2010 a report by the
oil spill commission criticized the
rig's poor management decisions and
stated that there had been a rush to
completion on the well
these management decisions were made to
save money and to get the oil rig up and
running faster and the co-chair of the
oil spill report was quoted as saying
that there was not a culture of safety
on that rig so in light of these
accidents how do we cultivate a strong
safety culture in physical material
operations facilities some things that
we can do include
first getting operator involvement and
buy-in
operators are more likely to follow
rules if they understand them especially
if the rules can be inconvenient
operators need to understand why rules
exist and to understand that these rules
make everyone safer
operations staff also probably know the
facility better than the crit engineers
and often they can offer valuable
insight on potential upset conditions or
easy but effective ways to implement
criticality safety controls
management also needs buy-in to
criticality safety they're the ones who
are likely to pressure operations to
skirt the rules to save time and they
also have access to resources to help
develop and sustain a criticality safety
program
routine self-assessments can also help
to develop a culture of safety and
physical material operations facilities
these assessments can include
walk-throughs where operations staff
show crit engineers how they usually
perform their work which provides an
opportunity for crit engineers to
identify strengths and weaknesses in
their criticality safety controls
these walkthroughs can also help to
identify the root causes of any current
or likely abnormalities since they give
us a chance to see how things are really
done in practice
it is worth noting that these
self-assessments are only as good as the
corrective actions that they prompt
identifying a problem and doing nothing
about it won't make anything safer
along these lines having audit closeout
meetings allows engineers and operators
a chance to reflect on how well the
criticality safety controls are
operating under normal conditions and
also after an abnormal condition arises
the meanings allow us to document
compliance with criticality safety
controls and to assess the adequacy of
the posted warnings and controls the
availability and continued use of the
controls and the adequacy of the
existing criticality safety evaluations
tracking corrective actions also helps
to develop a safety culture by tracking
these actions we can both ensure that
they have been implemented
notice if they have been removed or are
no longer functioning and reflect on the
root causes of an issue
if we continuously have to implement a
corrective action in response to a
seemingly random but reoccurring
abnormal event then chances are that
there's probably some underlying root
cause lurking about
managers should also encourage a safety
goal setting and a questioning attitude
in their facilities if operators
understand that going home safe each
night is the goal of crit safety and
that they're welcome to ask questions
about things that seem off then they're
more likely to notice potentially
dangerous upset conditions or even
better to proactively suggest ways to
make operations more safe
we should also seek to instill the
attitude of we're in it together rather
than it's us versus them
criticality safety engineers should be
seen as an ally to operations not as an
adversary
operators who think that you're only
there to give them a hard time aren't
very likely to pay much attention to
your suggestions they're much more
likely to cooperate and to proactively
work with you when they understand that
you're both on the same team
lastly we should also strive to set the
right example
saying that someone asked a stupid
question discourages everyone in the
room from ever asking a question in the
future
instead we should all demonstrate our
commitment to safety and show that we
want to help operators do their jobs
we should also seek to hire
knowledgeable instructors and make sure
that management continuously
demonstrates that it values safety
additionally it's very important to send
our crit safety engineers to conferences
and to support them to participate on
the ansi ans standards committees
these activities get credit engineers
talking to one another which allows them
to share their stories their experiences
to learn from each other's mistakes and
maybe to learn better ways to accomplish
their jobs
one of our homework assignments will
cover a case study where an expert
criticality safety engineer from bwxt
noticed an abnormal condition at y-12
this engineer called another expert
criticality safety engineer to bounce
ideas off of and to have review his
calculations this engineer might not
have known another expert safety
engineer that he could reach out to with
no notice had they not already met and
become friends over many years at
american nuclear society conferences
networking isn't just about getting a
job offer having a healthy network of
colleagues in the criticality safety
field allows us to help each other and
to learn from each other
this concludes our lecture on safety
culture if you're interested in learning
more about growing a healthy safety
culture then i recommend reading the
ansi ans 819 and 820 standards in the
following lectures we will continue
looking into criticality safety from an
operator's perspective and will discuss
ways to facilitate positive interactions
with operations staff
5.0 / 5 (0 votes)