Sie sind auf Seite 1von 30

How to be safe by fostering

successes rather than reducing


failures

Erik Hollnagel
Professor & Industrial Safety Chair
MINES ParisTech – Crisis and Risk Research Centre
Sophia Antipolis, France
Email: erik.hollnagel@crc.ensmp.fr
© Erik Hollnagel, 2008
The meaning of safety
How can it How much risk How much risk is
From French Sauf = be done? is acceptable? affordable
unharmed / except

What can
SAFETY = FREEDOM FROM UNACCEPTABLE RISK go wrong?

Prevention of Protection against


unwanted events unwanted outcomes
Unexpected event Unwanted outcome
LIFE
Normal PROPERTY
performance MONEY

Accidents, incidents, …
© Erik Hollnagel, 2008
Safety as reduction/elimination of risk
The common understanding of safety implies a distinction between:
A normal state where everything works as it should and where the outcomes /
products are acceptable (positive or as intended).
A failed state where normal operations are disrupted or impossible, and where
the outcomes/products are unacceptable (negative or not as intended).

The purpose of safety (management) is


What happens
to maintain a normal state by
when there is no
preventing disruptions or disturbances.
measurable change?
Safety efforts are normally driven by
what has happened in the past, and are
therefore reactive.
The level of safety is measured by the
absence of negative outcomes.

© Erik Hollnagel, 2008


First there were technical failures
Technology, equipment

100
% Attributed cause

90
80
70
60
50
40
30
20
10

1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

© Erik Hollnagel, 2008


and technical analysis methods

HAZOP
FMEA Fault tree FMECA

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

© Erik Hollnagel, 2008


Then came the “human factor”
Technology, equipment Human performance

100
% Attributed cause

90
80
70
60
50
40
30
20
10

1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

© Erik Hollnagel, 2008


And human factors analysis methods

RCA, ATHEANA
HEAT

Swiss Cheese
HPES
HERA
HCR
THERP AEB
HAZOP
CSNI
Root cause Domino FMEA Fault tree FMECA TRACEr

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

Technical Human Factors


© Erik Hollnagel, 2008
Then came “organisational failures” ...
Technology, equipment Human performance Organisation

100
% Attributed cause

90
80
70 ?
60 Which will be the
50 most unreliable
component?
40
30 ?
20
10

1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
?
© Erik Hollnagel, 2008
and organisational analysis methods

RCA, ATHEANA
HEAT TRIPOD
MTO
Swiss Cheese
HPES
FRAM
STEP HERA STAMP
HCR AcciMap
THERP AEB
HAZOP
CSNI MERMOS
Root cause Domino FMEA Fault tree FMECA TRACEr
MORT CREAM

1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010

Technical Human Factors Organisational Systemic


© Erik Hollnagel, 2008
Safety measured by accident/incidents
European Technology Platform on Industrial
Safety (ETPIS) milestones:
- 25% reduction in accidents by 2020
- Programmes in place by 2020 to continue
accident reduction at a rate of > 5% per year.

“Safety is a dynamic non-event”


(Karl Weick)
But how can a non-event be
measured?

© Erik Hollnagel, 2008


Safety = (1 - Risk)

“By 2020 a new safety paradigm will have been widely adopted in European
industry. Safety is seen as a key factor for successful business and an inherent
element of business performance. As a result, industrial safety performance will
have progressively and measurably improved in terms of reduction of
- reportable accidents at work,
The measurements
- occupational diseases, are all negative or
- environmental incidents and unwanted outcomes.
- accident-related production losses.
It is expected that an ‘incident elimination’ culture will develop where safety is
embedded in design, maintenance, operation and management at all levels in
enterprises. This will be identifiable as an output from this Technology Platform
meeting its quantified objectives.”

© Erik Hollnagel, 2008


OECD Indicators for Patient Safety
Operative and post-operative Hospital-acquired infections
complications Ventilator pneumonia
Complications of anaesthesia Wound infection
Postoperative hip fracture Infection due to medical care
Postoperative pulmonary embolism (PE) Decubitus ulcer
or deep vein thrombosis (DVT)
Postoperative sepsis Obstetrics
Technical difficulty with procedure Birth trauma - injury to neonate
Sentinel events Obstetric trauma – vaginal delivery
Obstetric trauma - caesarean section
Transfusion reaction Problems with childbirth
Wrong blood type
Wrong-site surgery Other care-related adverse events
Foreign body left in during procedure Patient falls
Medical equipment-related In-hospital hip fracture or fall
Medication errors

© Erik Hollnagel, 2008


Corporate HSE Indicators
1. Fatal accidents (number)
2. Total recordable injury frequency (TRIF)
3. Lost-time injury frequency (LTIF)
4. Serious HSE incident frequency (SIF)
5. Accidental oil spill (number and volume)
6. Emissions of CO2 and relative emissions of CO2
7. Emissions of NOx and relative emissions of NOx
8. Energy consumption and energy efficiency
9. Non-hazardous waste recovery rate
10. Climate KPI
11. Global warming potential (total quantity in CO2 equivalents)
12. Sickness absence (percentage)
13. Theft (number of instances and costs)
14. Criminal damage (number of instances and costs)
15. Violence and threats (number of instances and cost) HSE = Health, safety, security
and environment
16. Robbery (number of instances and costs)
17. Special security incidents (number and cost)
18. Work related illness (WRI)
19. Actual and potential transportation personal injuries
... ... ...
© Erik Hollnagel, 2008
Theories and models of the negative

Accidents are caused by people,


due to carelessness, inexperience, Organisations are complex
and/or wrong attitudes. but brittle with limited
Technology and materials are memory and unclear
imperfect so failures are inevitable distribution of authority

© Erik Hollnagel, 2008


Theory W: Traditional safety perspective
Systems are well designed and scrupulously maintained,
Things go Designers can anticipate every contingency.
right
because: Procedures are complete and correct
People behave as they are expected to – as they are taught.

Humans are a liability and performance variability is a threat.


The purpose of design is to prevent adverse outcomes by
constraining performance variability.

Accidents are due to failures or malfunctions of components


Common (“human errors”), equipment malfunctions.
assumptions Risks can be represented by linear combinations or chains of
failures or malfunctions. Example: Event tree - fault tree
The purpose of risk assessment is to determine in a systematic manner the
relation between adverse outcomes (= severe accidents) and their causes.

© Erik Hollnagel, 2008


Theory W: Safety by constraints
Individual, team,
organisation
(sharp end, blunt end)

Barriers,
regulations, Success
procedures, Function (accident
standardization, free) Safety is achieved
elimination by constraining
performance
Slow drift,
abrupt transition

Failure
Malfunction (accidents,
(root cause) incidents)

© Erik Hollnagel, 2008


My god, it's full of stars ...

© Erik Hollnagel, 2008


... but most of it is Dark Matter
According to current theories, the universe In safety management people tend to
consists of 5% ordinary matter, 25% dark notice only what goes wrong (the
matter, and 70% dark energy. Dark matter “stars”). But to understand it we need
and dark energy are the “fudge factors” also to look at the “unknown”
needed to make cosmology consistent. background = normal performance.

We can see the stars, but we need “dark


matter” to explain what we see.

We can “see” what goes wrong, but we can only understand


it against a background of “normal performance”.

© Erik Hollnagel, 2008


Performance variability is necessary
Systems are so complex that work situations always
are underspecified – hence partly unpredictable
Few – if any – tasks can successfully be carried out
unless procedures and tools are adapted to the
situation. Performance variability is both normal and
?
necessary.

Many socio-technical systems are intractable. The Success


conditions of work therefore never completely
match what has been specified or prescribed. Performance
variability
Individuals, groups, and organisations normally
adjust their performance to meet existing conditions Failure
(resources, demands, conflicts, interruptions).
Because resources (time, manpower, information,
etc.) always are finite, such adjustments will
invariably be approximate rather than exact.
© Erik Hollnagel, 2008
Performance variability is inevitable
Accounting for the sources and range of normal performance variability:

Inherent variability Ingenuity and creativity –


(psychological / adaptability (overcoming
physiological constraints and
phenomena). underspecification).

Organizationally induced Socially induced variability


performance variability (meeting expectations,
(meeting demands,
stretching resources). informal work standards).

Contextually induced performance variability (performance


conditions).
© Erik Hollnagel, 2008
Efficiency versus thoroughness
Thoroughness: Time to think Efficiency: Time to do
Recognising situation. Implementing plans.
Choosing and planning. Executing actions.

If thoroughness dominates, If efficiency dominates,


there may be too little actions may be badly
time to act. prepared or wrong

Neglect pending actions Miss pre-conditions


Miss new events Look for expected results

Time needed

Time available

© Erik Hollnagel, 2008


Efficiency-Thoroughness Trade-Off
For distributed work it is necessary to trust what
others do; it is impossible
to check everything.
Consider secondary
Confirm that outcomes and side-
input is correct effects
Thoroughness

Efficiency
Assume someone
Trust that input else takes care of
is correct outcomes
One way of managing
time and resource limitations is to
think only one step back and/or one step ahead.
© Erik Hollnagel, 2008
Theory Z: Revised safety perspective
Things go Learn to overcome design flaws and functional glitches
right Adjust their performance to meet demands
because Interpret and apply procedures to match conditions
people: Can detect and correct when things go wrong
Increasing complexity have made modern technological
systems intractable, hence underspecified.
Humans are therefore an asset without which the proper
functioning of modern technological systems would be
impossible.

Accidents are due to unexpected combinations of actions


Revised rather than action failures. Example: ETTO.
assumptions Risks can be represented by dynamic combinations of
performance variability. Example: Functional Resonance.
© Erik Hollnagel, 2008
Theory Z: Safety by management
Individual, team,
organisation
(sharp end, blunt end)
Performance
“Amplify” Success (no variability is needed
Physiological for normal
factors accidents or
incidents) functioning
Psychological (successes)
factors Normal function
Social factors (performance
Organisational variability)
factors Failures cannot be
Failure prevented by
Environmental (accidents, eliminating
factors “Dampen” incidents) performance
variability
Safety is achieved by managing unwanted combinations of
performance variability without adversely affecting successes
© Erik Hollnagel, 2008
From the negative to the positive
Negative outcomes are All outcomes (positive
caused by failures and and negative) are due to
malfunctions. performance variability..

Safety = Reduced Safety = Ability to Safety = Ability to


number of adverse respond when succeed under
events. something fails. varying conditions.

Eliminate failures Improve ability to


and malfunctions as respond to adverse Improve resilience.
far as possible. events.

© Erik Hollnagel, 2008


Resilience and safety management
Resilience is the intrinsic ability of a system to adjust its functioning prior to,
during, or following changes and disturbances, so that it can sustain required
operations even after a major mishap or in the presence of continuous stress.
A practice of Resilience Engineering / Proactive Safety Management requires that
all levels of the organisation are able to:

Respond to regular and


irregular threats in an Anticipate threats,
effective, flexible manner, disruptions and
destabilizing conditions
Actual

Factual Critical Potential

Learn from past events, Monitor threats


understand correctly and revise risk
what happened and why models
© Erik Hollnagel, 2008
Resilience = “being better at”
Responding: Knowing
what to do, being
capable of doing it. Anticipating: Finding
out and knowing what
to expect
Actual

Factual Critical Potential

Learning: Monitoring:
Knowing what has Knowing what to
happened look for (attention)

Resilience engineering measures how safe a system is by what it is able to do,


hence measures of the positive rather than the negative.
© Erik Hollnagel, 2008
As Low As Reasonably Practicable
Unacceptable region Must be eliminated or
(intolerable risk) contained at any cost
INVEST!
Will be eliminated or
contained, if not too costly
Save ALARP or Should be eliminated or
rather Tolerability region contained or otherwise
than
invest (tolerable risk) responded to
May be eliminated or
contained or otherwise
responded to
Broadly Might be assessed
acceptable region when feasible
(negligible risk)
SAVE!
© Erik Hollnagel, 2008
As high as reasonably practicable
Which events? How were
they found? Is the list What is our “model” of the future?
revised? How is readiness How long to we look ahead? What
ensured and maintained? risks are we willing to take? Who
believes what and why?
Actual

Factual Critical Potential

What, when (continuously or How are indicators defined? Lagging /


event-driven), from what leading? How are they “measured”? Are
(successes or failures), how effects transient or permanent? Who looks
(qualitative, quantitative), by where and when? How, and when, are they
individual or by organisation? revised?

© Erik Hollnagel, 2008


Conclusion: Two approaches to safety
Eliminate the negative (common safety approach, Theory W)

Failures
Efforts to maintain or improve safety focus on what can go
wrong and result in adverse outcomes.
Theories, models, and methods aim to explain or predict how
things can go wrong - with varying degrees of success.
Some also propose solutions, focusing M, T, and O issues –
again with varying degrees of success. Effort

Accentuate the positive (resilience engineering, Theory Z)


In resilience engineering, efforts to maintain or improve
Successes

safety looks at what goes right, as well as on what


should have gone right.
Theories, models, and methods aim to describe how
things go right, but sometimes fail, and how humans and
organisations cope with internal and external
Effort intractability and unpredictability.
© Erik Hollnagel, 2008

Das könnte Ihnen auch gefallen