Sie sind auf Seite 1von 47

Future Trends in

Process Safety
Prof. Nancy Leveson
Engineering Systems
Aeronautics and Astronautics

MIT

Youve carefully thought out all the angles


Youve done it a thousand times
It comes naturally to you
You know what youre doing, its what youve
been trained to do your whole life.
Nothing could possibly go wrong, right?

Think Again

Topics
Lessons from Texas City
New factors in process accidents
Safety as a control problem
Conclusions

Leadership
Safety requires passionate and effective leadership
Tone is set at the top of the organization
Not just sloganeering but real commitment
Setting priorities
Adequate resources assigned
A designated, high-ranking leader

Safety and productivity are not conflicting if take a


long-term view

Managing and Controlling Safety


Need clear definition of expectations, responsibilities,
authority, and accountability at all levels of safety control
structure
Entire control structure must together enforce the system
safety property
Unsafe changes must be eliminated or controlled
through system design or detected and fixed before they
lead to an accident.
Planned changes (MOC process)
Unplanned changes

Visibility and Communication


Downward and upward communication
Requires a positive, open, trusting environment
Need effective measurement and monitoring of
process safety performance (e.g., injury rates are not
useful and are misleading)

Avoid culture of denial


If managers do not want to hear, people stop talking

Information and Appropriate Feedback


Good accident/incident investigation and follow
through
Identification and correction of systemic causal
factors.
Ensuring thorough reporting of incidents and near
misses

Thorough hazard identification, analysis, and control


Effective process safety audit system to ensure
adequate process safety performance

Oversight and Control


Results of operating experience, process hazard
analyses, audits, near misses, or accident
investigations must be used to improve process
operations and process safety management system.
Address promptly and track to completion the
deficiencies found during assessments, audits,
inspections and incident investigation.

Fumbling for his recline button Ted


unwittingly instigates a disaster

Process Safety vs. Personal Safety


All behavior influenced by context in which it occurs
Both physical and social context
Personal safety focuses on changing individual
behavior
Process (system) safety focuses on design of system
in which behavior occurs

To understand why process accidents occur and to


prevent them, need to:
Understand current context (system design)
Create a design that effectively ensures safety

The Enemies of Safety


Complacency
Arrogance
Ignorance

Factors in Complacency
Discounting risk
Over-relying on redundancy
Unrealistic risk assessment
Ignoring low-probability, high-consequence events
Assuming risk decreases over time
Ignoring warning signs

Topics
Lessons from Texas City
New factors in process accidents
New technology
System accidents
New types of human error

Safety as a control problem


Conclusions

Accident with No Component Failures

Types of Accidents
Component Failure Accidents
Single or multiple component failures
Usually assume random failure

System Accidents
Arise in interactions among components
Related to interactive complexity and tight coupling
Exacerbated by introduction of computers and
software

Safety vs. Reliability


Safety and reliability are NOT the same
Sometimes increasing one can even decrease the
other.
Making all the components highly reliable will have no
impact on system accidents.

For relatively simple, electro-mechanical systems


with primarily component failure accidents, reliability
engineering can increase safety
For complex systems, need something more

Humans in Process Safety


Usually define human error as deviation from normative
procedures, but operators always deviate from standard
procedures
Normative vs. effective procedures
Sometimes violation of rules has prevented accidents

Cannot effectively model human behavior by


decomposing it into individual decisions and acts and
studying it in isolation from
Physical and social context
Value system in which takes place
Dynamic work process

Less successful actions are natural part of search by


operators for optimal performance

New Operator Roles and Errors


High tech automation changing cognitive demands on
operators

Supervising rather than directly monitoring


Doing more cognitively complex decision-making
Dealing with complex, mode-rich systems
Increasing need for cooperation and communication

Human-factors experts complaining about technologycentered automation


Designers focus on technical issues, not on supporting
operator tasks
Leads to clumsy automation

Errors are changing, e.g., errors of omission vs. commission

Impacts on System Design


Design for error tolerance
Alarm management (managing by exception)
Matching tasks to human characteristics
Design to reduce human errors
Providing information and feedback
Training and maintaining skills

Topics
Lessons from Texas City
New factors in process accidents
Safety as a control problem
New approaches to hazard analysis
Design for safety
Risk analysis and management

Conclusions

STAMP: A Systems Model of


Accident Causality
Systems-Theoretic Accident Model and Processes
Safety treated as a control problem, not a failure
problem
Accidents are not simply an event or chain of
events
Involve a complex, dynamic process
Arise from interactions among humans,
machines and the environment

A Broad View of Control


Does not imply need for a controller
Component failures and dysfunctional interactions may
be controlled through design
(e.g., redundancy, interlocks, fail-safe design)
or through process
Manufacturing processes and procedures
Maintenance processes
Operations

Does imply the need to enforce safety constraints in


some way

STAMP (2)
Safety is an emergent property that arises when system
components interact with each other within a larger
environment
A set of safety constraints related to behavior of
system components enforces that property
Accidents occur when interactions among system
components violate those constraints
Goal of process (system) safety engineering is to
identify the safety constraints and enforce them in the
system design

Example Safety Constraints

Build safety in by enforcing constraints on behavior


Controller contributes to accidents not by failing but by:
1. Not enforcing safety-related constraints on behavior
2. Commanding behavior that violates safety constraints

System Safety Constraint:


Water must be flowing into reflux condenser whenever catalyst is
added to reactor

Software (Controller) Safety Constraint:


Software must always open water valve before catalyst valve

STAMP (3)
Systems are not static
A socio-technical system is a dynamic process
continually adapting to achieve its ends and to react
to changes in itself and its environment
Systems and organizations migrate toward accidents
(states of high risk) under cost and productivity
pressures in an aggressive, competitive environment
Preventing accidents requires designing a control
structure to enforce constraints on system behavior
and adaptation that ensures safety

Example
Control
Structure

Controlling and managing dynamic


systems requires visibility and feedback
Controller
Model of
Process
Control
Actions

Feedback

Controlled Process

Relationship Between Safety and


Process Models
Accidents occur when models do not match process
and
Incorrect control commands given
Correct ones not given
Correct commands given at wrong time (too early, too
late)
Control stops too soon

(Note the relationship to system accidents)

Relationship Between Safety and


Process Models (2)
How do they become inconsistent?

Wrong from beginning


Missing or incorrect feedback
Not updated correctly
Time lags not accounted for

Resulting in
Uncontrolled disturbances
Unhandled process states
Inadvertently commanding system into a hazardous state
Unhandled or incorrectly handled system component
failures

Modeling Accidents Using STAMP


Two types of models are used:
1. Static safety control structure
2. Behavioral dynamics (system dynamics)
Dynamic processes behind change in the safety
control structure, i.e., why it may change (e.g.,
degrade) over time

Simplified System Dynamics Model of Columbia Accident

Uses for STAMP


Basis for new, more powerful hazard analysis techniques
(STPA)
Safety-driven design
More comprehensive accident/incident investigation and root
cause analysis
Organizational and cultural risk analysis
Defining safety metrics and performance audits
Designing and evaluating potential policy and structural improvements
Identifying leading indicators of increasing risk (canary in the coal mine)

New risk management tools


New holistic approaches to security

STAMP-Based Hazard Analysis (STPA)


Supports a safety-driven design process where
Hazard analysis influences and shapes early design
decisions
Hazard analysis iterated and refined as design evolves

Goals (same as any hazard analysis)


Identification of system hazards and related safety
constraints necessary to ensure acceptable risk
Accumulation of information about how hazards can be
violated, which is used to eliminate, reduce and control
hazards in system design, development, manufacturing,
and operations

STPA (2)
STPA process
Starts with identifying system requirements and
design constraints necessary to maintain safety.
Then STPA assists in
Top-down refinement into requirements and safety
constraints on individual components.
Identifying scenarios in which safety constraints can be
violated.
Using results to eliminate or control hazards in design,
operations, etc.

Copyright Nancy Leveson, Aug. 2006

Comparison of STPA with Traditional


HA Techniques
Top-down (vs bottom-up like FMECA)
Considers more than just component failure and
failure events (includes these but more general)
Guidance in doing analysis (vs. FTA)
Handles dysfunctional interactions and system
accidents, software, management, etc.

Comparisons (2)
Concrete model (not just in head)
Not physical structure (HAZOP) but control (functional)
structure
General model of inadequate control (based on control
theory)
HAZOP guidewords based on model of accidents being
caused by deviations in system variables
Includes HAZOP model but more general

Fault trees concentrate on component failures, miss


system accidents

Risk Analysis and Risk Management


Effectiveness and Credibility of ITA

Time

System Technical Risk

Time

Identifying Lagging vs. Leading Indicators


Number of waivers issued good indicator for risk in Space Shuttle
operations but lags rapid increase in risk
1 Risk Units
400 Incidents
0.75 Risk Units
300 Incidents
0.5 Risk Units
200 Incidents
0.25 Risk Units
100 Incidents
0 Risk Units
0 Incidents
0

100

200

300

System Technical Risk : Unsuccessful ITA 0613


Outstanding Accumulated Waivers : Unsuccessful ITA 0613

400

500
600
Time Time
(Month)

700

800

900

1000
Risk Units
Incidents

No. of incidents under investigation a better leading indicator

Time

Managing Tradeoffs Among Risks


Good risk management requires understanding
tradeoffs among
Schedule
Cost
Performance
Safety

Schedule Pressure

Example: Schedule Pressure and Safety Priority


High

Low

Low

High

Safety Priority

1.
2.

Overly aggressive schedule enforcement has little effect on


completion time (<2%) & cost, but has a large negative impact
on safety
Priority of safety activities has a large positive impact, including
a positive cost impact (less rework)

Conclusions
Future needs for safety in the process industry:
Differentiation between process safety and personal
(occupational) safety
Improved safety culture management
New approaches to handle
Advanced technology (particularly digital technology)
System accidents and complexity
New types of human error

Using a control-based (vs. failure-based) model of


causality expands our power to prevent process
accidents

Das könnte Ihnen auch gefallen