Reliability Engineeringand System Safety

ARTICLE IN PRESS
Reliability Engineering and System Safety 94 (2009) 10001018
Contents lists available at ScienceDirect
Reliability Engineering and System Safety

journal homepage: www.elsevier.com/locate/ress
Incorporating organizational factors into Probabilistic Risk Assessment (PRA) of complex socio-technical systems: A hybrid technique formalization
Zahra Mohaghegh , Reza Kazemi, Ali Mosleh
Center for Risk and Reliability, University of Maryland, College Park, MD 20742, USA
a r t i c l e in f o
Article history: Received 25 July 2008 Received in revised form 16 November 2008 Accepted 21 November 2008 Available online 3 December 2008 Keywords: Probabilistic Risk Assessment (PRA) Organizational factors Safety culture Socio-technical complex systems System dynamics Bayesian Belief Network (BBN) Safety management Human Reliability Analysis (HRA)
a b s t r a c t
This paper is a result of a research with the primary purpose of extending Probabilistic Risk Assessment (PRA) modeling frameworks to include the effects of organizational factors as the deeper, more fundamental causes of accidents and incidents. There have been signicant improvements in the sophistication of quantitative methods of safety and risk assessment, but the progress on techniques most suitable for organizational safety risk frameworks has been limited. The focus of this paper is on the choice of representational schemes and techniques. A methodology for selecting appropriate candidate techniques and their integration in the form of a hybrid approach is proposed. Then an example is given through an integration of System Dynamics (SD), Bayesian Belief Network (BBN), Event Sequence Diagram (ESD), and Fault Tree (FT) in order to demonstrate the feasibility and value of hybrid techniques. The proposed hybrid approach integrates deterministic and probabilistic modeling perspectives, and provides a exible risk management tool for complex socio-technical systems. An application of the hybrid technique is provided in the aviation safety domain, focusing on airline maintenance systems. The example demonstrates how the hybrid method can be used to analyze the dynamic effects of organizational factors on system risk. & 2008 Elsevier Ltd. All rights reserved.
1. Introduction In the past 30 years, we have witnessed signicant improvements in safety design concepts, as well as in methods and tools for safety risk analysis of complex technical systems. These improvements can be placed in three distinct phases, evolving from early to the rst and then to the second generations of conceptual theories and techniques, covering hardware, human, and organization performance. The nature of this development has been similar to the shift in human sciences (e.g. decision research, management, and organizational theory) from normative, prescriptive models to descriptive models in terms of a deviation from rational performance towards modeling the actual behavior, as described by Rasmussen [1]. The early phase is much more pronounced in the nuclear power industry, where the original safety design philosophy was defense-in-depth (use of multiple barriers against accidental release of radioactivity). The corresponding philosophy in aviation was the use of redundancies in critical systems, leading to conservative designs of engineering systems and stringent regulatory oversight, quality control, and inspection. This genera-
Corresponding author.
E-mail address: mohagheg@umd.edu (Z. Mohaghegh). 0951-8320/$ - see front matter & 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.ress.2008.11.006
tion coincided with the phase of normative models in human sciences, as mentioned by Rasmussen [1]. The next signicant phase (rst generation) is characterized by the introduction of formal risk analysis (e.g. Classical PRA (WASH1400; [2])) into regulatory systems (e.g. risk-informed regulation) and operation (e.g. risk-based maintenance outage planning). Initially, these methods were mostly hardware-driven; however, it was also recognized that major accidents often involved human error in addition to the technical system failures. The rst generation of Human Reliability Analysis (HRA) methods, such as Technique for Human Error Rate Prediction (THERP) [3], were developed to predict the probability of human error in performing prescribed or procedural tasks, or mainly Error of Omission (EOO). The interest in extending safety risk models to include organizational behavior was in part motivated by the fact that investigations of major accidents continued to cite management and organizational factors as major root causes of human errors in operating and/or maintaining technical systems [4,5]. Reasons Swiss Cheese Model [4,5] is a well-known example of the use of rst-generation organizational accident theories to describe the process of organizational effects on human errors, and, consequently, on the rate of accidents. There are also a number of rstgeneration quantitative methods and techniques that attempt to quantify the impact of organizational factors on system risk. These include MACHINE [6], WPAM [7,8], SAM [9], Omega Factor Model [10], ASRM [11], and Causal Modeling of Air Safety [12]. The nature
ARTICLE IN PRESS
Z. Mohaghegh et al. / Reliability Engineering and System Safety 94 (2009) 10001018 1001
of rst-generation safety risk analysis theories and techniques can be characterized in terms of deviations from normative performance [1]. The emerging second-generation theories and techniques are characterized by more realistic performance models of hardware, humans, and organization. There is a gradual move from classical PRA towards dynamic PRA [13,14]. HRA models are becoming increasingly cognition based, and attempt to cover Errors of Commission (EOC) in addition to EOO. Examples are Cognitive Reliability and Error Analysis Method (CREAM; [15]) and Information, Decision, and Action in Crew context (IDAC; [16]). Simulation-based techniques are being introduced to integrate cognition-based HRA methods with dynamic models of the technical system behavior. An example is an integration of Accident Dynamics Simulator (ADS; [17]) and IDA [18]. The second generation of safety risk analysis coincided with the phase of models in terms of the actual behavior of individuals and organizations, as mentioned by Rasmussen [1]. However, second-generation organizational models of safety risk frameworks are still evolving. These models attempt to represent the underlying organizational mechanisms of accidents, focusing on the systemic and collective nature of organizational behavior. On the theoretical side, Rasmussen [1] cites the selforganizing nature of High Reliability Organizations [19] and learning organizations [20,21] as concepts useful in analyzing the managerial and organizational inuences on risk. Normal Accident Theory [22], which views accidents caused by interactive complexity and close coupling, can also be considered a secondgeneration perspective on organizational safety. Meanwhile, second-generation quantitative techniques mostly tackle the dynamic aspects of organizational inuences. For example, Biondi [23] uses the qualitative model developed by Bella [24] to describe the changes in the reliability of a system due to organizational dynamics. Other researchers, e.g. Cooke [25] and Leveson [26], have used the System Dynamics approach [27] to describe the dynamics of organizations, but these models do not include detailed, PRA-style models of the technical system. Yu et al. [28] also used the System Dynamics approach to assess the effects of organizational factors on nuclear power plant safety. Their work is an attempt to link System Dynamics and PRA. However, the interconnection between PRA and System Dynamics is not claried. There are still a number of major challenges in developing second-generation theories and techniques for safety risk analysis in the areas of organizational models, human reliability, and PRA. This paper is a result of a research [29] focused on developing a second-generation organizational model of safety risk frameworks. Organizational models often direct the analysis of accidents and incidents to their deeper, more fundamental causes. The key questions in this line of research can be summarized as follows: (1) What are the organizational factors that affect risk, (2) How do these factors inuence risk, and (3) How much do they contribute to risk? From a broader perspective, all the efforts and studies in this research domain can be placed under the banner of Organizational Safety Risk Analysis. In the absence of a comprehensive theory, or at least a set of principles and modeling guidelines backed by theory, it is hard to assess the validity and quality of the proposed modeling techniques. In a multidisciplinary effort, we focused on improving the theoretical foundations and on introducing of a set of modeling principles into the eld of Organizational Safety Risk Analysis. A comprehensive review of relevant theories and technical domains was needed to address the inherently multidimensional nature of the problem. Most important among these domains were quality management [30], safety management [31], organizational culture and climate [32], safety culture [33,34], safety
climate [35,36], human resource systems [37], human reliability (e.g. CREAM; [15] and IDAC; [16]), organizational theory, such as socio-technical system theory [38], Lewinian eld theory [39], Mintzberg categorical theory [40], and organizational performance and change models [41], as well the theories of learning organization [21]. With a multidisciplinary perspective on the issue, a set of 13 principles for Organizational Safety Risk Analysis was proposed. These principles have been described briey in Section 2 in order to clarify the scope and goal of the research. More detailed discussions of the principles are given in the corresponding publications [29,42,43]. A new organizatioinal safety risk framework, called Socio-Technical Risk Analysis (SoTeRiA),1 was then developed, based on these modeling principles. The framework formally integrates the technical system risk model with the social (safety culture and safety climate) and structural (safety practices) aspects of safety prediction models, and provides a theoretical basis for the integration. SoTeRiA is briey described in Section 3 in order to facilitate the main discussion of the present paper. We refer the reader to [29,42,44] for complete discussion of SoTeRiA. The next challenge was nding appropriate techniques to operationalize the proposed organizational safety risk theory. This is the main focus of the current paper. Section 4 provides a methodology for assessing and adapting appropriate modeling techniques, building proper interfaces, and creating a hybrid technique consistent with the principles and characteristics of organizational safety risk frameworks. In Section 5, an example of the application of the proposed hybrid technique in the aviation domain is presented through an integration of a system dynamics software, STELLA [45,46], and a hybrid risk analysis software, The Integrated Risk Modeling System (IRIS; [47]).
2. Principles of organizational safety risk analysisan overview This section provides an overview of the work by two of the authors on exploring the theoretical foundations and a set of principles [29,42] for the eld of Organizational Safety Risk Analysis. These principles are a series of testable propositions with supporting rationales, insights from other research efforts, and, in some cases, the integration of different theories from diverse disciplines. Table 1 provides a high-level classication of the 13 principles proposed. They are grouped in four categories and labeled alphabetically. This section only provides a brief description of the proposed principles in order to clarify the scope of the research. One of these principles, Principle M, is the main discussion of the current paper. We refer the interested reader to Mohaghegh and Mosleh [29,42] for more detailed explanation of the rest of the principles. Principle (A). Organizational Safety Risk (OSR) is the unknown of interest or gure of merit in Organizational Safety Risk Theory, and is a measure of the safety performance of the whole, or of some sub-unit of the organization. It is formally expressed as OSR f F 1 ; F 2 ; . . . ; F N where f stands for an explicit or implicit function or statement, and F1,F2, y, FN are the predictors (independent variables). Principle (B). Safety Risk is one of the organizational outputs that inuences and is inuenced by other organizational outputs, such as prot and quality.
1
Soteria was the Greek goddess of deliverance and preservation from harm.
ARTICLE IN PRESS
1002 Z. Mohaghegh et al. / Reliability Engineering and System Safety 94 (2009) 10001018
Table 1 Classication of the proposed principles. Category I. Designation and denition of objectives Principles (A) Unknown of interest (B) Multidimensional performance objective
dimensions of that element and the sensitivity of the model output to those dimensions. Principle (F). Theorists need to specify the level of generality (all organizations, specic industry, etc.) and the scale and scope of safety concerns (occupational safety, public safety, etc.). Principle (G). The basic unit of analysis includes two factors and a link. Explanation: The rst essential element of a theory is a factor, also referred to as a construct or variable [48,49]. In developing a theoretical understanding of complex phenomena, the contents of factors (or elements) and their links (relations, interactions) provide a powerful, almost universal, language. Principle (H). Theorists must specify whether constructs are individual, global, shared, or congural. If a construct is shared or congural, the level of the construct, the level of its origin, and the nature of the corresponding emergent process (composition and compilation processes) should be specied. Explanation: Constructs can be dened either on an individuallevel or on a unit-level. Unit refers to any entity composed of two or more individuals, such as groups, divisions, and organizations. Three types of unit-level constructs are recognized: global, shared, and congural. Global constructs are single-level phenomena that originate and are revealed at the unit-level. Organization size and organizational practices (e.g. human resources functions) are examples of global constructs. They represent the unit as a whole, but they have an identity (or objective) separate from unit members social and psychological characteristics. In contrast, the shared and congural constructs originate at individual-level perceptions, values, cognitions, and behaviors, and emerge at the higher levels. Shared unit constructs (e.g. group climate) describe the common characteristics of the unit members, while congural constructs (e.g. diversity, pattern of individual perceptions) show the pattern or variability of unit members characteristics. Principle (I1). An organizational safety causal model should integrate the relevant aspects of organization in both the social (e.g. safety culture and climate) and structural (organizational safety structure and practices) dimensions. Principle (I2). Inclusion of factors in the theory should be in an optimum manner with respect to the two competing concepts of parsimony and comprehensiveness. Principle (J1). Links should be specied according to all of the following dimensions: level (single- and cross-levels), nature (antecedent, measurement, and association), and structure (factorto-factor and factor-to-link). Principle (J2). The cross-level inuences can be either bottom-up or top-down. Depending on the conceptualization of higher-level phenomena, a bottom-up process can be either a composition or a compilation. In the case of a composition process, theorists must explain how within-unit agreement and consensus emerge from the individual-level characteristics. In the case of a compilation, theorists must explain the theoretical process (nonlinear complex function) by which different individual contributions combine to produce the emergent phenomenon. Explanation: An example of a single-level relation is the one between organizational structure and organizational outcome (a relation at organization level). Each level of organization is embedded in a higher-level context. Individuals are placed in groups, groups in organizations, and organizations in industries. Often there are some inuences (either direct or moderating) from
II. Modeling perspective
(C) Safety performance and deviation (D) Multilevel framing (E) Depth of causality and level of detail (F) Model generality
III. Building blocks
(G) Basic unit of analysis (H) Factor level and nature (I) Factor selection (J) Link level, nature, and structure (K) Dynamic characteristics
IV. Techniques
(L) Measurement techniques (M) Modeling techniques
Organizational factors
Organizational Safety Risk C Group Safety Risk E Individual Safety Risk
Organization-level
B D
Group-level
Individual-level
Fig. 1. Multi-level relations between organizational factors and organizational safety performance.
Principle (C1). Metrics of organizational safety risk can be dened in terms of the deviation of organizational safety output from a normative level. Similarly, the concepts of error and deviation can be clearly dened for the technical system components and the individuals directly operating and maintaining them. However, these concepts should not be extended to the organizational factors affecting the performance of the individuals. Principle (C2). Modeling the effects of organizational factors on safety requires a theoretical understanding of organizational performance, capable of capturing its collective nature. Principle (D1). A comprehensive organizational safety risk theory should combine macro- and micro-organizational perspectives, built within a multi-level framework. Explanation: Since Organizational Safety Risk is an organizational outcome, the relation between organizational factors and organizational safety risk can be studied either at the organization level (A) or with a cross-level analysis (BC or DEC) (see Fig. 1). Principle (D2). When risk management is the objective, a crosslevel organizational causation theory is needed. Principle (E1). The decision about the depth of causality in the model depends on (1) the objectives of the model (e.g. the level of the decision variables), (2) the availability of data, and (3) the level of control (which factors can be changed or controlled). Principle (E2). The decision about the level of detail in specifying the characteristics of a model element depends on the impacts of different
ARTICLE IN PRESS
higher-level phenomena on lower-level constituent elements. These effects are called top-down inuences in the model. For example, organizational safety practices (e.g. training, reporting system, etc.) are antecedents of the individual-level psychological safety climate (the perception of individuals about safety practices). This is the direct effect of organizational safety practices on the individual psychological safety climate. In contrast, the top-down moderating effects are those where a higher-level factor moderates the relationships in a lower-level unit. For example, the effects of morale on individual safety performance would be different in different organizational structures. The bottom-up process describes the emergence of a phenomenon from lower-level to higher-level contexts. There are two types of emergent processes: composition and compilation. Composition describes phenomena that are essentially the same as they emerge upward across levels. For example, group safety climate emerges from the shared perceptions of group members about organizational safety practices (e.g. training, reporting system, etc.). Thus, the group and individual safety climates are the same constructs, although they are at different levels. In contrast, compilation describes phenomena that comprise a common domain but are distinctively different as they emerge across levels [50, p. 16]. For example, team performance can be the result of the pattern of individual team members performances. In other words, team performance is a complex function of individuals performances and their interaction with each other [51]. Thus, based on compilation models, a complex combination of diverse lower-level contributions forms the higher-level phenomena [50]. Principle (K1). Dynamic effects must be considered and explicitly modeled when deemed important. Static organizational safety frameworks cannot capture the risks originating from (1) delay in inuences, (2) temporal changes in factors and links (e.g. temporal cycles), (3) composite time effects (e.g. time scale variation and feedback loops), or (4) the changes in direction and strength of links as a function of time.
Principle (K2). The time boundary and reference points of the theory must be explicitly specied. Principle (L1). Assessment of organizational safety risk requires a multi-dimensional coverage of measurement bases (what to measure) and measurement methods (how to measure). The selection of the measurement method needs to reect (1) the type and level of the construct and its underlying theoretical model, (2) the required accuracy, and (3) the availability of information. Principle (L2). There are three different measurement methods: objective (e.g. audit), subjective (e.g. perceptions/survey), and hybrid (a combination of objective and subjective). Principle (L3). There are three kinds of measurement bases: direct (e.g. capturing organizational safety output (e.g. frequency of system accidents)), indirect (e.g. accounting for safety enablers or the safety causal factors (e.g. safety climate, safety practices)), and hybrid (a combination of direct and indirect). Principle (L4). The interdependencies of factors are strongly related to their measurement approaches. A factor assessed by different measurement approaches may have different paths of inuence in the safety causal model. Principle (M). Because of the multidisciplinary nature of the organizational safety framework, a comprehensive technique is a hybrid technique. 3. Socio-technical risk analysis (SoTeRiA) frameworkan overview This section briey describes SoTeRiA, which is discussed in more detail in the corresponding publications [29,42,44]. The overview provided here facilitates the discussion on developing the hybrid modeling technique described in Section 4 and applied in Section 5. The development of SoTeRiA starts from the system risk model (the right side of Fig. 2) and moves to the organizational root causes (the left side of Fig. 2). The system risk model
Industrial & Business Environment
Organizational Culture Safety Culture
Financial Outcome
Social and Political Culture and Climate
Organizational Structure & Practices Org. Safety Structure & Practices
Organizational Climate Org. Safety Climate
Group Climate Group Safety Climate
Individual PSFs Psyc. Safety Climate
Unit Process Model
Organizational Vision, Strategy and Goals
Regulatory Environment
Emergent Process (Leadership/Supervision, Social Interaction & Homogeneity )
SCP
System Risk
Fig. 2. Schematic representation of SoTeRiA.
ARTICLE IN PRESS
delineates the possible risk or hazard scenarios and decomposes them into their contributing elements, including human, software, and hardware failures, as well as environmental factors. Safety Critical Performances (SCPs) are identied based on the risk scenarios. SCPs are the group or individual performances that have direct effects on the elements of technical system risk scenarios. For example, maintenance is a safety critical task, since it directly affects hardware failure (an element of the technical system risk scenario). In general, SCPs can be events specied either explicitly in the accident scenarios (e.g. human actions) or implicitly through model parameters (e.g. equipment failure rate). SCPs help us to turn the focus onto what matters most for safety among many activities of the organization. The unit process model (e.g. maintenance unit, operation units) includes the direct activities that affect SCP, the unit output. The direct activities are decomposed to their direct resources, procedures, and the involved individuals performances in the unit process model. The rest of the causal model will describe how the organizational factors affect the SCPs through their effects on the direct resources, procedures, and individuals performances in the unit process model. All organizational practices that inuence the resource (e.g. calibration and test activities), individuals (e.g. human resource practices), and procedures (e.g. alteration) in the unit process models are dened as organizational safety practices. The organizational safety practices are classied into four groups, including (1) resources-related activities, (2) procedure-related activities, (3) human-related activities, and (4) common activities. The rst three activities are supported by the fourth one (common activities). Common activities include design, implementation, internal auditing, and internal change system. Internal auditing and internal change system are part of the organizational learning process, and, more specically, single-loop learning [52,53]. In single-loop learning, the actions are modied based on the gap between the real and expected results. The four common activities (design, implementation, internal auditing, and change) relate to the Plan, Do, Check, and Act (PDCA) cycle, which has its roots in the quality management eld [30]. Organizational safety practices affect resources and procedures in the unit process model through the direct link indicated in Fig. 2. Organizational Safety Structure and Practices also affect internal Performance Shaping Factors (PSFs), and, ultimately, individual safety performance through two different paths of inuence: (1) Organizational safety practices collectively inuence organizational safety climate, which is the shared perception of employees about actual organizational safety practices. Organizational safety climate affects group safety climate, which in turn inuences psychological safety climate (an element of individual-level PSF). Psychological safety climate is the perception of organizational safety practices at the individual level. It impacts individuals motivation in the unit process model. For example, high-quality training programs and work conditions collectively create a climate in which employees believe in their managers commitment to safety. This belief impacts the employees motivation. This is the indirect inuence of organizational safety practices on individual-level PSFs. (2) Different sub-factors in Organization Safety Structure and Practices can also directly impact individuals internal PSFs. The direct effects are caused by the inuence of humanrelated activities on ability and opportunity, which are two different individual performance shaping factors. For example, a low-quality work environmentpoor air conditioning and lightingcan affect physical opportunity, and training affects peoples knowledge.
The strength of the shared climate depends on emergent processes, including the social interaction process, leadership/ supervision, and homogeneity in the organization. As Fig. 2 shows, emergent processes are also affected by organizational practices. At the organization level, safety culture shapes managerial decisions regarding organizational safety practices and structural features. Culture is more stable, and is related to employees ideologies, assumptions, and values. Climate is the perception of what happens in the organization and can be described as temporary attributes of an organization. As Fig. 2 shows, organizational culture is inuenced by the type of industry and business environment, social/national culture, and organizational vision, goal, and strategy. Safety culture is also affected by feedback effects from organizational safety and nancial performances. These effects are part of organizational learning processes, specically double-loop learning [52,53]. In double-loop learning, the underlying assumptions, values, and policies that have led to the specic performances are analyzed, questioned, and adapted (if needed). Regulations have two different effects on safety: rst, through policies and rules on organizational practices, and second through external auditing of organizational practices and unit process elements, such as maintenance procedures and resources. Financial performance affects safety performance both directly and indirectly. The indirect effects, for example, would be the feedback effects of nancial stress on safety culture, and, ultimately, on organizational practices, such as training, work environment, operational procedures, etc. An example of the direct effects is a collapse of morale in reaction to news on extreme nancial distress (e.g. possible bankruptcy). On the other hand, safety performance may affect the nancial performance directly by increasing internal costs (e.g. higher insurance rates as a result of an accident) and indirectly through loss of goodwill, new regulations, and market value.
4. Use of hybrid technique to model organizational safety risk 4.1. Introduction The modeling principles briey presented in Section 2 and the elements of the proposed SoTeRiA framework are obviously still at a higher level of abstraction than a causal model for specic applications. Such causal models need to cover the broad range of causal factors and paths of inuence outlined by SoTeRiA, from organizational conditions and characteristics to the behavior of individuals, as well as the performance of the technical system itself. An immediate question is which modeling language and representational scheme can accommodate this wide range. There are a number of options, and the choice undoubtedly impacts the explanatory and predictive power of the resulting model. This paper concentrates on the choice of representational schemes and techniques. Candidates are selected from techniques in PRA of technological systems, Social and Behavioral Science, Business Process Modeling, and Dynamic Modeling. These candidate methods fundamentally satisfy two criteria: (a) consistency with the proposed modeling principles [29,42] and (b) consistency with the SoTeRiA framework [29,42,44]. In Section 4.2, most common techniques are briey compared and discussed in order to provide a guide for informed choices. Section 4.3 provides a methodology of adapting appropriate techniques, converting them to the possible common techniques, and creating a hybrid approach with capabilities that include: 1. Ability to capture the collective nature of organizational behavior, such as dynamic interactions (e.g. feedback loops,
ARTICLE IN PRESS
2.
3. 4.
5.
time lags) and emergence of abnormal behavior from the interactions of factors within their normal range of variability. Ability to cover the broad range of causal factors of various natures, such as social aspects (e.g. safety climate and safety culture) and structural factors (e.g. human resource system, hardware failure). Ability to cover different types of dependencies and relations (e.g. probabilistic vs. deterministic, causal vs. correlation). Flexibility in accommodating a range of metrics, assessment methods, and limitations in availability of required information (the complementary materials on measurement techniques for organizational safety risk analysis has been covered in the corresponding publications [29,43]). If possible, being well-developed and supported by computational algorithms and tools.
4.2. Candidate techniques 4.2.1. Formal probabilistic risk analysis techniques for technical systems Formal probabilistic risk analysis techniques refers to the class of methods that apply a logical construct to describe the system. They include classical Probabilistic Risk Assessment (PRA) techniques, such as Event Sequence Diagram (ESD), Event Tree (ET), and Fault Tree (FT). ESDs are used to dene the system risk scenarios, the context within which various causal factors would be viewed as a hazard, a source of risk, or a safety issue. They have been used both qualitatively, to identify hazards and risk scenarios, and quantitatively, to nd the probabilities of risk scenarios [54]. Other methods closely related to ESDs and used in risk and safety analysis of complex systems are Event Trees and Decision Trees. Both are inductive logic methods for identifying the various possible outcomes of a given initiating event. The initiating event in a decision tree is typically a particular business or risk acceptance decision, and the various outcomes depend upon subsequent decisions. In risk analysis applications, the initiating event is an event or condition (typically a component or subsystem failure) that starts the sequence of events, leading to various nal states of the technical system and possible undeniable consequences (e.g. loss of life due to system failure). The subsequent events are determined by the system characteristics, environmental conditions, and human actions. Fault tree analysis is a technique by which combinations of basic events that produce other events can be related using simple, logical relationships (AND, OR, etc.). These relationships permit methodical building of a causal model, and relate system failure to its causes (often failures of components of the system). Fault tree analysis is the most popular technique used for quantitative risk and reliability studies [55]. One of the most effective ways of modeling risks associated with technical systems is to use ESDs as the rst layer of describing system behavior in case of anomalies, and then to provide a more detailed picture of the contributing causes (of the events in the ESD) through FTs, which can be converted into a Binary Decision Diagram (BDD). A BDD is a directed acyclic graph, rst introduced by Lee [56] and enhanced by Akers [57], utilized by Bryant [58], improved by Rauzy [59,60], and enhanced in efciency and accuracy by Sinnamon and Andrews [61] in the FT analysis. Since ESDs in their most basic form are reducible to binary logic, the combination of ESDs and FTs can be converted into a BDD. The process, depicted in Fig. 3, was developed for large-scale combinations of ESD and FT models and used in NASAs risk analysis computer code QRAS [62].
4.2.2. Process modeling techniques One of the ingredients of the organizational safety causal model is a model of the primary production processes of the organization. Semi-formal techniques have been successfully applied for modeling business processes because their graphical notation can represent complex systems comprehensively [63]. Therefore, a semi-formal process technique should be adapted and applied to represent the various processes (e.g. work process) in an organization. Then, for quantication purposes, it needs to be converted to a formal technique that is consistent with other techniques in the framework. Section 4.3 describes one such conversion approach. Several semi-formal process modeling techniques can be found in the literature. These include Flow Chart, State Chart Diagram, Event Driven Process Chain [63], Integrated Denition Methodology [64], and Structured Analysis and Design Technique (SADT) [65,66]. In order to determine an appropriate modeling technique, a few aspects need to be considered: (1) ease of conversion to a formal technique, (2) generality for use in different types of processes and organizations, and (3) effectiveness in communicating the model and results. SADT is a good candidate because it meets with the above criteria. Originating in the eld of software and knowledge engineering, it is used to model decision-making activities. Hale et al. [67] have also adopted this technique for modeling a safety management system. Fig. 4 shows the structure of SADT. The activity process transmits the inputs (I) to the outputs (O), given the resources (R) and the control/criteria (C) [63]. The inputs can be information, hardware, raw materials, people, etc. Outputs are the products of the process. Resources are the things needed to perform the activity, such as tools, equipment, and people. Controls/criteria include requirements, job control mechanisms, constraints, applicable rules and regulations, and standards that are used to direct, control, and judge the conduct of an activity.
ESD Cut-Sets 3 2 1 0 1 Probability of Sequence
FT
+ 0 0 1 1 0 0 1 1
Fig. 3. Use of BDD to solve combined ESD and FT models [62].
Controls/ Criteria
Inputs
Activity Transformation
Outputs
Resources
Fig. 4. Structured analysis and design technique (SADT).
ARTICLE IN PRESS
The United States Air Force commissioned the developers of SADT to devise a process modeling method, named IDEF, for analyzing and communicating the functional perspective of a system. IDEF is a graphical approach similar to SADT, with the elements of Input, Control, Output, and Mechanism (called ICOMs). In December 1993, the Computer Systems Laboratory of the National Institute of Standards and Technology (NIST) released IDEF as a standard for Function Modeling. An IDEF represents the whole system as a single activity, named context activity (A0). The context activity is decomposed into detail levels that include its sub-activities. This decomposition process continues to include all relevant process of system, as seen in Fig. 5 [68].
A C B
Fig. 6. Simple path model [70].
Latent Variable A1
Measured variable
4.2.3. Regression-based techniques Regression-based techniques are common in economics and the social sciences. Over the past 20 years, causal modeling [69] has become increasingly popular in organizational psychology [70]; generally speaking, it is used to distinguish true statistical causality from spurious correlation [71]. The process involves dening a set of variables and their relations, then testing all of the relations simultaneously. This is practiced by applying various techniques, such as Path Analysis or Structural Equation Modeling (SEM) [72]. Despite some differences among these techniques, the underlying concept is that the analyst calculates the covariance among the variables in the proposed model (using actual data) and compares it with the expected covariance (the restriction that the modeler places). The comparison indicates to what extent the model ts the actual data. In path analysis, the variables that make up the causal relations are the measured variables (AD in Fig. 6). In contrast, SEM has two kinds of variables: measured (observed) and structural (latent) (see Fig. 7). Latent variables are hypothetical variables and are not measured directly; rather, they are estimated in the model from a number of measured variables. SEM is a combination of Path Analysis (between latent variables) and Factor Analysis (between measured variables and latent variables) [73]. The principal assumption in these techniques is that the variables that are connected by arcs are linearly related. 4.2.4. Bayesian belief net (BBN) A recently popular methodology for representing causal connections that are soft, partial, or uncertain in nature is
A A2
C1
C2
D1
D2
C B2 B B1
Fig. 7. Structural equation modeling (SEM) [70].
A B1 B2 X Y B3 Z
Fig. 8. A Bayesian belief network.
Context activity A0
A-0
A1 A2 A3 A0
A21 A22 A23 A24 A2 A3
A31 A32 A33
Fig. 5. A typical decomposition hierarchy in IDEF0 (adopted from [68]).
the Inuence Diagram (ID) family [74], particularly the Bayesian Belief Network (BBN) [75]. The applications of BBNs, also known as Bayesian Networks, Belief Nets, Causal Nets, or Probability Nets, have grown enormously over the past 20 years, with theoretical and computational development in many areas. During the 1990s, Bayesian Networks and Decision Graphs attracted a great deal of attention as a framework for articial intelligence and decision analysis, not only in academia but also in the industry. An example is the recent strong interest in eld of reliability analysis (see [7679]). Bayesian Networks are a network-based framework for representing and analyzing models involving uncertainty. They handle probabilistic relations in a mathematically rigorous, yet efcient and simple way. A Belief Network consists of a set of variables (causes and effects) and a set of directed edges between variables (paths of inuence). Each variable has a nite set of mutually exclusive states. The variables, together with the directed edges, form a Directed Acyclic Graph (DAG) (Fig. 8). Conditional probabilities carry the strength of the links between the causes and their potential effects. For example, for a given state Ai of a variable A (target node) with parents B1, y, Bn, we have the conditional probability of the state (Ai) occurring, given the states of the contributing parent nodes: P(AijB1, y, Bn), where the probability is assessed for all possible combinations of the states of the parents B1, y, Bn. Bayes theorem in the subjective theory of probability is at the core of the inference engine of BBNs. In the denition of BBNs, the DAG restriction is critical. Feedback cycles are difcult to model quantitatively, and no calculus has been developed for causal networks that can cope
ARTICLE IN PRESS
with feedback loops in a reasonably general way. The issue of feedback loops and dynamic aspects of candidate techniques will be revisited in Section 4.2.5. The probabilities required in a BBN are quantied with data and expert opinion, mostly the latter. In some cases, expressing an opinion quantitatively may be too demanding. In other words, the experts may feel more comfortable with qualitative likelihood assessments on coarser scales. For example, based on Fig. 9, experts may know that it is highly likely that a high quality factor C leads to a high quality factor D, or that it is highly likely that a specic type of factor A leads to a specic type of factors B and E. In such cases, there is a need for a method capable of inference with qualitative scales. The QualitativeQuantitative Bayesian Belief Network (QQ-BBN) [79,80] provides a solution for this case. According to this approach, the deeper parts of the BBN, which are farther from direct observation, are assessed using a qualitative scale (e.g. high, medium, and low) for both the node and conditional likelihoods. A qualitative likelihood calculus carries the inference from these deeper layers to points where observation-based assessments of probabilities are possible, or where the experts feel more comfortable expressing their beliefs in a numerical scale. This approach, however, requires a linkage between the two scales at the boundary between the qualitative and quantitative parts (i.e. an assessment of the probabilities of X and Y in Fig. 9 as a function of the qualitative likelihood scales of nodes B, D, and E (see [79,80] for more detail) 4.2.5. Deterministic dynamic techniques When there is enough information to establish deterministic relations among factors of the model or some parts of it, deterministic techniques can be used. The deterministic modeling technique can be either analytical or simulation based. Simulation-based techniques, such as Agent-Based Modeling (ABM) [81] and System Dynamics (SD) [27], are usually the only solution if the formal model is complex and an analytical solution is either too time-consuming or simply not possible. System Dynamics [27] shows signicant capabilities for modeling certain human behavior and decision-making processes, making it a good technique for modeling aspects of organizational behavior. System Dynamics was initially developed with the intention of studying the behavior of industrial systems to show the way policies, delays and structures are related, and how they inuence the stability of the system. As J.W. Forrester [82] explains, System Dynamics integrates the separate functional areas of management, marketing, investment, research, personnel, production, and accounting. Each of these functions is reduced to a common basis by recognizing that any economic or corporate activity consists of ows of money, orders, materials, personnel, and capital equipment (p. vii). System Dynamics strength also lies in its ability to account for non-linearity in dynamics, feedback, and time delays. The quantication of System Dynamics models is done with the help of stock and ow diagrams (see Fig. 10). Stock (the population in
Fig. 10) represents the accumulation of some measurable entities (e.g. people, parts, money), or even some intangible ones, such as happiness [83], that are in the same state. Stocks characterize the state of the system and generate the information upon which decisions and actions are based [27]. It changes with an inow or an outow. Flows (birth and death in Fig. 10) are the physical or conceptual entities that leave the system state and move over time. Auxiliaries (birth rate and death rate in Fig. 10) help to describe the ow [83]. Mathematically, a System Dynamics model represents a system of differential equations as dx=dt f xt ; ut ; t yt g xt ; ut ; t
(5.1)
Here u(t) stands for the vector of input variables, y(t) is the vector of output variables, and x(t) is the vector of state variables. 4.3. Proposed hybrid technique As stated earlier, the objective of the study is to make a case for the value and provide an example of hybrid modeling environments that can capture different dimensions and objectives of the safety assessment of socio-technical systems. In this section, the architecture of a proposed hybrid technique, including its modules and interfaces, is described. It is also specied how this hybrid environment is capable of incorporating the features of other candidate techniques described in Section 4.2. 4.3.1. The architecture of the hybrid technique The proposed hybrid methodology (Fig. 11) has three modules, including (1) Ordinary BBN or QQ-BBN, (2) ESD_FT methods (Probabilistic Risk Analysis techniques for technical systems), and (3) Stock and ow diagrams (System Dynamics). The rst interface type in this hybrid environment is the one between the QQ-BBN (or BBN) technique and the ESD-FT module, which has already been addressed with the Hybrid Causal Logic (HCL) methodology [79,80]. HCL is the underlying algorithm of the risk analysis software, IRIS [47], which was employed in Section 5 of this study. The main layers of the HCL are Event Sequence Diagrams, Fault Trees, and Bayesian Belief Nets. As mentioned in Section 4.2.1, the combination of ESD and FT is in the Binary Decision Diagram (BDD) environment. HCL
Population
Birth
Death
Birth Rate
Death Rate
Fig. 10. Stock and ow diagram.
B A D Y C E Qualitative
Fig. 9. Qualitativequantitative BBN concept [79,80].
X Z
Ordinary BBN or QQ-BBN
ESD_FT (PRA Techniques for Technical Systems)
Quantitative Stock & Flow Diagrams (System Dynamics)

Fig. 11. Hybrid modeling environment.
ARTICLE IN PRESS
IE
PE-1
PE-2 PE-3
B C C A B B
D E
mathematical relationships between BBN and regression-based techniques. The underlying assumption in path analysis and SEM is that the connected variables have linear relationships, but it is possible to incorporate the multiplicative composite of variables as additional variables in the causal model. In addition, path analysis has traditionally been applied to continuous quantities, but most of the underlying theories can be applied to discrete variables as well. Roehrig [84] has derived the mathematical relation between two techniques, creating a simple causal model, where both nodes can only be true or false. His result showed the following relation: r xy Prhyjxi Pr hyj xi
sx sy
(5.2)
Fig. 12. Connecting BBN to the technical system risk techniques (ET and FT) [79,80].
where rxy is the correlation coefcient between X and Y. sx and sy stand for the standard deviations of variables X and Y, respectively. Pr(yjx) is the conditional probability of y given x, i.e. the type of probability that one needs in BBNs. We refer the readers to Roehrig [84] for more details of these relations.
mathematically links BDDs and BBNs in order to create the required interfaces (see Fig. 12). The SD module depicts dynamic deterministic relations in the proposed hybrid modeling environment and provides a dynamic integration among the other two modules. In the proposed integration, different modules with different modeling techniques can have inputs and outputs to the System Dynamics module, allowing the entire hybrid environment to capture feedbacks and delays. The interface of SD with QQ-BBN and technical system model (ESD_FT) techniques can be captured by importing and exporting the data from the System Dynamics environment. SD software, such as STELLA, offers the capability of importing and exporting data. For example, the target node calculated from BBN can be imported to SD and processed inside SD (with delays and feedbacks), and the estimated values from SD can be exported to the BBN environment. This process can integrate the effects of the cyclic interactions between various factors into BBN. The combination of SD and HCL is illustrated in an example in Section 5. 4.3.2. BBN as a common modeling technique in the hybrid environment BBN is a natural framework for explicit probabilistic relations among elements of the model, where objective data are lacking and use of expert opinion and soft evidence is inevitable. This, of course, is the situation one faces in the quantication of organizational safety models. BBN cannot only be mathematically linked to the technical system models (as described earlier), but is also capable of incorporating the positive features of regressionbased techniques and process modeling methods, described below. 4.3.2.1. BBN and regression-based techniques. Since most social and psychological theories are in the form of regression-based techniques, these methods are introduced as candidate techniques for organizational safety risk theories. By establishing the relationship between BBN and regression-based techniques, it is possible to explicitly include (quantitatively) the psychological and social theories with the proposed hybrid environment. Regression-based techniques are statistical techniques, while BBN is a probabilistic method. These two groups of techniques have traditionally been used in very different areas of application. Path analysis and SEM have usually been applied for testing and understanding the causal relationships. In contrast, BBN has been used as a method of knowledge representation and as a reasoning framework. Some references, such as Roehrig [84], have already shown the
4.3.2.2. BBN and process modeling techniques. In Section 4.2.2, SADT and IDEF0 are described as candidate approaches for representing process within an organization. In this section, we show how such process modeling techniques (as semi-formal quantitative methods) can be converted into BBNs (as a formal quantitative method). Based on the hierarchical structure of IDEF0, the relation between any two activities is either sequential or hierarchical. For example, in Fig. 5, A2 and A3 are related sequentially, and A22 and A2 are related hierarchically (A22 is a sub-activity of A2). In reality, however, it is possible that the output performance of an organization is the result of two or more parallel activities, which are neither sequentially nor hierarchically related. This modication in process modeling is depicted in Fig. 13. In Fig. 13, the total output of the process model (O0) can be broken down into O1 to Ok, which are the outputs of the parallel activities A11 to A1k. Each of these activities (based on SADT technique) has their own Resource (R), Input (I), and Control/ Criteria (C). The second layer of activities is comprised of those that have the R, I, and C of layer one as their outputs. For example, R12 is the resource for activity A12 and the output of activity A22R, I12 is the input of activity A12 and the output of activity A22I, and C12 is the control for activity A12 and the output of activity A22C. The same logic will hold for all the activities from layer 1 to layer N (the layer where the modeler stops process decomposition), as shown in Fig. 13. The process model can become more complex in several ways:
Some supporting activities in the second layer can be common

between two parallel activities in the rst layer.
Some Is, Rs, and Cs are common between two parallel

activities.
It is possible that more than one activity support the elements

in the higher levelfor example, C12 can be supported by a couple of activities parallel to A22C. This K N process model can be converted to a BBN, as shown in Fig. 14. In this gure, the quality state of total output (O0) would be a function of the quality state of activities A11 to A1k. Knowing the states of A11 to A1k as well as the conditional probabilities for specic states of O0, given any states of A11 to A1k, we can reach the probability of specic states of the output. The states of A11 to A1k also depend on the states of their supporting I, R, and C, and the conditional probabilities of A11 to A1k given any states of I, R, and C.
ARTICLE IN PRESS
Total Output (O0) O11 Layer 1 c11 A11 R11 c12 O12 A12 I22 R22C C22I A22I I22I R22I C22R A22R I22R R12 c1K O1K A1K R1K
I11 Layer 2 Layer N CN2C

total Output O0 A11 A12 A1K C1K I1K R1K
I1K R22R
C22C
A22C
I22C
AN2C IN2C
RN2C CN2I
AN2I IN2I
RN2I CN2R AN2R IN2R
RN2R
Fig. 13. Modied process modeling technique.
Layer 1
C11
I11
R11
C12 I12 R12
Layer 2 ..
A22C
A22I
A22R
Layer N
AN2C
AN2I
AN2R
through SD, (b) Financial Stress through SD, (c) Organizational Safety Practices through HCL-SD (factors include Training modeled in SD, Hiring modeled in SD, and other organizational safety practices and regulatory auditing factors in HCL), (d) Individuallevel (e.g. maintenance technicians) PSFs in SD, (e) Maintenance Unit Process Model in IRIS, and (f) Technical System Risk and its links to Aircraft Airworthiness in IRIS. As Fig. 15 describes, the output of the Maintenance Unit Process Model in the BBN environment is Aircraft Airworthiness. This output is fed to FT and ESD in order to estimate the Technical System Risk. The Technical System Risk is then imported to the SD environment and processed inside SD (with delays and feedbacks). The output from SD, which in this case is the technicians probability of errors, is exported to the BBN environment. This process integrates the cyclic interactions into the BBN model.
Fig. 14. The conversion of process modeling technique to BBN.
5.1. Organizational safety culture (modeled in SD environment) SoTeRiA (summarized in Section 3) describes the link between safety culture and other elements of the organizational safety framework. Based on SoTeRiA, safety culture shapes the managerial decisions regarding organizational safety practices. In the following, for simplication, management commitment is considered as a measure of safety culture. The management commitment module (see Fig. 16) of the model illustrates important feedback loops that rule the dynamics affecting the managements commitment to safety. Cooke [25] has constructed a management commitment module designed for the mining industry, and the present module (Fig. 16) is a modied version that captures the effect of nancial stress (Z score) of the airline as well. The basic modeling idea of management commitment to safety in SD follows the concept of sea anchor and adjustment by Sterman [27]. It assumes a certain level of commitment by managers as a starting point, which changes over time (i.e. time to change management commitment to safety in Fig. 16) and according to the pressures, both safety- and nance-related, applied to the organization. Naturally, management should balance priorities between the safety and protability of the airline. In this module, the safetyrelated pressure is measured by assessing the deviation of maintenance technicians errors from the reference technician
5. An example application in aviation context In this section, a simplied version of SoTeRiA (summarized in Section 3) is used to demonstrate how a hybrid technique can operationalize an organizational safety theory. In the described example (see Fig. 15), the integration of SD and HCL is achieved by utilizing STELLA (System Dynamics software) [45,46] and The Integrated Risk Modeling System (IRIS) (risk analysis software) [47]. IRIS solves HCL-based risk models, and is a combination of BBN, ESD, and FT techniques. STELLA has been linked to IRIS in order to empower it with System Dynamics capabilities. SD is added to technical system risk models to help model some deterministic and dynamic causation mechanisms. The externally linked STELLA-IRIS environment can then be utilized to analyze the dynamic effects of organizational factors on system risk. The following briey describes how different elements of SoTeRiA are implemented in STELLA and IRIS environments. Complete lists of equations used in the following modules are provided by Mohaghegh [29]. The example concerns the impact of airline maintenance operations on aviation safety risk. For this purpose, we cover the following factors: (a) Safety Culture
ARTICLE IN PRESS
STELLA SD Technician error probability SD BBN
IRIS BBN
Aircraft Airworthiness
Risk ESD Top event probability
FT
ESD
Fig. 15. Integration of STELLA and IRIS.
FT
managemnet commitment
Management Commitment to safety
Z Score
Time to change management commitment to safety
FP effect time
Management drive to prioritize financial situation over safety
Change in Management Commitment to Safety
Pressure to change management commitment to safety Financial priority exponent Effect of financial pressure on management commitment to safety
Target management commitment to safety
Effect of relative tech error on management commitment to safety refrence technicion error Relative technicion error Probability
effect time
Safety priority exponent
technicion error Probability
Fig. 16. Management Commitment Module developed in the SD environment.
error probability and its exponential effects (referring to safety priority exponent in the gure) on management commitment. The normative level of safety (e.g. reference technician error) is usually set either by regulators or by social and cultural standards.
The nancial pressure also affects management commitment exponentially, according to nancial priority exponent. Financial stress (Z score) is calculated through another module, which is explained in Section 5.2.
ARTICLE IN PRESS
Financial Pressure Module
Z Score
Increase rate in Z Relative technicion error Probability Relative Risk
Decrease rate in Z
Decrease Z Decreasing Multiplier Increase Z Increasing Multiplier
Fig. 17. Financial pressure module in SD environment.
Training
Senior Quit Total technicions Rookie Quit Hiring Average experience of technicions hired Average technicions experience Total technicions Total Number of Quits
Increase in experience from hiring Average technicions experience
Loss of experience from attrition
Training Increase on the job experience Safety experience gained on job
Reference experience Experience gap Relative experience of technicions
Relative technicion commitment Target experience Relative management commitment to safety Time to provide training
Fig. 18. Training module in SD environment.
5.2. Financial stress (modeled in SD environment) The nancial distress module (Fig. 17) has been constructed based on Albert Altmans Z score model [85]. Altman suggests that a Z score which consists of a linear combination of a set of nancial ratios available on a rms balance sheet can be a representative of the rms nancial standing. This model is applicable to rms that are publicly traded; otherwise, a modied version of the model would need to be used. According to this model, if the Z score is less than 1.81, the rm has a 95 percent chance of nancial distress in the coming year. Z scores between 1.81 and 2.67 are of concern but not threatening, and scores above 2.67 raise no concerns. The purpose of the nancial module is to incorporate the way nancial distress (low Z scores) may affect management commitment and how safety output impacts nancial well-being. The stress will ultimately distract the management from safety concerns, forcing them to concentrate on recovering balance sheet gures. The lower levels of managerial commitment to safety may result in a lower technical commitment to safety, substandard training, and higher technician error probability, which increases accident risk. This will affect the consumers perception of the airlines safety and affect its sales and market value directly, lowering the Z score and pushing the airline towards a more nancially distressing situation. Accidents or
incidents damage and destroy the airlines assets as well. The structure of equations that shapes these changes is provided by Mohaghegh [29]. The multipliers in this module are assumed values. More realistic values require empirical studies. 5.3. Organizational safety practices (modeled in IRIS and SD environments) In SoTeRiA, organizational safety practices are classied as human-related activities, procedure-related activities, resourcerelated activities, and common activities. In this example, we include training and hiring for human-related activities. The maintenance procedure- and resource-related activities are all covered in the example. From common activities, we only include internal auditing factors in the model. 5.3.1. Modeling training in SD This module (Fig. 18) is intended to capture the way the experience level in an airline changes. Technicians attrition reduces the experience level of the maintenance workforce, while hiring and training add to it. Technicians also gain experience on the job. The level of training and its quality are also managerial decisions, and are affected by the managements commitment to safety. The goal is to ll the gap that exists between the level of
ARTICLE IN PRESS
experience needed in the organization (target experience in Fig. 18) and the level that exists at any time. A decrease in commitment to safety decreases the level of training, and, consequently, the amount of experience in the airline. There is also a time lag involved in the training process, which has been considered in this module [25,27]. 5.3.2. Modeling hiring in SD Cooke [25] illustrated hiring relations in the mining industry, and McCabe [86] has presented similar relations for airline employees. Our module and its equations are modied forms of these two models. The hiring module describes the process of hiring rookie maintenance technicians, and their transformation into experienced technicians. In this process, some technicians, both rookies and seniors, quit the organization for different reasons (poor work conditions, excessive workload resulting from nancial pressure, and the managements and technicians inattention to safety). This has been represented as quit rate in the module (see Fig. 19). Hiring has been dened as a process of trying to reduce the gap between the existing total number of technicians in the organization and the number of technicians required to fulll the maintenance tasks, according to the market or regulatory demands on the airline (target technician demand in Fig. 19). At the same time, hiring is a managerial decision, and thus depends on the level of the managements commitment to safety. It is the managements decision whether to hire and train more people to meet the demand, or whether to place more pressure on the existing workforce. The time lag associated with the hiring process has also been considered in the sub-model. The outcome of the module is the total number of technicians at any given time, which will be used in the training module [25,86].
5.3.3. Modeling other organizational safety practices and regulatory auditing factors in IRIS This example demonstrates the feasibility of combining the deterministic and probabilistic techniques (SD and BBN). In the organizational safety practices category, training and hiring are molded in SD. The rest of the organizational safety practices (resource- , procedure-related activities, and internal auditing activities) are represented by a static probabilistic model in the form of a BBN. As we explained in Section 4.2.5, this reects the fact that our knowledge of the relations among these factors is limited. The internal auditing activities and external regulatory auditing factors are also entered into IRIS in the form of a BBN model. These factors are then connected to the BBN representation of the maintenance unit process model explained in Section 5.5. 5.4. Individual-level PSF in SD environment The individual-level Performance Shaping Factors of the SoTeRiA framework, including psychological climate, motivation, ability, and opportunity, are described by Mohaghegh [29]. To quantitatively assess the technicians error probability as a function of its individual-level PSFs, a Human Reliability module (Fig. 20) has been developed. In this module, for ability, we considered knowledge to be equivalent to level of experience, which is imported from the training module. For opportunity, we have selected time pressure, which is a function of demand and the number of technicians taken from hiring module. Based on SoTeRiA, motivation is inuenced by the psychological and group climates. Here we have not modeled these climates explicitly. To execute the example model, we considered the term technician commitment, which is directly inuenced
Staffing
Technician working hours per year Target technicion Demand
demand
Total technicions Gap in Number of technicions
~ Effect of Relative technicion Commitment on quit
Hiring
Rookies
Relative management commitment to safety Up to speed Hiring Time
Seniors
Senior Quit
Senior quit fraction Rookie Quit Rookie Promotion Time
Rookie quit fraction
~ Relative technicion commitment
Effect of Relative technicion Commitment on quit
Fig. 19. Hiring module in SD environment.
ARTICLE IN PRESS
Human Reliability
~ APOA moral
Tech Error Probability input to IRIS GTT technicion error Probability
Relative technicion commitment
EPC moral EPC experience EPC time pressure
~ APOA time pressure
Time Pressure
~ APOA experience
Relative experience of technicions
demand
Time Available
Total technicions
Regulatory allowed working hours
Fig. 20. Human reliability module in SD environment.
Technicion commitment Management Commitment to safety
Normal management commitment to safety Relative management commitment to safety
management commitment effect time on personal commitment
Effect of relative management commitment on technicion comitment
Maximum personal commitment to safety
Safety goal Pressure to change personal commitment to safety
Relative technicion error Probability
Normal technicion commitment
Time to change personal commitment to safety Personal tech Commitment to Safety
Effect of relative risk on technicion commitment risk effect time on technicion commitment
Relative technicion commitment change in technicion commitment Personal incident learning exponent
Fig. 21. Technicians commitment module in SD environment.
by managerial commitment. Fig. 21 shows the technician commitment module, which is a modied version of the one Cooke used in his mining model [25]. The output of the technician commitment module is fed to the Human Reliability module (Fig. 20) as an individual-level PSF. The human error probability model used in the Human Reliability module is adopted from Nuclear Action Reliability Assessment (NARA; [87]). NARA uses a set of Generic Task Types (GTTs) to describe various tasks modeled in Probabilistic Risk Assessment. GTTs are further modied by considering factors known as Error Producing Conditions or Performance Shaping Factors (PSF). The process is mathematically simple, but necessitates a great deal of judgment, especially when deciding which Error Producing Conditions (EPC) are present and which Assessed
Proportions of Affect (APOA) should be used. The output of the human reliability module, which is technician error probability, is fed to the management commitment module (Fig. 16) to model the dynamics of the managements commitment to safety. Also, the assessed human error probability is used as an input to the maintenance unit process model (see Section 5.5) in order to obtain a maintenance quality index, which is then used to calculate the rate of hardware failures due to maintenance error. Like the Management Commitment to Safety module, technician commitment to safety has utilized the concept of sea anchor and adjustment [27] in modeling commitment changes. As Fig. 21 shows, relative management commitment to safety inuences the technicians personal commitment to safety. Higher managerial commitment to safety pressures
ARTICLE IN PRESS
technicians to be more committed to safety, and to follow the standard procedures for maintenance. Obviously, lower managerial attention to safety will eventually lead technicians to be less conscious about safety. Another factor that affects the technicians level of commitment is the rate of incidents (i.e. technician error probability in this example). Higher incident rates, which are correlated with higher risk, will raise their commitment to safety. 5.5. Maintenance unit process model in IRIS environment The factors of the unit process model, as dened in SoteRiA, are included in IRIS using the BBN technique. The target node of this BBN is Aircraft Airworthiness, which represents the quality of all necessary maintenance work, based on manufacturers instructions. The factors of this unit are affected by factors of organizational safety practices (mentioned before) and regulatory auditing factors (all in BBN form and modeled within the IRIS modeling environment). All the probabilities of the nodes and their related conditional probabilities are based on Eghbalis generic maintenance model [88], developed for airlines. We refer to Mohaghgeh and Mosleh [29,44] for more details about SoTeRiA application in maintenance unit and its map to Eghbalis model. 5.6. Technical system risk in IRIS environment and its links to aircraft airworthiness In order to run this example, we selected a specic airline accident scenario from Roelen [89]. The scenario consists of two ESDs. Aircraft system failure (ESD1) describes the accident category as an uncontrolled collision with ground, with an aircraft system failure during the takeoff phase as the initiating event. The power loss of a single engine during landing (ESD28) describes the accident type as an uncontrolled collision with ground, with single-engine failure as the initiating event during the landing phase. Both ESDs are linked to a set of fault trees for more details regarding system failure causes. The following gure briey illustrates how engine failure from an accident scenario is linked to the target node of the maintenance BBN model (i.e. Aircraft Airworthiness), using an omega factor parameter [10]. Fig. 22 shows a schematic scenario of an aircraft accident, which starts from the initiating event and continues with engine failure and pilot error. An engine failure rate (lEngine) can be divided into four contributors: The rate of inherent failures (lI), the rate of engine failure due to maintenance error (lMX), the rate of engine failure due to mismanagement by the crew (lC), and the rate of engine failure due to external factors (lEXT), as shown in
Fig. 22. The inherent portion of the failure rate represents failure mechanisms that are beyond the control of the organization operating the airline. In its simplest form, lI represents the expected failure behavior specied by the manufacturer. If there are no additional inuences by the organization, no adverse external factors, and no mismanagement by the crew, the component should perform according to the manufacturers expected lI (an exponential constant failure rate reliability model is assumed). According to the omega factor approach, a parameter o is dened as
o lMX =lI NMX =N I
(5.3)
where NMX is the number of maintenance-related failures and NI is the number of inherent failures for the engine. In order to establish a relation between the omega (o) factor and maintenance performance, a term P1 is dened as the probability of aircraft non-airworthiness, which can be estimated from the target node of the maintenance BBN model. P1 could also be viewed as the probability of substandard maintenance and estimated as P1 NsubSTD =N T-maint (5.4)
where NT-maint is the total number of maintenances performed on an aircraft and NsubSTD is the number of substandard maintenances. The total number of maintenances (NT-maint) is the combination of required maintenances based on procedure (Nmaint) and non-procedure (random) maintenance (Nrandom). Since Nrandom is a lot smaller than Nmaint, NT-maint is roughly equal to Nmaint. We also dene P as the probability that maintenance activities result in an engine failure: P N MX =Nmaint P P1 P2 ; P 2 N MX =N subSTD (5.5) where P2 stands for the probability that substandard maintenance will result in engine failure. The following equations show how P1 and o are proportional:
o P1 P2 K ;
K N maint=NI
(5.6)
Nmaint and NI in Eq. (5.6) should take place during the same operational exposure period (hours of ight). K is a constant design factor (e.g. a particular aircraft and manufacturer) and is not related to airline, P1 is related to maintenance organization, and P2 represents the sensitivity of each main component to the maintenance. However, as the value of P2 is different for different
Initiating event
Engine failure
Pilot Error | Engine failure
Inherent engine failure
Engine failure maintenance related
Engine failure due to Mismanagement by crew
Engine failure due to external factor
Aircraft Airworthiness
Maintenance unit process model

Fig. 22. The link between aircraft airworthiness and technical risk model, modeled with the omega factor approach.
ARTICLE IN PRESS
component categories, the value of o is specic for each main component (e.g. engine, gear). Therefore, there is a constant value (K0 ) for each component category: K 0 K P2 1=P1G oEngine_G (5.7)
5.7. Typical outputs Fig. 23 displays the calculated trends of management commitment to safety, technician commitment to safety, and technician error probability over a period of 15 years. As shown in Fig. 23, the model predicts that an increase in technician error probability leads to an increase in management commitment to safety with a time delay, and eventually raises the technicians commitment to safety. A change in the managements commitment to safety is a function of the deviation of safety output from a normative safety level. The normative level of safety is usually set by regulators, or by social and cultural standards. In this example, only Technician error probability is taken as a measure of safety output. In reality, there are different measures of safety outputs (e.g. incidents and accidents, hardware faults, and operator errors) that may trigger management commitment to safety, and consideration of these safety outputs requires an integration of the airlines operations model and maintenance model. According to Fig. 23, the technicians higher levels of commitment will eventually cause error probabilities to decline, and management sensitivity and commitment to safety will decrease, leading to the rise of error probabilities. In a similar situation, Fig. 24 demonstrates a case where lower error probabilities remain mostly constant for the rst couple of years, possibly resulting in a more relaxed managerial attitude towards safety. As the managers commitment (and, consequently, the technicians commitment) declines, however, error probabilities increase and later reach higher peaks. In other words, a signicant accident in the organization can happen after a period of very low incident rates. Familiarity with this type of organizational safety behavior can guide managers to schedule appropriate periods for internal auditing activities,
where oEngine_G is the generic value of o for engine and can be estimated from failure data bases according to Eq. (5.3). P1G is also the generic value (based on data) for the probability of nonairworthiness. Based on SoTeRiA, the state of aircraft airworthiness (safety critical task) is affected by the root organizational factors. If we change (improve/worsen) the organizational factors, the new value of engine failure rate due to maintenance (lMX_new) is estimated based on Eqs. (5.3) and (5.7) as follows:
lMX_new oEngine_new lI
oEngine_new P1new K 0
(5.8)
where oEngine_new is the new value of the Omega factor for engine and P1new stands for the new value of P1, estimated from the new state of organizational factors. The states of organizational factors impact the value of P1 through their effects on the factors of the safety causal model. lMX_new is used in the fault tree (in Fig. 22) to estimate the probability of Engine Failure, Pr(Engine). Considering the changes in organizational factors, the new value for the probability of engine failure, Pr(Engine_new), is calculated using following equation:
lEngine_new_ 1 oEngine_new lI lExt lC

Pr Engine_new 1 explEngine_new T mission
(5.9)
where Tmission stands for mission time. It should be mentioned that lC may also vary due to changes in organizational factors, but for simplicitys sake this is not considered here. The same approach can be used to assess the effects of changes in the pilot error (in Fig. 22) due to organizational factors, and to estimate the new value for the probability of pilot error under the condition of engine failure, Pr(Pilot Error_ newjEngine failure). These probabilities are plugged into ESD in order to estimate the new accident probability, Pr(F), which is based on new states of organizational factors: PrF PrEngine_new PrPilot Error_newjEngine failure (5.10)
1: Management Commitment to safety
2: technicion error Probability
3: Personal tech Commitment to Safety
1: 2: 3:
1
3 1 1
3 1 3
1: 2: 3:
3 1
3 3
1
2 2 2 2 2 2
1: 2: 3:
0 0.00
4.00
8.00
12.00
16.00
20.00
Page 1
Years
2:16 PM Fri, Jul 06, 2007
1: Management Commitment to safety 2: Personal tech Commitment to Safety 3: technicion error Probability 1: 2: 3:
1
1
Fig. 24. A period of low human error as a trigger point for a gradual decline in management commitment to safety.
1 2 2 2 1 1
1: 2: 3:
3 1: 2: 3:
3 3 3
Page 1
0 0.00
3.75
7.50
11.25
15.00
1.011E-08 1.0105E-08 1.01E-08 1.0095E-08 1.009E-08 1.0085E-08 1.008E-08 1.0075E-08 1.007E-08 1.0065E-08 1.006E-08 1.0055E-08 1 2
Series1
Years
Risk
10:06 PM Wed, Jul 04, 2007
9 10 11 12 13 14 15 16
Time
Fig. 23. Management commitment, technician commitment, and technician error probability over 15 years. Fig. 25. Total system risk over 15 years.
ARTICLE IN PRESS
which should be treated as early warnings for all organizational members of possible increase in incident/accident rates. Other outputs of the model (Fig. 25) include the time trend of system risk (as measured, for instance, by aircraft crash rates). The predicted time trends obviously show possible future risk spikes that may exceed the acceptable threshold, even though the current levels are quite acceptable. Again, modeling the dynamics of the organizations can provide early warning of increased likelihood of accidents. Fig. 25 shows very small changes in total system safety risk, given all the existing uctuations in organizational factors (e.g. managerial commitments). One possible explanation is that the results of this example are based on many simplifying assumptions (e.g. considering only two of many possible ESDs, modeling the organizational effects solely based on airline maintenance activities and ignoring the important common impacts of organization on ight operations), and nally the use of limited actual data. On the other hand, a promising explanation of these small changes can be based on the discussion in Section 3 regarding high reliability organizations. More specically, barriers such as technological resilience, regulatory and industrial standards, and high levels of professionalism, tend to dampen the propagation of the impacts of organizational uctuations to the system safety risk. If we expand the model and run it with more actual data, the time trend of system risk can reveal the organizations reliability, and indicate which organizational and managerial factors can result in future higher risk spikes. As an example, if a manager has to decide between fewer hiring and less training in an abnormal situation, two different runs of such a model can predict which decision would lead to more risky situations over the next couple of years. Furthermore, if the results show that both decisions have very small effects on the total system safety risk, the manager(s) can more condently prioritize the various nancial concerns for that specic case. Fig. 26 also illustrates the effects of dynamic interactions of nancial and safety performances through the time trace of their respective metrics. This example shows that if for any reason (e.g. a disaster that affects the entire industry, such as 9/11) an airline is subjected to nancial distress (i.e. a decreasing Z score over time as in Fig. 26), management commitment to safety can decline, causing the technician error probability to increase. One possibility is that as an organization faces nancial distress, the management intuitively concentrates on service and achieving higher turnover rate. This might translate into a lower level of safety commitment, manifested in lowered attention to training and more time pressure on maintenance technicians. Since technicians will try to meet the service expectations imposed by the management, they will respond to low management commitment to safety by performing their tasks (e.g. maintenance) in a
substandard manner, and by generally engaging in more risky behaviors in order to meet the deadlines and the schedule set by the management. This leads to higher error probabilities for the technicians, and, consequently, higher risks. However, managers may ignore the fact that higher risks and higher incident rates in an airline will ultimately affect the markets perception of the airline and result in deeper nancial distress. On the other hand, this may be countered by a subsequent increase in the managements commitment to safety, in response to increases in incident rates. In other words, observing higher error probabilities may force management to pay more attention to safety while nancial pressures are in effect. The result of this competition is a gradual increase in peak points of error probabilities and a gradual decline in minimum levels of management commitment to safety, as shown in Fig. 26. Running such a model, using the actual nancial data of an organization, can help managers to make better decisions about organizational safety practices (e.g. training and hiring), which will lead to less decline in organizational outputs (i.e. safety and nancial outputs).
6. Concluding remarks This paper reports on one of the results of a research with the primary purpose of extending Probabilistic Risk Analysis (PRA) to include organizational roots of risk. The research included an effort to develop a set of principles [29,42] and a multidisciplinary theoretical framework, Socio-Technical Risk Analysis (SoTeRiA) [29,42,44], for the eld of Organizational Safety Risk Analysis. The principles and the proposed safety framework are also briey described in this paper to clarify the scope of the research and facilitate the discussions of the present paper. The main focus of this paper is on the choice of representational schemes and techniques in order to operationalize the SoTeRiA framework. Over the past three decades, there have been signicant improvements in the sophistication of quantitative methods of safety and risk assessment, but the progress on techniques most suitable for organizational safety risk frameworks has been limited. The effort documented in this paper is a step towards lling the aforementioned gap, and its contributions can be summarized as follows: 1. Hybrid methods are recognized as an effective way to deal with the multidisciplinary nature of organizational safety and corresponding assessment frameworks. 2. A rationale and a process for selecting appropriate techniques to create a hybrid approach are offered. Candidates are taken from PRA techniques for technological systems, Human Reliability, Social and Behavioral Science, Business Process Modeling, and Dynamic Modeling. 3. Methods are explored and proposed for converting candidate techniques to common techniques for modeling organizational effects and for their integration into a hybrid approach. 4. An example of the proposed hybrid modeling environment, including an integration of SD, BBN, ESD, and FT, is applied in the civil aviation domain to demonstrate the feasibility and value of hybrid framework. The proposed hybrid technique integrates deterministic and probabilistic modeling perspectives, and offers a exible riskinformed decision-making tool. The approach is generic and can be applied in different high-risk industries, such as nuclear power, aerospace, and healthcare. Since the variables in the model can be chosen to correspond to decision parameters set by managers (e.g. rate of hiring and frequency of training), the hybrid method can be
1: Management Commitment to safety 1: 2: 3:
2: technicion error Probability
3: Z Score
1 1
1
3 1 3 1 3
1: 2: 3:
1 0
2 2 2
3 3
1 3
2 2
1: 2: 3: Page 1
0 0 0.00
4.00
8.00
12.00
16.00
20.00
Years
2:06 PM Fri, Jul 06, 2007
Fig. 26. Financial stresses as a trigger point for a gradual decline in management commitment to safety.
ARTICLE IN PRESS
used by an organizations management to study the impact of their decisions on safety-related outcomes, such as probability of human errors. This is done with explicit consideration of (a) dynamic effects, such as time lags between decisions and outcomes, and feedback effects (e.g. the impacts of incidents on workers and managers awareness and attention to safety; (b) the uncertain nature of the relation between human performance and its organizational context; and (c) the impact of human performance on the systems and evolution of risk scenarios. The hybrid model provides the exibility of using appropriate modeling techniques in each of the above areas: the SD method for (a), BBNs for (b), and PRA logic modeling techniques (ESD/FT) for (c). We used typical outputs of the example of Section 5 to show how the proposed hybrid approach can help to study the dynamic effects of organizational factors on technical system safety risk. We note that the purpose of the example is to explore the feasibility of the proposed hybrid methodology rather than performing a comprehensive and realistic numerical estimation of the aviation risks. An important open question is how sensitive is the level of safety performance of an organization to uctuations in certain organizational factors. A more in depth answer to the question needs future research to extend the model to include other aspects of operations (e.g. ight crew model), and to analyze the common effects of organizational factors on both the operation and maintenance of an airline. For example, the impact of cost-cutting measures on both the maintenance personnel errors and ight crew errors. If we expand the model and run it with more actual data, then the time trend of system risk can reveal how reliable the organization is, and which organizational and managerial factors can result in future risk spikes. As an example, if a manager has to decide between fewer hiring and less training in an abnormal situation, two different runs of such a model can predict which decisions would lead to more risky situations over the next couple of years. Furthermore, if the results show that both decisions have very small effects on the total system safety risk, managers can more condently prioritize various nancial concerns for that specic case.
Acknowledgments The work described in this paper was in part supported by the US Federal Aviation Administration. The authors are indebted to the FAAWilliam J. Hughes Technical Center for their support. The opinions expressed in this paper are those of the authors and do not reect any ofcial position by the FAA. References
[1] Rasmussen J. Risk management in a dynamic society: a modeling problem. Saf Sci 1997;27:183213. [2] Rasmussen N. Reactor safety study. WASH-1400. Washington, DC: US Nuclear Regulatory Commission; 1975. [3] Swain AD, Guttmann HE. Handbook of human reliability analysis with emphasis on nuclear power plant applications. NUREG/CR-1278: Nuclear Regulatory Commission, 1983. [4] Reason J. Human error. New York: Cambridge University Press; 1990. [5] Reason J. Managing the risks of organizational accidents. Aldershot, Hants, England; Bookeld, VT, USA: Ashgate; 1997. [6] Embrey DE. Incorporating management and organizational factors into probabilistic safety assessment. Reliab Eng Syst Saf 1992;38:199208. [7] Davoudian K, et al. Incorporating organizational factors into risk assessment through the analysis of work processes. Reliab Eng Syst Saf 1994;45:85. [8] Davoudian K, et al. The work analysis model (WPAM). Reliab Eng Syst Saf 1994;45:107. [9] Pate-Cornell ME, Murphy DM. Human and management factors in probabilistic risk analysis: the SAM approach and observations from recent application. Reliab Eng Syst Saf 1996;53(2):11526. [10] Mosleh A, Golfeiz EB. An approach for assessing the impact of organizational factors on risk. Technical research report, Center for Technology Risk Studies, University of Maryland at College Park, 1999.
[11] Luxhoj J. Building a safety risk management system: a proof-of-concept prototype. FAA/NASA risk analysis workshop, 2004. [12] Roelen ALC, Wever R, Hale AR, Goossesns LHJ, Cooke RM, Lopuhaa R, et al., Causal modeling for integrated safety at airports. Safety and reliability. Swets and Zeitlinger, Lisse, ISBN:90-5809-551-7, 2003. [13] Smidts C, Devooght J, Labeay PE. Dynamic reliability: future direction. International workshop on dynamic reliability, 2000. [14] Mosleh A, Zhu D, Hu Y, Nejad H. SimPRA (Simulation-based probabilistic risk assessment system). US Patent Pending, University of Maryland, 2007. [15] Hollnagel E. Cognitive reliability and error analysis method (CREAM). Amsterdam: Elsevier; 1998. [16] Mosleh A, Chang YH. Model-based human reliability analysis: prospects and requirements. Reliab Eng Syst Saf 2004;83(2):24153. [17] Siu N. Risk assessment for dynamic systems: an overview. Reliab Eng Syst Saf 1994;43:4373. [18] Chang YH, Mosleh A. Cognitive modeling and dynamic probabilistic simulation of operating crew response to complex system accidents. Part 5: Dynamic probabilistic simulation of IDAC model. Reliab Eng Syst Saf 2006;92: 1076101. [19] Rochlin GI, La Porte TR, Roberts KH. The self designing high reliability organization: aircraft carrier ight operations at sea. Naval War College Rev 1987 Autumn. [20] Weick K. Managing the unexpected: assuring high performance in an age of complexity. New York: Wiely; 1977. [21] Senge PM. The fth discipline: the art and practice of the learning organization. New York: Doubleday; 1990. [22] Perrow C. Normal accidents. New York: Inc. Publishers; 1984. [23] Biondi EL. Organizational factors in the reliability assessment of offshore systems. Master thesis of Ocean Engineering, Oregon State University, Corvallis, OR, 1998. [24] Bella A. Organized complexity in human affairs the tobacco industry. J Bus Ethics 1997;16:97799. [25] Cooke DL. The dynamics and control of operational risk. PhD. thesis, The University of Calgary, 2004. [26] Leveson N. A new accident model for engineering safer systems. Saf Sci 2004; 42(4):23770. [27] Sterman J. Business dynamics; system thinking and modeling for complex world. Mc Graw-Hill Companies; 2000. [28] Yu J, Ahn N, Jae M. A quantitative assessment of organizational factors affecting safety using system dynamics model. J Korean Nucl Soc 2004;36(1):6472. [29] Mohaghegh Z. On the theoretical foundations and principles of organizational safety risk analysis. PhD. thesis, University of Maryland, 2007. [30] Walton M. The Deming management method. New York: Putnam; 1986. [31] Kennedy R, Kirwan B. Development of hazard and operability-based method for identifying safety management vulnerabilities in high risk system. Saf Sci 1998;30:24974. [32] Ostroff C, Kinicki A, Tamkins M. Organizational culture and climate. In: Borman WC, Ilgen DR, Klimoski Rj, editors. Comprehensive handbook of psychology, vol. 12. I/O Psychology, New York: Wiley; 2003. [33] Cooper MD. Towards a model of safety culture. Saf Sci 2000;36:11136. [34] Cox S, Cox T. The structure of employee attitudes to safety: a European example. Work Stress 1991;5:93104. [35] Zohar D, Luria G. A multilevel model of safety climate: cross-level relationships between organization and group-level climates. JAppl Psychol 2005;90(4):61628. [36] Grifn M, Neal A. Perceptions of safety at work: a framework for linking safety climate to safety performance, knowledge, and motivation. J Occup Health Psychol 2000;5:34758. [37] Ostroff C. Best practices. Human resource management: ideas and trends in personnel. Issue no. 356. Chicago: CCH Inc.; 1995. p. 112. [38] Emry FE, Trist EL. Socio-technical systems. In: Management science models and techniques, vol. 2. London: Pergamon; 1960. [39] Lewin K. Field theory in social science. New York: Harper and Row; 1951. [40] Mintzberg HT. Structure in ves: designing effective organizations. Englewood Cliffs, NJ: Prentice-Hall; 1983. [41] Burke W, Litwin G. A causal model of organizational performance and change. J Manage 1992;18(3):52345. [42] Mohaghegh Z, Mosleh A. Incorporating organizational factors into probabilistic risk assessment (PRA) of complex socio-technical systems: principles and theoretical foundations. Saf Sci, in press. [43] Mohaghegh Z, Mosleh A. Multi-dimensional measurement perspective in modeling safety risk. Proceedings of the European Safety and Reliability Association, ESREL 2007 [44] Mohaghegh Z, Mosleh A. Framework for incorporating organizational factors in safety causal models. Technical Report prepared for Federal Aviation Administration, Center for Risk and Reliability, University of Maryland, July 2007. [45] Richmond B. An introduction to systems thinking: STELLA. High Performance Inc; 2001. [46] Hannon B, Ruth M. Dynamic modeling. 2nd ed. Berlin: Springer; 2001. [47] IRIS, Integrated risk information system software, developed at center for risk and reliability. University of Maryland at College Park, MD, 2007. [48] Whetten DA. What constitutes a theoretical contribution? Acad Manage Rev 1989;14(4):4905. [49] Bacharach S. Organizational theories: some criteria for evaluation. Acad Manage Rev 1989;14:496515.
ARTICLE IN PRESS
[50] Kozlowski S, Klein K. A multilevel approach to theory and research in organizations; contextual, temporal, and emergent processes. In: Klein Katherine J, Kozlowski Steve WJ, editors. Multilevel theory, research, and methods in organizations: foundations, extensions, and new directions. Jossey-Bass Inc.; 2000. [51] Kozlowski S, Gully SM, Nason ER, Smith EM. Developing adaptive teams: a theory of compilation and performance across levels and time. In: Ilgen DR, Pulakos ED, editors. The changing nature of work performance: implication for stafng personnel actions, and development. San Francisco: Jossey-Bass; 1999. [52] Argyris C, Schon D. Organizational learning. II: Theory, method and practice. Reading, MA: Addison-Wesley; 1996. [53] Carroll J, Rudolph J, Hatakenaka S. Learning from experience in high-hazard organizations. In: Staw B, Kramer R, editors. Research in organizational behavioran annual essay and critical reviews, vol. 24, 2002. p. 87137. [54] Stamatelatos M. Probabilistic risk assessment procedures guide for NASA managers and practitioners. Washington, DC: Ofce of Safety and Mission Assurance, NASA Headquarters; 2002. [55] Henley E, Kumamoto H. Reliability engineering and risk assessment. Englewood Cliffs, NJ: Prentice-Hall; 1981. [56] Lee CY. Representation of switching circuits by binary-decision programs. Bell Syst Tech J 1959;38(4):98599. [57] Akers SB. Binary decision diagrams. IEEE Trans Comput 1978;C-276(August): 50916. [58] Bryant R. Graph based algorithms for boolean function manipulation. IEEE Trans Comput 1987;35(8):67791. [59] Rauzy A. New algorithms for fault trees analysis. Reliab Eng Syst Saf 1993;40: 20311. [60] Rauzy A, Dutit Y. Exact and truncated computations of prime implicants of coherent and non-coherent fault trees within Aralia. Reliab Eng Syst Saf 1997;58:12744. [61] Sinnamon RM, Andrews JD. Improved efciency in qualitative fault tree analysis. Qual Reliab Eng Int 1997;13:28592. [62] Groen F, Smidts C, Mosleh A, Swaminathan S. QRASthe quantitative risk assessment system. IEEE 2002 proceedings annual reliability and maintainability symposium, 2002. p. 34958. [63] Diergardt M. Modeling scenarios for analyzing the risk of complex computer based information system. PhD. thesis, Swiss Federal Institute of Technology, Zurich, 2005. [64] Mayer RJ, Menzel CP, Paineter MK, de Witte PS, Blinn T, Perakath B. Information integration for concurrent engineering (IICE) IDEF3 process description capture method report. Technical report, Knowledge BASED Systems Inc., College Station, TX, 1995. [65] Marca DA, MacGowan CL. SADT: structured analysis and design technique. NewYork: McGraw-Hill; 1988. [66] Heins W. Structured analysis and design technique (SADT): application on safety systems. Delft: TopTech Studies; 1993. [67] Hale AR, Heming BHJ, Carthey J, Kirwan. Modeling of safety management system. Saf Sci 1997;26:12140. [68] Federal Aviation Administration Report, Air CARRIER operation system model. DOT/FAA/AR-0045, 2001.
[69] James LR, Mulaik S, Brett J. Causal analysis: assumptions, models, and data. Beverly Hills, CA: Sage; 1982. [70] Jex S. Organizational psychology. New York: Wiley; 2002. [71] Simon HA. Spurious correlation: a causal interpretation. J Am Stat Assoc 1954;49:46992. [72] Bollen K. Structural equations with latent variables. New York: Wiley; 1989. [73] Gorsuch RL. Factor analysis. Hillsdale, NJ: Lawrence Erlbaum; 1983. [74] Howard RA, Matheson JE. Inuence diagrams. In: Howard RA, Matheson JE, editors. Readings on the Principles and Applications of Decision Analysis. Strategic Decision Group, Menlo Park, CA., vol. II; 1981. [75] Pearl J. Bayesian networks: a model of self-activated memory for evidential reasoning. In: Proceedings of the 7th conference of the Cognitive Science Society, University of California, Irvine, CA, 1985. p. 32934. [76] Bobbio A, Portinale L, Minichino M, Ciancamerla E. Comparing fault trees and Bayesian networks for dependability analysis. In: Proceedings of the 18th international conference on computer safety, reliability and security, SAFECOMP99, vol. 1698, 1999. p. 31022. [77] Torres-Toledano J, Sucar L. Bayesian networks for reliability analysis of complex systems. In: Lecture Notes in Articial Intelligence, vol. 1484. Berlin: Springer; 1998. [78] Kim MC, Seong PH. Reliability graph with general gates: an intuitive and practical method for system reliability analysis. Reliab Eng Syst Saf 2002;78(3):23946. [79] Mosleh A, Wang C, Groth K, Mohagehgh Z. Integrated methodology for identication, classication and assessment of aviation system risk. Prepared for Federal Aviation Administration (FAA). Center for Risk and Reliability, 2005. [80] Wang C. Hybrid causal methodology for risk assessment. PhD. thesis, University of Maryland, Center for Risk and Reliability, 2007. [81] Wooldridge M. An introduction to multi-agent systems. New York: Wiley; 2002. [82] Forrester J. Industrial dynamics. Cambridge, MA: The MIT Press; 1961. [83] Ford A. Modeling the environment: an introduction to system dynamics modeling of environmental systems. Washington, DC: Island Press; 1999. [84] Roehrig S. Path analysis and probabilistic networks: analogous concepts. IEEE Trans 1993;3:52332. [85] Altman E. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 1968;XXIII(4):589609. [86] McCabe R. Why airlines succeed or fail: a system dynamics synthesis. PhD. thesis, The Claremont Graduate University, 1998. [87] Kirwan B, Gibson H, Kennedy R, Edmunds J, Cooksley G, Umbers I. Nuclear action reliability assessment (NARA): a data-based HRA tool. In: The 7th probabilistic safety assessment and management. Berlin, Germany: Springer, 2004. [88] Eghbali H. Performance measures and risk indicators for title 14 CFR Part 121 air carriers maintenance operations. US Department of Transportation, Federal Aviation Administration, William J. Hughes Technical Center, Flight Safety Branch, 2006. [89] Roelen A, Wever R. Accident scenarios for an integrated aviation safety model. NLR-CR-2005-560, NLR, Amsterdam, 2005.

Reliability Engineeringand System Safety

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Reliability Engineeringand System Safety

Hochgeladen von

Copyright:

Verfügbare Formate

ARTICLE IN PRESS

Reliability Engineering and System Safety 94 (2009) 10001018

Contents lists available at ScienceDirect

Reliability Engineering and System Safety

II. Modeling perspective

III. Building blocks

(L) Measurement techniques (M) Modeling techniques

Organizational Safety Risk C Group Safety Risk E Individual Safety Risk

Industrial & Business Environment

Organizational Culture Safety Culture

Social and Political Culture and Climate

Organizational Structure & Practices Org. Safety Structure & Practices

Organizational Climate Org. Safety Climate

Group Climate Group Safety Climate

Individual PSFs Psyc. Safety Climate

Unit Process Model

Organizational Vision, Strategy and Goals

Emergent Process (Leadership/Supervision, Social Interaction & Homogeneity )

Fig. 2. Schematic representation of SoTeRiA.

ESD Cut-Sets 3 2 1 0 1 Probability of Sequence

Fig. 3. Use of BDD to solve combined ESD and FT models [62].

Fig. 8. A Bayesian belief network.

A21 A22 A23 A24 A2 A3

A31 A32 A33

Fig. 5. A typical decomposition hierarchy in IDEF0 (adopted from [68]).

Ordinary BBN or QQ-BBN

ESD_FT (PRA Techniques for Technical Systems)

Quantitative Stock & Flow Diagrams (System Dynamics)

 Some supporting activities in the second layer can be common

 Some Is, Rs, and Cs are common between two parallel

 It is possible that more than one activity support the elements

I11 Layer 2 Layer N CN2C

RN2I CN2R AN2R IN2R

Fig. 13. Modied process modeling technique.

C12 I12 R12

Fig. 14. The conversion of process modeling technique to BBN.

STELLA SD Technician error probability SD BBN

Risk ESD Top event probability

Management Commitment to safety

Time to change management commitment to safety

Management drive to prioritize financial situation over safety

Change in Management Commitment to Safety

Target management commitment to safety

Safety priority exponent

technicion error Probability

Fig. 16. Management Commitment Module developed in the SD environment.

Financial Pressure Module

Increase rate in Z Relative technicion error Probability Relative Risk

Decrease Z Decreasing Multiplier Increase Z Increasing Multiplier

Fig. 17. Financial pressure module in SD environment.

Increase in experience from hiring Average technicions experience

Loss of experience from attrition

Training Increase on the job experience Safety experience gained on job

Reference experience Experience gap Relative experience of technicions

Fig. 18. Training module in SD environment.

Technician working hours per year Target technicion Demand

Total technicions Gap in Number of technicions

~ Effect of Relative technicion Commitment on quit

Relative management commitment to safety Up to speed Hiring Time

Senior quit fraction Rookie Quit Rookie Promotion Time

Rookie quit fraction

~ Relative technicion commitment

Effect of Relative technicion Commitment on quit

Some supporting activities in the second layer can be common

Some Is, Rs, and Cs are common between two parallel

It is possible that more than one activity support the elements