Beruflich Dokumente
Kultur Dokumente
Reliability Block Diagrams A reliability block diagram is a graphical procedure which describes the system operation in terms of successful "signal" transmission between the system units.
1 1 2 2 Two Unit Series System Two Unit ActiveParallel System
Consider a system which consists of two units both of which must function for the system to function (series system). Assume component failures are statistically independent and let Then P ( A) = P ( A1 A2 ) = P ( A1 ) P ( A2 ) => R(t ) = R1 (t ) R2 (t ) If the system functions when either Unit#1 or Unit#2 functions (active-parallel system), then P ( A) = P ( A1 + A2 ) = P ( A1 ) + P ( A2 ) P ( A1 ) P ( A2 ) => R(t ) = R1 (t ) + R2 (t ) R1 (t ) R2 (t ) If units are identical with constant failure rate R series (t ) = e2 t A1 : Unit#1 functions at time t ; P ( A1 ) = R1 (t ) A2 : Unit#2 functions at time t ; P ( A2 ) = R2 (t ) A : system functions at time t ; P ( A) = R(t )
47 R parallel (t ) = 2 e t e2 t
1 __ R(t)=2e 0.9 ... R(t)=e t 0.8
2 t ._. R(t)=e t
2 t
0.7
0.6
R(t)
0.5
0.4
0.3
0.2
0.1
0.2
0.4
0.6
0.8
1 *t
1.2
1.4
1.6
1.8
0.6
F(t)
0.5
0.4
0.3
0.2
0.1
0.2
0.4
0.6
0.8
1 *t
1.2
1.4
1.6
1.8
In general, R (t ) =
R (t )
N n=1 n N n =1
R (t ) = 1
N
[1 R (t )]
n
R (t ) =
n= M
N n
dt e N t =
1 N
1
MTTF parallel =
dt {1 [1 e
0
t N
] }=
0 n
1 yN dy 1y
N 1 n=0
y
1 dy
0 n=0
N 1
1 1 = n+1
1 n
N n=1
MTTFM out of N =
N! n! (N n)! e
N
dt
n t
n= M
[1 e t ] N n
n= M
N! n! ( N n)!
N
dt e
0 1 0
n t
[1 e t ] N n
N! dy (1 y ) n! ( N n)! 1
n= M
n1 N n
dy (1 y)
0
n1 N n
dy (1 y )n2 y N n+1
n1 = I with I 1 = N n + 1 n1
dy y N 1 =
1 N
49 Example 1 A system consists of 7 units connected as shown in the following reliability block diagram. Units 1 through 4 are different (with 2,3, and 4 in active-parallel) and 3 units of type 5 constitute a 2-out-of-3 system. If Ri (t ) (i = 1, 2, . . . , 5) denotes the reliability function of each unit as a function of time, nd the reliability function for the system.
1 5
Solution R234 (t ) = R2 (t ) + R3 (t ) + R4 (t ) R2 (t ) R4 (t ) R3 (t ) R4 (t ) + R2 (t ) R3 (t ) R4 (t ) R1234 (t ) = R1 (t ) + R234 (t ) R1 (t ) R234 (t ) Also, since for an M-out-of-N system (good) with identical unit ! [ R(t )] [1 R(t )] n! ( NN n)!
N n N n
R ( M ) N (t ) = we have
n= M
3 2 3 R(55)5 (t ) = 3 R2 5 (t )[1 R 5 (t )] + R 5 (t ) = 3 R 5 (t ) 2 R 5 (t )
and the reliability function for the system is R sys (t ) = R1234 (t ) R(55)5 (t )
50 Example 2 Find the R(t ) and MMTF for the system whose reliability diagram is given below. In calculating MTTF, assume all components are identical and fail randomly with failure rate
.
2 1 3
Solution R sys = R4 R( sys|4) + R4 R( sys|4) R( sys|4) = 1 (1 R2 )(1 R3 )(1 R5 ) R( sys|4) = R1 ( R2 + R3 R2 R3 ) R sys = R4 [1 (1 R2 )(1 R3 )(1 R5 )] + (1 R4 ) R1 ( R2 + R3 R2 R3 ) If all components are identical and fail randomly with failure rate R sys (t ) = e t [1 (1 e t )3 ] + (1 e t )e t (2e t e2 t ) => R sys (t ) = 5 e2 t 6 e3 t + 2 e4 t
=> MTTF =
dt R
0
sys (t )
We could have obtained the same results by choosing the "keystone element" as unit 3, i.e. R sys = R3 R( sys|3) + R3 R( sys|3) R( sys|3) = R1 + R4 R1 R4
51
Reliability block diagram for Example 2 with component 3 failed R( sys|3) = R2 R( sys|23) + R2 ) R( sys|23) = R2 ( R1 + R4 R1 R4 ) + R2 R4 R5 => R sys = R3 ( R1 + R4 R1 R4 ) + (1 R3 )[ R2 ( R1 + R4 R1 R4 ) + (1 R2 ) R4 R5 ] = R4 [1 (1 R2 )(1 R3 )(1 R5 )] + (1 R4 ) R1 ( R2 + R3 R2 R3 ) Note that if the link were not present R L = R4 R5 + R1 ( R2 + R3 R2 R3 )(1 R4 R5 ) and for all components identical and failing randomly with failure rate R L (t ) = 3 e2 t e3 t 2 e4 t + e5 t
=> MTTF L =
dt R sys (t ) =
13 15
which shows that the presence of the link improves the system reliability. Failure Modes and Effects Analysis (FMEA) The FMEA was rst developed by the aerospace industry in mid 60s. The FMEA analysis
52 describes inherent causes of events that lead to system failure, determines their consequences, and, devises methods to minimize their occurrence or recurrence.
There are basically two types of FMEA: Design FMEA is used to evaluate the failure modes and their effects for a product before it is released to production and is normally applied at the component and subsystem levels. Its objectives are: Identify failure modes and rank them according to their effect on the product performance. Identify design actions to eliminate potential failure modes or reduce the occurrence of the respective failures. Document the rationale behind product design changes.
Process FMEA is used to analyze manufacturing and assembly processes. Its objectives are to identify: failure modes that can be associated with manufacturing and assembly process deciencies, highly critical process characteristics that may cause the occurrence of particular failure modes, sources of manufacturing/assembly process variations.
An example of FMEA for transportation applications (using SEA J1739 FMEA Procedure) is given below. The design controls are: 1. 2. 3. Prevent the failure cause/mechanism or mode from occurring or reduce rate of occurrence Detect the failure cause/mechanism and lead to corrective actions Detect the failure mode.
53
54
55
Severity Rating Scale
56
Detection Rating Scale
Some limitations of FMEA: Limited insight into probabilistic system behavior. FMEA is performed for only 1 failure at a time. There may be multiple failure modes with comparable likelihoods. Limited insight into the functional relationships between components Time element in system operation cannot be represented.
57 Fault Tree/Event Tree Methodology Fault-trees are logic diagrams that link primary or secondary faults (Basic Events) to an undesirable event (Top Event). Example 1 Construct a fault-tree with Top Event "Circuit breaker does not open upon demand" for the system below:
Control Circuit A
Relay A
Control Circuit B
Relay B
Trip Coil
Circuit Breaker
Solution
Circuit Breaker Does Not Open a
58 Example 2 Construct a fault-tree with Top Event "Latch does not trip" for the system below:
Actuator A
Solution
Latch Does Not Trip a
59 Example 3 For the system of Example 1 nd an expression which yields the probability of Top Event occurrence in terms of the probability of basic event occurrence. Solution Let A: B: C: D: E: Circuit breaker mechanism fails closed Relay A fails closed Control circuit A fails on Relay B fails closed Control circuit B fails on
Top Event
60 From the rules of Boolean Algebra (or Event Algebra) given in Appendix B: A + ( B + C )( D + E ) = A + B ( D + E ) + C ( D + E ) (Distributive Law) = A + BD + BE + CD + CE (Distributive Law) Each A, BD, BE , CD, CE is called a cut set (in this case also a minimal cut set). Then P ( a) = P [ A] + P [ BD] + P [CD] + P [ BE ] + P [CE ] P [ BCE ] P [CD( BE + CE )] P [ BD(CD + BE + CE )] P [ A( BD + CD + BE + CE )] (using the Commutative and Idempotent Laws) = P [ A] + P [ BD] P [ ABD] + P [CD] P [ ACD] P [ BCD] + P [ ABCD] + P [ BE ] P [ ABE ] + P [CE ] P [ ACE ] P [ BCE ] + P [ ABCE ] P [ BDE ] + P [ ABDE ] P [CDE ] + P [ ACDE ] + P [ BCDE ] P [ ABCDE ] (using the Associative, Distributive and Idempotent Laws). It is often reasonable to assume that P [ A], P [ BD], P [CD], P [ BE ] and P [CE ] are much larger that the other probabilities (i.e. rare event approximation ) which implies that Top Event probability is the sum of minimal cut set probabilities, i.e. P ( a) P [ A] + P [ BD] + P [ BE ] + P [CD] + P [CE ].
Statistical Importance Statistical importance is a measure of the signicance of a given basic event to the Top Event. If X is the event of interest, then one denition of statistical importance (Im) is Im = Pr(Minimal Cut Sets Containing X ) Pr(Top Event)
Example 4 If P ( A)=0.001/demand, P ( B)= Pr ( D)=0.001/demand and P (C )= P ( E )=0.005/demand in Example 2, use the rare event approximation to identify the component that needs most frequent inspection to prevent the Top Event "Circuit breaker does not open upon
61 demand". Solution This component can be identied as the one with the highest statistical importance to the Top Event. Then using the rare event approximation from Example 2, Im( A) = = Im( B) = P ( A) P [ A] + P [ BD] + P [ BE ] + P [CD] + P [CE ] 0. 001 + 0. 001 = 0. 9653 + 2(0. 001)(0. 005) + (0. 005)2
(0. 001)2
(0. 001)2 + (0. 001)(0. 005) = 0. 0058 = 0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2 Im(C ) = P (CD) + P (CE ) P [ A] + P [ BD] + P [ BE ] + P [CD] + P [CE ]
(0. 001)(0. 005) + (0. 005)2 = 0. 0290 = 0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2 Im( D) = = Im( E ) = = P ( BD) + P (CD) P [ A] + P [ BD] + P [ BE ] + P [CD] + P [CE ] (0. 001)2 + (0. 001)(0. 005) = 0. 0058 0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2 P ( BE ) + P (CE ) P [ A] + P [ BD] + P [ BE ] + P [CD] + P [CE ] (0. 001)(0. 005) + (0. 005)2 = 0. 0290 0. 001 + (0. 001)2 + 2(0. 001)(0. 005) + (0. 005)2
The results show that the circuit breaker mechanism should be inspected most frequently. Note that we have assumed that the events B, C , D, E are statistically independent as per given data.
62 Event-Trees Event-trees are used to identify the possible outcomes of a given initiating event and also to quantify the probability of their occurrence. Event-trees are often used in conjunction with fault-trees to quantify branch probabilities as illustrated in the example below for re readiness.
Initiating Event Evacuation Fire Containment Fire Control
S3 S2 F3
Sequence Probability
P(I)P(S1|I)P(S2|IS1)P(S3|IS1S2) P(I)P(S1|I)P(S2|IS1)P(F3|IS1S2)
S1
S3 F2 F3 Success (S)
P(I)P(S1|I)P(F2|IS1)P(S3|IS1F2) P(I)P(S1|I)P(F3|IS1)P(F3|IS1F2)
Failure (F)
P(I)P(F1|I)P(S2|IF1)P(S3|IF1S2) P(I)P(F1|I)P(S2|IF1)P(F3|IF1S2)
S3 S2 F3
F1
S3 F2 F3
P(I)P(F1|I)P(F2|IF1)P(S3|IF1F2) P(I)P(F1|I)P(F2|IF1)P(F3|IF1F2)
63 Root Cause Analysis Root causes are the most basic causes that can be reasonably identied by experts and can be corrected so as to minimize their recurrence. Several structured techniques are used for root cause analysis, including change analysis, barrier analysis, events and causal factors analysis, tree diagrams, management oversight and risk tree analysis (MORT) and shbone diagrams. Some other less structured approaches are process control charts, trend analyses and Pareto diagrams. Root cause analysis consists of three steps: 1. 2. 3. Determine events and causal factors Code and document root causes Generate recommendations
An example using the tree approach to Step 1 is given below. Step 2 consists of following each path to the top event to determine its relevance for the particular incident (e.g by asking "if not?"). Once root causes are identied corrective and preventive recommendations are made. For more information on root cause analysis see Ref.[6].
Aerosol Inhalation While Spray Painting
Personnel
Procedures
Material or Equipment
Statistically Dependent Failures Statistically dependent failures are dened as events in which the probability of each failure is dependent on the occurrence of other failures. In general, statistically dependent
64 failures are handled using Markov models which we will discuss in Dynamic Methods. However, in systems with redundant identical components static techniques may be used. We will illustrate the factor method for a 2 component parallel system. For generalization of the factor method and other methods see Ref.[2]. Consider a 2-component parallel system where each component can individually fail with rate R or fail due to common cause (e.g. loss of power) with rate C . Then the reliability function for the system is 2 R(t ) = e C t 1 1 e R t = e C t 2e R t e2 R t Let
= C C + R C
Then C = and R = (1 ) and R(t ) = e t 2e(1 ) t e2(1 ) t = 2e t e(2 ) t = e t 2 e(1 ) t . The factor method assumes that t is small enough that e t 1 t and e(1 ) t 1 (1 ) t . Then R(t ) = 1 t (1 )( t )2 or F (t ) = 1 R(t ) = t + (1 )( t )2 . Note that since can be interpreted as the probability that component failure occurs to the common cause event, then the rst term gives the probability of system failure due to common cause event and the second term gives the probability of system failure due to the non-common cause failure of the components. New Static Methods While the fault-tree/event-tree approach is perhaps the most commonly used technique for system reliability modeling, construction of fault-trees is difcult when the system operation involves control loop action. Some alternative techniques that have been
65 proposed include inuence diagrams, directed graphs (digraphs) and the GO-FLOW methodology. Since the digraph approach can be used to simplify fault-tree construction, we will illustrate this technique through a simple example. For inuence diagrams see Ref.[7] and for the GO-FLOW methodology see the Supplementary Material under Course Notes on the web. Consider the pressure tank system shown below. The switch is normally closed and the motor drives a pump which feeds air into the tank. The air is discharged through the discharge valve at periodic intervals. A timer set to these intervals opens the contacts before an overpressure condition occurs and pumping stops. If the timer contacts fail-closed, the operator observes from the pressure gauge that the tank pressure is high and manually opens the switch. There are 2 control loops: Loop 1: Tank, pressure gauge, operator, switch. Loop 2: Tank, relief valve.
The digraph is a tool to describe the cause-effect relationships between system components and variables. A digraph consists of nodes which represent the system variables and components and edges which connect the nodes. The numbers represent the direction and the qualitative magnitude of the gains between the variables. The gains multiply at the nodes. For example, in tank pressure P tank increases, the gauge pressure P gauge increases (+1 into P gauge ) which alerts the Operator (+1) who then opens the switch (+1) and reduces the current I switch to the switch (-1 +1 = -1). With the switch open, current
66 I pump to the pump motor decreases (-1 +1 = -1) and tank pressure stops increasing (-1 +1 = -1). Subsequently,if everything works as designed, an increase in P tank leads a decrease through the action of the feedback loop. The fault tree is constructed by considering the events that cause the loops to lead to the top event.
Pressure Tank Rupture
TFC
RVS
TFC: Timer Contacts Fail Closed GS: Gauge Stuck RVS: Relief valve Stuck OF: Operator Fails SFC: Switch Fails Closed
GS
OF
SFC