Beruflich Dokumente
Kultur Dokumente
Primary source material: A. Hyland, and M. Rausand, 1994, System Reliability Theory:
Models and Statistical Methods, Wiley, New York. Chapter 2.
Notes to the Student:
These lecture notes are not a substitute for the thorough study of books. These notes
are no more than an aid in following the lectures.
Section 15 contains review exercises that will assist the student to master the material
in the lecture and are highly recommended for review and self-study. The student is directed
to the review exercises at selected places in the notes. They are not homework problems,
and they do not entitle the student to extra credit.
Contents
1 Overview 3
2 Time to Failure 4
6 Poisson Process 12
8 Gamma Distribution 16
10 Pareto Distribution 23
11 Weibull Distribution 25
0
\lectures\reltest\pfm.tex 7.6.2015
Yakov
c Ben-Haim 2014.
pfm.tex P ROBABILISTIC FAILURE M ODELS 2
12 Normal Distribution 27
13 Lognormal Distribution 28
15 Review Exercises 38
pfm.tex P ROBABILISTIC FAILURE M ODELS 3
1 Overview
This lecture develops 3 main ideas:
The probabilistic reliability function, R(t).
The probabilistic failure rate function z(t).
The mean time to failure MT T F .
2 Time to Failure
The time to failure (TTF):
time from start of operation to 1st failure.
Examples:
Hours of operation.
# of times a switch is operated.
Kilometers traveled.
# rotations of an axis.
Z t
F (t) = f ( ) d (1)
0
= P (T t) (2)
f (t)
6
- t
q1
Fig. 1 illustrates the 1 quantile of a pdf f (t). Formally, the 1 quantile is defined as
the value of the random variable for which the probability of non-exceedence is 1 :
Prob(T q1 ) = 1 (6)
Equivalently, this quantile is the value of the random variable for which the probability of
exceedence is :
Prob(T > q1 ) = (7)
Recall the reliability function in eq.(3):
Comparing eqs.(7) and (8) we see that t is the 1 R(t) quantile of the time-to-failure distri-
bution.
This is important and useful since much statistical theory relates to the quantile function.
If you live to be one hundred, youve got it made. Very few people die past that age.
The failure rate function z(t) is the conditional probability density of failure during (t, t +
dt), given no-failure up to time t:
We can understand this expression in terms of the basic definition of conditional probabil-
ity, as illustrated in fig. 2. Let A and B be two events with joint probability function p(A, B).
The marginal probability of B is p(B). The conditional probability of A given B is:
p(A, B)
p(A|B) = (10)
p(B)
In this case A entails B, so p(A, B) = p(A), and we obtain the final equation in (9).
pfm.tex P ROBABILISTIC FAILURE M ODELS 7
f (t) = et , t 0 (14)
Thus the failure rate function for the exponential density is constant. This means that the
probability of failure in the infinitesimal interval (t, t + dt) does not depend on how long the
component has survived. In other words, the item has neither increasing nor decreasing
risk of failure with age.
Note also that has units per time and thus represents a rate of failure.
Review exercise 4, p.38.
Related to this, consider the probability that the item, (still with an exponential density),
will survive an additional duration , given that it has survived a duration t. This is the
conditional reliability function:
1 F ( + t)
Prob(no failure in + t | no failure in t) = (18)
1 F (t)
e( +t)
= (19)
et
= e = R( ) (20)
In other words, the probability of survival for an additional duration , given survival for a
time t, precisely equals the survival probability for time . Since the failure rate function is
constant, the risk rate neither increases nor decreases with time.
Review exercise 5, p.38.
pfm.tex P ROBABILISTIC FAILURE M ODELS 8
1. Phase 1: high but decreasing failure rate. Early in the life of a population of compo-
nents or systems, the failure rate will tend to be high due to early failure of defective
items.
2. Phase 2: constant failure rate. Once the defective units have been selected out by
premature failure, the population dwindles due to random failures.
3. Phase 3: increasing failure rate. Late in the life of the population the units begin to
wear out, and the failure rate increases.
pfm.tex P ROBABILISTIC FAILURE M ODELS 9
Derive z(t):
f (t) f (t)
z(t) = = (22)
1 F (t) R(t)
Also:
dF (t) d(1 R(t))
f (t) = = = R (t) (23)
dt dt
So:
R (t) d ln R(t)
z(t) = = (24)
R(t) dt
Derive R(t):
Thus, because R(0) = 1 (Explanation: R(t) = Prob(T t)):
Z t
z(s) ds = ln R(t) (25)
0
or: Z
t
R(t) = exp z(s) ds (26)
0
Derive f (t):
Since:
f (t)
z(t) = (27)
R(t)
we see that: Z
t
f (t) = z(t)R(t) = z(t) exp z(s) ds (28)
0
Derive F (t):
Z t
F (t) = 1 R(t) = 1 exp z(s) ds (29)
0
pfm.tex P ROBABILISTIC FAILURE M ODELS 10
As N we find:
j t (33)
nj
f (t) dt (34)
N Z
t tf (t) dt = E(t) (35)
0
So, if tR(t)|
0 = 0, then Z
MT T F = R(t) dt (39)
0
Thus: Z
L(R)s=0 = R(t) dt = MT T F (41)
0
pfm.tex P ROBABILISTIC FAILURE M ODELS 12
6 Poisson Process
In this section we will derive the Poisson process for describing the probability distribution
of independent failures.
Assumptions:
= average number failures/sec = constant.
Failures occur independently of one another.
Probability balance:
Pn (t + dt) = Pn (t)(1 dt) + Pn1 (t)dt + Pn2 (t)O(dt)2 + (42)
Review exercise 10, p.38.
Consider n = 0:
P0 = P0 , P0 (0) = 1, = P0 (t) = et (47)
pfm.tex P ROBABILISTIC FAILURE M ODELS 13
Consider n = 1:
nglish Poisson
tician Probability mass function
en
ime
verage
vent.[1]
umber
nce,
ount
ct the
s of
ently The horizontal axis is the index k, the number of
at the occurrences. The function is only defined at integer
values of k. The connecting lines are only guides for the
ollow a eye.
call
Cumulative distribution function
cond
assing
Expectation of n:
X
E(n) = nPn (t) (50)
n=0
X (t)n
= et n (51)
n=0 n!
X (t)n
= et n (52)
n=1 n!
pfm.tex P ROBABILISTIC FAILURE M ODELS 14
X (t)n1
= tet (53)
n=1 (n 1)!
t
X (t)n
= te (54)
n=0 n!
| {z }
et
= tet et (55)
= t (56)
Clearly:
FT1 (t) = 0 for t < 0 (59)
Also, for t 0:
= 1 et (62)
dFT1 (t)
fT1 (t) = (63)
dt
(
et t 0
= (64)
0 t<0
Expectation of T1 : Z 1
E(T1 ) = tfT1 (t) dt = (65)
0
Review exercise 13, p.39.
8 Gamma Distribution
Now consider a generalization of FT1 (t).
Clearly:
FTk (t) = 0 for t < 0 (67)
Also, for t 0:
d k1
X et (t)n
= (74)
dt n=0 n!
k1
X (t)n t k1
X (t)n1 t
= e n e (75)
n=0 n! n=0 n!
" k1 #
t
X (t)n k1
X (t)n1
= e (76)
n=0 n! n=1 (n 1)!
" k1 #
t
X (t)n k2
X (t)n
= e (77)
n=0 n! n=0 n!
" #
t (t)k1
= e (78)
(k 1)!
pfm.tex P ROBABILISTIC FAILURE M ODELS 17
So:
(t)k1 et
fTk (t) = (79)
(k 1)!
which is known as the gamma distribution.
This name refers to the gamma function, (x).
For integer x: (x) = (x 1)! (transparency, SFM p.12.2)
1/,
metrics
quently
e until
gamma
d as a
i.e., the
f which
n for a
ero, and
n).[3] Parameters
Suppose also that failure occurs on the kth shock but not before.
(t)k1et
fTk (t) = , t0 (82)
(k 1)!
Consider a system containing two subunits, both of which must operate for the system to
be functional. We can think of this as a system with two serial subunits, as in fig. 7.
= S1 = S2 =
Let Pi be the probability that the ith subunit is operational, and assume that the subunits
fail independently. The probability that the system as a whole is functional is precisely the
reliability: the probability of no-failure:
R = P1 P2 (85)
FS = 1 R = 1 P1 P2 (86)
= S1 = = SN =
The immediate generalization to a system with N independent subunits, all of which are
essential for system operation, is:
N
Y
R= Pi (87)
i=1
Note that:
lim FS = 1 so lim R = 0 (89)
N N
Consider now a system with N essential subunits. That is, failure of any one subunit causes
failure of the entire system. We can model this system as N serial subunits as in fig. 8. Also,
suppose that failures in the individual subunits are distributed in time as Poisson random
variables. Let the rate constant for failure in the ith subunit be i , i = 1, . . . , N. Then the
probability of no-failure in duration t in the ith subunit is:
Pi = P0 (t; i ) = ei t (90)
Thus, using eq.(87), the probability that the system is operational for a duration t is its
reliability: " #
N
Y N
X
i t
R(t) = e = exp t i (91)
i=1 i=1
The probability that one or another of the subunits will fail during an infinitesimal interval
dt is dt. The probability that no failure occurred during the interval (0, t) is, as we saw
in eq.(91), et . Thus the probability of survival up to time t and then failure of the system
during (t, t + dt) is:
f (t) dt = et dt (94)
This is simply the pdf of FS (t) in eq.(92).
We thus find the mean time to failure is:
Z Z
E(t) = tf (t) dt = tet dt (95)
0 0
1 1
= = PN (96)
i=1 i
Note that we have shown that serial Poisson systems have exponential distribution of the
time to failure.
Review exercise 18, p.39.
pfm.tex P ROBABILISTIC FAILURE M ODELS 21
= S1 =
= .. =
.
= SN =
That is, the situation is just the reverse of the case with serial subunits, and eq.(97) is the
reverse of eq.(87).
The probability that the system is operational is its reliability:
N
Y N
Y
R = 1 FS = 1 Fi = 1 (1 Pi ) (98)
i=1 i=1
Note that:
lim R = 1 (99)
N
What does this imply about the reliability of complex systems? Compare with eq.(89).
Review exercise 19, p.39.
pfm.tex P ROBABILISTIC FAILURE M ODELS 22
The generic parts count method for reliability analysis of complex systems is a widely
used method, and is based on the following assumptions:
1. The system is made up of distinct subunits whose failure probabilities are statistically
independent.
2. The operation of the entire system is dependent on all subunits. Thus, the system is
modelled as though it were serial.
3. Each subunit has a constant failure rate. Thus, the distribution of failures in each
subunit is modelled as a Poisson process.
Example 1 (From Davidson 1988, chap. 12). The mean failure rates in units of failures per
million hours of operation per single device are shown in table 1. The total failure rate is
= 814 106 failures/hour. Thus the probability that the system will survive for one month
of continuous operation is:
R = et = e81410
6 2430
= 0.577 (100)
Thus the system has a 57.7% chance of operating without failure for one 30-day month.
pfm.tex P ROBABILISTIC FAILURE M ODELS 23
10 Pareto Distribution
Definition: a random variable X has a Pareto distribution if its cdf is:
!c
d
FX (x) = 1 , x > d, c > 0, d > 0 (101)
x
f (t|) = et , t0 (102)
d = 1, c=k (112)
11 Weibull Distribution
Definition: a random variable T has a Weibull distribution if its cdf is:
(
1 e(t) , t>0
FT (t) = Prob(T t) = (117)
0 t0
Weibull describes the distribution of the least of a large number of identically distributed
non-negative random variables.
The Weibull distribution is thus sometimes called the weakest link distribution.
Example. Consider a steel pipe with wall thickness which is exposed to corrosion.
12 Normal Distribution
Definition: t has a normal distribution if its pdf is:
!
1 (t )2
f (t) = exp , < t < (125)
2 2 2
We write:
t N (, 2 ) (126)
where and 2 are the mean and variance of t.
The standard normal distribution is t N (0, 1). Its pdf is:
!
1 t2
(t) = exp , < t < (127)
2 2
The cdf is denoted (t).
Standardization. If T N (, 2 ), then:
T t
FT (t) = Prob(T t) = Prob (129)
But:
T
N (0, 1) (130)
Thus:
t
FT (t) = (131)
2
If T N (, ), then the reliability function is:
t
R(t) = 1 FT (t) = 1 (132)
13 Lognormal Distribution
Definition. A random variable T is lognormally distributed if:
Y = ln T and Y N (, 2) (134)
because:
Y = ln T N (, 2 ) (141)
so:
ln T
N (0, 1) (142)
pfm.tex P ROBABILISTIC FAILURE M ODELS 29
f (R)
6
??? ???
- R
rare events rare events
The opportunity:
Fat tails: extreme favorable outcomes occur too.
Two sides to uncertainty:
Design robustly against failure: robust-satisficing.
Design opportunely for windfall: opportune-windfalling.
Related concepts:
Statistical quantiles.
Value at Risk in financial economics.1
1
Yakov Ben-Haim, 2005, Value at risk with info-gap uncertainty, Journal of Risk Finance, vol. 6, #5,
pp.388403.
Yakov Ben-Haim, 2010, Info-Gap Economics: An Operational Introduction, Palgrave-Macmillan.
pfm.tex P ROBABILISTIC FAILURE M ODELS 30
Info-gap uncertainty:
f (R) = unknown true pdf for R.
fe(R) = best estimate of f (R).
F (h, fe) = info-gap model for uncertainty in f (R), fig. 11.
Unbounded family of nested sets of pdfs.
E.g. Unknown fractional error in f (R):
Z f (R) fe(R)
F (h, fe) = f (R) : f (R) 0, f (R) dR = 1, h , h0 (143)
fe(R)
f (R)
6 fe(R)
(1 h)fe(R) (1 + h)fe(R)
@
@
R
@ - R
Failure probability:
Failure: response less than R : R < R .
Given R , the failure probability is (fig. 12):
Z R
Pf (f ) = f (R) dR (144)
f (R)
6
Pf (f )
@
@R
@
- R
R
Pf (f ) c (145)
E.g., c = 0.01.
Problem: f (R) highly uncertain, especially on its tails.
Estimated failure probability:
Z R
Pf (fe) = fe(R) dR (146)
Robustness question:
How wrong can fe(R) be, without failure probability exceeding c?
f (R)
6
c - R
q(c, f )
q(c, f ) R (148)
This is because:
Z R
q(c, f ) R c f (R) dR (149)
f (R)
6
c - R
q(c, f )
So, the min in eq.(151) occurs when the lower tail of f (R) is as fat as possible:2
2
This is true for very small c.
pfm.tex P ROBABILISTIC FAILURE M ODELS 34
f (R)
6
R R
f (R) dR c
@
@R
@ ? - R
R q(c, f )
b c
h(R , c) = Z R 1 (156)
f (R) dR
Trade-off:
Robustness decreases as aspiration increases (transparency):
b b
R1 < R2 < 0 implies h(R2 , c) h(R1 , c) (157)
Robustness
b
h(R , c)
High 12 60.01
0.03
10 c = 0.05
8
6
4
2
Low 0 -
0.25 0.20 0.15 0.10
Cutoff return, R
Poor Good
Robustness
b
h(R , c)
12 60.01
0.03
10 c = 0.05
8
6
4 q(c, ff)
2
/
0
0 u u u/
-
0.25 0.20 0.15 0.10
Cutoff return, R
pfm.tex P ROBABILISTIC FAILURE M ODELS 36
14.5 Opportuneness
Two faces of uncertainty:
Pernicious: threatening failure.
Propitious: enabling windfall.
Opportuneness:
Lowest horizon of uncertainty at which wonderful response is possible.
Windfall response: R , greater than R .
( ! )
b , c) = min h :
(R max q(c, f ) R (160)
f F (h,fe)
Immunity functions:
b is better.
Robustness: immunity against failure. Bigger h
Opportuneness: immunity against windfall. Big b is bad.
Summary
Quantile: statisical assessment of risk.
Foci of uncertainty:
Estimation error: randomness.
Info-gap uncertainty, surprise, past/future.
Info-gap robustness:
Managing info-gaps.
Supplementing statistical quantile.
Info-gap opportuneness:
pfm.tex P ROBABILISTIC FAILURE M ODELS 37
Exploiting info-gaps.
pfm.tex P ROBABILISTIC FAILURE M ODELS 38
15 Review Exercises
The exercises in this section are not homework problems, and they do not entitle the
student to credit. They will assist the student to master the material in the lecture and are
highly recommended for review and self-study.
1. (a) In analogy to eq.(1), p.4, write an integral expression for the probability of failure
in the time interval from t1 to t2 .
(b) Write an integral expression for the probability of failure in the interval [t1 , t2 ] or
[t3 , t4 ] where t2 < t3 .
(c) Write an integral expression for the probability of failure in the interval [t1 , t2 ] or
[t3 , t4 ] where t4 > t2 > t3 > t1 .
2. Section 3, p.5:
(a) For any random variable x, a median of the distribution of x is a value, q, such
that the following two conditions hold:3
1 1
Prob(x q) and Prob(x q) (161)
2 2
Is a median a quantile? If so, what is its 1 value?
(b) If f (t) is a symmetric probability density function on the interval [a, a], what is
the difference between its mean and its median?
(c) Compare the mean and the median of the exponential distribution, f (t) = et ,
t 0. Which is larger? Why?
3. Explain the denominator of eq.(10), p.6. Also consider the special case that B always
occurs.
9. Use eq.(39), p.11, to give a reliability explanation of why a large MTTF is desirable.
Alternatively, if the MTTF is large, use eq.(39) to explain the sense in which the system
is reliable, and the limits of this interpretation.
10. (a) Explain why the probability on the righthand side of eq.(42), p.12, balances the
probability on the left hand side.
3
DeGroot, Morris H., Probability and Statistics, 2nd ed., Addison-Wesley, Reading, MA, 1986, p.207.
pfm.tex P ROBABILISTIC FAILURE M ODELS 39
(b) In eq.(42) the value of n can either stay the same or increase; it cannot decrease.
Now suppose that it can also decrease due to repair as opposed to failure. In
analogy to , Let be the average number of repairs/sec. Thus the probability
of repair in duration dt is dt. Suppose that repair and failure are independent.
Derive a probability balance equation in analogy to eq.(42).
14. For what value of k does fTk in eq.(79), p.17, become the exponential distribution?
15. (a) The mean and variance in eqs.(80) and (81), p.17, increase as k increases. Does
this make sense? Why?
(b) How does the relative error, /MTTF, vary with k? Does this make sense? Why?
18. Following eq.(93), p.20, we said that the probability that one or another of the subunits
will fail during an infinitesimal interval dt is dt. Explain. What assumptions underlie
this assertion?
19. (a) Compare eqs.(97), p.21, and (88), p.19. Explain the difference.
(b) Compare eqs.(98), p.21, and (87), p.19. Explain the difference.
21. What is the meaning of the decreasing failure rate function in eq.(116), p.24?
22. What is the meaning of the decreasing or increasing failure rate function in eq.(120),
p.25? Why does this make the Weibull distribution useful for explaining a wide range
of phenomena?
23. Explain the weakest link metaphor as it is applied to the Weibull distribution on p.25.
24. Why does the central limit theorem, p.27, motivate calling the normal distribution nor-
mal?
25. Explain the similarity and difference between the two foci of uncertainty on p.29.
26. (a) Why is there no known worst case in the info-gap model of eq.(143), p.30? Could
there be a worst case which simply is not known?
(b) Explain that the sets of the info-gap model of eq.(143) become more inclusive as
h increases. Explain why this gives h its meaning as an horizon of uncertainty.
pfm.tex P ROBABILISTIC FAILURE M ODELS 40
27. Can we answer the robustness question on p.31 even if we dont know how wrong
fe(R) actually is?
30. Why does the solution in eq.(153), p.33, depend on c being very small?