Sie sind auf Seite 1von 15

Basic Concepts

Reliability, MTTF, Availability, etc.

CprE 545: Fault Tolerant Systems (G. Manimaran)

Definitions
Reliability of a system is defined to be the probability
that the given system will perform its required function
under specified conditions for a specified period of
time.

MTBF (Mean Time Between Failures): Average time a


system will run between failures. The MTBF is usually
expressed in hours. This metric is more useful to the
user than the reliability measure.

CprE 545: Fault Tolerant Systems (G. Manimaran)

Approaches to increase the reliability of a system

Increasing reliability of a system

1.

Worst case design

1.

Redundancy

2.

Using high quality


components

2.

Typically employed

3.

Less expensive

3.

Strict quality
control procedures

CprE 545: Fault Tolerant Systems (G. Manimaran)

Reliability expressions
Exponential Failure Law:
Reliability of a system is often modeled as:
R(t) = exp(-t)
where is the failure rate expressed as
percentage failures per 1000 hours or as failures
per hour.

When the product t is small,


R(t) = 1 - t
CprE 545: Fault Tolerant Systems (G. Manimaran)

Relation between MTBF and the Failure rate


MTBF is the average time a system will run between
failures and is given by:

MTBF = 0 R(t) dt = 0 exp(-t) dt = 1 /


In other words, the MTBF of a system is the
reciprocal of the failure rate.
If is the number of failures per hour, the MTBF
is expressed in hours.

CprE 545: Fault Tolerant Systems (G. Manimaran)

A simple example
A system has 4000 components with a failure rate of
0.02% per 1000 hours. Calculate and MTBF.

= (0.02 / 100) * (1 / 1000) * 4000 = 8 * 10-4


failures/hour
MTBF = 1 / (8 * 10-4 ) = 1250 hours

CprE 545: Fault Tolerant Systems (G. Manimaran)

Relation between Reliability and MTBF


R(t) = (1 t) = (1 t / MTBF)
Therefore,
MTBF = t / (1 R(t))

1.0
0.8
Reliability 0.6
R(t)

0.4

0.36

0.2
0

1 MTBF

2 MTBF
Time t

CprE 545: Fault Tolerant Systems (G. Manimaran)

An example
A first generation computer contains 10000
components each with = 0.5%/(1000 hours). What is
the period of 99% reliability?
MTBF = t / (1 R(t)) = t / (1 0.99)
t = MTBF * 0.01 = 0.01 / av
Where av is the average failure rate
N = No. of components = 10000
= failure rate of a component
= 0.5% / (1000 hours) = 0.005/1000 = 5 * 10-6 per hour
Therefore, av = N = 10000 * 5 * 10-6 = 5 * 10-2 per hour
Therefore, t = 0.01 / (5 * 10-2 ) = 12 minutes

CprE 545: Fault Tolerant Systems (G. Manimaran)

Reliability for different configurations

1. Series Configuration
1

Overall reliability = Ro = R * R * R. R = RN
2. Parallel Configuration

Ro = 1 (probability that all of the


components fail)

Ro = 1 (1 -

R)N

CprE 545: Fault Tolerant Systems (G. Manimaran)

R
9

Reliability for different configurations

3. Hybrid Configuration
1
1

R
R

Overall reliability = Ro = ?

CprE 545: Fault Tolerant Systems (G. Manimaran)

10

Reliability for different configurations

4. Triple Modular Redundancy (TMR)


1
2

Voting

Overall reliability = Ro = [3C2 * R2 * (1-R)] + [R3]

CprE 545: Fault Tolerant Systems (G. Manimaran)

11

Reliability calculation a more complicated example

R = Rc Rs2 + (1-Rc) Rs1

System
B
A

Assuming C is faulty

Assuming C is fault
free

S1

E
D

S2
B
A

E
D

Needs
further
reduction

Rs1 can be
calculated
using
parallel
series
formulae

S2

Rs2 = RE Rs3 + (1-RE) Rs4

B
A

Assuming E is faulty

S4

Assuming E is fault
free

S3
B
A

S3

F
B
D

Maintainability
Maintainability of a system is the probability of
isolating and repairing a fault in the system within a
given time.
Maintainability is given by:
M(t) = 1 exp(-t)
Where is the repair rate
And t is the permissible time constraint for the
maintenance action
= 1/(Mean Time To Repair) = 1/MTTR
M(t) = 1 exp(-t/MTTR)
CprE 545: Fault Tolerant Systems (G. Manimaran)

14

Availability
Availability of a system is the probability that the system will be
functioning according to expectations at any time during its
scheduled working period.

Availability = System up-time / (System up-time + System down-time)

System down-time = No. of failures * MTTR

System down-time = System up-time * * MTTR

Therefore,
Availability = System up-time / (System up-time + (System up-time *
* MTTR)
= 1 / (1 + ( *MTTR)
Availability = MTBF / (MTBF + MTTR)

CprE 545: Fault Tolerant Systems (G. Manimaran)

15

Das könnte Ihnen auch gefallen