Beruflich Dokumente
Kultur Dokumente
Reliability
Probabilistic Modeling
Introduction
We begin by defining a repairable system, after some
preliminary concepts and definitions.
1. Part: an item not subject to disassembly and
hence discarded when it fails.
2. Socket: a circuit or equipment position which, at
any given time, holds a part of a given type.
3. System: a collection of two or more sockets and
their associated parts, interconnected to perform
one or more functions.
4. Nonrepairable System: a system that is discarded the first time that it ceases to perform
satisfactorily.
5. Repairable System: a system which, after failing
to perform at least one of its required functions,
can be returned to performing all of its required
functions satisfactorily by any method other than
replacement of the entire system.
Three points must be made. First, since small
appliances are systems, many systems, perhaps even
a majority, are nonrepairable. Nevertheless, the overwhelming majority of systems of interest in reliability applications are designed to be repaired, rather
than discarded, after their first failure. Henceforth,
therefore, the term system will be used to denote a
repairable system.
Secondly, given that a system contains n parts,
the definition of a repairable system allows up to
n 1 part replacements during a single repair. In
practice, however, when the repair would require
many new parts, it usually is more cost effective to
replace the entire system. Most repairs involve the
replacement of only a minute fraction of a systems
constituent parts and this has major implications for
probabilistic modeling and therefore, for statistical
analysis as well. Some repairs, e.g., cleaning contacts
or adjusting internal potentiometers, do not involve
replacement of any parts.
Thirdly, if a system can be returned to satisfactory
operation by occasional adjustments of external, i.e.
front panel controls, it hasnt failed and therefore,
doesnt require repair. However, too frequently
Most of the literature concerning methods for predicting the time to failure of a system assumes that
system failure is an absorbing state, i.e., that the system of interest is nonrepairable. In many cases, the
nonrepairable system can include one or more groups
of repairable redundant subsystems. Once a systemlevel failure occurs, however, the system is assumed
to be discarded. Such reliability with repair models
will not be considered in this article.
This section concentrates on black-box modeling and analysis. That is, models are postulated for
the pattern of system-level failures, regardless of the
systems design, and the postulated models can be
tested against even small datasets.
Fx (x)
1 Fx (x)
(2)
d
E[N (t)]
dt
(3)
Statistical Analysis
The interarrivals times between failures of a system appear in natural order on a time line. The first
step in an analysis, therefore, is to test for trend. If
a trend toward larger interarrivals times (reliability
growth or improvement) or toward smaller interarrivals times (deterioration) exists, the interarrivals
times are not identically distributed. Hence, little or
nothing in many reliability and statistics books is
applicable, since many books concentrate solely on
i.i.d. data. If an assumption is dropped, it usually
is that of independence, rather than the even more
important assumption of identically distributed interarrivals times.
Parametric Models
Given a trend, the NHPP is the first choice as
a model. Consider the power-law process where
v(t) = t 1 .
Then if failures occurred at arrival times T1 = t1 ,
T2 = t2 , . . . , Tm = tm , over an observation interval
(0, t ], the maximum-likelihood estimators (MLEs)
of and are
m
(4)
= m
ln(t /ti )
i=1
m
t
(5)
If observation is till time Tm , then Tm is substituted for t in both equations. Many other techniques
for the power-law process and other NHPPs are provided in [8] and the references therein. Lawless and
Thiagarajah [9, 10] show how to include explanatory factors or covariates, e.g. different environmental
stresses experienced by different system copies, in an
analysis.
If there is no evidence of trend, the data can
be considered to be identically distributed, but the
interarrivals times of a copy may still be dependent.
In practice, however, one seldom has enough failures
to test the independence assumption [11].
If there is no evidence that the interarrivals times
are not i.i.d., one fits an RP to them. If an exponential
distribution provides adequate fit, the resulting model
is the HPP; otherwise, a more general RP must be
selected. The techniques (a) for fitting an exponential distribution to times to failure of parts and (b) for
fitting an HPP to system interarrivals times are interchangeable. The interpretations of the results, however, often are drastically different. This is because
a failed part is discarded at failure, whatever the
magnitude of its constant FOM, whereas a failed system modeled by an HPP is repaired to the ROCOF
it had when it was new, which may be very large.
For further information on the major differences in
interpretations, see [12, 13.]
Nonparametric Models
Cumulative Plots. If we plot cumulative failures
versus cumulative operating time, t, for a single system copy, we can perform trend testing visually. An
increasing, constant, or decreasing slope is an indication of deterioration, noncommittal, or improving
reliability, respectively, of that system copy. Deteriorating (improving) systems have become known as
sad (happy) systems to help distinguish these concepts from wear out (burn-in) of parts.
Mean Cumulative Function. In most applications,
data are available from two or more copies and
in many cases the number of copies is far greater
than two. The mean cumulative function (MCF) is
constructed incrementally at each instant a failure
occurs by considering the number of copies at risk
at that instant. Copies may not be in the risk set
because of left or right censoring. Left censoring
occurs when a copy is operated for some time without
knowledge of its failure history and then comes under
observation. Right censoring occurs when a copy is
no longer observed after a time t = t* for any reason,
ranging from the copy being totaled to simply not
accumulating more than t* hours at a time when the
number at risk is being calculated. The key reference
to this approach is [14].
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
Related Articles
Age-Dependent Minimal Repair and Maintenance; Analysis of Recurrent Events from Repairable Systems; General Minimal Repair Models;
Imperfect Repair, Counting Processes; Intensity
Functions for Nonhomogeneous Poisson Processes;
Multivariate Imperfect Repair Models; Nonparametric Methods for Analysis of Repair Data;
Reliability Growth Modeling; Repairable Systems: Statistical Inference; Repairable Systems:
Bayesian Analysis; Software Failure Data Analysis; System Availability.
HAROLD E. ASCHER