PQD 5 - Experimental Verification (2016/2017)

AUTOMOTIVE ENGINEERING
Mario Vianello
vianello.clm@tin.it
PRODUCT
QUALITY DESIGN
01OFHLO
These slides can not be reproduced or distributed without permission in writing from the author. 1
COURSE CONTENTS
1. Design for Quality considering customer needs and pro-

duct targets
2. Freshen up of Applied Statistics Fundamentals
3. Fundamentals of Reliability and of Robust Design
4. Measure and prevention methodological instruments for
Reliability
5. Criteria and methods to plan reliability experimental verifi-
cations
6. Managerial considerations: mention about Problem solving,
Lessons learned, Experience accumulation, Technical memory,
Global approaches like Six Sigma
PRODUCT QUALITY DESIGN
5
CRITERIA AND METHODS
TO PLAN RELIABILITY
EXPERIMENTAL VERIFICATIONS
CHAPTER CONTENTS
 General criteria for designing experimental tests

- Categories and purposes of tests
- Concept of tests “significance”
- Overview of all experimental tests for a new car model in development
 Component bench testing (Success Run)
- Basic Success Run
- Prolonged Success Run
- Extended Success Run
 Reliability tests on complete vehicles for design verification (Reliability
Growth Testing for Useful Life)
- Fundamentals of Reliability Growth Testing (Duane curve)
- “New Content” as a main parameter for sizing test samples
- Meaningfulness of tested vehicles in subsequent batches
- Tests designing
- Results monitoring
- In field prediction adjustment
 Reliability tests on complete vehicles for process verification (Reliability
Growth Testing for Infant Mortality)
 Final summary on experimental tests
5. Criteria and methods to plan reliability experim. verifications
5.1
GENERAL
CRITERIA FOR DESIGNING
EXPERIMENTAL TESTS
5.1. General criteria for designing experimental tests
5.1.1
Categories and purposes
of experimental tests
5.1.1. Categories and purposes of experimental tests
AIMS OF EXPERIMENTAL RELIABILITY CERTIFICATION
Experimental testing has 2 main targets:

 Reliability verification according to statistical
laws: we have to deal with problems such as the
large size of samples required by classical Stati-
stics;
 Final tuning (= Reliability Growth): option to be
used sparingly, because changes at this stage,
are extremely expensive.
The emphasis is on prevention activities

and less on the improving during tests phase
Categories and purposes of Experimental Tests
OPERATING TEST: tests aimed at verifying whether the design
changes achieve the desired performance targets. Small samples
are sufficient.
DURABILITY TEST: tests aimed to avoid durability failures (i.e.
failures for which the repair work is not technically feasible or not
economically convenient), and then:
• tests extended to the wear out area and beyond;
• sample size intermediate between that required by operating tests and
that required by reliability test (but, given the burden of these tests, closer
to the first ones than to the second ones);
• test acceleration can be very high, because these tests are aimed at
kinds of failure which have been substantially identified.
RELIABILITY TEST: tests aimed at detecting any kind of failure,
which has been able to overcome the prevention checks, and
therefore:
• large samples are required, because we have to be able to detect failures
even with very low frequencies;
• acceleration can not be very high, for the need to maintain a full relation-
ship with the customer’s use, because, in this case, we need to be able to
detect any kind of failure.
Depending on the types of components,

Durability tests
may precede or follow
those of Reliability.
5.1.2
Concept of
test “significance”
5.1.2. Concept of test “significance”
How many cars must be experimentally tested ?
ONCE
For models with high production, many cars were
tested, while for low-production models, only a few
cars were tested.
But the low-production models are often the more
expensive ones, whose reliability, in this way, is
less well tested !
NOW
Sometimes one wonders: “How many cars
need to test, so that the test is significant“ ?
But what does mean  “significant” ?

5.1.2. Concept of test “significance”
Oversimplifying...
in Statistics, the term significance level means
the probability of failing,
due to the fact that deductions are made from a sample
and not from the full population.
but “fail” compared to what ?!? ...

... compared to what you want to know !
5.1.2. Concept of tests “significance”
Basic : the purpose of the tests
In general :
 the design of an experimental test depends on what its
results will have to express and of what decision they will
help us to take.
 if we can not afford an adequate sample size to answer the
question of interest, it is worth wondering whether it is better
to eliminate all experimental test (saving the expenditure!).
 Alternatively, we must clearly highlight the questions (less
important) to which a reduced test (compared to the ideal
one) is able to answer, and also highlight the relative risks
after a careful evaluation of costs and benefits.
5.1.3
Overview of
all experimental tests
for a new car model
in development
5.1.3. Overview of all experimental tests for a new car model in development
EXPERIMENTAL RELIABILITY CERTIFICATION

generally is divided into 3 phases
These are bench tests for components of major importance for a new
RELIABILITY car model.
DEMONSTATION The size of the sample is by the methodology Success-Run (tests suc-
cessfully), that is, assuming that no component fails during the tests: it
for Critical Components permits a minimal sampling.
These tests are on the whole vehi- Reliability Growth Testing me-
cle, with the purpose of verifying the thod provides a total amount of
reliability of the design. Then they test, increasing with
RELIABILITY GROWTH can start even if the production pro-  as more demanding the
cess is not yet the final one. As a reliability target;
for useful life guide, you can try 3 to 4 batches of  and as the greater the
10 to 30 cars for 15,000÷60,000 km, differences (New Content)
depending on its mission profile (eg. between the new model and the
Petrol or diesel). archetype of reference.
These tests are on the car to check
Its main advantages are:
the reliability of the process, then
they can begin only if the production  calculation of the theoretical
RELIABILITY GROWTH process is final. model related to the specific
As a guide: in the first tests, 7,000 to co-intervention;
for infant mortality  managing the failures problem
12,000 cars have been tested for
100 to 180 km, but today it is prefer- solving process;
red not to overcome 5,000 km. cars.  monitoring of actual results.
5.1.3. Overview of all experimental tests for a new car model in development
PRELIMINARY CLASSIFICATION OF DETECTED FAILURES
Kind of problem q Design

(based on the kind of corrective action
plausibly more appropriate/convenient)
q Process
q Product Specifications not respected

q SUPPLIER q Product Specifications to be improved
Responsibility (by FIAT)
q FIAT
… for these reason:

q Standards not respec- q ……………..
Macro group ted q ……………..
of causes q ……………..
which produced failure
q Standards not yet fully
adequate
5.2
COMPONENT BENCH
TESTING
(SUCCESS RUN)
5.2. Component bench testing (and their limits)
5.2.1
Basic Success Run
5.2.1. Component bench testing: basic Success Run
SUCCESS RUN = tests with success
These are tests during which

not even a failure occurs (= success) :
therefore, no longer makes sense
to speak of an average number of failures.
But almost always what interest is

to quantify the trustworthiness
(called confidence level),
with which the experimental test with zero failures
proves the achievement
of the value assumed as reliability target.
Direct application
In this regard, the method provides the expression:
C = 1 - Rn
which expresses the Confidence Level, C, depending
on the reliability, R, and on the sample size, n.
The confidence level expresses here the detect ability of failures. In fact, R is the
Reliability of a single component, i.e. its probability not to fail during the mission in
question, and therefore the probability of not finding faults on n components in the
same conditions is given by the product of the probability of n terms equal to R:
R . R . R ... . R = Rn. Ultimately, Rn is the probability of finding not even a failure
(= zero failures), and C = 1 - Rn is the probability of finding at least one failure
(one or more), namely the detect ability. We can now proceed with a reasoning
similar to that of Statistical Hypothesis, noting that, if the reliability was equal to R
or less, it would be a probability not less than 1 - Rn to have at least one failure,
but, as there has been no failure, we are allowed to believe that the actual reliability
is greater than R: with a confidence level equal to 1 - Rn and risk to be wrong (the
so-called error of Type I) equal to Rn.
Suppose that we have set, for a particular component, a failure

frequency target no greater than 3 failures in a thousand pieces
during the warranty period;
the reliability target is given by the complement to 1 of the failure
rate, i.e.:
R = 1 - 3/1000 = 0,997  99,7% .
Imagine that we have tested 450 pieces for a period corresponding

to that of the warranty, without finding any failure; using the over-
written expression we get a Confidence Level C = 0.74  74%.
That means that if we run many times tests of this type and we
always bet we have achieved the target, we win 74% of bets and we
miss the remaining 26%.
Inverse application
The same expression is commonly used to plan the minimum number of
units to be tested, in order to statistically demonstrate a predetermined
reliability target with an assigned confidence level.
It’s enough to write in the inverse form the preceding expression:
n = Log (1-C) / Log R + 0.499999
With the values of the preceding example, we obtain (as it was logical to
expect):
n = Log(1- 0.74) / Log 0.997 = (-0.585) / (-0.0013) = 450 units.
Since the market requires more and more demanding reliability targets,
the number of units to be sampled is usually very high.
MINIMUM SAMPLE SIZE depending on the MINIMUM VALUE, R, OF RELIABILITY to be demonstrated and on its CONFIDENCE LEVEL, C
MINIMUM
C O N F I D E N C E L E V E L , C
RELIABILITY
R 0,500 0,550 0,600 0,650 0,700 0,750 0,800 0,850 0,900 0,950 0,960 0,970 0,980 0,990 0,995 0,999
0,500 1 2 2 2 2 2 3 3 4 5 5 6 6 7 8 10
0,550 2 2 2 2 3 3 3 4 4 6 6 6 7 8 9 12
0,600 2 2 2 3 3 3 4 4 5 6 7 7 8 10 11 14
0,650 2 2 3 3 3 4 4 5 6 7 8 9 10 11 13 17
0,700 2 3 3 3 4 4 5 6 7 9 10 10 11 13 15 20
0,750 3 3 4 4 5 5 6 7 9 11 12 13 14 17 19 25
0,800 4 4 5 5 6 7 8 9 11 14 15 16 18 21 24 31
0,850 5 5 6 7 8 9 10 12 15 19 20 22 25 29 33 43
0,900 7 8 9 10 12 14 16 19 22 29 31 34 38 44 51 66
0,950 14 16 18 21 24 28 32 37 45 59 63 69 77 90 104 135
0,960 17 20 23 26 30 34 40 47 57 74 79 86 96 113 130 170
0,970 23 27 31 35 40 46 53 63 76 99 106 116 129 152 174 227
0,980 35 40 46 52 60 69 80 94 114 149 160 174 194 228 263 342
0,990 69 80 92 105 120 138 161 189 230 299 321 349 390 459 528 688
0,995 139 160 183 210 241 277 322 379 460 598 643 700 781 919 1.058 1.379
0,999 693 799 916 1.050 1.204 1.386 1.609 1.897 2.302 2.995 3.218 3.505 3.911 4.603 5.296 6.905
Example - In order to demonstrate a minimum reliability of 0.995 (i.e. minimum reliability of

99,5% and maximum failure frequency of 5 ‰) with a confidence level of 90%, a sample of 460
units is needed.
Which value do we assume for the MINIMUM RELIABILITY ?
BASIC ASSUMPTIONS
Usual target for the failure frequency at the end of the warranty period (2 years) = 0.5 repair/car
Number of car’s subsystems/components influencing reliability ≈ 300
(which may cause about 1,200 different failure modes, as seen in the histogram of Par. 2.3.4.2 speaking of the binomial distribution)
Let’s suppose that each of the 300 subsystems/components has a failure frequency of 0.01 (= 1%)
during the warranty period (2 years): this means that the production process produces 1 defective
every 100 components (not so bad!), which corresponds to a reliability equal to 99%.
Under this hypothesis, the number of repairs/cars during the warranty period (2 years) is given by:
1
------- . 300 = 3 repairs/car
100
This result (3 repairs/car) is 6 times greater than the usual target of 0.5 repair/car !
As a conclusion, in the previous Table, we have to assume a MINIMUM RELIABILITY > 0.990.
And, since a value of 0.999 would require samples of impossible size, we would usually limit to
reliability values around 0.995.
“Attainable” and “statistically provable” targets
d
1/3÷ 1/5 d Statistically
provable target
105 rep./100 cars
Archetype fron field result

PROBABILITY DENSITY
D = 1/3 ÷ 1/5 d
80 repairs / 100 cars

Attainable target
170 repairs/100 cars

( 1/3 onlysolo
( 1/3 whenquando
target and
experimental
obiettivo results are
e consuntivo
sonovery close
molto vicini )
30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200
FREQUENZA DI FAILURE
IN FIELD GUASTO IN RETE [ inconvenienti
FREQUENCY / 100
[repairs/100 vetture ]
cars]
With the same reliability value to be demonstrated,
the needed sample size greatly increases with the increasing of the confidence level.
1,00
R [fraction]
0,99
dimostrata [frazione]
0,98
0,97
RELIABILITY,
0,96
0,95
0,94
Affidabilità R
Demonstrated
0,93
C.L. = 50%
0,92 C.L. = 75%
C.L. = 90%
0,91
0,90
0 50 100 150 200 250 300 350 400 450 500
Numeroofditested
Number unità provate con zero
units (with zero guasti
failures)
The number of units required by the Success Run

(i.e. without any failure)
may be considered the minimum needed
for a given reliability statistical demonstration.
In fact, if a failure was detected,

in order to have the same statistical demonstration,
the test should increase
with a further quantity of units without failure
so great as to make negligible the only failure found.
In practice, if a failure occurs,

root causes are analyzed,
suitable corrective actions are found
(taking into account of economical aspects too).
The effectiveness of such countermeasures
is then verified by separate tests
(in addition to those of Success Run).
And then we proceed
according with the initial tests plan,
as if the failure had never happened,
even if the same failure occurs in some other cases
(but on components not yet updated on the latest changes,
otherwise all corrective actions should be reviewed).
5.2.2
Prolonged Success Run
5.2.2. Component bench testing: Prolonged Success Run
The method provides the possibility of savings,

when the failure law of the component is known,
and when the test can be extended beyond the time or mileage
at which the reliability R (to be estimated by the test) is referred.
This technique is called Prolonged Success Run
What do we demonstrate by testing a component

for more time than
that to which we want to demonstrate the reliability ?
It depends on the type of problem we want to analyze.
For example, if we want to demonstrate the Reliability at one year:

is there any advantage to prolong bench testing at a time corre-
sponding at two years of life ?
In the case of … ... then …

infant mortality failures, if we prolong the test time, we only waste
time;
random failures (the probability to occur it is the same as to test 2 components at
is the same for the whole time), 1 year;
wear-out failures, the second year is obviously more
stressing than the first one and so it is “as
if” we test more than 2 components at 1
year.
To “prolong” the test gives no advantages with infant mortality

failures, while it reduces the number of units to be tested with
random and wear-out failures.
To quantify these effects we use

the Weibull law.
The ratio between the test time Ttest

and the time at which we want to estimate reliability Testimate
is called prolonging factor
and is usually indicated by L:
L = Ttest / Testimate
Based on the concepts outlined above, if Ttest > Testimate and we are
not dealing with infant mortality failures, to extend the test is "as if" we
test an “equivalent” number of units neq greater than the number n of
units really tested. The relationship which links them is:
neq = n . Lb
where b is the shape parameter (exponent) of Weibull distribution:
b < 1 for infant mortality failures;
b = 1 for random failures (useful life);
b > 1 for wear-out failures.
In general, it must always be: L3

In the previous example, imagining ...

 to know failure law of tested units and to know that we
are dealing with wear out failures with b = 1.64,
 to test 450 units, but for a time 50% longer that the
warranty time (at the end of which we want the estima-
te), i.e. with L = 1.5,
We obtain an equivalent number of units neq :

neq = n . Lb = 450 . (1.5)1.64 = 875.
And the confidence level would be:

C = 1 - Rneq = 1 - 0.997875 = 0.928  93%
instead of 74% found before.
Example of
SEMIAUTOMATIC PLANNING of RELIABILITY TESTS
Complete system: [car model]

Component: coolant pump
Shape parameter of the specific failure law (Weibull ) b = 1,5
ASSIGNED TARGETS
At the distance of = 50.000 km, corresponding to 2 years of use
Specification target = 1,3 repairs/(100 units), i.e. reliability target = 0,9870
required Confidence Level = 75%
Demonstrated reliability
with the assigned Confidence Level
Number of
Prolonging factor units Equivalent without prolonging with prolonging
Kind of test actually number of units
L = Ttest/Testimate tested neq = n . Lb R = (1- C)1/n R = (1- C)1/neq
n
ENGINE BENCH 3,00 8 41,57 0,8409 0,9672
TRACK TEST 3,00 2 10,39 0,5000 0,8751
TEST ON MIXED GROUND 2,16 9 28,57 0,8572 0,9526
POLLUTION TEST 1,60 15 30,36 0,9117 0,9554
Total 34 110,89 0,9600 0,9876
Example of
SEMIAUTOMATIC PLANNING of RELIABILITY TESTS
CONCLUSIONS
 Reliability target required by Specifications is 0.987 with a Con-
fidence Level equal to 75%.
 The reliability target we can demonstrate by bench test, with a
Confidence Level of 75%, testing 8 units without prolonging
the test, is 0.841.
 Reliability target that we can demonstrate, with a confidence
level of 75%, with all units in all tests (8 units in bench tests
and 26 prototype cars), without prolonging the test, is 0.960.
 Reliability target that we can demonstrate, with a confidence le-
vel of 75%, with all units in all tests (8 units in bench tests and
26 prototype cars) and prolonging the test is 0.988 (even bet-
ter than Specification requirement).
5.2.3
Extended Success Run
5.2.2. Component bench testing: Extended Success Run
Success-Run method can be applied also providing

1 or 2 (or more) failures,
instead of zero, as done previously.
This technique is called Extended Success Run.
As it is easy to understand, it has little implementation,

because it requires samples of size much greater
than that required by classical Success Run with zero failures,
which is already very demanding.
1,00
R [fraction]
0,99
R dimostrata [frazione]
0,98
0,97
Affidabilità RELIABILITY,
0,96
0,95
0,94
Demonstrated
0,93 Livello di fiducia

Confidence level = 50%
0,92 failures
0 guasti
1 guasto
failure
0,91 2 guasti
failures
0,90
0 50 100 150 200 250 300 350 400 450 500
Number ofunità
Numero di tested units
provate
1,00
R [fraction]
0,99
R dimostrata [frazione]
0,98
0,97
Affidabilità RELIABILITY,
0,96
0,95
0,94
Demonstrated
0,93 Confidence level = 75%

Livello di fiducia
0,92 0 guasti
failures
1 guasto
failure
0,91 2 guasti
failures
0,90
0 50 100 150 200 250 300 350 400 450 500
Numero di
Number ofunità provate
tested units
1,00
R [fraction]
0,99
dimostrata [frazione]
0,98
0,97
RELIABILITY,
0,96
0,95
Affidabilità R
0,94
Demonstrated
0,93 Confidence level = 90%

Livello di fiducia
0,92 0 guasti
failures
1 guasto
failure
0,91 2 guasti
failures
0,90
0 50 100 150 200 250 300 350 400 450 500
Number di
Numero ofunità
tested units
provate
5.3
RELIABILITY TESTS
ON COMPLETE VEHICLES
FOR DESIGN VERIFICATION
(Reliability Growth Testing
for Useful Life)
5.3. Rel. tests on compl. vehicle for design verif. (useful life)
EXAMPLE OF RELIABILITY TEST FOR A WHOLE VEHICLE,
WITH ACCELERATION 2, KNOWN AS “DEMANDING CUSTOMER”
DISTRIBUTION OF AVERAGE ANNUAL CORRESPONDING DISTANCES DISTRIBUTION
DISTANCES IN CUSTOMER USE IN THE ACCELERATED TEST KNOWN AS "DEMANDING CUSTOMER"
Average annual Equivalent distances
true distances in the test
Types of travel Types of travel Reduction
[km] % compared to the [km] %
customer use
Downtown 1/ 2 2.250 30,00%

Downtown 4.500 30%
Test track: uneven road surfaces (derived by Chrysler-cycle (1) ) 1 / 10 450 6,00%
Main road 3.750 25% Main road: only for transfers and/or fittings ( 1 / 3,57 ) 1.050 14,00%
Hill/mountain 2.250 15% Hill/mountain 1/ 1 2.250 30,00%
Motorway 4.500 30% High-speed track (speeds close to the maximum) 1/ 3 1.500 20,00%
TOTAL 15.000 100% TOTAL 7.500 100,00%
Acceleration factor = 15.000 / 7.500 = 2,00

(1) The Chrysler-cycle includes: pavé, long and short waves, dirt road, ford, potholes, "chicken nests" , ties (sleepers), etc..
The distances in the test (last two columns to the right in the table) obviously change with the mission profile of the
car model (described in the first 3 columns on the left), but remain substantially valid the reduction principles, descri-
bed in the two central yellow columns (the antepenultimate and the previous one). These tests with acceleration 2 are
the basic reference in the Reliability Growth Testing for useful life (which will be discussed immediately hereinaf-
ter). However, in the tests design, no account is taken of their acceleration (that assumes a purely conservative
role in reliability predictions/verifications: hence the name of “demanding customer tests”). The only constraint
that the acceleration implies for these tests is that the tested vehicles may not exceed 60,000 km in order to
avoid the risk of encroaching in the wear-out area (since 2 . 60.000 = 120.000 km).
5.3.1
Fundamentals of
Reliability Growth Testing
(Duane learning curve)
5.3.1. Fundamentals of Reliability Growth Testing
Reliability Growth Testing methodology meets

two basic requirements:
 make available “objective” criteria for the design of
test plans, able to take into account the type of project in
development (characterized by set parameters such as:
target, New Content, speed of problem solving, etc.);
 realize substantial savings in sampling, by evaluating the

trustworthiness of the “reliability certification“ also on the
basis of the early batches of vehicles; passing from one
batch to the next:
• vehicles have changed (and generally are improved) with the im-
plementation of corrective actions (Reliability Growth);
• vehicles have gradually increased their meaningfulness, initially not
complete and reaching the full value (100%) in the last batch.
Duane’s learning curve
25
FAILURES
0,004
N°inconvenienti
N° of progressively detected
via via rilevati
cumulato
failures (cumulative)
(cumulato)
20
OF DETECTED
SPEED of failure detection
0,003
[ failures / 100 hours ]
INCONVENIENTI
15
0,002
NUMBER
10
Numero di
0,001
CUMULATIVE
5
Velocità di rilevamento
Speed of failure inconvenienti
detection (failure detection rate)
0,000 0
0 5.000 10.000 15.000 20.000 25.000 30.000 35.000 40.000
CUMULATIVE
TEMPO TOTALCUMULATO
DI SPERIMENTAZIONE TESTINGGLOBALE
TIME [working hours]
[ore di funzionamento]
With the increasing of testing hours

the failure detection rate becomes lower and lower.
Generally, during an experimental test, most of failures occur in a relatively

short time or, in other words, the failure detection rate (= N° of failures detec-
ted per testing hour) decreases with the increasing of test time. Based on
this observation, Duane has established an exponential relationship be-
tween the number of detected failures, N(t), and the test time t, called
learning curve:
N(t) = k . t 1-a (1)
where k and a are appropriate constants. Deriving (1) respect to the time,
we get the instantaneous failure rate li:
li = dN(t)/dt = k . (1 - a) . t -a (2)
this relationship represents a straight line in a log-log graph:
Log li = [Log k + Log (1 - a)] - a . Log t (3)
DUANE’S LEARNING CURVE

CURVA DI APPRENDIMENTO (learning curve ) DI DUANE
(in Cartesian coordinates)
( grafico cartesiano )
2,0
1,8
FREQUENCY DENSITY, li
]
1,6
INTENSITÀ DI FREQUENZA DI GUASTO li
x 100hours)
ore)]
1,4
x 100
1,2
/ (vettura
/ (vehicle
1,0
[inconvenienti
0,8
[ failures
FAILURE
0,6
0,4
Obiettivo
Target
0,2
0,0
0 5.000 10.000 15.000
CUMULATIVE
TEMPO DI PROVATEST TIME
CUMULATO [ore[working hours]
di funzionamento]
In a log-log graph, Duane’s learning curve becomes a straight line
0,010
DI GUASTO li l
[inconvenienti / (vettura x 100 ore)]
FREQUENZADENSITY,
100 hours) ]
FAILUREDIFREQUENCY
(vehicle x
0,001
failures /
Obiettivo
Target
INTENSITÀ
[
0,000
100 1.000 10.000 100.000
TEMPO DI PROVA CUMULATO

CUMULATIVE [ore di funzionamento]
TEST TIME [working hours]
Confidence band for the experimental tests
PREDICTIVE
MODELLO MODEL AND
MODELLOPREVISIVO
PREVISIVO CONFIDENCE
EEINTERVALLO
INTERVALLO INTERVAL
DIDIFIDUCIA
FIDUCIA
Duane’s
retta di di
retta straight
Duane
Duane linemedi);
(valori
(valori (mean values);
medi);pend. = slope
0,40 = 0.40
= 0,40
pend.
10,0
Duane’s
curva
curvadi di 75%
Duane
Duane C.L. Upper Limit;
(C.L.=75%);
(C.L.=75%); pend. mean
media
pend. slope
media >>
> 0,400.40
0,40
10,0
Target
target perfor useful
vita utilelife (mean
(valore value)
medio)
target per vita utile (valore medio)
STARTING-POINT 75%
STARTING-POINT = difettosità
75% STARTING-POINT
75% iniziale
= one-sided
= difettosità C.L.
iniziale 75%
upper
C.L. monolaterale
limit
75% with 75% C.L.
monolaterale Target
target per
target for
per useful
vita utile
vita life
utile (75% C.L. Upper Limit)
(C.L.=75%)
(C.L.=75%)
convenzionalmente
conventionally a 100
calculated ore100
at ore hours
DI GUASTO l l ii
convenzionalmente a 100
l
[Inconvenienti / (vettura x 100 ore)] i
100 ore)]
STARTING-POINT = difettosità iniziale media

INTENSITÀ DI FREQUENZA DI GUASTO
STARTING-POINT
STARTING-POINT = difettosità
= average iniziale media
initial defectiveness
DI FREQUENZADENSITY,
]
convenzionalmente a 100 oreore

conventionally calculated
convenzionalmente a 100 at 100 hours
x 100xhours)
Banda di fiducia
Confidence
Banda band
di fiducia
/ (vettura
(75% monolaterale)
(one-sided
(75% limit of 75% C.L.)
monolaterale)
INTENSITÀFREQUENCY
Velocità di problem-solving prevista

1,0 25%
25% Velocità
Expected diproblem-solving
problem-solving prevista
speed
/ (vehicle
1,0 (pendenza
(slope ofdella
(pendenza retta
Duane’s
della distraight
retta Duane) line)
di Duane)
[Inconvenienti
Spesso
Spesso i dueduetarget
iOften vengono
the two
target targets
vengono confusi
confusi
(per
(per can
l'esiguità be
l'esiguità considered
dell'intervallo equal
di
dell'intervallo fiducia
di fiducia
(for the small size of the confidence interval
dopodopo unun
after aelevato
elevato
large numero
numbernumero di hours)
of test ore)
di ore)
[ failures
FAILURE
Ammontare
Ammontare globale
globale
Total
della test hours
sperimentazione
della sperimentazione
0,1
0,1
100
100 1.000
1.000 10.000
10.000 100.000
100.000
TEMPO
CUMULATIVE
TEMPODIDI
PROVA CUMULATO
PROVA [ore
TEST TIME
CUMULATO didi
funzionamento]
[working
[ore hours]
funzionamento]
Purpose of Reliability Growth Testing for “useful life”
infant useful life (random failures) wear-out

mortality
2,5E-5
… and therefore
it is not involved
in the wear-out zone
RATE, h(t)l
2,0E-5
STARTING POINT (failure frequency of the first R.G. batch)
h(t)
GUASTO
FREQUENCY,
Main goal of Reliability Growth Testing

TASSO DIRATE,
1,5E-5 Reliability Growth Testing

for useful life Bathtub
merely estimates curve
HAZARD
this ordinate, as result of

HAZARD
1,0E-5 representing the bottom of archetype’s

the bathtub curve and data
FAILURE
the slope of the linear part

of M(t) curve …
5,0E-6
Target, l
(slope of ”useful life”)
0,0E+0 of the new car model
0 25.000 50.000 75.000 100.000 125.000 150.000 175.000 200.000
PERCORRENZA,
MILEAGE,km km
km( o (or (o tempo
tempo
hours o cicli)
oorcicli )
cycles)
R.G. upper limit for car testing = 60,000 km

(before the wear-out zone, when using tests
with coefficient of acceleration equal to 2)
5.3.2
“New Content “
as a main parameter
for sizing test samples
5.3.2. Reliability Growth for useful life: New Content
The New Content of a new product in development gives the percentage of
failure frequency (theoretically from 0% to 100% of the total failure frequency
corresponding to the whole vehicle) which is attributable to redesigned
elements and/or elements employed in a different operating environment
(e.g. a known gearbox applied to a new engine or a known suspension with adjustments aimed
to a different market with a greater percentage of dirt roads) with regard to products
already marketed and used as reference point (archetype).
In other words, this percentage is given by the failure frequency (e.g. measured
in “failures/100 units") corresponding to the subsystems/components (or portion
thereof) on which changes were made, divided by the total failure frequency of
the product (e.g. the whole vehicle).
Of course, it makes no sense (and is therefore forbidden!) assume values
exactly equal to 0% or 100%, because in any situation, there is always
something changed (e.g. the mission profile or the operating environment in which the
subsystem/component is placed) and something already experienced (e.g.: designers
virtuous traditions, standards, etc.) .
It’s worth to highlight that the failure frequency to be used (first column in the Table of
next slide) is “actual”, since detected in field, for what concerns archetype’s
components, while it is “potential”, since estimated, for what concerns the
“innovative” components of the new product in development.
Being merely illustrative and with a large simplification of the concept, but useful for
didactic purposes, we could say that New Content is given by the ratio between the
sum of the failure frequencies(1) of the (only) modified components (for better
adequacy to the new vehicle model in development, and therefore with a “specific”(2)
design) and the total failure frequency of the vehicle(3):
Sum of the failure frequencies of the modified components

NC = 100 . -------------------------------------------------------------------------------------------
Failure frequency at the whole vehicle level
Of course, the professional calculation of New Content (as you can guess from the
previous slide), is a bit more complicated than that and the next slide gives a rough
idea about the practice run of calculations.
(1) Instead it would be totally wrong to consider New Content as the ratio between the number of modified
components and the total number of components that make up the whole vehicle.
(2) To be the numerator of the New Content, it is not necessary that components are "innovative" (i.e. sub-
stantially new inventions or, at least, very recent), but it is enough that they are "specific" i.e. that their
de-sign has been modified, for example, in order to better suit the needs of the new car model in deve-
lopment. Clearly, all innovative components belong to the category of specific, but not vice versa (the
most of the specific components are not innovative).
(3) The failure frequency of the whole vehicle is calculated: a) for the components already present in the ar-
chetype, on the basis of their failure frequencies from filed, subjectively corrected taking into account of
differences presumably introduced by the planned modifications; b) for the new components (= not pre-
sent in the archetype), on the basis of the best possible estimates (e.g. by the systems in series and in
parallel).
SPREADSHEET for NEW CONTENT estimate
NEW CONTENT assessment - Car model: f ictitious example of an old car model (production year around 2000)
( INPUT data in YELLOW cells )
REPAIRS/(100 CARS) detected from field N E W C O N T E N T

(1)
Official allocation CARRY-OVER SPECIFIC Subjective
Automatic
SUBSYSTEMS TOTALS of which of which components(4) components(5) calculation
refinement
(if useful)
( working teams ) CARRY-OVER(2) SPECIFIC(3) (subjective assessment) (subjective assessment)
POWER UNIT 70,93 9,87 61,06 10,00% 75,00% 11,82% 12,0%

MECHANICAL
Inlet/exhaust 4,86 0,00 4,86 0,00% 90,00% 1,11% 1,2%

Controls 17,59 7,62 9,97 10,00% 85,00% 2,33% 2,5%
Fuel injection system 8,21 5,62 2,59 15,00% 70,00% 0,67% 0,7%
Pedals 8,20 1,58 6,62 35,00% 90,00% 1,65% 1,7%
Engine cooling system 4,69 2,97 1,72 7,00% 70,00% 0,36% 0,4%
Brake & fuel pipes 4,55 0,10 4,45 5,00% 85,00% 0,96% 1,0%
Front/rear suspension 40,30 8,67 31,63 20,00% 75,00% 6,43% 6,5%
Chassis 36,83 20,79 16,04 12,00% 70,00% 3,47% 3,5%
Seats 17,95 6,93 11,02 25,00% 90,00% 2,94% 3,0%
OTHER
Sides 39,19 1,83 37,36 10,00% 80,00% 7,60% 8,0%

Body mobile elements 66,64 2,59 64,05 35,00% 90,00% 14,79% 15,0%
Dashboard 42,44 20,27 22,17 15,00% 90,00% 5,81% 6,0%
Bumpers 1,89 0,00 1,89 0,00% 90,00% 0,43% 0,5%
Exterior/interior lighting 9,70 1,13 8,57 25,00% 90,00% 2,02% 2,5%
Heaters & air-conditioner 21,79 4,12 17,67 30,00% 90,00% 4,33% 4,5%
POWER Unit only 70,93 9,87 61,06 65,96% 67,0%
MECHANICAL subs. only 159,33 36,43 122,90 62,90% 64,6%
OVERALL TOTAL 395,76 94,09 301,67 66,72% 69,0%
(1) Components totally "carry-over" or totally "specific" are very rare, because it is inevitable that something (design and/or technology, environment, mission profile, etc.) changes from the archetype and the
new car model in development, as well as it is quite impossible a New Content of 100% in completely new components, due to the consolidated standards, traditions/habits, previous experiences, etc..
Therefore the distinction between "carry-over" and "specific" components is always made in a subjective way and consequently it has to be regarded as "formal" and/or "official" .
(2) Components already used in marketed car models (archetype ) and now essentially adopted in the new car model in development; of course, small changes (in design/technology and/or environment and/or
mission profile) are allowed (e.g. to get more demanding reliability targets) and we must take into account of them in the New Content assessment.
(3) Components with important changes compared to the corresponding in the archetype (but not necessarily "innovative" ): changes may regard one or more of the following aspects: design/technology,
environment, mission profile.
(4) Values greater than 30÷40% have to be considered excessively conservative.
(5) Values greater than 80÷90% have to be considered excessively conservative.
Method to plan EXPERIMENTAL TESTS
Initial failure frequency, called Starting Point (S.P.)

evaluated on the basis of the Achievable Target
(i.e. the target recognized as “achievable”)
and of the New Content
S.P. = target + f(NC)
Achievable Target for useful life
Log t
100 hours Total testing time
Based on historical data, Fiat established a semi empirical relation-
ship to evaluate the Starting Point according to the set reliability
target and to the specific New Content (this relationship is not shown for
reasons of business confidentiality): every Company will evaluate his
own relationship.
The concept of New Content is the first strong point of

Reliability Growth Testing, because it links the total amount
of testing time to the whole effect on the reliability of all
differences between the new car model and the reference
one (archetype).
As testing proceeds, knowledge about the new product in-
creases and, correspondingly, the New Content decreases
(of what has become "known“); which is why the New
Content can also be seen as an indicator of the amount
of testing that remains to be completed.
Growth slope
Type of Reliability Program
a
Little or no attention to the reliability problems.
- 0.01 to - 0.10 No formal procedure.
Simple reliability development program, primarily

- 0.10 to - 0.25 focused on the most important failure modes.
Aggressive reliability development program, plan-
ned according to well-defined standards and con-
- 0.25 to - 0.40 ducted by experienced specialists under the con-
stant supervision of the top management.
Highly aggressive reliability development program.
Values of a greater than 0.60 are not realistic: in
- 0.40 to - 0.60 the automotive field, is very difficult even exceed
0.40 (in absolute value).
FAILURE FREQUENCY DENSITY, li

[ failures / (vehicle x 100 hours) ]
CUMULATIVE TIME [operating hours]
The input parameter to which the Reliability Growth Testing shows the maximum sensitivity
is the slope of the Duane straight line. However, if we incorrectly assume an optimistic
value for it, this will not affect the final reliability of the product, but simply will require a test
time longer than planned.
5.3.3
Meaningfulness
of tested vehicles
in subsequent batches
5.3.3. R.G. for Useful life.: meaningfulness of vehicles
Vehicles of every tested batch

are characterized by their meaningfulness
which indicates the degree of similarity
of tested vehicles
with those that will actually be marketed.
Similar to New Content, the meaningfulness too is measured

by the ratio between the failure frequency due to the subsy-
stems/components that can be considered as “final (= no need
to be modified)” and the failure frequency of the overall system
(e.g. the whole vehicle).
This is the second strong point

of Reliability Growth Testing,
because it allows us to perform our reliability predictions
even with vehicle batches of non-full meaningfulness.
Therefore, almost never a single Reliability Growth phase is

planned, but usually several (= 2÷4) phases (i.e. several car
batches) are used, because:
 doing so will reduce the overall amount of testing because
the sample is renewed at each phase;
 in the automotive world, cars of subsequent batches have
different meaningfulness: audit process, pre-series, star-
ting production, and so on; at each R.G. phase, the cars
must have uniform meaningfulness, gradually higher than
that of the previous phases.
IMPORTANT REMARK
 Classical Statistics provides only predictions based on homogeneous
batches (taken from the same population) and therefore does not allow to
put together batches of vehicles with different meaningfulness in reliabi-
lity estimates. So we would be forced to base the estimated only on the
results of the last batch (the closest to the vehicle that will actually be
sold that are the subject of the estimate) and thus the quality of the esti-
mate (expressed by the Confidence Level and the Confidence Interval)
would be determined only by the size of the last batch.
 Instead, Reliability Growth Testing uses meaningfulness in order to
empirically increase the sample size of the last batch by adding to the
number of its vehicles, the number of vehicles in each previous batch
multiplied by their specific meaningfulness. In this way, the quality of the
estimate (expressed by the Confidence Level and the Confidence Inter-
val) is affected not only by the size of the last batch, but also by that of all
previous batches, although "weighted" by their meaningfulness, with
evident benefits (see slide 76).
5.3.4
Tests designing
5.3.4. R.G. for useful life: tests designing
MAIN INPUTs:
 Target to be demonstrated, corresponding to the achievable target
recognized by Engineering;
 New Content of the new model, which measures what the car has
never been tested, evaluated as % of failure frequency of reference:
redesigned elements and/or elements employed in a different operating
environment.
 Meaningfulness of cars, which measures how much tested cars
are representative of a defined project (weighting the assessment
on the failure frequency of the archetype).
OPTIONAL ASSUMPTIONS: depending on what is available in the com-
pany for the new car model in development:
 Speed of Problem-Solving (slope of Duane’s straight line);
 Number of phases (batches)
Number of cars needed
for each batch.
TOTAL MILEAGE
(and of each batch or phase)
Test time (of every
These slides can not be reproduced or distributed without permission in writing from the author.
batch and total) 65
Phase 2 (and following) re-starting-point
Untested from phase 1 +

New Content Specific of phase 2 =
New Content of phase 2
Phase 1 Target
Ending Point
Phase 1
100 hours
100 hours PHASE 1 PHASE 2
The percentage of UNTESTED from Phase 1 is given by: where :

U1 = Untested o Residual New Content from Phase 1
NC1 = New Content of Phase 1
l 100 = Starting Point or initial failure frequency (= at 100 hours)
l Tep = Ending Point or final failure frequency
l Tdt = Development Target
Impact of different meaningfulness of cars

in the subsequent batch test
 How it is easy to understand, the more the tested cars are close
to those that will be sold (meaningfulness) and the sooner they
are available to be tested, the smaller will be the size of batches
(samples).
 Experimental tests will be divided into several phases (from 2 to
4), taking into account the meaningfulness and the timing with
which the vehicles will be available for reliability tests.
Reliability Growth Model (actual case)
Phase 1 Target
Ph1T
Phase 2 Target
Final Target Final Target
FT
End of Phase 1
Actual length of Phase 1

100 hours 100 hours
Theoretical length of Phase 1
to achieve the final target UNTESTED UNTESTED
FROM PHASE 1 FROM PHASE 2
100 hours
PHASE 1 PHASE 2 PHASE 3
Example of calculation of Specific New Content for each phase
NCtot = 40%
Yellow cells highlight initial assumptions
P H A S E S
GROWTH PHASES Verification
1 2 3 (last) phase
Estimated at the planning of testing 85% 95% 100% 100%
CAR (1)
MEANINGFULNESS (2)
Detected at the end of each phase 85% 95% 98% 100%
Conventionally = 36%
SPECIFIC Formulas NCspec = 0,40 . 0,85 NCspec = 0,40 . (0,95 - 0,85) NCspec = 0,40 . (1,00 - 0,95) of the (total) NC of the
previous (last) phase
NEW CONTENT
OF EACH PHASE Result of the formula 34% 4% 2% 0,72%
Check: 34 + 4 + 2 = 40 = NCtot
(1) Before starting to test the 3rd batch (3rd phase) a full meaningfulness of 100% was expected, but, during the 3 rd phase, some new failure modes have been
detected and then solved by implementing suitable corrective actions. The " New Content " due to these corrective actions (defined but never tested on the cars)
has been estimated around 2%, so that the actual meaningfulness of the 3rd phase becomes equal to 98%. This requires an additional phase to get the actual 100%:
this phase is called " verification phase " . Conventionally, its New Content is assumed equal to 36% of the (total) New Content of the previous (last) phase.
(2) The actual car meaningfulness remains 100% at the end of the "verification phase" (only) if, as usually happens, no corrective actions have been implemented during
the verification phase.
CAR MODEL Date
RELEASE FOR MANUFACTURE: 14.000 32
PRODUCTION START-UP: May 1, 2003
COMMERCIAL LAUNCH: June 15, 2003
TOTAL NEW-CONTENT = 55,00% FINAL TARGET = 0,2907 repairs/100 hours
PHASE 1 PHASE 2 PHASE 3 PHASE 4 PHASE 5 PHASE 6
Reference Information
DATE OF MANUFACTURE OF TESTED CARS
PHASE's STARTING DATE
PHASE's ENDING DATE 10,63
PHASE's LENGTH [months]
OVERLAPPING WITH SUBSEQUENT [months]
MEANINGFULNESS OF TESTED CARS
NUMBER OF TESTED CARS 52
OMITTED IN THIS CASE

AVERAGE DISTANCE for each car [km] 130.000
TOTAL DISTANCE [km]
PHASE's SPECIFIC NEW CONTENT 35,75% 2,75% 5,50% 5,50% 5,50% 55,00%
RESIDUAL NEW CONTENT from the PREVIOUS PHASE
Input data for log-log diagrams

PHASE's TOTAL NEW CONTENT [%]
SLOPE OF DUANE's STRAIGHT LINE
CONFIDENCE LEVEL
Actual TOTAL TEST HOURS 42.813
Automatic basic outputs

HOURS to reach the FINAL TARGET with C.L. = 75%
STARTING POINT (100 hours) with C.L. = 50%
STARTING POINT (100 hours) with C.L. = 75%
Attained TARGET at the end of PHASE with C.L. = 50%
Attained TARGET at the end of PHASE with C.L. = 75%
RESIDUAL NEW CONTENT at the end of phase [%]
Point A estimated (CL = 50%) at the planning

roughly around 2500 hours
Number of repairs foreseen in each Fase

Based on the Starting Point with C.L. 50%
Based on the END-PHASE TARGET with C.L. 50% (min. = 1)
These slides can not be reproduced or distributed without permission in writing from the author. Example of an EXCEL spreadsheet to plan the phases (by trial and error) 70
5.3.5
Results monitoring
5.3.5. R.G. for useful life: results monitoring
FAILURE FREQUENCY DENSITY, l [failures / (vehicle x 100 hours)]
R E Q U I R E D T E S T T I M E
Monitoring the progress of testing is particularly simple, since the test results (thick
broken line in Figure) are displayed on the same chart that contains the line of
planning. This facilitates a continuous and immediate comparison of results with the
intermediate targets defined by the growth model.
Each of the two curves (planned and resulting) is the upper limit of a confidence
band, because it identifies, point by point, the upper limit of the specific confidence
interval.
Usually the broken curve is updated weekly. Thus, the 1st point of the broken line is
the number of failures detected at the end of the first week divided by the total accu-
mulated test hours during the first week; the 2nd point is calculated by assuming the
same number of failures, but divided by the sum of hours accumulated during both
the first and second week, the 3rd point, positioned vertically above the 2nd, is calcu-
lated by dividing the total number of failures of the two weeks by the total hours ac-
cumulated in the two weeks. And we continues this way until we get to point A, at
which (for the first time) the broken line has a vertical drop due to the benefits of im-
plemented corrective actions.
Therefore, point A graphically represents the simplest graphical estimate of the Star-
ting Point value, deduced from experimental results.
Failures attributable to the production process

are excluded from this analysis and accounting.
To them is devoted a specific analysis

to be carried out
at the beginning of actual production,
which will be discussed below
(see Paragraph 5.4.).
The fact that the experimental curve falls towards the

target only at the end of the graph is due to two syste-
matic causes:
 To eliminate (from the graph) a specific kind of failure,

we first need to have demonstrated (e.g. by accelera-
ted tests aimed to the specific kind of failure and car-
ried out separately from the R.G. ones) the effective-
ness of the adopted corrective actions and that takes
time !
 Graph is logarithmic and thus each of the succeeding
(from left) logarithmic decades is 10 times thicker than
the previous one.
 Test results are independent from Test Plan. It is always

indicated if the target has been reached or not. The damage of
a wrong plan is having to extend the experimental tests
beyond expectations (time, mileage, etc.).
 As the cars of the first batches (= first phases) of Reliability
Growth have a meaningfulness less than 100%, the achie-
ving the phase-target implies only that Reliability is increasing
according to the planned roadmap. Only the last phase, in
which there must be a meaningfulness equal to 100%, is able to
certify the achievement of the Reliability target.
 If any batch/phase, during its testing, presents cars with mea-
ningfulness lower than planned or fails to achieve its phase-
target, it is suitable to reschedule the subsequent phases.
In assessing the confidence interval of the final estimate of the failure frequency (of
a new car model) after testing the last lot, the Reliability Growth Testing offers a
further advantage, because it takes into account not only the size of the last batch (as
classical Statistics suggests), but also of that of previous batches.
4 BATCHES OR PHASES
GROWTH PHASES Verification phase
1 2 3 (last) actual data virtual sample
Car meaningfulness 85% 95% 98% 100%
Actual number of cars

(of each batch)
20 40 30 10 -
Virtual number of cars
17,00 38,00 29,40 10,00 94,40
counted for the final sample
Because the cars of the batches preceding the last not have a full meaningfulness,
their size is "weighted" on the meaningfulness of each batch, as shown in Table
above.
In practice, in the above example, the confidence interval is not calculated on the
basis of the only 10 cars of the last lot, but also taking into account the cars of all
previous batches (albeit "weighted" on their meaningfulness). So 94.4 cars are
counted (in a virtual way). Therefore, with a larger number of cars, we obtain a
smaller confidence interval.
The certified value of the reliability target,

is certainly achieved by the cars of the last phase,
once implemented all corrective actions
(because these cars have the detected problems
and substantially no other).
Instead in field, there will be also all failures which,

for the randomness of sampling,
did not occur in the tested sample.
This requires a some adjustment

to refine the prediction of the in field reliability:
the in field prediction adjustment.
5.3.6
In field prediction adjustment
5.3.6. R.G. for useful life: In field prediction adjustment
 We have already mentioned that a vehicle model has

about 1000÷1200 different kinds of failure, of which
almost half with a frequency of less than 0.5‰: in which
case it is generally better to avoid introducing improve-
ments.
 We cannot think that all kinds of defects can be present

in a single sample of vehicles: e.g., with 100 cars, we
can see the 17% of different kinds of failure, corre-
sponding about to 65% of total failures in field.
An example to better capture the concept
Suppose there are 1000 balls in an urn,
with the characteristics shown in Table
Balls Meaning
Number Coulor in the similarity
500 white No failures
100 blue Failures of kind A
100 red Failures of kind B
50 green Failures of kind C
50 yellow Failures of kind D
200 Metallic Other kinds of

(one different failures
from the other) (sporadic)
Suppose we extract a sample of 20 balls from the urn.
Without disturbing Statistics ...
 … it is reasonable to expect that the white balls represent, more
or less, one half the sample and the other half is made up of co-
lored balls (in various ways);
 … it is not said that the balls green/yellow are present in the sam-
ple in a quantity (even approximately) halved compared to the
blue/red: this can be generalized by saying that usually the fre-
quency of the colors of the whole population is not observed
in the sample;
 … indeed, in a sample of 20 balls, it is even possible that a color
is completely missing: of course it is easier that green or yellow
is missing, but it is not impossible that are the blue and/or red to
miss;
 … in a sample, it is plausible the presence of some metallic ball
(all of different colors): based on the sample, one is led to attribu-
te to each of them a frequency of 1/20, while we know that their
true frequency, in this example, is 1/1000.
Generalized deductions
1. The overall failure frequency (i.e. the failure frequency at

the level of whole vehicle) varies from sample to sample
much less than that of a single kind of failure, and much
more as the lower the frequency of the kinds of failure.
2. It can be practically ruled out that in single sample, although
large, can be found at least one case of all kinds of failures
potentially possible, and, since we can not act on what we
have not seen on the sample, we must remember this when
we estimate the in field reliability.
3. In general, we should not act on sporadic failures (with a
frequency less than 1‰), but there is no sure way to distin-
guish them from the others on a sample. The only reference
is experience (or intuition!) of Technicians .
 If we want to make a reliable forecast of what will happen in field, we

need to take account of the above.
 We do it by the “in field adjustment” described below.
 Referring to the next slide, we can see that, until the recording of the
first corrective action recognized as effective (Point A in the graph),
we have made a classic failure frequency estimate, whose uncertainty is
defined by the confidence interval.
 After the point A, we eliminate failure frequencies related to the solved
problems (namely those for which corrective actions have been identi-
fied and recognized as effective) in order to get the final result on the
sample, corresponding to the target or very close to it.
 This result is correct for the cars of the sample, which have virtually
no problems other than those identified during the test (regardless they
have been eliminated with corrective actions or not),
 but it is not true for the cars in filed (sold to customers), which are
also affected by all kinds of failures which have not occurred in the sam-
ple, and therefore could not be taken into account.
5.3.5. R.G. for useful life: results monitoring 100%
90%
No. failure kinds mildly detectable
86,1%
91,6%
We can use the “detect ability” in the sample in order to adjust the
% OF FAILURES DETECTABLE IN THE SAMPLE

Repairs/100 cars mildly detectable 80,7%
80%
75,4%
in field reliability prediction, based on the sample results, starting 70%

64,1%
from the last completely meaningful car batches. 60%
50,5%
58,0%
50%
41,9%
40%
31,6% 32,1%
30%
25,5%
19,5%
20% 16,6%
10,3%
10%
5,0%
2,8%
0%
10 20 50 100 200 300 500 1.000

FAILURE FREQUENCY DENSITY, l [failures / (vehicle x 100 hours)]
SAMPLE SIZE [number of cars]
Supposing a total
of 100 cars
detect ability
A in a sample (e.g. 64%)
Credible estimate of failure frequency before last corrective actions
Improvement detected
on the sample
prediction
In field
% fail. not detectable
in a sample (e.g. 36%)
 At this point, it is clear that the true value of the in field failure frequency is
between point A and the value of the result (near the target) reached on
the sample.
 Taking the value of Point A as a prediction of the in field failure frequen-
cy, would be pessimistic, because it was implemented a number of cor-
rective actions. On the other hand, assume the value provided by the re-
sult on the sample would be optimistic, because it does not take ac-
count of kinds of failures that have not appeared in the sample.
 So, much empirically, we split the interval, between the previous two alter-
native estimations, into two parts proportional to the visibility and to the
non-visibility on the sample, as shown in previous slide.
 We have already seen in Chapter 2 - Statistics, speaking about the bino-
mial distribution, the meaning of "detect ability" (or, rather, visibility) on
a sample of size n: the non-visibility quantifies, in some way, the inade-
quacy of the sample in relation to achieving a very precise estimate.
 In this way, with very large samples (practically unfeasible!), the in field
estimate would be close to the result on the sample, while, with small
samples, the in field estimate slightly differs from the value of Point A.
This semi-empirical technique, very simple,

has always provided good results
and appears as the least countermeasure
to avoid wrong predictions!
5.4
RELIABILITY TESTS
ON COMPLETE VEHICLES
FOR PROCESS VERIFICATION
(Reliability Growth Testing
for Infant Mortality)
5.4. R.G. for infant mortality: process verification
To capture as much as possible the actual variability of the produc-

tion process, we must examine a very large number of vehicles,
that, for this, will be subject to a minimum-distance test, so that
they can be regenerated and then sold as new (or even with a spe-
cial Quality mark !).
Happily, production process creates problems mainly of infant mor-
tality (i.e. they happen immediately or never) and therefore the
choice of very small mileage is justified, because the design pro-
blems have already been detected and removed with Reliability
Growth Testing for useful life.
But how much these distances can be limited ? Even for infant
mortality failures is often required some time to "mature" (i.e. to be
perceived by the driver). So if we reduce too much distance, we risk
of not detecting some kind of failure (probably not very frequent, but
able to be annoying for customers).
Usually, we consider the following classes of mileage, hoping

that they suffice to bring out all kinds of possible infant morta-
lity failures (due to the production process):
from 0 to 100 km;
from 0 to 180 km;
from 0 to 1.500 km (rarely).
In practice, Reliability Growth Testing for infant mortality lasts 6

months (a little more for car models with low production):
• in the first 3 months, the main purpose is to bring out as much as
possible of the potential failures,
• while during the next 3 months, is also added the aim of verify the
effectiveness of corrective actions (identified and implemen-ted
in the previous three months).
The main difficulty

applying the Reliability Growth Testing method
in the area of infant mortality is that,
in the bathtub curve, the failure frequency density
is not constant, but decreases with distance.
With a large approximation, we consider as constant, and

equal to their average values, the values of failure frequen-
cy density within each of the already mentioned classes:
from 0 to 100 km;
from 0 to 180 km;
from 0 to 1.500 km (if is used);
and we operate separately for each class.
DI GUASTO ll 5.4. R.G. for infant mortality: process verification
l
100 km)]
DI FREQUENZADENSITY,
a 180 km
x 100x km)
/ (vettura ]
INTENSITÀFREQUENCY
failures / (cars
l a 1.500 km
[[inconvenienti
mileage
media annua
precorrenza
Annual
FAILURE
average
180 km 1.500 km P EI RL CEOA
M RRG EEN Z A
[ k [km]
m]
After that, the application of

Reliability Growth Testing for infant mortality,
i.e. for the verification of the production process,
can be developed in a similar way to that seen
for useful life and design verification.
However
(instead of using Reliability Growth Testing for infant mortality),
the production process validation is often conducted
by the analysis and comparison of failure frequencies
of (small) car batches, weekly taken from the production
(which is still under development).
Given the high number of car involved in the sampling,

the Reliability Growth for infant mortality,
does not require the in field prediction adjustment.
5.5
FINAL SUMMARY ON
EXPERIMENTAL TESTS
5.5. Final summary on experimental tests
Current standards
Different categories of experimental tests to define
sample size
At least 1
failure in the
sample
OPERATING tests (with assigned
probability):
binomial distr.
Success-Run
Reliability Verification
M(t) curve
with (one sided)
confidence
band
RELIABILITY tests
Design
(useful life)
Reliability Growth Testing
Process
(infant mortality)
DURABILITY tests
The predictions
obtained using Reliability Growth Testing methodology,
by combining tests for useful life and those for infant mortality,
proved to be particularly accurate.
The forecast-error
on the failure frequency at 12 months of use,
noticeable only after about a year and a half,
was around 15%,
which can be considered a very good result,
if account is taken of the advance which it is given.
The estimate is generally a little optimistic,
because it can not take into account of possible process drifts.
It can become pessimistic for those models (e.g. niche models),
for which smaller samples are taken:
the smaller sample size widens the confidence interval
of the failure frequency estimate.
EXAMPLES (only qualitative) of comparisons between:
 PREDITION at the end of Reliability Growth Testing and
 RESULT IN FIELD after one year of use of the cars sold in the 2nd 3rd and 4th month after the Commercial Launch.
Unusual conspicuous
process drifts occured RG PREDICTION
immediately after the
Commercial Launch
(causes have been IN FIELD RESULT
identified)
FAILURES / 100 CARS
CAR MODELS
(first applications of the Reliability Growth Testing methodology)
Who want to delve the topics

concerning
“Reliability Growth Testing”
may usefully read the 1st Chapter
(Classical Reliability Growth Testing)
of the Thesis of Ma Xiao

PQD 5 - Experimental Verification (2016/2017)

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

PQD 5 - Experimental Verification (2016/2017)

Hochgeladen von

Copyright:

Verfügbare Formate

AUTOMOTIVE ENGINEERING

1. Design for Quality considering customer needs and pro-

 General criteria for designing experimental tests

AIMS OF EXPERIMENTAL RELIABILITY CERTIFICATION

Experimental testing has 2 main targets:

The emphasis is on prevention activities

Depending on the types of components,

But what does mean  “significant” ?

but “fail” compared to what ?!? ...

Basic : the purpose of the tests

EXPERIMENTAL RELIABILITY CERTIFICATION

PRELIMINARY CLASSIFICATION OF DETECTED FAILURES

Kind of problem q Design

q Product Specifications not respected

… for these reason:

SUCCESS RUN = tests with success

These are tests during which

But almost always what interest is

Suppose that we have set, for a particular component, a failure

Imagine that we have tested 450 pieces for a period corresponding

n = Log (1-C) / Log R + 0.499999

Example - In order to demonstrate a minimum reliability of 0.995 (i.e. minimum reliability of

Archetype fron field result

80 repairs / 100 cars

170 repairs/100 cars

The number of units required by the Success Run

In fact, if a failure was detected,

In practice, if a failure occurs,

The method provides the possibility of savings,

This technique is called Prolonged Success Run

What do we demonstrate by testing a component

For example, if we want to demonstrate the Reliability at one year:

In the case of … ... then …

To “prolong” the test gives no advantages with infant mortality

To quantify these effects we use

The ratio between the test time Ttest

In general, it must always be: L3

In the previous example, imagining ...

We obtain an equivalent number of units neq :

And the confidence level would be:

Complete system: [car model]

Success-Run method can be applied also providing

This technique is called Extended Success Run.

As it is easy to understand, it has little implementation,

0,93 Livello di fiducia

0,93 Confidence level = 75%

0,93 Confidence level = 90%

Downtown 1/ 2 2.250 30,00%

Acceleration factor = 15.000 / 7.500 = 2,00

Reliability Growth Testing methodology meets

 realize substantial savings in sampling, by evaluating the

With the increasing of testing hours

Generally, during an experimental test, most of failures occur in a relatively

Log li = [Log k + Log (1 - a)] - a . Log t (3)

DUANE’S LEARNING CURVE

TEMPO DI PROVA CUMULATO

STARTING-POINT = difettosità iniziale media

convenzionalmente a 100 oreore

Velocità di problem-solving prevista

infant useful life (random failures) wear-out

Main goal of Reliability Growth Testing

1,5E-5 Reliability Growth Testing

this ordinate, as result of

1,0E-5 representing the bottom of archetype’s