Beruflich Dokumente
Kultur Dokumente
ir
manifestation of the random variability inherent in the data. If 3.1.1 outlier—see outlying observation.
this is true, the value should be retained and processed in the 3.1.2 outlying observation, n—an observation that appears
same manner as the other observations in the sample. to deviate markedly in value from other members of the sample
a.
1.1.2 On the other hand, an outlying observation may be the in which it appears.
result of gross deviation from prescribed experimental proce-
dure or an error in calculating or recording the numerical value. 4. Significance and Use
In such cases, it may be desirable to institute an investigation di 4.1 When the experimenter is clearly aware that a gross
to ascertain the reason for the aberrant value. The observation deviation from prescribed experimental procedure has taken
may even actually be rejected as a result of the investigation, place, the resultant observation should be discarded, whether or
though not necessarily so. At any rate, in subsequent data not it agrees with the rest of the data and without recourse to
analysis the outlier or outliers will be recognized as probably statistical tests for outliers. If a reliable correction procedure,
e
being from a different population than that of the other sample for example, for temperature, is available, the observation may
values. sometimes be corrected and retained.
gP
1.2 It is our purpose here to provide statistical rules that will 4.2 In many cases evidence for deviation from prescribed
lead the experimenter almost unerringly to look for causes of procedure will consist primarily of the discordant value itself.
outliers when they really exist, and hence to decide whether In such cases it is advisable to adopt a cautious attitude. Use of
alternative 1.1.1 above, is not the more plausible hypothesis to one of the criteria discussed below will sometimes permit a
accept, as compared to alternative 1.1.2, in order that the most clear-cut decision to be made. In doubtful cases the experi-
appropriate action in further data analysis may be taken. The menter’s judgment will have considerable influence. When the
En
procedures covered herein apply primarily to the simplest kind experimenter cannot identify abnormal conditions, he should at
of experimental data, that is, replicate measurements of some least report the discordant values and indicate to what extent
property of a given material, or observations in a supposedly they have been used in the analysis of the data.
single random sample. Nevertheless, the tests suggested do
cover a wide enough range of cases in practice to have broad 4.3 Thus, for purposes of orientation relative to the over-all
utility. problem of experimentation, our position on the matter of
screening samples for outlying observations is precisely the
following:
4.3.1 Physical Reason Known or Discovered for Outlier(s):
1
This practice is under the jurisdiction of ASTM Committee E11 on Quality and
Statistics and is the direct responsibility of Subcommittee E11.10 on Sampling /
2
Statistics. For referenced ASTM standards, visit the ASTM website, www.astm.org, or
Current edition approved Oct. 1, 2008. Published November 2008. Originally contact ASTM Customer Service at service@astm.org. For Annual Book of ASTM
approved in 1961. Last previous edition approved in 2002 as E178 – 02. DOI: Standards volume information, refer to the standard’s Document Summary page on
10.1520/E0178-08. the ASTM website.
Copyright © ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959. United States
1
E178 − 08
4.3.1.1 Reject observation(s). criteria presented may also be used to test the hypothesis of
4.3.1.2 Correct observation(s) on physical grounds. normality or that the random sample taken did come from a
4.3.1.3 Reject it (them) and possibly take additional obser- normal or Gaussian population. The end result is for all
vation(s). practical purposes the same, that is, we really wish to know
4.3.2 Physical Reason Unknown—Use Statistical Test: whether we ought to proceed as if we have in hand a sample of
4.3.2.1 Reject observation(s). homogeneous normal observations.
4.3.2.2 Correct observation(s) statistically.
4.3.2.3 Reject it (them) and possibly take additional obser- 6. Recommended Criteria for Single Samples
vation(s). 6.1 Let the sample of n observations be denoted in order of
4.3.2.4 Employ truncated-sample theory for censored obser- increasing magnitude by x1 ≤ x2 ≤ x3 ≤ ... ≤ x n. Let xn be the
vations. doubtful value, that is the largest value. The test criterion, Tn,
4.4 The statistical test may always be used to support a recommended here for a single outlier is as follows:
judgment that a physical reason does actually exist for an T n 5 ~ x n 2 x̄ ! /s (1)
outlier, or the statistical criterion may be used routinely as a
basis to initiate action to find a physical cause. where:
x̄ = arithmetic average of all n values, and
5. Basis of Statistical Criteria for Outliers s = estimate of the population standard deviation based on
5.1 There are a number of criteria for testing outliers. In all the sample data, calculated as follows:
of these, the doubtful observation is included in the calculation
ir
of the numerical value of a sample criterion (or statistic), which s = n n
! ( ~ x i 2x̄ ! 2
! (x 2 2
i 2n·x̄
is then compared with a critical value based on the theory of i51
5
i51
a.
tion is to be retained or rejected. The critical value is that value n
S( D
n 2
nonrandom causes (human error, loss of calibration of tables give the “one-sided” significance levels. In the previous
instrument, change of measuring instrument, or even change of tentative recommended practice (1961), the tables listed values
time of measurements, etc.), then the observed value of the of significance levels double those in the present practice, since
sample criterion used would exceed the “critical value” based it was considered that the experimenter would test either the
on random-sampling theory. Tables of critical values are lowest or the highest observation (or both) for statistical
usually given for several different significance levels, for significance. However, to be consistent with actual practice and
En
example, 5 %, 1 %. For statistical tests of outlying in an attempt to avoid further misunderstanding, single-sided
observations, it is generally recommended that a low signifi- significance levels are tabulated here so that both viewpoints
cance level, such as 1 %, be used and that significance levels can be represented.
greater than 5 % should not be common practice. 6.2 The hypothesis that we are testing in every case is that
NOTE 1—In this practice, we will usually illustrate the use of the 5 % all observations in the sample come from the same normal
significance level. Proper choice of level in probability depends on the population. Let us adopt, for example, a significance level of
particular problem and just what may be involved, along with the risk that 0.05. If we are interested only in outliers that occur on the high
one is willing to take in rejecting a good observation, that is, if the side, we should always use the statistic Tn = (xn − x̄)/s and take
null-hypothesis stating “all observations in the sample come from the
same normal population” may be assumed correct.
as critical value the 0.05 point of Table 1. On the other hand,
if we are interested only in outliers occurring on the low side,
5.2 It should be pointed out that almost all criteria for we would always use the statistic T1 = (x̄ − x1)/s and again take
outliers are based on an assumed underlying normal (Gaussian) as a critical value the 0.05 point of Table 1. Suppose, however,
population or distribution. When the data are not normally or that we are interested in outliers occurring on either side, but
approximately normally distributed, the probabilities associ- do not believe that outliers can occur on both sides simultane-
ated with these tests will be different. Until such time as criteria ously. We might, for example, believe that at some time during
not sensitive to the normality assumption are developed, the the experiment something possibly happened to cause an
experimenter is cautioned against interpreting the probabilities extraneous variation on the high side or on the low side, but
too literally. that it was very unlikely that two or more such events could
5.3 Although our primary interest here is that of detecting have occurred, one being an extraneous variation on the high
outlying observations, we remark that some of the statistical side and the other an extraneous variation on the low side. With
2
E178 − 08
TABLE 1 Critical Values for T (One-Sided Test) When Standard Deviation is Calculated from the Same SampleA
Number of Upper 0.1 % Upper 0.5 % Upper 1 % Upper 2.5 % Upper 5 % Upper 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
3 1.155 1.155 1.155 1.155 1.153 1.148
4 1.499 1.496 1.492 1.481 1.463 1.425
5 1.780 1.764 1.749 1.715 1.672 1.602
ir
21 3.266 3.031 2.912 2.733 2.580 2.408
22 3.300 3.060 2.939 2.758 2.603 2.429
23 3.332 3.087 2.963 2.781 2.624 2.448
a.
24 3.362 3.112 2.987 2.802 2.644 2.467
25 3.389 3.135 3.009 2.822 2.663 2.486
3
E178 − 08
TABLE 1 Continued
Number of Upper 0.1 % Upper 0.5 % Upper 1 % Upper 2.5 % Upper 5 % Upper 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
64 3.903 3.586 3.437 3.224 3.049 2.860
65 3.910 3.592 3.442 3.230 3.055 2.866
ir
81 4.002 3.677 3.525 3.309 3.134 2.945
82 4.007 3.682 3.529 3.315 3.139 2.949
83 4.012 3.687 3.534 3.319 3.143 2.953
84 4.017 3.691 3.539 3.323 3.147 2.957
a.
85 4.021 3.695 3.543 3.327 3.151 2.961
4
E178 − 08
TABLE 1 Continued
Number of Upper 0.1 % Upper 0.5 % Upper 1 % Upper 2.5 % Upper 5 % Upper 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
125 4.164 3.831 3.675 3.457 3.281 3.092
126 4.166 3.833 3.677 3.460 3.284 3.095
127 4.169 3.836 3.680 3.462 3.286 3.097
128 4.173 3.838 3.683 3.465 3.289 3.100
129 4.175 3.840 3.686 3.467 3.291 3.102
130 4.178 3.843 3.688 3.470 3.294 3.104
ir
143 4.209 3.874 3.719 3.501 3.324 3.135
144 4.212 3.876 3.721 3.503 3.326 3.138
145 4.214 3.879 3.723 3.505 3.328 3.140
a.
146 4.216 3.881 3.725 3.507 3.331 3.142
147 4.219 3.883 3.727 3.509 3.334 3.144
Tn = (xn − x̄)/s
Œ Œ Œ So D
n n n n 2
o
i51
s x i 2x̄ d 2 ox
i51
i
2
2n·x̄ 2
ox
i51
i
2
2
i51
xi /n
5 5
A
n21 n21 n21
T1 = [(x̄ − x1)/s]x1≤ x2 ≤ ... ≤ xn
Values of T are taken from Ref (1). All values have been adjusted for division by n – 1 instead of n in calculating s.
e di
this point of view we should use the statistic T n = (xn − x̄)/s or against the doubtful value having come from the same popu-
the statistic T1 = (x̄ − x1)/ s whichever is larger. If in this lation as the others (assuming the population is normally
gP
instance we use the 0.05 point of Table 1 as our critical value, distributed). Investigation of the doubtful value is therefore
the true significance level would be twice 0.05 or 0.10. If we indicated.
wish a significance level of 0.05 and not 0.10, we must in this 6.3 An alternative system, the Dixon criteria, based entirely
case use as a critical value the 0.025 point of Table 1. Similar on ratios of differences between the observations is described
considerations apply to the other tests given below. in the literature (2)3 and may be used in cases where it is
6.2.1 Example 1—As an illustration of the use of Tn and desirable to avoid calculation of s or where quick judgment is
En
Table 1, consider the following ten observations on breaking called for. For the Dixon test, the sample criterion or statistic
strength (in pounds) of 0.104-in. hard-drawn copper wire: 568, changes with sample size. Table 2 gives the appropriate
570, 570, 570, 572, 572, 572, 578, 584, 596. See Fig. 1. The statistic to calculate and also gives the critical values of the
doubtful observation is the high value, x10 = 596. Is the value statistic for the 1, 5, and 10 % levels of significance.
of 596 significantly high? The mean is x̄ = 575.2 and the 6.3.1 Example 2—As an illustration of the use of Dixon’s
estimated standard deviation is s = 8.70. We compute test, consider again the observations on breaking strength given
T 10 5 ~ 596 2 575.2! /8.70 5 2.39 (3) in Example 1, and suppose that a large number of such samples
had to be screened quickly for outliers and it was judged too
From Table 1, for n = 10, note that a T10 as large as 2.39
time-consuming to compute s. Table 2 indicates use of
would occur by chance with probability less than 0.05. In fact,
so large a value would occur by chance not much more often r 11 5 ~ x n 2 x n21 ! / ~ x n 2 x 2! (4)
than 1 % of the time. Thus, the weight of the evidence is
Thus, for n = 10,
r 11 5 ~ x 10 2 x 9 ! / ~ x 10 2 x 2 ! (5)
5
E178 − 08
TABLE 2 Dixon Criteria for Testing of Extreme Observation (Single Sample)A
Significance Level (One-Sided Test)
n Criterion
10 percent 5 percent 1 percent
3 r10 = (x2 − x1)/(xn − x1) if smallest value is suspected; 0.886 0.941 0.988
4 = (xn − xn−1)/(xn − x1) if largest value is suspected 0.679 0.765 0.889
5 0.557 0.642 0.780
6 0.482 0.560 0.698
7 0.434 0.507 0.637
8 r11 = (x2 − x1)/(xn−1 − x1) if smallest value is suspected; 0.479 0.554 0.683
9 = (xn − xn−1)/(xn − x2) if largest value is suspected. 0.441 0.512 0.635
10 0.409 0.477 0.597
11 r21 = (x3 − x1)/(xn−1 − x1) if smallest value is suspected; 0.517 0.576 0.679
12 = (xn − xn−2)/(xn − x2) if largest value is suspected. 0.490 0.546 0.642
13 0.467 0.521 0.615
14 r22 = (x3 − x1)/(xn−2 − x1) if smallest value is suspected; 0.492 0.546 0.641
15 = (xn − xn−2)/(xn − x3) if largest value is suspected. 0.472 0.525 0.616
16 0.454 0.507 0.595
17 0.438 0.490 0.577
18 0.424 0.475 0.561
19 0.412 0.462 0.547
20 0.401 0.450 0.535
21 0.391 0.440 0.524
22 0.382 0.430 0.514
23 0.374 0.421 0.505
24 0.367 0.413 0.497
ir
25 0.360 0.406 0.489
26 0.354 0.399 0.486
27 0.348 0.393 0.475
28 0.342 0.387 0.469
a.
29 0.337 0.381 0.463
30 0.332 0.376 0.457
A
x1 # x2 # ... # xn. (See Ref (2), Appendix.)
the best one to use for the single-outlier case, and final tions of the vertical semidiameters of Venus made by Lieuten-
statistical judgment should be based on it. See Ferguson (3,4). ant Herndon in 1846 (8). In the reduction of the observations,
6.3.2 Further examination of the sample observations on Prof. Pierce assumed two unknown quantities and found the
breaking strength of hand-drawn copper wire indicates that following residuals which have been arranged in ascending
none of the other values need testing. order of magnitude:
En
NOTE 2—With experience we may usually just look at the sample −1.40 in. −0.24 −0.05 0.18 0.48
values to observe if an outlier is present. However, strictly speaking the −0.44 −0.22 0.06 0.20 0.63
−0.30 −0.13 0.10 0.39 1.01
statistical test should be applied to all samples to guarantee the signifi-
cance levels used. Concerning “multiple” tests on a single sample, we See Fig. 2.
comment on this below. The deviations − 1.40 and 1.01 appear to be outliers. Here
6.4 A test equivalent to Tn (or T1) based on the sample sum the suspected observations lie at each end of the sample. Much
of squared deviations from the mean for all the observations less work has been accomplished for the case of outliers at both
and the sum of squared deviations omitting the “outlier” is ends of the sample than for the case of one or more outliers at
given by Grubbs (5). only one end of the sample. This is not necessarily because the
6.5 The next type of problem to consider is the case where “one-sided” case occurs more frequently in practice but be-
we have the possibility of two outlying observations, the least cause “two-sided’’ tests are much more difficult to deal with.
and the greatest observation in a sample. (The problem of For a high and a low outlier in a single sample, we give two
testing the two highest or the two lowest observations is procedures below, the first being a combination of tests, and the
considered below.) In testing the least and the greatest obser- second a single test of Tietjen and Moore (7) which may have
vations simultaneously as probable outliers in a sample, we use nearly optimum properties. For optimum procedures when
the ratio of sample range to sample standard deviation test of there is an independent estimate at hand, s2 or σ 2, see (9).
David, Hartley, and Pearson (6). The significance levels for this 6.6 For the observations on the semi-diameter of Venus
sample criterion are given in Table 3. Alternatively, the largest given above, all the information on the measurement error is
6
E178 − 08
TABLE 3 Critical Values (One-Sided Test) for w/s (Ratio of Range or more outliers. The lowest measurement, − 1.40 in., is 1.418
to Sample Standard Deviation)A below the sample mean, and the highest measurement, 1.01 in.,
5 Percent 1 Percent 0.5 Percent is 0.992 above the mean. Since these extremes are not
Number of
Significance Significance Significance
Observations, n symmetric about the mean, either both extremes are outliers, or
Level Level Level
3 2.00 2.00 2.00 else only − 1.40 is an outlier. That − 1.40 is an outlier can be
4 2.43 2.44 2.45 verified by use of the T1 statistic. We have
5 2.75 2.80 2.81
6 3.01 3.10 3.12 T 1 5 ~ x̄ 2 x 1 ! /s 5 @ 0.018 2 ~ 21.40! # /0.551 5 2.574 (10)
7 3.22 3.34 3.37
8 3.40 3.54 3.58 This value is greater than the critical value for the 5 % level,
9 3.55 3.72 3.77 2.409 from Table 1, so we reject − 1.40. Since we have decided
10 3.68 3.88 3.94
11 3.80 4.01 4.08 that − 1.40 should be rejected, we use the remaining 14
12 3.91 4.13 4.21 observations and test the upper extreme 1.01, either with the
13 4.00 4.24 4.32 criterion
14 4.09 4.34 4.43
15 4.17 4.43 4.53 T n 5 ~ x n 2 x̄ ! /s (11)
16 4.24 4.51 4.62
17 4.31 4.59 4.69 or with Dixon’s r22. Omitting − 1.40 and renumbering the
18 4.38 4.66 4.77
19 4.43 4.73 4.84
observations, we compute
20 4.49 4.79 4.91 x̄ 5 1.67/14 5 0.119, s 5 0.401, (12)
30 4.89 5.25 5.39
40 5.15 5.54 5.69
and
ir
50 5.35 5.77 5.91
60 5.50 5.93 6.09 T 14 5 ~ 1.01 2 0.119! /0.401 5 2.22 (13)
80 5.73 6.18 6.35
100 5.90 6.36 6.54 From Table 1, for n = 14, we find that a value as large as 2.22
a.
150 6.18 6.64 6.84
200 6.38 6.85 7.03 would occur by chance more than 5 % of the time, so we
500 6.94 7.42 7.60 should retain the value 1.01 in further calculations. We next
1000 7.33 7.80 7.99 calculate
A
See Ref (6), where:
w 5 x n 5 x1
x1 # x2 # { # xn
di r 22 5 ~ x 14 2 x 12! / ~ x 14 2 x 3 !
5 ~ 1.01 2 0.48! / ~ 1.0110.24!
50.53/1.25
(14)
Œ Œ Œ So D 50.424
n n n n 2
o ox ox
e
s x i 2x̄ d 2 i
2
2n·x̄ 2
i
2
2 xi /n
i51 i51 i51 i51
s5 5 5 From Table 2 for n = 14, we see that the 5 % critical value
n21 n21 n21
for r22 is 0.546. Since our calculated value (0.424) is less than
gP
w/s 5 ~ x n 2 x 1 ! /s (7) 6.8 For suspected observations on both the high and low
sides in the sample, and to deal with the situation in which
where:
some of k ≥ 2 suspected outliers are larger and some smaller
s5 =( @ ~ x 2 x̄ ! / ~ n 2 1 ! #
i
2
(8) than the remaining values in the sample, Tietjen and Moore (7)
suggest the following statistic. Let the sample values be x1, x2,
If xn is about as far above the mean, x̄, as x1 is below x̄, and x3, ... xn and compute the sample mean, x̄. Then compute the n
if w/s exceeds some chosen critical value, then one would absolute residuals
conclude that both the doubtful values are outliers. If, however,
x1 and xn are displaced from the mean by different amounts, ? ? ? ?
r 1 5 x 1 , 2, x̄ , r 2 5 x 2 , 2, x̄ , … r n 5 x n , 2, x̄ ? ? (15)
some further test would have to be made to decide whether to Now relabel the original observations x1, x2, ... , xn as z’s in
reject as outlying only the lowest value or only the highest such a manner that zi is that x whose ri is the ith smallest
value or both the lowest and highest values. absolute residual above. This now means that z1 is that
6.7 For this example the mean of the deviations is x̄ = 0.018, observation x which is closest to the mean and that zn is the
s = 0.551, and observation x which is farthest from the mean. The Tietjen-
Moore statistic for testing the significance of the k largest
w/s 5 @ 1.01 2 ~ 21.40! # /0.551 5 2.41/0.551 5 4.374 (9)
residuals is then
From Table 3 for n = 15, we see that the value of w/s
= 4.374 falls between the critical values for the 1 and 5 %
levels, so if the test were being run at the 5 % level of
Ek 5 F( ~n2k
i51
z i 2 z̄ k ! 2 /
n
( ~ z 2 z̄ !
i51
i
2
G (16)
7
E178 − 08
n2k
based on the ratio of the sample sum of squares when the two
z̄ k 5 ( z /~n 2 k!
i51
i (17)
doubtful values are omitted to the sample sum of squares when
is the mean of the (n − k) least extreme observations and z is the two doubtful values are included. If simplicity in calcula-
the mean of the full sample. tion is the prime requirement, then the Dixon type of test
6.8.1 Applying this test to the above data, we find that the (actually omitting one observation in the sample) might be
total sum of squares of deviations for the entire sample is used for this case. In illustrating the test procedure, we give the
4.24964. Omitting —1.40 and 1.01, the suspected two outliers, following Examples 4 and 5.
we find that the sum of squares of deviations for the reduced 6.9.1 Example 4—In a comparison of strength of various
sample of 13 observations is 1.24089. Then plastic materials, one characteristic studied was the percentage
E2 = 1.24089 ⁄4.24964 = 0.292, and by using Table 4, we find elongation at break. Before comparison of the average elonga-
that this observed E2 is slightly smaller than the 5 % critical tion of the several materials, it was desirable to isolate for
ir
value of 0.317, so that the E2 test would reject both of the further study any pieces of a given material which gave very
observations, —1.40 and 1.01. We would probably take this small elongation at breakage compared with the rest of the
latter recommendation, since the level of significance for the pieces in the sample. In this example, one might have primary
a.
E2 test is precisely 0.05 whereas that for the double application interest only in outliers to the left of the mean for study, since
of a test for a single outlier cannot be guaranteed to be smaller very high readings indicate exceeding plasticity, a desirable
than 1 — (0.95)2 = 0.0975. The table of percentage points of Ek characteristic.
was computed by Monte Carlo methods on a high-speed di 6.9.1.1 Ten measurements of percentage elongation at break
electronic calculator. made on material No. 23 follow: 3.73, 3.59, 3.94, 4.13, 3.04,
6.9 We next turn to the case where we may have the two 2.22, 3.23, 4.05, 4.11, and 2.02. See Fig. 3. Arranged in
largest or the two smallest observations as probable outliers. ascending order of magnitude, these measurements are: 2.02,
e
Here, we employ a test provided by Grubbs (5, 10) which is 2.22, 3.04, 3.23, 3.59, 3.73, 3.94, 4.05, 4.11, 4.13. The
k n
α 50 45 40 35 30 25 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3
1A 0.01 .748 .728 .704 .669 .624 .571 .499 .484 .459 .440 .422 .404 .374 .337 .311 .274 .235 .197 .156 .110 68 .29 .4 ...
0.05 .796 .776 .756 .732 .698 .654 .594 .579 .562 .544 .525 .503 .479 .453 .423 .390 .353 .310 .262 .207 .145 .81 .25 .1
0.10 .820 .802 .784 .762 .730 .692 .638 .624 .610 .593 .576 .556 .534 .510 .482 .451 .415 .374 .326 .270 .203 .127 .49 .3
2 0.01 .636 .607 .574 .533 .482 .418 .339 .323 .306 .290 .263 .238 .207 .181 .159 .134 .101 .78 .50 .28 .12 .2 ... ...
0.05 .684 .658 .629 .596 .549 .493 .416 .398 .382 .362 .340 .317 .293 .262 .234 .204 .172 .137 .99 .65 .34 .10 .1 ...
En
0.10 .708 .684 .657 .624 .582 .528 .460 .442 .424 .406 .384 .360 .337 .309 .278 .250 .214 .175 .137 .94 .56 .22 .2 ...
3 0.01 .550 .518 .480 .435 .386 .320 .236 .219 .206 .188 .166 .146 .123 .103 .83 .64 .44 .26 .14 .6 .1 ... ... ...
.0.05 .599 .567 .534 .495 .443 .381 .302 .287 .267 .248 .227 .206 .179 .156 .133 .107 .83 .57 .34 .16 .4 ... ... ...
.0.10 .622 .593 .562 .523 .475 .417 .338 .322 .304 .284 .263 .240 .216 .189 .162 .138 .108 .80 .53 .27 .9 ... ... ...
4 0.01 .482 .446 .408 .364 .308 .245 .170 .156 .141 .122 .107 .90 .72 .56 .42 .30 .18 .9 .4 ... ... ... ... ...
0.05 .529 .492 .458 .417 .364 .298 .221 .203 .187 .170 .153 .134 .112 .92 .73 .55 .37 .21 .10 ... ... ... ... ...
0.10 .552 .522 .486 .443 .391 .331 .252 .234 .217 .198 .182 .160 .138 .116 .94 .73 .52 .32 .16 ... ... ... ... ...
5 0.01 .424 .386 .347 .299 .250 .188 .121 .108 .94 .79 .68 .54 .42 .31 .20 .12 .6 ... ... ... ... ... ... ...
0.05 .468 .433 .395 .351 .298 .236 .163 .146 .132 .116 .102 .84 .68 .53 .39 .26 .14 ... ... ... ... ... ... ...
0.10 .492 .459 .422 .379 .325 .264 .188 .172 .156 .140 .122 .105 .86 .68 .52 .36 .22 ... ... ... ... ... ... ...
6 0.01 .376 .336 .298 .252 .204 .146 .86 .74 .62 .52 .40 .32 .22 .14 .8 ... ... ... ... ... ... ... ... ...
0.05 .417 .381 .343 .298 .246 .186 .119 .105 .91 .78 .67 .52 .39 .28 .18 ... ... ... ... ... ... ... ... ...
0.10 .440 .406 .367 .324 .270 .210 .138 .124 .110 .95 .82 .67 .52 .38 .26 ... ... ... ... ... ... ... ... ...
7 0.01 .334 .294 .258 .211 .166 .110 .58 .50 .41 .32 .24 .18 .12 ... ... ... ... ... ... ... ... ... ... ...
0.05 .373 .337 .297 .254 .203 .146 .85 .74 .62 .50 .41 .30 .21 ... ... ... ... ... ... ... ... ... ... ...
0.10 .396 .360 .320 .276 .224 .168 .102 .89 .76 .64 .53 .40 .29 ... ... ... ... ... ... ... ... ... ... ...
8 0.01 .297 .258 .220 .177 .132 .87 .40 .32 .26 .18 .14 ... ... ... ... ... ... ... ... ... ... ... ... ...
0.05 .334 .299 .259 .214 .166 .114 .59 .50 .41 .32 .24 ... ... ... ... ... ... ... ... ... ... ... ... ...
0.10 .355 .320 .278 .236 .186 .132 .72 .62 .51 .42 .32 ... ... ... ... ... ... ... ... ... ... ... ... ...
9 0.01 .264 .228 .190 .149 .108 .66 .26 .20 .14 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.05 .299 .263 .223 .181 .137 .89 .41 .33 .26 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.10 .319 .284 .243 .202 .154 .103 .51 .42 .34 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
10 0.01 .235 .200 .164 .124 .87 .50 .17 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.05 .268 .233 .195 .154 .112 .68 .28 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.10 .287 .252 .212 .172 .126 .80 .35 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
A
From Grubbs (1950, Table 1) for n # 25.
8
E178 − 08
are the two smallest ranges, 4420 and 4549. For testing these
two suspected outliers, the statistic S1,22/S2 of Table 5 is
probably the best to use.
NOTE 4—Kudo (11) indicates that if the two outliers are due to a shift
in location or level, as compared to the scale σ, then the optimum sample
criterion for testing should be of the type:
min (2x̄ − xi − xj)/ s = (2x̄ − x1 − x2)/s in our Example 5.
6.9.2.2 The distances arranged in increasing order of mag-
FIG. 3 Ten Measurements of Percentage Elongation at Break nitude are:
from Example 4 in 6.9.1 4420 4782
4549 4803
4730 4833
questionable readings are the two lowest, 2.02 and 2.22. We 4765 4838
can test these two low readings simultaneously by using the The value of S2 is 158 592. Omission of the two shortest
following criterion of Table 5: ranges, 4420 and 4549, and recalculation, gives S1,22 equal to
S 1,2 2 /S 2
(18) 8590.8. Thus,
S 1,2 2 /S 2 5 8590.8/158 592 5 0.054 (23)
For the above measurements:
which is significant at the 0.01 level (See Table 5). It is thus
S 2 5( ~ x 2 x̄ !
i51
i
2
highly unlikely that the two shortest ranges (occurring actually
ir
2
5@n( x 2 ~( x ! i
2
# /n i (19) from excessive yaw) could have come from the same popula-
5 @ 10~ 121.3594! 2 ~ 34.06! 2 # /10 tion as that represented by the other six ranges. It should be
55.351, noted that the critical values in Table 5 for the 1 % level of
a.
significance are smaller than those for the 5 % level. So for this
and particular test, the calculated value is significant if it is less
n
than the chosen critical value.
S 1,2 2 5 ( ~ x 2 x̄
i53
i 1,2 !2 di 6.10 By Monte Carlo methods using an electronic
F
5 ~n 2 2! (x
i53
n
i
2
2 S( D G ~i53
n
x i
2
/ n 2 2!
(20)
calculator, Tietjen and Moore (7) extended the tables of
percentage points for the two highest or the two lowest
2
5 @ 8 ~ 112.3506! 2 ~ 29.82! # /8 observations to k > 2 highest or lowest sample values. Their
e
59.5724/8 results are given in Table 6 where
51.197 n2k n n2k
Lk 5 ( ~ x 2 x̄ ! / ( ~ x
2
2 x̄ ! 2
and x̄ k 5 ( x /~n 2 k!.
F ( x /~n 2 2!G
n i k i i
gP
2
2/S is 0.2305. Since the calculated value is less than the critical large number of samples must be examined individually for
value, we should conclude that both 2.02 and 2.22 are outliers. outliers, the questionable observations may be tested with the
In a situation such as the one described in this example, where application of Dixon’s criteria. Disregarding the lowest range,
the outliers are to be isolated for further analysis, a significance 4420, we test if the next lowest range, 4549, is outlying. With
level as high as 5 % or perhaps even 10 % would probably be n = 7, we see from Table 2 that r10 is the appropriate statistic.
used in order to get a reasonable size of sample for additional Renumbering the ranges as xi to x7, beginning with 4549, we
study. The problem may really be one of economics, and we find:
use probability as a sensible basis for action.
6.9.2 Example 5—The following ranges (horizontal dis- r 10 5 ~ x 2 2 x 1 ! / ~ x 7 2 x 1 !
tances in yards from gun muzzle to point of impact of a 5 ~ 4730 2 4549! / ~ 4838 2 4549!
(24)
projectile) were obtained in firings from a weapon at a constant 5181/289
angle of elevation and at the same weight of charge of 50.626
propellant powder. which is only a little less than the 1 % critical value, 0.637,
Distances in Yards for n = 7. So, if the test is being conducted at any significance
4782 4420
4838 4803 level greater than a 1 % level, we would conclude that 4549 is
4765 4730 an outlier. Since the lowest of the original set of ranges, 4420,
4549 4833 is even more outlying than the one we have just tested, it can
6.9.2.1 It is desired to make a judgment on whether the be classified as an outlier without further testing. We note here,
projectiles exhibit uniformity in ballistic behavior or if some of however, that this test did not use all of the sample observa-
the ranges are inconsistent with the others. The doubtful values tions.
9
E178 − 08
2
TABLE 5 Critical Values for S n−1,n/ S2, orS21,2/S2 for Simultaneously Testing the Two Largest or Two Smallest ObservationsA
Number of Lower 0.1 % Lower 0.5 % Lower 1 % Lower 2.5 % Lower 5 % Lower 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
4 0.0000 0.0000 0.0000 0.0002 0.0008 0.0031
5 0.0003 0.0018 0.0035 0.0090 0.0183 0.0376
ir
22 0.3288 0.3927 0.4245 0.4711 0.5107 0.5550
23 0.3450 0.4085 0.4398 0.4857 0.5244 0.5677
24 0.3605 0.4234 0.4543 0.4994 0.5373 0.5795
a.
25 0.3752 0.4376 0.4680 0.5123 0.5495 0.5906
31
0.4397
0.4510
0.4985
0.5091
di
0.5268
0.5369
0.5672
0.5766
0.6008
0.6095
0.6375
0.6455
32 0.4618 0.5192 0.5465 0.5856 0.6178 0.6530
33 0.4722 0.5288 0.5557 0.5941 0.6257 0.6602
e
34 0.4821 0.5381 0.5646 0.6023 0.6333 0.6671
35 0.4917 0.5469 0.5730 0.6101 0.6405 0.6737
gP
10
E178 − 08
TABLE 5 Continued
Number of Lower 0.1 % Lower 0.5 % Lower 1 % Lower 2.5 % Lower 5 % Lower 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
65 0.6696 0.7079 0.7253 0.7496 0.7690 0.7898
66 0.6733 0.7112 0.7284 0.7524 0.7716 0.7921
67 0.6770 0.7144 0.7314 0.7551 0.7741 0.7944
68 0.6805 0.7175 0.7344 0.7578 0.7766 0.7966
69 0.6839 0.7206 0.7373 0.7604 0.7790 0.7988
70 0.6873 0.7236 0.7401 0.7630 0.7813 0.8009
ir
83 0.7250 0.7570 0.7715 0.7915 0.8075 0.8245
84 0.7275 0.7592 0.7736 0.7934 0.8093 0.8261
85 0.7300 0.7614 0.7756 0.7953 0.8109 0.8276
a.
86 0.7324 0.7635 0.7776 0.7971 0.8126 0.8291
87 0.7348 0.7656 0.7796 0.7989 0.8142 0.8306
88 0.7371 0.7677 0.7815 0.8006 0.8158 0.8321
89 0.7394 0.7697 0.7834 0.8023 0.8174 0.8335
90 0.7416 0.7717 0.7853
di 0.8040 0.8190 0.8349
11
E178 − 08
TABLE 5 Continued
Number of Lower 0.1 % Lower 0.5 % Lower 1 % Lower 2.5 % Lower 5 % Lower 10 %
Observations, Significance Significance Significance Significance Significance Significance
n Level Level Level Level Level Level
126 0.8016 0.8245 0.8348 0.8490 0.8602 0.8721
127 0.8028 0.8256 0.8359 0.8499 0.8611 0.8729
128 0.8041 0.8267 0.8369 0.8508 0.8619 0.8737
129 0.8053 0.8278 0.8379 0.8517 0.8627 0.8744
130 0.8065 0.8288 0.8389 0.8526 0.8636 0.8752
ir
144 0.8218 0.8423 0.8515 0.8641 0.8741 0.8846
145 0.8227 0.8431 0.8523 0.8648 0.8747 0.8853
a.
147 0.8247 0.8449 0.8539 0.8663 0.8761 0.8865
148 0.8256 0.8457 0.8547 0.8670 0.8767 0.8871
149 0.8266 0.8465 0.8555 0.8677 0.8774 0.8877
n
di S2 5 o s x 2 x̄ d
i51
i
2
x1 # x2 # ... # xn
n
S 2 1,2 5 o s x 2 x̄
i51
i 1,2 d2
e
n
1
x̄ 1,2 5
n22 ox
i53
i
gP
n22
2
S n21,n 5 oi51
s x i 2 x̄ n 2 1,n d 2
n22
1
x̄ n21,n 5
n22 o
i51
xi
En
A
These significance levels are taken from Table 11, Ref (1). An observed ratio less than the appropriate critical ratio in this table calls for rejection of the null hypothesis.
6.12 Rejection of Several Outliers— So far we have dis- the various rejection rules relative to changes in level or scale.
cussed procedures for detecting one or two outliers in the same For several outliers and repeated rejection of observations,
sample, but these techniques are not generally recommended Ferguson points out that the sample coefficient of skewness,
for repeated rejection, since if several outliers are present in the n
sample the detection of one or two spurious values may be =b 1 5 =n ( ~ x i 2 x̄ ! 3 / ~ n 2 1 ! 3/2 s 3
i51
(25)
“masked” by the presence of other anomalous observations.
Outlying observations occur due to a shift in level (or mean), n
12
E178 − 08
TABLE 6 1000 X Tietjen-Moore Critical Values (One-Sided Test) for Lk
k n
α 50 45 40 35 30 25 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3
A
1 0.01 .768 .745 .722 .690 .650 .607 .539 .522 .504 .485 .463 .440 .414 .386 .355 .321 .283 .241 .195 .145 .93 .44 .10 ...
0.025 .796 .776 .756 .732 .699 .654 .594 .579 .562 .544 .525 .503 .479 .453 .423 .390 .353 .310 .262 .207 .145 .81 .25 .1
0.05 .820 .802 .784 .762 .730 .692 .638 .624 .610 .593 .576 .556 .534 .510 .482 .451 .415 .374 .326 .270 .203 .127 .49 3
0.10 .840 .826 .812 .792 .766 .732 .685 .673 .660 .646 .631 .613 .594 .573 .548 .520 .488 .450 .405 .350 .283 .199 .98 .11
2B 0.01 . 667 .641 .610 .573 .527 .468 .391 .373 .353 .332 .310 .286 .261 .233 .204 .174 .141 .108 .75 .44 .19 .4 ... ...
0.025 .697 .667 .644 .610 .567 .512 .439 .421 .403 .382 .360 .337 .311 .284 .254 .221 .186 .149 .110 .71 .35 .9 ... ...
0.05 .720 .698 .673 .641 .601 .550 .480 .464 .446 .426 .405 .382 .357 .330 .300 .267 .230 .191 .148 .102 .56 .18 .1 ...
0.10 .746 .726 .702 .674 .637 .591 .527 .511 .494 .476 .456 .435 .411 .384 .355 .323 .286 . 245 .199 .148 .92 .38 .3 ...
3 0.01 .592 .558 .522 .484 .434 .377 .300 .272 .260 .237 .219 .194 .172 .147 .120 .98 .70 .48 .28 .10 .2 ... ... ...
0.025 .622 .592 .561 .527 .479 .417 .341 .321 .299 . 282 .261 .239 .214 .184 .162 .129 .100 .73 .45 .21 .5 ... ... ...
0.05 .646 .618 .588 .554 .506 .450 .377 .354 .337 .322 .300 .276 .250 .224 .196 .162 .129 .99 .64 .32 .10 ... ... ...
0.10 .673 .648 .622 .586 .523 .489 .420 .398 .384 .364 .342 .322 .298 .270 .240 .208 .170 .134 .95 .56 .20 ... ... ...
4 0.01 .531 .498 .460 .418 .369 .308 .231 .211 .192 .171 .151 .132 .113 .94 .70 .52 .32 .18 .8 ... ... ... ... ...
0.025 .559 .529 .491 .455 .408 .342 .265 .243 .226 .208 .185 .167 .145 .122 .96 .74 .52 .30 .13 ... ... ... ... ...
0.05 .588 .556 .523 .482 .434 .374 .299 .277 .259 .240 .219 .197 .174 .150 .125 .98 .70 .45 .22 ... ... ... ... ...
0.10 .614 .586 .554 .516 .472 .412 .339 . 316 .302 .282 .260 .236 .212 .186 .159 .128 .98 .68 .38 ... ... ... ... ...
5 0.01 .483 .444 .408 .364 .312 .246 .175 .154 .140 .126 .108 .90 .72 .56 .38 .26 .12 ... ... ... ... ... ... ...
0.025 .510 .473 .433 .398 .352 .282 .209 .189 .171 .151 .135 .113 .95 .77 .57 .40 .23 ... ... ... ... ... ... ...
0.05 .535 .502 .468 .424 . 376 .312 .238 .217 .200 .181 .159 .140 .122 .98 .76 .54 .34 ... ... ... ... ... ... ...
0.10 .562 .533 .499 .458 .411 .350 .273 .251 .236 .216 .194 .172 .150 .126 .103 .74 .51 ... ... ... ... ... ... ...
6 0.01 .438 .399 .364 .321 .268 .204 .136 .118 .104 .91 .72 .57 .46 .33 .19 ... ... ... ... ... ... ... ... ...
0.025 .466 .430 .387 .348 .302 .233 .165 .145 .129 .117 .96 .78 .63 .47 .31 ... ... ... ... ... ... ... ... ...
0.05 .490 .456 .421 .376 .327 .262 .188 .168 .154 .136 .115 .97 .79 .60 .42 ... ... ... ... ... ... ... ... ...
ir
0.10 .518 .488 .451 .410 .359 .296 .220 .199 .184 .165 .144 .124 .104 .82 .62 ... ... ... ... ... ... ... ... ...
7 0.01 .400 .361 .324 .282 .229 .168 .104 .88 .76 .64 .49 .37 .27 ... ... ... ... ... ... ... ... ... ... ...
0.025 .428 .391 .348 .308 .261 .192 .128 .108 .95 .82 .65 .51 .38 ... ... ... ... ... ... ... ... ... ... ...
a.
0.05 .450 .417 .378 .334 . 283 .222 .150 .130 .116 .100 .82 .66 ,50 ... ... ... ... ... ... ... ... ... ... ...
0.10 .477 .447 .408 .365 .316 .251 .176 .158 .142 .125 .104 .86 .68 ... ... ... ... ... ... ... ... ... ...
8 0.01 .368 .328 .292 .250 .196 .144 .78 .64 .53 .44 .30 ... ... ... ... ... ... ... ... ... ... ... ... ...
0.025 .392 .356 .314 .274 . 226 .159 .98 .80 .68 .58 .45 ... ... ... ... ... ... ... ... ... ... ... ... ...
0.05 .414 .382 .342 .297 .245 .184 .115 .99 .86 .72 .55 ... ... ... ... ... ... ... ... ... ... ... ... ...
0.10 .442 .410 .372 .328 .276 .213 .140 .124 .108 .92 .73 ... ... ... ... ... ... ... ... ... ... ... ... ...
9 0.01
0.025
0.05
.336
.363
.383
.296
.325
.350
.262
.283
.310
.220
.242
.264
.166
.193
. 212
.112
.132
.154
.58
.73
.88
.46
.59
.74
.36
.48
.62
...
...
...
...
...
...
di
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
0.10 .410 .378 .338 .294 .240 .180 .110 .94 .80 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
10 0.01 .308 .270 .234 .194 . 142 .92 .42 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
e
0.025 .334 .295 .257 .213 .165 .108 .54 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.05 .356 .320 .280 .235 .183 .126 .66 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
0.10 .380 .348 .307 .262 .210 .152 .85 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
gP
A
From Grubbs (1950, Table I) for n # 25.
B
From Grubbs (1972, Table II).
should be used for “one-sided” tests (change in level of 8, then the observation farthest from the mean is rejected and
several observations in the same direction), and the sample the same procedure repeated until no further sample values are
En
coefficient of kurtosis, judged as outliers. (As is well-known =b 1 and b2 are also used
n as tests of normality.)
b2 5 n ( ~x
i51
i 2 x̄ ! 4 / ~ n 2 1 ! 2 s 4 (26) NOTE 5—In the above equations for =b 1 and b2, s is defined as used
in this standard:
S( D
n n n n n 2
( ~ x 2 x̄ ! / @ ( ~ x 2 x̄ ! #2
! ! !
4 2
5n
i51
i i (
i51
~ x i 2x̄ ! 2 (x
i51
i
2
2n·x̄ 2
(x
i51
i
2
2
i51
xi /n
5 5
is recommended for “two-sided” tests (change in level to ~ n21 ! ~ n21 ! n21
higher and lower values) and also for changes in scale 6.12.1 The significance levels in Table 7 and Table 8 for
(variance) (see Note 5). In applying the above tests, the =b 1 or sample sizes of 5, 10, 15, and 20 (and 25 for b2) were obtained
the b2, or both, are computed and if their observed values by Ferguson on an IBM 704 computer using a sampling
exceed those for significance levels given in Table 7 and Table
TABLE 7 Significance Levels (One-Sided Test) for œ b 1 TABLE 8 Significance Levels (One-Sided Test) for b2
Signifi- Significance n
n
cance Level, percent 5A 10A 15A 20A 25A 50 75 100
Level, per- 1 3.11 4.83 5.08 5.23 5.00 4.88 4.59 4.39
5A 10 A
15 A
20 A
25 30 35 40 50 60
cent 5 2.89 3.85 4.07 4.15 4.00 3.99 3.87 3.77
1 1.34 1.31 1.20 1.11 1.06 0.98 0.92 0.87 0.79 0.72 A
These values were obtained by Ferguson, using a Monte Carlo procedure. For
5 1.05 0.92 0.84 0.79 0.71 0.66 0.62 0.59 0.53 0.49
n = 25; Ferguson’s Monte Carlo values of b2 agree with Pearson’s computed
A
These values were obtained by Ferguson, using a Monte Carlo procedure. values.
13
E178 − 08
experiment or “Monte Carlo’’ procedure. The significance Potassium acid phthalate (P.A.P.), obtained from the National
levels for the other sample sizes are from Pearson, E. S. “Table Institute of Standards and Technology, was used as the test
of Percentage Points of = b 1 and b2 in Normal Samples; a standard.
Rounding Off,” Biometrika, Vol 52, 1965, pp. 282–285. 7.3.1 Test data by the twelve laboratories are given in Table
10. The P.A.P. readings have been coded to simplify the
6.12.2 The = b 1 and b2 statistics have the optimum property
calculations. The variances between the three readings within
of being “locally” best against one-sided and two-sided
all laboratories were found to be homogeneous. A one-way
alternatives, respectively. The =b 1 test is good for up to 50 % classification in the analysis of variance was first analyzed to
spurious observations in the sample for the one-sided case, and determine if the variation in laboratory results (averages) was
the b2 test is optimum in the two-sided alternatives case for up statistically significant. This variation was significant and
to 21 % “contamination” of sample values. For only one or two indicated a need for action, so tests for outliers were then
outliers the sample statistics of the previous paragraphs are applied to isolate the particular laboratories whose results gave
recommended, and Ferguson (3) discusses in detail their rise to the significant variation.
optimum properties of pointing out one or two outliers. 7.3.2 Table 11 shows that the variation between laboratories
6.12.2.1 Instead of the more complicated =b 1 and b2 is highly significant. To test if this (very significant) variation
statistics, one can use Table 4 and Table 6 (7) for sample sizes is due to one (or perhaps two) laboratories that obtained
and percentage points given. “outlying” results (that is, perhaps showing nonstandard
technique), we can test the laboratory averages for outliers.
7. Recommended Criterion Using Independent Standard From the analysis of variance, we have an estimate of the
ir
Deviation variance of an individual reading as 0.008793, based on 24
degrees of freedom. The estimated standard deviation of an
7.1 Suppose that an independent estimate of the standard
individual measurement is =0.00879350.094 and the estimated
deviation is available from previous data. This estimate may be
a.
standard deviation of the average of three readings is therefore
from a single sample of previous similar data or may be the
result of combining estimates from several such previous sets 0.094/ =350.054.
of data. In any event, each estimate is said to have degrees of 7.3.3 Since the estimate of within-laboratory variation is
freedom equal to one less than the sample size that it is based
on. The proper combined estimate is a weighted average of the
several values of s2, the weights being proportional to the
di independent of any difference between laboratories, we can use
the statistic T'1 of 7.1 to test for outliers. An examination of the
deviations of the laboratory averages from the grand average
respective degrees of freedom. The total degrees of freedom in indicates that Laboratory 10 obtained an average reading much
e
the combined estimate is then the sum of the individual degrees lower than the grand average, and that Laboratory 12 obtained
of freedom. When one uses an independent estimate of the a high average compared to the over-all average. To first test if
standard deviation, sv, the test criterion recommended here for Laboratory 10 is an outlier, we compute
gP
14
E178 − 08
TABLE 9 Critical Values (One-Sided Test) for T ' When Standard Deviations v is Independent of Present SampleA
x n 2 x̄ x̄ 2 x 1
T' 5 , or
sv sv
n
v = d.f.
3 4 5 6 7 8 9 10 12
1 percentage point
10 2.78 3.10 3.32 3.48 3.62 3.73 3.82 3.90 4.04
11 2.72 3.02 3.24 3.39 3.52 3.63 3.72 3.79 3.93
12 2.67 2.96 3.17 3.32 3.45 3.55 3.64 3.71 3.84
13 2.63 2.92 3.12 3.27 3.38 3.48 3.57 3.64 3.76
14 2.60 2.88 3.07 3.22 3.33 3.43 3.51 3.58 3.70
ir
120 2.25 2.48 2.62 2.73 2.82 2.89 2.95 3.00 3.08
` 2.22 2.43 2.57 2.68 2.76 2.83 2.88 2.93 3.01
5 percentage points
10 2.01 2.27 2.46 2.60 2.72 2.81 2.89 2.96 3.08
a.
11 1.98 2.24 2.42 2.56 2.67 2.76 2.84 2.91 3.03
12 1.96 2.21 2.39 2.52 2.63 2.72 2.80 2.87 2.98
13 1.94 2.19 2.36 2.50 2.60 2.69 2.76 2.83 2.94
14 1.93 2.17 2.34 2.47 2.57 2.66 2.74 2.80 2.91
15
16
17
18
1.91
1.90
1.89
1.88
2.15
2.14
2.13
2.11
2.32
2.31
2.29
2.28
2.45
2.43
2.42
2.40
di 2.55
2.53
2.52
2.50
2.64
2.62
2.60
2.58
2.71
2.69
2.67
2.65
2.77
2.75
2.73
2.71
2.88
2.86
2.84
2.82
19 1.87 2.11 2.27 2.39 2.49 2.57 2.64 2.70 2.80
e
20 1.87 2.10 2.26 2.38 2.47 2.56 2.63 2.68 2.78
24 1.84 2.07 2.23 2.34 2.44 2.52 2.58 2.64 2.74
30 1.82 2.04 2.20 2.31 2.40 2.48 2.54 2.60 2.69
gP
7.3.6.2 In conclusion, there should be a systematic investi- we must live with—or guard against—then the observed F
gation of test methods for Laboratories No. 10 and No. 12 to ratio could be multiplied by the within variance of a sample
determine why their test procedures are apparently different mean and divided by this quantity plus the among laboratory
from the other ten laboratories. (In this type of problem, the variance, in order to adjust the F test to detect the undesirable
tables of Greenhouse, Halperin, and Cornfield (13) could also deviations of those laboratories which departed in average
be used for testing outlying laboratory averages.) level from measurements of the common or acceptable level of
7.3.7 Cautionary Remarks—In the use of the tests for the closely agreeing laboratories. Also, a somewhat similar
outliers as given above, our interest was to direct the statistical adjustment, if desired, could be applied to the tests for isolated
tests of significance toward picking out those laboratories outliers. In our particular example, however, we desired to
which have different levels of measurement than the others. detect those particular laboratories which departed in average
Thus, we have assumed that there should not exist any level from that of the closely agreeing laboratories. In fact, this
component of variance among the laboratory true means of should be the aim of many interlaboratory testing programs, if
measurement. On the other hand, it is well known that in we are to seek high precision and accuracy of measurement.
practically all interlaboratory tests one does indeed find a
nonzero component of variance among the laboratory levels. 8. Recommended Criteria for Known Standard Deviation
Often the variance among the laboratory means may be several 8.1 Frequently the population standard deviation σ may be
times that within individual laboratories. Thus, if we knew the known accurately. In such cases, Table 13 may be used for
size of the actual component of variance among laboratories single outliers and we illustrate with the following example:
15
E178 − 08
TABLE 10 Standardization of Sodium Hydroxide Solutions as TABLE 13 Critical Values (One-Sided Test) of T’1` andT' n` When
Determined by Plant Laboratories the Population Standard Deviation σ is KnownA
Standard used: Potassium Acid Phthalate (P.A.P.) Number of 5 Percent 1 Percent 0.5 Percent
Deviation Observations, Significance Significance Significance
(P.A.P. − n Level Level Level
Labora- of Average
0.096000 Sums Averages 2 1.39 1.82 1.99
tory from Grand
×10 3) 3 1.74 2.22 2.40
Average
1 1.893 4 1.94 2.43 2.62
1.972 5 2.08 2.57 2.76
1.876 5.741 1.914 + 0.043 6 2.18 2.68 2.87
2 2.046 7 2.27 2.76 2.95
1.851 8 2.33 2.83 3.02
1.949 5.846 1.949 + 0.078 9 2.39 2.88 3.07
3 1.874 10 2.44 2.93 3.12
1.792 11 2.48 2.97 3.16
1.829 5.495 1.832 −0.039 12 2.52 3.01 3.20
4 1.861 13 2.56 3.04 3.23
1.998 14 2.59 3.07 3.26
1.983 5.842 1.947 + 0.076 15 2.62 3.10 3.29
5 1.922 16 2.64 3.12 3.31
1.881 17 2.67 3.15 3.33
1.850 5.653 1.884 + 0.013 18 2.69 3.17 3.36
6 2.082 19 2.71 3.19 3.38
1.958 20 2.73 3.21 3.39
2.029 6.069 2.023 + 0.152 21 2.75 3.22 3.41
22 2.77 3.24 3.42
ir
7 1.992
1.980 23 2.78 3.26 3.44
2.066 6.038 2.013 + 0.142 24 2.80 3.27 3.45
8 2.050 25 2.81 3.28 3.46
a.
2.181 x1 # x2 # x3 # ... # xn
1.903 6.134 2.045 + 0.174 T'1 = (x̄ − x1)/σ; T'n = (xn − x̄ )/σ
9 1.831 A
This table is taken from Ref (13).
1.883
1.855 5.569 1.856 −0.015 di
10 0.735
0.722
0.777 2.234 0.745 −1.126 path. Since the stars were also photographed at the same times
11 2.064 as the Satellite, all the pictures show star-trails and are thus
1.794 called “star-plates.”
1.891 5.749 1.916 + 0.045
e
12 2.475 8.2.1 The x- and y-coordinate of each point on the Echo path
2.403 are read from a photograph, using a stereo-comparator. To
2.102 6.980 2.327 + 0.456
eliminate bias of the reader, the photograph is placed in one
gP
Grand
sum 67.350 position and the coordinates are read; then the photograph is
Grand rotated 180 deg and the coordinates reread. The average of the
average 1.871
two readings is taken as the final reading. Before any further
calculations are made, the readings must be “screened” for
TABLE 11 Analysis of Variance
gross reading or tabulation errors. This is done by examining
the difference in the readings taken at the two positions of the
En
Degrees
of
Sum of Mean photograph.
Source of Variation Squares Square F-ratio
Freedom
(SS) (MS)
8.2.2 Table 14 records a sample of six readings made at the
(d.f.) two positions and the differences in these readings. On the third
Between laboratories 11 4.70180 0.4274 F = v 48.61
Within laboratories 24 0.21103 0.008793 (highly significant) reading, the differences are rather large. Has the operator made
Total 35 4.91283 an error in placing the cross hair on the point?
8.2.3 For this example, an independent estimate of σ is
available since extensive tests on the stereo-comparator have
TABLE 12 Analysis of Variance shown that the standard deviation in reader’s error is about 4
(Omitting Labs 10 and 12)
µm. The determination of this standard error was based on such
Source of Variation d.f. SS MS F-ratio
Between laboratories 9 0.13889 0.015430 F = v 2.36
Within laboratories 20 0.13107 0.00655 F0.05(9, 20) = v 2.40 TABLE 14 Measurements, µm
F0.01(9, 20) = v 3.45
Total 29 0.26996 x -Coordinate y -Coordinate
Position Position
Position Position
1 + 180 ∆x 1 + 180 ∆y
1 1
deg deg
−53011 −53004 −7 70263 70258 +5
−38112 −38103 −9 −39739 −39729 −6
8.2 Example 7 (σ known)—Passage of the Echo I (Balloon) −2804 −2828 + 24 81162 81140 + 22
Satellite was recorded on star-plates when it was visible. 18473 18467 +6 41477 41485 −8
25507 25497 + 10 1082 1076 +6
Photographs were made by means of a camera with shutter 87736 87739 −3 −7442 −7434 −8
automatically timed to obtain a series of points for the Echo
16
E178 − 08
a large sample that we can assume σ = 4 µm. The standard further analysis of the data. We do not propose to cover this
deviation of the difference in two readings is therefore problem here, since in many cases it will depend greatly on the
particular case in hand. However, we do remark that there
=4 2 14 2 5 =32 or 5.7 µm (31)
could be the outright rejection of aberrant observations once
8.2.4 For the six readings above, the mean difference in the and for all on physical grounds (and preferably not on
x-coordinates is ¯∆ x = 3.5 and the mean difference in the statistical grounds generally) and only the remaining observa-
y-coordinates is ¯∆ y = 1.8. For the questionable third reading, tions would be used in further analyses or in estimation
we have problems. On the other hand, some may want to replace
aberrant values with newly taken observations and others may
T' x 5 ~ 24 2 3.5! /5.7 5 3.60 (32) want to “Winsorize” the outliers, that is, replace them with the
T' y 5 ~ 22 2 1.8! /5.7 5 3.54 (33) next closest values in the sample. Also with outliers in a
sample, some may wish to use the median instead of the mean,
From Table 13 we see that for n = 6, values of T'n∞ as large
and so on. Finally, we remark that perhaps a fair or appropriate
as the calculated values would occur by chance less than 1 %
practice might be that of using truncated-sample theory (11) for
of the time so that a significant reading error seems to have
cases of samples where we have “censored” or rejected some
been made on the third point.
of the observations. We cannot go further into these problems
8.3 A great number of points are read and automatically here. For additional reading on outliers, see Refs (12,14,15,16,
tabulated on star-plates. Here we have chosen a very small 17,18,19).
sample of these points. In actual practice, the tabulations would
9.2 A sample test criterion for non-normality, and hence
probably be scanned quickly for very large errors such as
ir
possibly for outliers, not covered above is the Wilk-Shapiro W
tabulator errors; then some rule-of-thumb such as 63 standard
statistic for a sample of size n given by
deviations of reader’s error might be used to scan for outliers
F(@ n/2 #
G 2
a.
due to operator error (Note 6). In other words, the data are a n2i11 ~ x 2 x i!
n2i11
probably too extensive to allow repeated use of precise tests i51
W5 n (34)
like those described above (especially for varying sample size),
but this example does illustrate the case where σ is assumed di ( ~x
i51
i 2 x̄ ! 2
9.1 In the above, we have covered only that part of sensitive to departures from normality and generally may
screening samples to detect outliers statistically. However, a compare most favorably with the =b 1 and b2 tests discussed
large area remains after the decision has been reached that above. In addition, therefore, the W statistic may also be used
outliers are present in data. Once some of the sample obser- as a test for outliers, or otherwise as a general test for
vations are branded as “outliers,” then a thorough investigation heterogeneity of sample values. Our significance tests given
En
should be initiated to determine the cause. In particular, one above have been selected and recommended since they spe-
should look for gross errors, personal errors, errors of cifically point out particular suspected outliers in the sample.
measurement, errors in calibration, etc. If reasons are found for We therefore are inclined to favor the above tests for specific
aberrant observations, then one should act accordingly and outliers in samples for the case where they will be used
perhaps scrutinize also the other observations. Finally, if one routinely, for example, by engineers.
reaches the point that some observations are to be discarded or
treated in a special manner based solely on statistical judgment, 10. Keywords
then it must be decided what action should be taken in the 10.1 dixon test; gross deviation; Grubbs test; outlier
17
E178 − 08
REFERENCES
(1) Grubbs, F. E., and Beck, G.,“ Extension of Sample Sizes and (11) Kudo, A., “On the Testing of Outlying Observations,” Sankhyā, The
Percentage Points for Significance Tests of Outlying Observations,’’ Indian Journal of Statistics, SNKYA Vol 17, Part 1, June 1956, pp.
Technometrics, TCMTA, Vol 14, No. 4, November 1972, pp. 847–854. 67–76.
(2) Dixon, W. J., “Processing Data for Outliers,’’ Biometrics, BIOMA (12) David, H. A., “Revised Upper Percentage Points of the Extreme
March 1953, Vol 9, No. 1, pp. 74–89. Studentized Deviate from the Sample Mean,’’ Biometrika, BIOKA
(3) Ferguson, T. S., On the Rejection of Outliers, Fourth Berkeley Vol 43, 1956, pp. 449–451.
Symposium on Mathematical Statistics and Probability, edited by (13) Greenhouse, S. W., Halperin, M., and Cornfield, J.,“ Tables of
Jerzy Neyman, University of California Press, Berkeley and Los Percentage Points for the Studentized Maximum Absolute Deviation
Angeles, Calif., 1961. in Normal Samples,’’ Journal of the American Statistical
(4) Ferguson, T. S., “Rules for Rejection of Outliers,’’ Revue Inst. Int. de Association, JSTNA Vol 50, No. 269, 1955, pp. 185–195.
Stat., RINSA Vol 29, Issue 3, 1961, pp. 29–43. (14) Anscombe, F. J., “Rejection of Outliers,’’ Technometrics, TCMTA
(5) Grubbs, F. E., “Sample Criteria for Testing Outlying Observations,’’ Vol 2, No. 2, 1960, pp. 123–147.
Annals of Mathematical Statistics, AASTA Vol 21, March 1950, pp. (15) Chew, Victor,“ Tests for the Rejection of Outlying Observations,”
27–58. RCA Systems Analysis Technical Memorandum No. 64-7, 31 Dec.
(6) David, H. A., Hartley, H. O., and Pearson, E. S., “The Distribution of 1964, Patrick Air Force Base, Fla.
the Ratio in a Single Sample of Range to Standard Deviation, (16) Kruskal, W. H., “Some Remarks on Wild Observations,”
Biometrika, BIOKA Vol 41, 1954, pp. 482–493. Technometrics, TCMTA Vol 2, No. 1, 1960, pp. 1–3.
(7) Tietjen, G. L., and Moore, R. H., “Some Grubbs-Type Statistics for (17) Proschan, F., “Testing Suspected Observations,” Industrial Quality
the Detection of Several Outliers,’’ Technometrics, TCMTA, Vol 14, Control, IQCOA Vol XIII, No. 7, January 1957, pp. 14–19.
No. 3, August 1972, pp. 583–597. (18) Sarhan, A. E., and Greenberg, B. G., Editors, Contributions to Order
ir
(8) Chauvenet, W., A. Manual of Spherical and Practical Astronomy,Vol Statistics, John Wiley and Sons, Inc., New York 1962.
2, Fifth Edition. (19) Thompson, W. R. “On a Criterion for the Rejection of Observa-tions
(9) David, H. A., and Quesenberry, C. P., “Some Tests for Outliers,’’ and the Distribution of the Ratio of the Deviation to the Sample
a.
Technical Report No. 47, OOR(ARO) Project No. 1166, Virginia Standard Deviation,’’ The Annals of Mathematical Statistics,
Polytechnic Inst., Blacksburg, Va. AASTA Vol 6, 1935, pp. 214–219.
(10) Grubbs, F. E., “Procedures for Detecting Outlying Observations in (20) Shapiro, S. S., and Wilk, M. B., “An Analysis of Variance Test for
Samples,’’ Technometrics, TCMTA, Vol 11, No. 4, February 1969, Non-Normality (Complete Samples),” Biometrika, BIOKA, Vol 52,
pp. 1–21. 1965, pp. 591–611.
di
ASTM International takes no position respecting the validity of any patent rights asserted in connection with any item mentioned
in this standard. Users of this standard are expressly advised that determination of the validity of any such patent rights, and the risk
of infringement of such rights, are entirely their own responsibility.
e
This standard is subject to revision at any time by the responsible technical committee and must be reviewed every five years and
if not revised, either reapproved or withdrawn. Your comments are invited either for revision of this standard or for additional standards
gP
and should be addressed to ASTM International Headquarters. Your comments will receive careful consideration at a meeting of the
responsible technical committee, which you may attend. If you feel that your comments have not received a fair hearing you should
make your views known to the ASTM Committee on Standards, at the address shown below.
This standard is copyrighted by ASTM International, 100 Barr Harbor Drive, PO Box C700, West Conshohocken, PA 19428-2959,
United States. Individual reprints (single or multiple copies) of this standard may be obtained by contacting ASTM at the above
address or at 610-832-9585 (phone), 610-832-9555 (fax), or service@astm.org (e-mail); or through the ASTM website
(www.astm.org). Permission rights to photocopy the standard may also be secured from the Copyright Clearance Center, 222
En
18