Sie sind auf Seite 1von 9

Control Charts for Multivariate Processes Author(s): Regina Y.

Liu Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 90, No. 432 (Dec., 1995), pp. 13801387 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2291529 . Accessed: 28/08/2012 14:20
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association.

http://www.jstor.org

Control Charts for Multivariate Processes


Regina Y Liu
This article uses the concept of data depth to introduce several new control charts for monitoring processes of multivariate quality measurements. For any dimension of the measurements, these charts are in the form of two-dimensional graphs that can be visualized and interpreted just as easily as the well-known univariate X, X, and CUSUM charts. Moreover, they have several significant advantages. First, they can detect simultaneously the location shift and scale increase of the process, unlike the existing methods, which can detect only the location shift. Second, their construction is completely nonparametric;in particular, it does not require the assumption of normality for the quality distribution, which is needed in standard approaches such as the x2 and Hotelling's T2 charts. Thus these new charts generalize the principle of control charts to multivariate settings and apply to a much broader class of quality distributions. KEY WORDS: Control charts; Q chart; Quality control; r chart; S chart; Statistical process control.

CUSUMcharts.The geometricnatureof the notionof data the values of statisticsdedepthmakesit easy to interpret Controlchartsareusefultools for monitoring/controlling rived from those ranks and to visualize their plots. This chosencontrollimprocess.Withproperly a manufacturing and thus the resultis completelynonparametric, approach its, a controlchartcan detect a shift from a "good"qualassumptions on the ing chartsare valid withoutparametric to a "bad"one. When the measurement, process model. Moreover,these chartsallow us to detect ity distribution denoted by X, of a particularcharacteristicof a prod- simultaneously the location change and the scale increase uct is used to gauge the quality of the product,the most in a process. In Section3 threetypes of controlcharts-the commonly used charts are the X chart, the X chart (or r, Q, and S charts-are proposedand justified. They can Shewhartchart),and the cumulativesum (CUSUM)chart. be viewed as data-depth-based generalizations multivariate visualize,and interpret, of the univariate These chartsare easy to construct, X, X, and CUSUM charts. Their names andmost important, havebeen proveneffectivein practice. are suggestedrespectivelyby the relativeranksof sample However,they are usually suitableonly when the observa- points with respect to a referencesample, by the quality and their validity often relies on the index introducedin Liu and Singh (1993), and by a plot tion X is univariate, of X, whichis not alwaysrealistic. of sums of deviations. In Section 2 a brief descriptionof of normality assumption qualitymea- datadepthsanddefinitions multivariate In real life, we often encounter of the relevantstatisticssuitable since the overall quality for plottingin controlchartsarepresented. surementsratherthan univariate, A simulated biby morethanone quality variatedata set is used to demonstrate of a productis usuallydetermined of the construction For example,the qualityof a certaintype of the proposedcharts.The results,presentedin Figures 1-5, characteristic. tabletsmay be determined by weight, degree of hardness, appearto supportour methods. A detaileddiscussion of thickness,width, and length. These qualitycharacteristics the simulationis given in Section 4, and some concluding in- remarksare presentedin Section 5. andcontrolchartsfor monitoring are clearlycorrelated, for dedividualqualitycharacteristics may not be adequate tecting changesin the overallqualityof the product.Thus 2. SOME STATISTICSDERIVED FROM DATADEPTH it is desirableto have controlchartsthat can monitormulof eachproductare Assumethatk (k > 1) characteristics tivariatemeasurements directly. of The processis to determine the the used product. quality There are some methods for constructingmultivariate in if the are followconsidered to be control measurements control chartsin the literature(see, for example,Alt and customers distribution by (required a quality ing prescribed Smith 1988 for a thoroughsurvey and for furtherreferor designing engineers). Let G denote the prescribedkences). However these methods are usually restrictedto dimensionaldistribution, and let Y1,. . ., Y- be m random and are difficultto visualthe case of normaldistributions is generally observationsfrom G. The sample Y1,. . ., Ym, ize and interpret. The main idea behind our control charts referredto as a referencesamplein the context of quality is to reduce each multivariate measurement to a univariate of products control, and consideredas the measurements index-namely, its relative center-outward ranking induced producedby an in-controlprocess. Let X1, X2,... be the the originalqualby a datadepth(cf. Sec. 2). Representing new observations fromthe manufacturing process. Assume univariateranks, by their corresponding ity measurements F. Based on the observathatthe Xi's follow a distribution we are able to developcontrolchartsbased on these-ranks tions whetherthe quality Xi's, we would like to determine X, X, and of the product following the same principlesfor the univariate or whetherthe processis out has deteriorated of control. This would meanthat the Xi's are not meeting
1. INTRODUCTION
Regina Y Liu is Professor, Department of Statistics, Rutgers University, New Brunswick, NJ 08903. The author gratefully acknowledges support from National Science Foundation Grants DMS-90-04658 and DMS 9022126. The author thanks Kay Tatsuoka for his computing assistance and the referees, associate editor, and editor for their helpful comments. 1380

? 1995 American Statistical Association Journal of the American Statistical Association December 1995, Vol. 90, No. 432, Theory and Methods

Liu: Control Charts for MultivariateProcesses

1381

1.0 0.8 0.6 0 .5 0.4 . ...... .. ....

Another notion of depth is based on the Mahalanobis distance. Here how deep a point y is with respect to a given distribution G is measured by how smallits quadratic distanceis to the mean
MDG(Y) = 1/[1 + (Y - UGEG (Y - ILG)] (3)

where,UGand EG denotethe meanandthe covariance matrix of G, ""'denotes the transpose of a (k x 1) vector,and "-1" denotesthe inverseof a matrix.The empirical version of MDG(Y)is
MDGm(Y) = 1/[1 + (y-Y)'S-l(Y

- Y)],

(4)

where Y is the sample mean of Y1, Y.-.,

and S is the

0.2 . .... o.o .,,,,,,,, ..,.... 0 20 40 60 80

Figure 1. r Chart.

the prescribedG(.) in a certain sense. Thus we need to compareF with G. The statisticsthat we use to characterize certainaspects of the differencebetween G and F are basedon the notionof datadepth,so we beginby describing some conceptsof datadepth. For any point y in Rk, the simplicialdepth(Liu 1990)of y with respectto G is definedto be
SDG(Y) = PG{Y E S[Y1,. ... i Yk+1]},
(1)

samplecovariance matrix.We observethatMDGO() is also affineinvariant. There are several other affine-invariant notions of data depth, includingTukey'sdepth (Tukey 1975) and the majority depth of Singh (Liu and Singh 1993). As a matter of fact, all controlchartsproposedhereinare also valid for these two depths. (See Liu and Singh 1993 for a fuller discussion of variousnotions of data depth.) The simplicial depthand the Mahalanobis depth sufficefor our purposes, because they illustratewell the contrastingpropertiesof probabilisticgeometry and metric distances. Henceforth we use the same notationDG(.) to denote eithernotion of depth,unless indicatedotherwise. We also assumethat G and F are two absolutelycontinuousdistributions. Clearly,a data depthinducesa center-outward ordering of the samplepoints if depthvalues for all points are computed and compared. More specifically,if we arrangeall DG(Yi)'S in an ascendingorderand use Y[j]to denote the samplepoint associatedwith the jth smallestdepthvalue, then Y], Y[2],.. . , Y[,,] are the orderstatisticsof Yi's, with

where s[Yl,... , Yk+1] is the open simplex whose vertices Y1,.. ., Yk+l are (k + 1) random observations from G. The

0.8
0.7 0.6 0.5 0.4

valueof SDGis a measureof how "deep," or how "central," y is with respect to G. When G is unknownand only a sample {Yi,... , Ym} is given, the sample simplicialdepth of y is definedas
SDGm(Y)

(k+1

)yEI(y

c s[Yil,

... ,Yik+l]),

(2)

which measures how deep y is within the data cloud


, Ym}. Here I(.) is the indicator function; that is, = 1 if A occurs and I(A) = 0 otherwise. The function I(A) Gm( ) denotes the empirical distribution of {Yl, . . . , Ym} and (*) runs over all possible subsets of {Y1, ... , Ym} of

{y1,...

0.3

size (k + 1). A fuller motivationtogetherwith the basic propertiesof SDG(.) can be found in an earlierwork (Liu thatSDG 1990),whereit was shownin particular (.) is affine invariant and that SDGm (.) converges uniformly and stronglyto SDG(). The affineinvariance will ensure that our proposedcontrol charts are coordinatefree, and the convergence of SDGm to SDGwill allowus to approximate SDG(.) by SDGm(.) when C is not specified.

0.1
0.0

10

15

20

Figure2. 0 Chart(n = 4).

1382

Journal of the American Statistical Association, December 1995

0.7 0.60.50.4 0.3 0.2 0.1


0.0

2
Figure 3.

been transformed into univariate data by data depth. In principle, a control chart consists of critical values, the upper control limit (UCL) and the lower control limit (LCL), for a sample quality measurement. Between the two control limits is the center line (CL), which represents no deviation from the prescribed distribution. Samples from the manufacturing process are recorded in time order, and their measurements are plotted on the chart. By convention, those sample points are connected by a straight line, so that the sequence of activities over time can be easily visualized. The region above UCL or below LCL is termed the out-ofcontrol region. A sample point falling in the out-of-control region is interpreted as evidence that the process is out of control, and a proper corrective action is sought. If the process is declared out-of-control when in fact it is not, we say that we have a "false alarm." The UCL and LCL are chosen so that the false alarm rate is small, say a. Thus a control chart at every plotted point is a visualization of an a-level test with the null hypothesis Ho: G = F. The rejection region in this test corresponds to the out-of-control region in the control chart. (A more detailed discussion of control charts can be found in, for example, Banks 1989 and Wadsworth, Stephen, and Godfrey 1986.) 3.1 The r Charts The r chart introduced in this section is similar to the X chart for univariate data. It is based on the statistics r* (.) of (5) and (6). First we discuss the X chart. Assume that the observations Y., . . , Ym and Xl, . . , X, are univariate and that our main concern is a possible shift in the mean in the Xi's. If G is a normal distribution with mean ,u and standarddeviation a, then the following is a typical X chart of Xi's:
Xi

Q Chart (n = 10).

Y[m] being the most central point. The smaller the order (or

the rank) of a point, the more outlying that point with respect to the underlying distribution G(.). We now proceed to list some statistics derived from data depth that are used in the next section to construct control charts. We write Y - G to indicate that the random variable Y follows the distribution G, and set
< DG(Y)lY G}

rG(Y)

= P{DG(Y)

(5)

and
rGm(y)

UCL ---------------------------------

(Yj) < DGm(Y), #{Yj IDGm

1,

, m}/m (6)
CL

Let F (.) denote the empirical distribution of the sample {Xl, ..., X . We can now define Q(G, F) = P{DG(Y) < DG(X) lY - G, X = EF[rG(X)]), n

LCL ----__------------------_-L---XTime 8

F} (7)

10

Q(G, Fn) and


Q(Gm, Fn)

IrG (Xi),

(8)

Z ni= 1

rGm (Xi)

(9)

3.

CONTROL CHARTS BASED ON DATADEPTH

We now introduce three control charts-the r chart, Q chart, and S chart-which can be viewed as the X chart, X chart, and CUSUM chart, after the multivariate data have

In this example, UCL = CL + Z,/2u, LCL = CL -Z/2, and CL = ,uif ,uis known and = Y otherwise. Here zc, indicates the upper a critical value of the standard normal dis1). The tribution; that is, a = P(Z > z,), where Z KV(O, X chart allows us to detect a possible mean shift from the prescribed value ,u or the existence of any trend or pattern in the sequence of observations. It is a simple but effective tool for monitoring an univariate process; however, it does not generalize easily to the multivariate case. For bivariate normal G, a bivariate X chart with elliptical contours as control limits, also called control ellipses, was studied by Alt and Smith (1988). Besides the restriction of normality, it is also difficult to visualize and detect any pattern or trend, because the chronological order of the observations

Liu:Control Charts for MultivariateProcesses

1383

5
0 ... . . . . . . . . . . . .. . . . . . . . . . . .

0-2

control when Ho is rejected or, equivalently, when an observation falls below a in the r chart. To explain the choice of CL = .5 and LCL = a in the r chart, we require the properties of rG (X) and rGm (X) established by Liu and Singh (1993) and listed in Proposition 3.1. Proposition 3.1. Assume that F = G and X - F. Let U[O,1] denote a uniform distribution supported in [0, 11, and let the notation ,L stand for convergence in law. If DG(X) has a continuous distribution, then a. rG(X) U[O,1], and m ,L U[O,1] along almost all b. as r oc, rGm(X) {y1,... ,Ym} sequences, provided that DGm() converges to DG (-) uniformly as m -* oo.

-10

Remark 3.1. The uniform convergence of DGm ( ) holds for the simplicial depth if G is absolutely continuous, and for the Mahalanobis depth if G has a bounded second ab_ _ ._ ._ ._ solute moment. Under Ho: F = G, Proposition 3.1 implies that the expected value of rG(X) is .5 and that of rGm (X) is .5 almost surely for all sequences .Y., .. , Ym} for large m. This justifies choosing .5 to be CL of the r chart. When rG(X) (or Figure 4. S Chart. rGm (X)) is much smaller than .5, there is doubt for Ho and evidence to support Ha, signaling a possible quality deteriis lost in the plot. Furthermore,when the dimension k goes oration. When rG(X) (or rGm (X)) is larger than .5, there beyond 3, it does not seem possible to follow the same idea is indication of a decrease in scale with perhaps a negligible to construct charts that are easy to visualize. location shift. This is seen as an improvement in quality, Our r chart is constructed as follows. Compute {rG(Xi), termed a gain in precision, and thus the process should not rG(X2) ...} (or rGm(X1), rGm(X2), ... if only Yi, . . ., Ym be viewed as out-of-control. Therefore, there is only an are available, but not G), following (5) (or (6)). The r chart LCL in the r chart. The uniform distribution of rG(X) (or is the plot of rG(Xi)'s (or rGm(Xi)'s) against time i, with rGm (X)) implies clearly that LCL should be a. CL = .5 and the control limit a. The process is declared out-of-control if rG() falls below a. Recall that a is the false alarm rate, which generally is close to zero, so the r 1 chart only has LCL = a but no UCL. The motivation and justification of the r chart as a control chart are given next. 0 The expression (6) shows that rGm(X) is an indication of how outlying X is with respect to the data cloud Yi's. A very small value of rGm(X) means that only a very small proportion of Yi's are more outlying than X. Thus X is at the "outskirt"and is not conforming to most of the central part of the good data set. Assuming that X - F, a small value of rGm() then suggests a possible deviation from G '-3 to F. Since rGm(.) is defined according to data depth, the possible deviation here can be a shift in "center" and/or an -4 increase in scale. (A detailed mathematical justification of this interpretationcan be derived from Liu and Singh 1993, -5 sec. 3.) Thus the r chart with LCL = a corresponds to an a-level test of the following hypotheses:

-15

-20 -

20

40

60

80

-2 -.-

-6
Ho: F = G vs. Ha: there is a location shift and/or a scale increase from G to F. (10) We observe that the alternative hypothesis is particularly suitable for detecting quality deterioration in quality control, as it presents a loss of accuracy and/or a loss of precision. This also justifies viewing the process as out-of-

-7
0 20 40 60 80

Figure5. S* Chart.

1384 Table 1. Simplicial Depth Values and Ranks X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 D(X) .0028 .2263 .1794 .0196 .1144 .0025 .0115 .0443 .0389 .0268 0 .1962 .1651 .1835 .0249 .0583 .1106 .0022 .2315 .0366 .0711 .0645 .0103 .0797 .0870 .0051 .0518 0 .0044 .0903 .1900 .1621 .1499 .0757 .0514 .0581 .1096 .0570 .2082 .1927 r(X) .082 .948 .840 .256 .670 .074 .196 .392 .358 .296 .022 .888 .812 .852 .280 .446 .658 .068 .962 .348 .502 .472 .186 .542 .566 .114 .424 .022 .102 .576 .866 .800 .768 .528 .420 .444 .656 .436 .920 .876 X 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 D(X) 0 0 0 .0107 0 .0041 0 0 .0111 .0261 0 0 0 0 0 0 0 0 0 0 .0932 0 0 0 0 0 0 .0123 0 .1984 .0250 .0087 0 0 0 0 0 0 0 0 r(X)

Journal of the American Statistical Association, December 1995

or
IQ(Gm, .022 .022 .022 .194 .022 .100 .022 .022 .194 .290 .022 .022 .022 .022 .022 .022 .022 .022 .022 .022 .588 .022 .022 .022 .022 .022 .022 .202 .022 .896 .280 .160 .022 .022 .022 .022 .022 .022 .022 .022 Fnl)l Q(Gm, Fn2),
...

.}

if only Yi, Ym are available. The main issue now is to set the correct values for CL and LCL in this Q chart. This depends on the choice of n. We shall see that when n is large, in view of the approximations described in Proposition 3.2, CL should be .5, whereas LCL should be (.5 - z (12n) -1/2) for plotting {Q(G, Fn )}'s and {.5 - zc, 12 [(1/m) + (1/n)]} for plotting {Q(Gm ,Fj)} ' (cf. Fig. 3). This approximation seems to be quite reasonable even when n is as small as 5. In practice, however, n can be even smaller, say 3 or 4. In this case, we may use the exact distributions for Q(G, Fn) given in Proposition 3.3. It turns out that for a small a value the Q chart should have CL = .5 and LCL = (n!a)l/n/n. First we describe the large n asymptotics. The Q chart corresponds to the a-level test based on Q(G, Fn) (or Q(Gm, Fn)) for testing the same set of hypotheses in (10). These are actually two of the several multivariate rank tests studied by Liu (1992) and Liu and Singh (1993). Their main asymptotic properties are as follows. Proposition 3.2. Assume that the conditions in Proposition 3.1 hold. Then a. as n -s oc, [Q(G,Fn) K(O, L 1/(12n)); and K{O, b. as min(m,n) oo,[Q(Gm,Fn) - 2] [(1/m) + (1/n)]/(12)}, under the following additional condition: if MD(.) is used to define Q(., .), and G has a bounded fourth absolute moment; if SD(.) is used to define Q(., *), and G is a one-dimensional distribution and its density is bounded above and below in a neighborhood of the median (or center). The statement (a) is a straightforward application of the central limit theorem, because Q(G, Fn) is just the average of n iid uniform random variables. The statement (b) has been established by Liu and Singh (1993). Although (b) has been proven only for R1 in the case of SD, it was conjectured by Liu and Singh (1993) with the support of simulation results that it actually holds for any k-dimensional G. It is now evident that CL and LCL should be set to the values indicated earlier when n is large. When n is small, the foregoing asymptotic results may not be applicable. Since LCL in this case is the ath quantile of the distribution of Q(G, Fn) = (1/n) En 1rG(Xi), we need the distribution of the average of uniform random variables (cf. Prop. 3.1). This follows directly from the formula for the distribution of the sum of uniform random variables provided in Proposition 3.3. Proposition 3.3. Let {U1, ... , Un} be an iid sample from U[O,1], and let Hn(t) be the distribution function of Ui < t}. Then for each nZ> Ui; that is, Hn(t) = P{ZE>n n= 1,2,...,H(t) 0 fort <0 and

Remark 3.2. Even though the r chart does not have the UCL to make its CL the center line of the in-control region, the CL here does serve as a reference point to allow us to observe whether a pattern or trend is developing in a sequence of samples. 3.2 The Q Charts The idea behind the Q chart is similar to that of the univariate X chart. When X1, X2, ... are univariate and G is normal, the X chart plots the averages of consecutive subsets of the Xi's. The X chart may prevent a false alarm when the process is actually in control but some individual sample point falls outside the control limits merely due to random fluctuations. This is an advantage over the X chart. In the multivariate setting we propose to plot the averages of subsets of the rG(Xi)'s (or rGm(Xi)'s). Assume that each subset has size n. In the notation of (8) and (9), the averages of the rG(Xi)'s and rGm(Xi)'s are given by Q(G, Fn) and Q(Gm,F ). Here Fn is the empirical distribution of the Q chart plots Xi's in the jth subset, j =1, 2,...The

Hn(t) LT =

Liu:Control Charts for MultivariateProcesses Table 2. .5315 .0650 .3330 .0415 .3910 .1320 .5975 .0220 .5090 .0220 Q-values (n = 4) .4255 .1635 .2815 .0670 .5860 .3395 .5400 .0220

1385

.7220 .0220

where (x)+ =0,


-

definedby
if x < 0;
xn if x > 0.

Sn(G) = and
Snt(Gm)

[rc(Xi)

(13)

This formula has been derived by Feller (1971). The expression (11) shows that Hn (.) is a piecewise polynomial. For our purpose, the most relevant part of the polynomial is Hn (t) = =
(_n

= l

2I

(14)

!tn,
1)

if 0 < t < 1;

Since Sn(G) n[Q(G,Fn) - 1/2] and Sn(Gm) n[Q(Gm, Fn) -1/2], we can immediately deduce the

! (tT-n(t-1)
= ! _ -n if 2<t<3.
(t_1)n+

),

if 1<t<2; n( 2 ) (t-2)n (12)

following from Proposition3.2. Proposition 3.4. Under the conditions described in Proposition3.2, we have
a. Sn(G) )L N(O,

To determine LCL for our Q chart for small n, we need to find the value wg, such that P(1/n En1= Ui < WC,) -=a or, equivalently, Hn(nma) = a. Formula (12) implies that for a < 1/n!, (nmw,)n/n! = a. Consequently, (n! a)1/n/n. This justifies our choice of LCL for -ce = the Q chart. For example, when n = 4 and a = .025, then W.025 = [24(.025)]1/4/4 = .220. This value is used as the LCL for the Q chart in Figure 2, where the Xi's are grouped in sets of 4. It is also clear that CL here should be .5, because it is the expected value of the average of n iid U[0, 1] random variables. Note that in practical situations in quality control, a is usually chosen to be .0027 or smaller. Thus when n is not greater than 4, the LCL w., is given by (n!a) 1/n/n as shown earlier. However, if for whatever reasons, a is chosen to be greater than 1/n!, then the proper piecewise formula in (12) should be used to determine the value for w. For example, for n = 4 and a = .1, we would need to solve the equation 1/4! ((4w.1)4 -4 ((4w.1- _1)4) = .1. The solution is unique, because Hn(.) is a strictly increasing function. In general, there are no convenient closed forms for solutions of polynomial equations of high orders. However, they can be easily obtained by using Newton's method or by using computer algorithms in, say, Mathematica.

b. Sn(Gm)
min(m, n)

__*L
-*

n/12) as n -* o0, and K(0, nm2[(1/m) + (1/n)]/12),

as

oo.

Proposition3.4 implies that the LCL for the S chart based on Sn(G) is -(z,(n/12)1/2) and the LCL for the S chartbased on Sn(Gm) is -{z, Vn2[(l/m) + (1/n)]/12}. We observe that the control limit here is a curve rather than a line, as shown in Figure 4. In fact, the control limit curves down following . When n is large, the S paper size, which is chart can easily exceed the standard all the impractical. Thus it is convenientto standardize CUSUM'sto have a straightline controllimit (see Fig. 5). This means plotting Sn(G) = Sn(G)/ m/12 or Sn*(Gm) = Sn(Gm)/ /n2[(1/m) + (1/n)]/12 for n =1,2 .... This S* charthas CL = OandLCL =-z,.
4. RESULTS SIMULATION

dataset to illustratethe In this sectionwe use a bivariate constructionof the control charts discussed earlier. The is carriedout using S languageon a SUN worksimulation station. The data set is obtainedas follows. Let G - V((8)0 . We generate540 samplepoints from G, labeling ( ?1)) the first 500 as Y1,. . , Y500andthe last 40 as X1, . , X40. We also generate 40 sample points from the distribution KV((2), (o4 )) and label these 40 samplepoints as 3.3 The S Charts herehavebeen chosento be X41,.. ., X80.The distributions We shall use the univariate CUSUM chart to motivate normaljust to make the evaluationof the outcome easier. of the charts. for the applicability the S chart. When the Xi's are univariate, the simplest Normalityis not required CUSUM chart is basically the plot of En (Xi - A), which Note thatthereis a clear meanshift and a scale increasein for the last 40 Xi's. In principle,we should reflects the pattern of the total deviation from the expected the distribution value. It is more effective than the X chart or the X chart in expect all our chartsto detect this change. As Figures 1-5 detecting small process change and is perhaps the most used show,this is indeedthe case. For each Xi, we computeits simplicialdepth,using the chart. In the multivariate setting, the idea of CUSUM chart naturally suggests plotting the values Sn (G) and Sn (Gm) FORTRANalgorithmdevelopedby Rousseeuw and Ruts
Table 3. .4112 .5336 .3506 .6714 Q-values (n = 10) .0910 .02200 .1840 .06160

1386 Table 4. -.418 -1.366 -.550 -1.680 -.810 -4.900 -9.114 -12.582 .030 -.978 -.578 -1.380 -1.288 -5.378 -9.592 -12.922 .370 -.666 -.892 -1.112 -1.766 -5.856 -10.070 -13.400 .126 -.314 -.850 -1.084 -2.072 -6.334 -10.548 -13.878 .296 -.534 -.784 -1.164 -2.550 -6.812 -11.026 -14.356 S-values

Journal of the American Statistical Association, December 1995

-.130 -.588 -1.170 -1.220 -2.950 -7.290 -11.504 -14.834

-.434 -.430 -1.246 -1.064 -3.428 -7.768 -11.982 -15.312

-.542 -.862 -1.724 -1.128 -3.906 -8.246 -12.28 -15.79

-.684 -.400 -2.122 -.708 -4.212 -8.724 -12.758 -16.268

-.888 -.552 -2.046 -.332 -4.422 -9.202 -12.362 -16.746

(1992). This algorithmis highly efficient,because it requires only 0(m logm) steps in computingthe simplicial depthsfor m datapoint,insteadof O(m4) steps as required by direct computation based on solving systems of linear equations.The simplicialdepthvaluesof Xi's arerecorded in the first column of Table 1. Based on these values we can computeall rGm(Xi) using (6), andrecordthem in the second column of Table 1. Figure 1 gives the plot of the
rGm(Xi)'S

with CL = .5 and LCL = .025, which is the

which is -1.96 in this case. For both figures, CL equals zero. In the simulation here,we havechosenm = 500. Clearly, largervaluesof m give betterapproximations to the limiting distributions statedin Propositions3.1, 3.2, and 3.4 and to LCL'sfor the r, Q, and S charts. Our experienceshows thatthe approximation resultsare reasonable when m is as small as 50 in the bivariatecase. We would recommend largervalues for higher-dimensional observations.
5. CONCLUDING REMARKS

a value that we choose for all five charts.It clearly shows that the process is out-of-control in the second half, with most of the rGm(Xi)'s falling below LCL. The few false alarmsin the first half of the Xi's shouldbe attributed to randomfluctuations in the same mannerthat false alarms are characterized in a univariate X chart. Figures 2 and 3 show the Q chartswith the group size

In additionto the X, X and CUSUM charts, there are more complicatedcontrolchartsfor monitoringa univariate processmeanchange,suchas the movingaveragechart, the EWMA chart and the CUSUM chart with a V mask (cf. Wetherill1977). It would be interestingto develop our chartsfurtheralong these lines. For example,a mov= 1,2,...} n = 4 and n = 10. The {Q(Gm,Fn),j are ing average chartbasedon the r*(.) valuesin (5) or (6) can computedaccordingto the definition(9) and are recorded be readily constructed. To obtainpropercontrollimits for in Tables2 and 3. For Figure2, the CL has been set to .5 this chart, one may apply the moving blocks bootstrap techandthe LCLhas been set to .220, followingProposition 3.3. niquesof Liu and Singh (1992) to developthe distributions In Figure3, the resultsin Proposition 3.2 lead to the choice of the moving averages. of CL = .5 and LCL = {+ (1/n)]}, zc,1/12[(1/m) As discussedby Alt and Smith(1988), the classicalmulwhich turns out to be .3193 when a = .025. Both plots tivariatecontrol chartsbased on the x2 or Hotelling'sT2 clearlyshow thatthe processis out-of-control in the second statistics(Hotelling 1949) are valid only when the process half. We also observe that the averagingof rGmH()'s in follows a normaldistribution and can be used to detect a Q has eliminatedthe randomfluctuations in the mean shift only. When the process is bivariate,a control appearing first half of the r chartin Figure 1. In principle,because ellipse may be used instead of the foregoing two charts. the underlying distribution hereis specified,we can use for The control ellipse approachalso requiresthe normality examplethe computingpackageMathematica to compute assumptionfor the underlyingprocess, and it loses the the exact values of DG(.)'S and hence Q(G, Fn), j = 1, 2, ... orderof the plottedobservations. In a differchronological andgive the corresponding Q chart.The differenceof this ent direction,one may use separateX chartsfor individual chartand our Figure2 appearsto be negligible. componentvariablesand then apply Bonferroni'sinequalFigure4 illustratesthe S chartof the Sn(Gm) values in ity to providea boundfor the level of the combinedtest. Table4. Since the S values are not standardized here, the As pointedout by Alt (1982), this inequalityis not sharp LCL is -z./( To keep the chart enoughto give an accurate level unlessthe component vari2/12)[(1/m)+(1/n)]. withinstandard papersize, we needto adopta muchsmaller ables are independent. More precisely,this approach tends scale for the S axis. By contrast,in Figure5, the S values to overestimate the probability for asserting thatthe process have been standardized, and hence no severe rescalingis is in control. needed.The standardized S valuesarerecordedin Table5, Since the sample Mahalanobis depth definedin (4) and labeledas S*. The controllimit LCLis a straightline -z., Hotelling'sT2 arebothmeasuring the quadratic distanceof
Table 5. S*-values -1.447 -1.411 -.407 -1.014 -.421 -2.264 -3.816 -4.840 .073 -.966 -.418 -.819 -.661 -2.459 -3.980 -4.932 .738 -.632 -.630 -.649 -.895 -2.650 -4.142 -5.075 .217 -.287 -.587 -.623 -1.037 - 2.837 -4.300 -5.216 .456 -.471 -.530 -.659 -1.261 -3.020 -4.457 -5.355 -.183 -.501 -.775 -.680 -1.442 -3.200 -4.610 -5.492 -.564 -.355 -.809 -.585 -1.656 -3.377 -4.762 -5.627 -.659 -.691 -1.098 -.611 -1.866 -3.550 -4.840 -5.760 -.783 -.312 -1.327 -.378 -1.989 3.721 -4.987 -5.892 -.963 -.419 -1.257 -.175 -2.066 -3.889 -4.794 -6.022

Liu:Control Charts for MultivariateProcesses

1387 book of Statistics, 7, eds. P. R. Krishnaiah and C. R. Rao, Amsterdam: Elsevier, pp. 333-351. Banks, J. (1989), Principles of Quality Control, New York: John Wiley. Feller, W. (1971), Introduction to Probability Theory and Its Applications (2nd ed.), New York: John Wiley. Hotelling, H. (1949), "MultivariateQuality Control," in Techniques in Statistical Analysis, eds. C. Eisenhart, M. W. Hastay, and W. A. Wallis, New York: McGraw-Hill. Liu, R. (1990), "On a Notion of Data Depth Based on Random Simplices," The Annals of Statistics, 18, 405-414. (1992), "Data Depth and MultivariateRank Tests," in L1 -Statistical Analysis and Related Methods, ed. Y. Dodge, Amsterdam: Elsevier, pp. 279-294. Liu, R., and Singh, K. (1992), "Moving Blocks Bootstrap and Jackknife Capture Weak Dependence," in Exploring the Limits of Bootstrap, eds. R. LePage and L. Billard, New York: John Wiley, pp. 225-248. (1993), "A Quality Index Based on Data Depth and Multivariate Rank Tests," Jourmal of the American Statistical Association, 88, 252260. Mahalanobis, P. C. (1936), "On the Generalized Distance in Statistics," Proceedings of the National Academy India, 12, 49-55. Rousseeuw, P. J., and Ruts, I. (1992), "Bivariate Simplicial Depth," technical report, University of Antwerp, Dept. of Mathematics and Computer Science. Tukey, J. W. (1975), "Mathematics and Picturing Data' Proceedings of the 1975 International Congress of Mathematics, 2, 523-531. Wadsworth, H., Stephen, K. S., and Godfrey, A. B. (1986), Modern Methods for Quality Control and Improvement,New York: John Wiley. Wetherill, G. B. (1977), Sampling Inspection and Quality Control (2nd ed.), New York: Chapman and Hall.

a point to its mean,one may attemptto equateHotelling's T2 chart to our r or Q charts when Mahalanobisdepth is used. Note that in our approach,Mahalanobisdepth serves only as a steppingstone to reducethe observations to "ranks."What we chart here are the "ranks" but not the Mahalanobis depth values themselves. The determinationof the control limit in Hotelling'sT2 plot requires the exact samplingdistribution of Hotelling'sT2 statistic, whereas this is not needed in our charts due to the further transformation of statisticsinto ranks. Consequently, our chartsbased on Mahalanobis depth are differentfrom the HotellingT2 plots. Regarding the choice of datadepth for our charts,we note that if the underlyingdistribution is close to elliptical, then it is more efficientto use Mahalanobisdepth. Otherwise,the more geometrictype of depth,suchas majority depth,simplicialdepth,andTukey's depth,may be more desirable, becausethey do not require momentconditions.
[Received September 1993. Revised January 1995.]

REFERENCES
Alt, F. (1982), "Multivariate QualityControl: State of the Art,"ASQC
Annual Quality Congress Transactions, pp. 886-893.

Alt, F., and Smith, N. (1988), "Multivariate ProcessControl," in Hand-

Das könnte Ihnen auch gefallen