Sie sind auf Seite 1von 12

ANALYSIS OF PrOpOrtIONS (ANOp)

In many situations, you need to examine differences among several proportions. The ANOP provides a confidence interval type of approach that allows you to determine which, if any, of the c groups has a proportion significantly different from the overall average of all the group proportions combined. The ANOP is really an extension of the ANOM procedure developed in Section 11.1. In that section, you focused on evaluating several groups of numerical data. In this section, you will learn how to evaluate several groups of categorical data. Instead of looking at the lower and upper limits of a confidence interval, you will be studying which of the c group proportions are not contained in an interval formed between a lower decision line and upper decision line. Any individual group proportion not contained in this interval is deemed significantly larger than the overall average of all group proportions if it lies above the upper decision line. Similarly, any group proportion that falls below the lower decision line is declared significantly smaller than the overall average of all group proportions. As with confidence interval estimation, to compute the upper and lower decision lines (UDL and LDL) for ANOP, you must add and subtract a measure of sampling error around the statistic of interest. That is, FOrMAt fOr ALL COnfIdEncE IntErVALS statistic { sampling error statistic { 1critical value21standard error of the statistic2 OBtAInInG tHE UDL And LDL p { hc, where c = number of groups in the study j = representation for a particular group; j = 1, 2, c , c nj = sample size for group j n = total number of observations where each of the nj sample sizes are equal; n = n1 + n2 + g + nc n = average sample size over all c groups; n = n > c p = X1 + X2 + g + Xc X = n n1 + n2 + g + nc p = pooled proportion; the overall average of all c sample proportions; p11 - p2 1c - 12 c B n B

For ANOP this is demonstrated in the following equations:

hc, = critical value of Nelsons h statistic with c groups and large sample sizes, nj, per group obtained using the infinity row in the table so that UppEr DEcISIOn LInE fOr AnOp UDL = p + hc, p11 - p2 1c - 12 c B n B

12-1

12-2

CHAPTER 12 Chi-Square and Nonparametric Tests

and LOwEr DEcISIOn LInE fOr AnOp LDL = p - hc, p11 - p2 1c - 12 c B n B

Note that Nelsons h statistic is found from the combination of c groups with the infinity row for common group sample sizes. This is because the ANOP procedure expects the sample sizes to be large. The major assumption for applying the ANOP procedure is that the c group sample sizes be sufficiently large to enable the expected number of items of interest in each group to be at least 5. That is, nj pj 5 and also nj11 - pj2 5. To demonstrate how the ANOP procedure is used, suppose that the Rosen & Berg food processing plant funnels its baked cookies to sealing machines used for product packaging. In monitoring the process, Livia Salvador, the product manager, finds that the quality of the seals, as defined by the proportion of defective seals, is at an unacceptably high level. Because she feels that the temperature setting on the sealing machine may affect the quality of the seals, she designs an experiment in which five different temperature settings are evaluated. At each temperature setting, 500 boxes are sealed and examined for their seal quality. The following table presents the number of boxes with defective and non-defective package seals at each of the five temperature settings.

Cross-Classification of Observed Frequencies from the Sealing Machine TemperatureSetting Experiment

Packaging Result Defective seals Non-defective seals Totals

Sealing Machine Temperature Setting No. 1 No. 2 No. 3 No. 4 22 20 34 23 478 480 466 477 500 500 500 500

No. 5 41 459 500

Totals 140 2,360 2,500

From the data you observe that with five temperature settings used to evaluate the sealing process for samples of 500 different boxes per temperature setting, there is no problem with the ANOP procedure assumption regarding expected numbers of defective and non-defective seals. To apply the ANOP procedure, you first need to compute the key statistics from the experiment as well as the critical value of Nelsons h statistic. From the data displayed in the table above, you compute the following: p = X1 + X2 + g + Xc X = n n1 + n2 + g + nc 122 + 20 + 34 + 23 + 412 140 = = = 0.056 1500 + 500 + 500 + 500 + 5002 2,500

Because the estimated overall proportion of defective seals is 0.056, its complement, 11 - p2, or 0.944, is the estimated proportion of non-defective or conforming seals. The critical values of Nelsons h statistic for obtaining the 95% UDL and LDL are found in the following table.

Analysis of Proportions (Anop)

12-3

Selected Critical Values of Nelsons h Statistic for Obtaining 95% Upper and Lower Decision Lines (UDL and LDL)

Sample Size per Group, nj 4 5 6 7 8 10 12 15 20 infinity

3 2.79 2.67 2.60 2.55 2.52 2.49 2.46 2.43 2.40 2.34

4 2.85 2.74 2.68 2.65 2.62 2.59 2.56 2.55 2.53 2.47

Number of Groups, c 5 6 2.88 2.79 2.74 2.71 2.69 2.66 2.64 2.62 2.61 2.56 2.91 2.83 2.79 2.77 2.74 2.72 2.70 2.68 2.66 2.62

7 2.94 2.87 2.83 2.80 2.79 2.76 2.75 2.73 2.71 2.68

Note: Values in italics were obtained through linear interpolation. Source: Table 2 of L. S. Nelson, Exact Critical Values for Use with the Analysis of Means, Journal of Quality Technology, 15(1), 1983. Reprinted with permission of the American Society for Quality.

Given that you have five temperature settings 1c = 52 and samples of 500 boxes per setting 1nj = 5002, the critical value of Nelsons h statistic for the sealing machine temperaturesetting experiment is h5, = 2.56. You now compute the UDL and LDL as follows: UDL = p + hc, p11 - p2 1c - 12 c B n B

= 0.056 + 12.562 and LDL = p - hc,

= 0.056 + 0.0235 = 0.0795

10.056210.9442 4 B 500 B5

= 0.056 - 0.0235 = 0.0325

= 0.056 - 12.562

p11 - p2 1c - 12 c B n B

10.056210.9442 4 B 500 B5

From the cross-classification table, the proportions of defective seals out of 500 boxes sampled for each of the five sealing machine temperature settings are p1 = 0.044 p2 = 0.040 p3 = 0.068 p4 = 0.046 p5 = 0.082 The following is a graphical display for the ANOP.

12-4

CHAPTER 12 Chi-Square and Nonparametric Tests

The proportions of defective box seals for each of the five sealing machine temperature settings are plotted on the vertical axis. From top to bottom, the three horizontal lines represent the UDL, p (the pooled proportion or overall average of all five proportions of defective box seals combined), and the LDL. From this figure, you observe that the proportion of defective boxes produced by temperature setting 5 is significantly higher than the average proportion, based on all five temperature settings combined. You would suggest to the product manager Livia Salvador that temperature setting 5 should not be used.

12.6 McNemar Test for the Difference Between Two Proportions (Related Samples)

12-5

12.6 McNemar Test for the Difference Between


Two Proportions (Related Samples)
In Section 10.3, you used the Z test, and in Section 12.1, you used the chi-square test to examine whether there was a difference in the proportion of items of interest between two populations. These tests require independent samples from each population. However, sometimes when you are testing differences between the proportion of items of interest, the data are collected from repeated measurements or matched samples. For example, in marketing, these situations can occur when you want to determine whether there has been a change in attitude, perception, or behavior from one time period to another. To test whether there is evidence of a difference between the proportions when the data have been collected from two related samples, you can use the McNemar test. Table 12.16 presents the 2 * 2 table needed for the McNemar test.
T A B L E 1 2 . 1 6 2 * 2 Contingency Table for the McNemar Test

CONDITION (GROUP) 1 Yes No Totals where A B C D n = = = = =

CONDITION (GROUP) 2 Yes No A C A + C B D B + D

Totals A + B C+D n

number of respondents who answer yes to condition 1 and yes to condition 2 number of respondents who answer yes to condition 1 and no to condition 2 number of respondents who answer no to condition 1 and yes to condition 2 number of respondents who answer no to condition 1 and no to condition 2 number of respondents in the sample

The sample proportions are A + B = proportion of respondents in the sample who answer yes to condition 1 n A + C p2 = = proportion of respondents in the sample who answer yes to condition 2 n p1 = The population proportions are p1 = proportion in the population who would answer yes to condition 1 p2 = proportion in the population who would answer yes to condition 2 When testing differences between the proportions, you can use a two-tail test or a one-tail test. In both cases, you use a test statistic that approximately follows the normal distribution. Equation (12.9) presents the McNemar test statistic used to test H0: p1 = p2. McNEMAr TESt StAtIStIc ZSTAT = B - C 2B + C (12.9)

where the ZSTAT test statistic is approximately normally distributed.

To illustrate the McNemar test, suppose that the business problem facing a cell phone provider was to determine the effect of a marketing campaign on the brand loyalty of cell phone customers.

12-6

CHAPTER 12 Chi-Square and Nonparametric Tests

Data were collected from n = 600 participants. In the study, the participants were initially asked to state their preferences for two competing cell phone providers, Sprint and Verizon. Initially, 282 panelists said they preferred Sprint and 318 said they preferred Verizon. After exposing the set of participants to an intensive marketing campaign strategy for Verizon, the same 600 participants are again asked to state their preferences. Of the 282 panelists who previously preferred Sprint, 246 maintained their brand loyalty, but 36 switched their preference to Verizon. Of the 318 participants who initially preferred Verizon, 306 remained brand loyal, but 12 switched their preference to Sprint. These results are organized into the contingency table presented in Table 12.17. You use the McNemar test for these data because you have repeated measurements from the same set of panelists. Each participant gave a response about whether he or she preferred
T A B L E 1 2 . 1 7 Brand Loyalty of Cell Phone Providers

BeFORe MARKeTINg CAmPAIgN Sprint Verizon Total

AFTeR MARKeTINg CAmPAIgN Sprint Verizon 246 36 12 306 258 342

Total 282 318 600

Sprint or Verizon before exposure to the intensive marketing campaign and then again after exposure to the campaign. To determine whether the intensive marketing campaign was effective, you want to investigate whether there is a difference between the population proportion who favor Sprint before the campaign, p1, versus the proportion who favor Sprint after the campaign, p2. The null and alternative hypotheses are H0: p1 = p2 H1: p1 p2 Using a 0.05 level of significance, the critical values are - 1.96 and + 1.96 (see Figure 12.16), and the decision rule is Reject H0 if ZSTAT 6 - 1.96 or if ZSTAT 7 + 1.96; otherwise, do not reject H0.
F I G ur E 1 2 . 1 6 Two-tail McNemar test at the 0.05 level of significance
Reject H0 .025 1.96

Do not reject H0

Reject H0 .025 +1.96

For the data in Table 12.17, A = 246 B = 36 C = 12 D = 306 so that p1 = A + B 246 + 36 282 A + C 246 + 12 258 = = = 0.47 and p2 = = = = 0.43 n n 600 600 600 600 B - C 2B + C 36 - 12 236 + 12 24 248

Using Equation (12.9), Z = = = = 3.4641

12.6 McNemar Test for the Difference Between Two Proportions (Related Samples)

12-7

Because ZSTAT = 3.4641 7 1.96, you reject H0. Using the p -value approach (see Figure 12.17), the p-value is 0.0005. Because 0.0005 6 0.05, you reject H0. You can conclude that the population proportion who prefer Sprint before the intensive marketing campaign is different from the population proportion who prefer Sprint after exposure to the intensive Verizon marketing campaign. In fact, from Figure 12.17, observe that preference for Verizon increased after exposure to the intensive marketing campaign.

F I G ur E 1 2 . 1 7 Excel results for the McNemar test for brand loyalty of cell phone providers

Problems for Section 12.6


LEArnInG tHE BAsIcs 12.60 Given the following table for two related samples:
GROUP 2 Yes No 46 25 16 59 62 84 preferred Brand A increased as a result of an advertising campaign. A random sample of 200 coffee drinkers was selected. The results indicating preference for Brand A or Brand B prior to the beginning of the advertising campaign and after its completion are shown in the following table: PReFeReNce PRIOR TO ADVeRTISINg CAmPAIgN Brand A Brand B Total PReFeReNce AFTeR COmPLeTION OF ADVeRTISINg CAmPAIgN Brand A Brand B 101 9 22 68 123 77

GROUP 1 Yes No Total

Total 71 75 146

a. Compute the McNemar test statistic. b. At the 0.05 level of significance, is there evidence of a difference between group 1 and group 2?

Total 110 90 200

AppLYInG tHE COncEpts SELF 12.61 A market researcher wanted to determine Test whether the proportion of coffee drinkers who

a. At the 0.05 level of significance, is there evidence that the proportion of coffee drinkers who prefer Brand A is

12-8

CHAPTER 12 Chi-Square and Nonparametric Tests

lower at the beginning of the advertising campaign than at the end of the advertising campaign? b. Compute the p-value in (a) and interpret its meaning. 12.62 Two candidates for governor participated in a televised debate. A political pollster recorded the preferences of 500 registered voters in a random sample prior to and after the debate: PReFeReNce PRIOR TO DebATe Candidate A Candidate B Total PReFeReNce AFTeR DebATe Candidate A Candidate B Total 269 21 290 36 174 210 305 195 500

12.64 The CEO of a large metropolitan health-care facility would like to assess the effect of the recent implementation of the Six Sigma management approach on customer satisfaction. A random sample of 100 patients is selected from a list of patients who were at the facility the past week and also a year ago: SATISFIeD NOW Yes No 67 5 20 8 87 13

SATISFIeD LAST YeAR Yes No Total

Total 72 28 100

a. At the 0.01 level of significance, is there evidence of a difference in the proportion of voters who favored Candidate A prior to and after the debate? b. Compute the p-value in (a) and interpret its meaning. 12.63 A taste-testing experiment compared two brands of Chilean merlot wines. After the initial comparison, 60 preferred Brand A, and 40 preferred Brand B. The 100 respondents were then exposed to a very professional and powerful advertisement promoting Brand A. The 100 respondents were then asked to taste the two wines again and declare which brand they preferred. The results are shown in the following table: PReFeReNce AFTeR ADVeRTISINg Brand A Brand B 55 5 15 25 70 30

a. At the 0.05 level of significance, is there evidence that satisfaction was lower last year, prior to introduction of Six Sigma management? b. Compute the p-value in (a) and interpret its meaning. 12.65 The personnel director of a large department store wants to reduce absenteeism among sales associates. She decides to institute an incentive plan that provides financial rewards for sales associates who are absent fewer than five days in a given calendar year. A sample of 100 sales associates selected at the end of the second year reveals the following:

PReFeReNce PRIOR TO ADVeRTISINg Brand A Brand B Total

Total 60 40 100

YeAR 1 * 5 Days Absent # 5 Days Absent Total

YeAR 2 * 5 Days # 5 Days Absent Absent 32 4 25 39 57 43

Total 36 64 100

a. At the 0.05 level of significance, is there evidence that the proportion who preferred Brand A was lower before the advertising than after the advertising? b. Compute the p-value in (a) and interpret its meaning.

a. At the 0.05 level of significance, is there evidence that the proportion of employees absent fewer than five days was lower in year 1 than in year 2? b. Compute the p-value in (a) and interpret its meaning.

12.7 Chi-Square Test for the Variance or Standard Deviation

12-9

12.7 Chi-Square Test for the Variance or Standard Deviation


When analyzing numerical data, sometimes you need to draw conclusions about the population variance or standard deviation. For example, recall that in the cereal-filling process described in Section 9.1, you assumed that the population standard deviation, s, was equal to 15 grams. To see if the variability of the process has changed, you need to test whether the standard deviation has changed from the previously specified level of 15 grams. Assuming that the data are normally distributed, you use the x2 test for the variance or standard deviation defined in Equation (12.10) to test whether the population variance or standard deviation is equal to a specified value. x2 TESt fOr tHE VArIAncE Or StAndArd DEVIAtIOn where n = sample size S2 = sample variance s2 = hypothesized population variance The test statistic x2 STAT follows a chi-square distribution with n - 1 degrees of freedom. To apply the test of hypothesis, return to the cereal-filling example described in Section 9.1. You are interested in determining whether the standard deviation has changed from the previously specified level of 15 grams. Thus, you use a two-tail test with the following null and alternative hypotheses: H1: s2 H0: s2 = 225 1that is, s = 15 grams2 225 1that is, s 15 grams2 x2 STAT = 1n - 12S2 s2 (12.10)

If you select a sample of 25 cereal boxes, you reject the null hypothesis if the computed x2 STAT test statistic falls into either the lower or upper tail of a chi-square distribution with 25 - 1 = 24 degrees of freedom, as shown in Figure 12.18. From Equation (12.10), observe that the x2 STAT test statistic falls into the lower tail of the chi-square distribution if the sample standard deviation (S) is sufficiently smaller than the hypothesized s of 15 grams, and it falls into the upper tail if S is sufficiently larger than 15 grams. From Table 12.18 (extracted from Table E.4), if you select a level of significance of 0.05, the lower and upper critical values are 12.401 and 39.364, respectively. Therefore, the decision rule is
2 2 2 Reject H0 if x2 STAT 6 xa>2 = 12.401 or if xSTAT 7 x1 - a>2 = 39.364;

otherwise, do not reject H0.

F I G ur E 1 2 . 1 8 Determining the lower and upper critical values of a chi-square distribution with 24 degrees of freedom corresponding to a 0.05 level of significance for a two-tail test of hypothesis about a population variance or standard deviation

0 Region of Rejection

.025 12.401

.95

.025 39.364

Region of Nonrejection

Region of Rejection

12-10 CHAPTER 12

Chi-Square and Nonparametric Tests

T A B L E 1 2 . 1 8 Finding the Critical Values Corresponding to a 0.05 Level of Significance for a TwoTail Test from the ChiSquare Distribution with 24 Degrees of Freedom

Cumulative Area .005 .01 .025 .05 .10 .90 .95 .975

Upper-Tail Areas Degrees of Freedom 1 2 3 . . . 23 24 25 .995 .99 .975 .95 .90 .10 .05 .025 c c 0.001 0.004 0.016 2.706 3.841 5.024 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 . . . . . . . . . . . . . . . . . . . . . . . . 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646

Source: Extracted from Table E.4.

Suppose that in the sample of 25 cereal boxes, the standard deviation, S , is 17.7 grams. Using Equation (12.10), x2 STAT = 1n - 12S2 s2 = 125 - 12117.722 11522 = 33.42

2 2 Because x2 0.025 = 12.401 6 xSTAT = 33.42 6 x0.975 = 39.364, or because the p@value = 0.0956 7 0.05 (see Figure 12.19), you do not reject H0. You conclude that there is insufficient evidence that the population standard deviation is different from 15 grams.

F I G ur E 1 2 . 1 9 Worksheet for testing the variance in the cereal-filling process

Figure 12.19 displays the COMPUTE worksheet of the Chi-Square Variance workbook. Create this worksheet using the instructions in Section EG12.7.

In testing a hypothesis about a population variance or standard deviation, you assume that the values in the population are normally distributed. Unfortunately, the test statistic discussed in this section is very sensitive to departures from this assumption (i.e., it is not a robust test). Thus, if the population is not normally distributed, particularly for small sample sizes, the accuracy of the test can be seriously affected.

12.7 Chi-Square Test for the Variance or Standard Deviation

12-11

Problems for Section 12.7


LEArnInG tHE BAsIcs 12.66 Determine the lower- and upper-tail critical values of x2 for each of the following two-tail tests: a. a = 0.01, n = 26 b. a = 0.05, n = 17 c. a = 0.10, n = 14
12.67 Determine the lower- and upper-tail critical values of x2 for each of the following two-tail tests: a. a = 0.01, n = 24 b. a = 0.05, n = 20 c. a = 0.10, n = 16 12.68 In a sample of n = 25 selected from an underlying normal population, S = 150. What is the value of x2 STAT if you are testing the null hypothesis H0: s = 100? 12.69 In a sample of n = 16 selected from an underlying normal population, S = 10. What is the value of x2 STAT if you are testing the null hypothesis H0: s = 12? 12.70 In Problem 12.81, how many degrees of freedom are there in the hypothesis test? 12.71 In Problems 12.69 and 12.70, what are the critical values from Table E.4 if the level of significance is a = 0.05 and H1 is as follows: a. s 12? b. s 6 12? 12.72 In Problems 12.69, 12.70, and 12.71, what is your statistical decision if H1 is a. s 12? b. s 6 12? 12.73 If, in a sample of size n = 16 selected from a very left-skewed population, the sample standard deviation is S = 24, would you use the hypothesis test given in Equation (12.10) to test H0: s = 20? Discuss. following: What was the cost of all repairs performed on your car last year? In order to determine the sample size necessary, the researcher needs to provide an estimate of the standard deviation. Using his past experience and judgment, he estimates that the standard deviation of the amount of repairs is $200. Suppose that a small-scale study of 25 auto owners selected at random indicates a sample standard deviation of $237.52. a. At the 0.05 level of significance, is there evidence that the population standard deviation is different from $200? b. What assumption do you need to make in order to perform this test? c. Compute the p-value in part (a) and interpret its meaning. 12.76 The marketing manager of a branch office of a local telephone operating company wants to study characteristics of residential customers served by her office. In particular, she wants to estimate the mean monthly cost of calls within the local calling region. In order to determine the sample size necessary, she needs an estimate of the standard deviation. On the basis of her past experience and judgment, she estimates that the standard deviation is equal to $12. Suppose that a small-scale study of 15 residential customers indicates a sample standard deviation of $9.25. a. At the 0.10 level of significance, is there evidence that the population standard deviation is different from $12? b. What assumption do you need to make in order to perform this test? c. Compute the p-value in (a) and interpret its meaning. 12.77 A manufacturer of doorknobs has a production process that is designed to provide a doorknob with a target diameter of 2.5 inches. In the past, the standard deviation of the diameter has been 0.035 inch. In an effort to reduce the variation in the process, various studies have resulted in a redesigned process. A sample of 25 doorknobs produced under the new process indicates a sample standard deviation of 0.025 inch. a. At the 0.05 level of significance, is there evidence that the population standard deviation is less than 0.035 inch in the new process? b. What assumption do you need to make in order to perform this test? c. Compute the p-value in (a) and interpret its meaning.

AppLYInG tHE COncEpts 12.74 A manufacturer of candy must monitor the temperature at which the candies are baked. Too much variation will cause inconsistency in the taste of the candy. Past records show that the standard deviation of the temperature has been 1.2F. A random sample of 30 batches of candy is selected, and the sample standard deviation of the temperature is 2.1F. a. At the 0.05 level of significance, is there evidence that the population standard deviation has increased above 1.2F? b. What assumption do you need to make in order to perform this test? c. Compute the p-value in (a) and interpret its meaning.
12.75 A market researcher for an automobile dealer intends to conduct a nationwide survey concerning car repairs. Among the questions included in the survey is the

EG12.7CHI-SQuArE TEst for the VArIAncE or StAndArd DEvIAtIOn EXcEL GuIdE


PHStat2 Use the Chi-Square Test for the Variance procedure to perform this chi-square test. For example, to perform the test for the Section 12.7 cereal-filling process example, select PHStat One-Sample Tests Chi-Square Test

12-12 CHAPTER 12

Chi-Square and Nonparametric Tests

for the Variance. In the procedures dialog box (shown below): 1. Enter 225 as the Null Hypothesis. 2. Enter 0.05 as the Level of Significance. 3. Enter 25 as the Sample Size. 4. Enter 17.7 as the Sample Standard Deviation. 5. Select Two-Tail Test. 6. Enter a Title and click OK.

The procedure creates a worksheet similar to Figure 12.19. In-Depth Excel Use the CHISQ.INV.RT and CHISQ. DIST.RT functions to help perform the chi-square test for the variance or standard deviation. Enter CHISQ.INV.RT(1 half area, degrees of freedom) and enter CHISQ.INV.RT(half area, degrees of freedom) to compute the lower and upper critical values. Enter CHISQ.DIST.RT( X2 test statistic, degrees of freedom) to compute the p-value. Use the COMPUTE worksheet of the Chi-Square Variance workbook, shown in Figure 12.19, as a template for performing the chi-square test. The worksheet contains the data for the cereal-filling process example. To perform the test for other problems, change the null hypothesis, level of significance, sample size, and sample standard deviation in the cell range B4:B7. (Open to the COMPUTE_ FORMULAS worksheet to examine the details of all formulas used in the COMPUTE worksheet.)

Das könnte Ihnen auch gefallen