Sie sind auf Seite 1von 18

# CHAPTER TWELVE THE KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BY RANKS TEST (H TEST)

Introduction Method Example for Small Samples Example for Large Samples ( n j The Problem with Tied Ranks Multiple Comparisons Following a Significant H Test Chapter Summary Worked Example on the Kruskal-Wallis H Test Exercises on the Kruskal-Wallis H Test
>5)

12.1: Introduction The Kruskal-Wallis One-Way Analysis of Variance by Ranks test is the nonparametric equivalent of the parametric One-Way analysis of variance (F) test that we discussed in Chapter 6. This means that the test is used when we are dealing with three or more independent samples. It requires that measurement be at least on an ordinal scale. The KruskalWallis technique tests the null hypothesis that all the k independent samples (k > 2) are coming from identical populations with respect to averages. 12.2: Method In the computation of the statistic in the KruskalWallis test, all the observations from the k samples are combined and ranked in a single series. The lowest score is given a rank of 1, the next lower score a rank of 2, etc., and the highest score is given a rank of n , the total number of observations. Assuming that there are equal sample sizes, then the sum of the ranks for each group is determined. The Kruskal-Wallis test determines whether these sums of ranks are so different from each other and therefore are not likely to have come from the same population with equal averages. If the sample sizes are not equal, then the Kruskal-Wallis test will test the hypothesis whether the average ranks in each of the k groups are also very different from each other and are therefore not likely to have come from the same population or populations with equal averages. If the average ranks in each of the k groups are about the same, then the different ranks would be about equally distributed among the k groups (i.e., ranks in the various groups would have about the same average) and therefore that H 0 is more likely to be true. Rejection of H 0 means that the various average ranks are different from one another. It can be shown mathematically that if the k samples actually come from the same population or from identical populations (with respect to ranks), i.e., if H 0 is true, then H, the statistic used in the Kruskal-Wallis test is distributed as a Chi Square ( 2 ) with k 1 df, where k = number of samples or groups involved in the study, provided that each sample size is greater than 5. The Kruskal-Wallis H is defined as:

206

12 H = N ( N +1)

n
j= 1

R2 j
j

## 3( N +1) , where k is as defined above, n j

is the number of cases in the jth sample; N is total sample size, i.e. n1 + n 2 + n3 + + n k 1 + n k ; and R j is the sum of the ranks in the jth sample. When at least each n j > 5 , the value of H may be referred to the 2 tables (Appendix 2.6) with k 1 df under a two-tailed test. If the 2 observed value of H is greater than or equal to the critical value of in the tables at a specified level of significance, H 0 is rejected in favour of H 1 , and we conclude that a significant difference exists among the ranked averages in the k groups, i.e., at least one ranked average is larger or smaller than another. In the special case where k = 3 , and each sample size ( n j ) in the 2 three samples is less than or equal to 5, then the approximation is not sufficiently exact to be used. In such a situation, the values of n1 , n 2 , n3 , and H may be referred to the Kruskal-Wallis probability tables (Appendix 2.9) to determine the probability associated with an observed value of H , given the values of n1 , n 2 , and n3 , the sample sizes of the three samples. The first column in the table gives values of n1 , n 2 , n3 . The second column gives various values of H , and the third column the exact probability associated with the occurrence of H 0 of values as large as an observed H. For H 0 to be rejected, H obs must be greater than or equal to H crit at a given level of significance. Note that the value of H cannot be negative. 12.3: Example for Small Samples Let the data in Table 12.1 represent the scores obtained by three groups of subjects on a sub-scale of an aptitude test where scores ranged between 1 and 15. To compute the value of H we first of all combine the scores from the three groups and rank them, but retaining the identity of each group. This will result in the following table (Table 12.2). Table 12.1: Scores obtained by three groups of subjects on a sub-scale of an aptitude test Group 1 Group 2 Group 3 207

Sub. No 1 2 3 4

Score 5 4 3 12

Sub No. 1 2 3 4

Score 6 7 2 8

Sub No. 1 2 3

Score 9 10 11

Table 12.2 : Scores obtained by three groups of subjects on a sub-scale of an aptitude test and the ranks assigned to these scores (Case 1). Group 1 Group 2 Group 3 Sub Score Rank Sub Score Rank Sub. Score Rank No. No. No. 1 5 4 1 6 5 1 9 8 2 4 3 2 7 6 2 10 9 3 3 2 3 2 1 3 11 10 4 12 11 4 8 7
n1 = 4
R1 = 5.00

R1 = 20

n2 = 4

R2 = 19

n3 = 3
R3 = 9.00

R3 = 27

R2 = 4.75

Substituting the above relevant values into the Kruskal-Wallis H formula and noting that N = n1 + n 2 + n3 = 4 + 4 + 3 = 11 we get,
12 H = N ( N +1)

n
j= 1

R2 j
j

3( N +1) =

## (19) 2 ( 27) 2 12 ( 20) 2 + + 11(12) 4 4 3

1 (100 + 90.25 + 243) 36 = 433.25 11 36 11 = 39.386363 36 = 3.386363 , i.e., H obs 3.3864 (corrected to 4

decimal places ). Referring H obs = 3.3864 to the Kruskal-Wallis probability tables (Appendix 2.9) with n1 = 4 , n2 = 4 , and n3 = 3 , we find that at the 0.049 level of significance ( p = 0.049 ), we need an H value of 5.5985 to reject H 0 . At the 0.051 level of significance ( p = 0.051 ) the critical value of H = 5.5758 . This means that at the 0.05 level of significance, H obs must lie between 5.5985 and 5.5758 for H 0 to be rejected. This is because 0.05 lies between 0.049 and 0.051 (0.049 < 0.05 < 0.051) . Since our ( H obs = 3.3864) < ( H crit = 5.5758) 208

even at the 0.051 level of significance, then H 0 should be retained and we conclude that no significant difference exists between the three groups of subjects in performance on the subscale of the aptitude test. Note that for each n j > 5 , the H distribution, as we observed earlier, is distributed as a Chi Square ( 2 ) and therefore for H 0 to be rejected in the H test, the H obs must be greater than or equal to H crit . Also note from Appendix 2.9 that like the 2 test, the smaller the value of H, the more likely it is that H 0 is true and therefore must be retained. Now let us assume that in the study, instead of the scores and ranks presented in Table 12.2, Table 12.3 represent the scores and ranks obtained by subjects in the three groups Table 12.3: Ranks assigned to scores obtained by three groups of subjects on a sub-scale of an aptitude test (Case 2) Group 1 Group 2 Group 3 Sub Score Rank Sub Score Rank Sub. Score Rank No. No. No. 1 5 4 1 7 6 1 9 8 2 4 3 2 10 9 2 12 10 3 2 1 3 3 2 3 14 11 4 6 5 4 8 7
n1 = 4 R1 = 13 n2 = 4 R2 = 24
n3 = 3 R3 = 29

R 1 = 3.25

R 2 = 6.00

R 3 = 9.67

## Computing the value of H from the above data, we get,

H = (13) 2 12 ( 24) 2 ( 29) 2 + + 3(12) 11(12) 4 4 3

If we simplify the above, we will find that H obs 6.4167 (corrected to 4 decimal places). Referring the value of H obs = 6.4167 to the KruskalWallis probability tables with n1 = 4 , n 2 = 4 , and n3 = 3 , we find that our H obs value of 6.4167 is not listed in the tables. However, we can deduce that the probability associated with this observed value of H obs which is 6.4167 lies between 0.049 and 0.011. Again this is because at p = 0.049 , the associated H value = 5.5985 and at p = 0.011 , the associated H value = 6.8727 . Since our computed H obs = 6.4167 209

lies between 5.5985 and 6.8727 i.e., 5.5985 < 6.4167 < 6.8727 , then the probability associated with our H obs = 6.4167 is less than 0.05 0.049 < 0.05 and 0.011 < 0.05 . Therefore, H 0 will be rejected in this instance and we conclude that at = 0.05 , a significant difference exists between the scores obtained by the three groups of subjects on the sub-scale of the aptitude test. 12.4: Example for Large Samples (each n j > 5 ) Suppose that in the example given for small sample sizes (Section 12.3), the sample sizes were increased to 6, 7, and 8 respectively for n1 , n 2 and n3 , with the data as in Table 12.4. Table 12.4: Scores obtained by three groups of subjects on a sub-scale of an aptitude test and the ranks assigned to these scores (Case 3). Group 1 Group 2 Group 3 Sub Score Rank Sub Score Rank Sub. Score Rank No. No. No. 1 5 5 1 6 6.5 1 9 11.5 2 4 4 2 7 8 2 10 14 3 3 3 3 2 1.5 3 11 16 4 12 17 4 8 9.5 4 14 20 5 6 6.5 5 9 11.5 5 15 21 6 2 1.5 6 13 18.5 6 13 18.5 7 10 14 7 8 9.5 8 10 14
n1 = 6 R1 = 37.00 n2 = 7 R2 = 69.50
n3 = 8 R3 = 124.50

R1 = 6.17

## values into the H formula,

H =
H =

12 N ( N +1)

n
j= 1

R2 j
j

3( N +1) , we get,

## (37) 2 12 (69.50) 2 (124.50) 2 + + 3(22) 21(22) 6 7 8

2 77

[ 228.16666

+ 690.03571 + 1937.5312] 66

210

## ( 2 2855.7335) 77 66 = 74.174896 66 = 8.1748 i.e, H obs 8.17 (corrected to 2 decimal places).

2 Tables with k 1 to the df = 3 1 = 2 df, we note that the critical value of 2 at the 0.05 level of significance under a nondirectional test = 5.99 . Therefore, H crit = 5.99 . Since ( H obs = 8.17) > ( H crit = 5.99) , H 0 is rejected and we conclude that a significant difference exists between the three groups in performance on the sub-scale of the aptitude test.

Referring the

H obs = 8.17

12.5: The Problem with Tied Ranks When there are tied ranks, either within and/or between groups, then the usual procedure of assigning the average of the ranks to the observations involved is used. This procedure was used in the computation in Section 12.4 above. The value of H is somewhat influenced by ties and when there are too many ties involved, a formula for correction for ties is used. The effect of correcting for ties is to increase the value of H and thus make it more likely for H 0 to be rejected. In most cases, however, the effect of the correction is negligible and it may be used when the H obs is quite close in value to the H crit but does not reach significance. The interested reader is referred to: Siegel, S. (1956). Nonparametric statistics for the behavioral sciences. New York: McGraw -Hill Book Company, Inc. Pp 188192. 12.6: Multiple Comparisons following a significant H Test As we noted in Chapter 6 under the One-Way ANOVA, the rejection of the ANOVA H 0 does not necessarily mean that all the means being compared differ from each other. Like the parametric F test, the nonparametric H test will dictate that H 0 should be rejected when an inequality exists between any two or more of the k average ranks being compared. Unfortunately, unlike the F test, there is no known method of pairwise comparisons between average ranks based on data from the H test when H is found to be significant. We may however, use the Mann-Whitney U test to make such pairwise comparisons. But, as we noted in Chapter 6, the difficulty with many pairwise comparisons is that we are bound to commit an experimentwise error, i.e., we increase the chances of any of the comparisons being significant by chance. However, when we have a directional hypothesis 211

and H 0 is rejected as in the example under Section 12.4, then we have no choice but to use the Mann-Whitney U test to make any relevant pairwise comparisons. Whenever we are compelled to make such pairwise comparisons with the Mann-Whitney U test, we must bear in mind the possibility of committing an experimentwise error. Let us now go back to the example on the H test presented under Section 12.4. In the example, we found that H obs = 8.17 was significant at the 0.05 level. If we wish to make pairwise comparisons between the three groups using the Mann-Whitney U test, then we will have to re-rank the data for any two groups being compared as shown in Table 12.5. The table also shows the computations of the U values. Referring the smaller of the 2 Us computed ( U in all cases) to the Mann-Whitney exact probability tables (Appendix 2.7.1), we observe that between groups 1 and 2 (n1 = 6 , n 2 = 7) the exact probability associated with the computed value of U = 11 is equal to 0.090. [This probability should be multiplied by 2 to obtain a two-tailed probability. This is because in multiple comparisons, the direction of the difference could be opposite to the predicted direction. Thus, the exact probability associated with U = 11 in a two-tailed test = (0.090 2) = 0.180 . Since p = 0.180 is greater than the desired p = 0.05 , H 0 is retained and we conclude that no significant difference exists between groups 1 and 2 in performance on the sub-scale of the Aptitude Test. In a similar manner, comparing groups 1 and 3, p = (0.006 2) = 0.012 . Since H 0 will be rejected, meaning that a significant 0.012 < 0.05 , difference exists between groups 1 and 3 in performance on the subscale of the aptitude test.

212

Table 12.5: Pairwise comparisons among three groups using the MannWhitney U test.
Group 1 Vrs. Group 2 Group 1 Group 2 Score Rank Score Rank 5 5 4 4 3 3 12 12 6 6.5 2 1.5 6 7 2 8 9 13 10 6.5 8 1.5 9 10 13 11 Group 1 Vs. Group 3* Group 1 Group 3 Score Rank Score Rank 5 4 9 7 4 3 10 8.5 3 2 11 10 12 11 14 13 6 5 15 14 2 1 13 12 8 6 10 8.5 Group 2 Vs. Group 3* Group 2 Group 3 Score Rank Score Rank 6 7 2 8 9 13 10 2 3 1 4.5 6.5 12.5 9 9 10 11 14 15 13 8 10 6.5 9 11 14 15 12.5 4.5 9

n1 = 6 R1 = 32
R1 = 5.33

n2 = 7 R2 = 59
R 2 =8.43

n1 = 6 R1 = 26
R1 = 4.33

n2 = 8 R2 = 79 R 2 = 9.88

n1 = 7 n2 = 8 R1 = 38.50 R2 = 81.50
R1 = 5.50

U = (6 7) + 6(7) / 2 32
+

U = (6 8) + 6(7) / 2 26

## + 56 + 28 38.5 = 84 38 = 42 + 21 32 = 63 32 = 31 = 48 + 21 26 = 69 26 = 43 = + +45.50 + = U = (6 8) 43 = 48 43 = 5 U = (6 7) 31 = 42 31 = 11 U = (7 8) 45.50 = 56

= 10.50 +

Comparing groups 2 and 3 ( n1 =7 , n 2 =8, U = 10.5) , we find, that the U value of 10.5 is not listed in the MannWhitney exact probability tables. However, we can obtain the approximate p value by interpolation as follows: For U = 10 , p = 0.020 ; For U = 11 , p = 0.027 (see Appendix 2.7.1.). Assuming that there are equal intervals between the above two p-values, then a U- value of 10.5 must lie about midway between the p-values of 0.020 and 0.027. Thus, the p-value associated with the U - value of 10.5 (0.020 + 0.027) / 2 = 0.0235 . Therefore, for a two-tailed test, the p- value associated with the U- value of 10.5 = ( 2 0.0235) = 0.047 . Since p = 0.047 < 0.05 , H 0 may be
Please note that in the Mann Whitney U test, n1 and n2 represent the smaller and larger sample sizes respectively. Therefore n1 and n2 are used to designate the sample sizes of any two groups we are comparing. The designations, n3 and R3 are irrelevant here.

213

rejected and we conclude that a significant difference exists between groups 2 and 3 in performance on the sub-scale of the aptitude test. By inspecting the averages of the ranked data for any two groups we are comparing, we should be able to know which average rank is greater/less than the other average rank in all situations where H 0 is rejected and decide whether or not our directional prediction is supported. Thus, the data in Table 12.5 shows that between groups 1 and 3, R1 = 4.33 and R2 = 9.88 and we may conclude that the scores in group 3 were generally higher than the scores in group 1. Similarly, between groups 2 and 3, R1 = 5.50 and R2 = 10.19 and we may conclude that the scores in group 3 were generally higher than the scores in group 2. We must remember that no significant difference existed between groups 1 and 2 and therefore that the scores in these 2 groups were about the same. Thus, it is reasonable to say that group 3 performed better on the aptitude test than groups 1 and 2. The problem of experimentwise error should always be at the back of our minds when we are making these pairwise comparisons, especially when the p-value is quite close to the 0.05 decision rule as in the comparison between groups 2 and 3 where the p-value was found to be equal to 0.047, quite close to the 0.05 decision rule. 12.7: Chapter Summary The Kruskal-Wallis One-Way Analysis of Variance by ranks (H) Test is the nonparametric equivalent of the parametric One Way ANOVA (F) Test. The test is used when 3 or more independent groups are being compared on a dependent variable measure. The use of the KruskalWallis H Test requires that measurement on the dependent variable be at least on an ordinal scale. The KruskalWallis H tests the null hypothesis that all the k independent samples (k > 2) are coming from identical populations with respect to averages. In the computation of H, all the observations from the k independent samples are combined and ranked in a single series from the lowest to the highest score. The sum of ranks for each group is then determined. The Kruskal-Wallis test determines whether the sum of the ranks in the k groups are very much different from each other and are therefore not likely to have come from the same population (in the case 214

when all the sample sizes are equal), or from populations with unequal averages (in the case when all the sample sizes are not equal.) It can be shown mathematically that if the k groups all come from the same population or from identical populations with respect to average ranks, then H, the statistic employed in the KruskalWallis test is distributed as a Chi Square ( 2 ) with k 1 df, where k = number of samples or groups, provided that all the sample sizes are each greater than 5. The H is defined as:
k R2 12 j H = 3 ( N +1) N ( N +1) j =1 n j

where k = number of samples, n j = number of cases in the jth sample, N = total sample size, i.e., n1 + n 2 + n3 + + nk 1 + nk , and R j = sum of the ranks in the jth sample. The computed value of H is therefore referred to the 2 tables with the associated df and a decision is taken as to whether or not H 0 should be rejected under a specified decision rule. Like the 2 test, the value of H cannot be negative. In the special case where k = 3 , and each n j 5 , the 2 approximation cannot be used. In such cases, the values of n1 , n 2 and n3 (the three samples sizes) and the computed value of H may be referred to the Kruskal-Wallis probability tables to determine the probability associated with the observed value of H, given the values of n1 , n2 , and n3 . For H 0 to be rejected, H obs must be greater than or equal to H crit in the tables at a given level of significance. When there are tied ranks, the usual method of assigning the average rank to the various tied positions is used. The value of H is somewhat influenced by ties and when there are too many ties involved, a correction for ties is used. The effect of correcting for ties is to increase the value of H and thus make it more likely for H 0 to be rejected. In most cases however, the effect of the correction is negligible. When H is significant, the Mann-Whitney U test may be used to make multiple comparisons where necessary, but at the risk of committing an experimentwise error. 12.8: Worked Example on the Kruskal-Wallis H Test Question 215

A manufacturing company used three different methods of selection in a particular year to employ twenty-four (24) workers. One group of workers were selected through interviews only, a second group through a screening (aptitude) test only, and a third group through both interviews and a screening test. After one year on the job, the performance of the three groups of workers were evaluated by different supervisors, one supervisor each for the Interviews only group, Screening test only group, and Interviews and screening test group on a scale ranging from 10 to 100. The following data were obtained (higher scores reflect better performance on the job).
Method of Selection Screening Test only Employe e No. 1 2 3 4 5 6 7

## Interviews only Employe e No. 1 2 3 4 5 6 7 8 Job Performanc e Rating 55 35 45 60 60 40 30 50

Interviews & Screening Test Job Employe Job Performance e No. Performance Rating Rating 50 1 65 65 2 65 70 3 70 65 4 55 55 5 50 45 6 75 60 7 60 8 60 9 45

(a) All other things being equal, determine whether or not the three methods of selection led to differences in performance among the employees in the company. [Show full working steps.] (b) Assuming that the first four employees in each group were females, what conclusions may we draw concerning levels of job performance in the female population using the three selection methods? [No working steps needed.] Solution (a) 216

Step 1: Choice of Statistical Test We are given three independent groups of employees who have been randomly and independently sampled from their respective populations and their job performance evaluated by supervisors. Since different supervisors rated the employees in the different groups, we cannot be sure whether the rating scores assigned by the three supervisors to the employees in the different groups were objective, and in any case, rating scales are very subjective by nature. The data may therefore be converted to ranks (an ordinal scale) and the Kruskal-Wallis H test used to determine whether or not the three different methods of employee selection led to differences in job performance among the three groups of employees. Step 2: Statement of Hypotheses We are merely asked to determine whether or not the three selection methods led to differences in performance among the three groups of employees. The research hypothesis is thus non-directional. Let R1 , R2 , and R3 represent the average ranks in the population of employees selected through interviews only, screening test only, and interviews and screening test respectively. Then the null hypothesis ( H 0 ) and the alternative hypothesis ( H 1 ) may be stated as follows: H 0 : No significant difference exists between the three groups of employees on job performance (i.e., R1 =R2 = R3 ). H 1 : A significant difference exists between the three groups of employees on job performance (i.e., Not H 0 ; at least two of the groups differ on job performance.) Step 3: Decision Rules Given: 0.05 level of significance, a Kruskal-Wallis H test, n1 = 8 , n 2 = 7 , and n3 = 9 . Since each n > 5 , then H is distributed as a 2 with k 1 = 3 1 = 2 df. From the 2 Tables, the critical value of 2 with 2 df = 5.99 . Therefore if H obs < 5.99 , retain H 0 , and if H obs 5.99 , reject H 0 . Step 4: Computation With the usual notations, the KruskalWallis H is given by

217

k R2 12 j H = 3 ( N +1) N ( N +1) j =1 n j

(i)

## The data presented is combined, converted to ranks and retabulated as follows:

Methods of Selection

Interviews Only
Employee No. Performance Rating Rank

## Screening Test Only

Employee No. Performance Rating Rank

## Interviews & Screening Test

Employee No. Performance Rating Rank

1 2 3 4 5 6 7 8

55 35 45 60 60 40 30 50

11 2 5 15 15 3 1 8

1 2 3 4 5 6 7

50 65 70 65 55 45 60

## 8 19. 5 22. 5 19. 5 11 5 15

1 2 3 4 5 6 7 8 9

65 65 70 55 50 75 60 60 45

## 19.5 19.5 22.5 11 8 24 15 15 5

n1 = 8 R1 = 60
R 1 = 7.50

n 2 = 7 R2 = 100.50
R 2 =14.36

n3 = 9 R3 =139.50
R3 =15.50

Substituting the above computed values into (i) above and noting that N = n1 + n2 + n3 = 8 + 7 + 9 = 24 , we get,
H = (60) 2 12 (100.50) 2 (139.50) 2 + + 3( 25) . 24( 25) 8 7 9

Evaluating, we get,
H = 1 (450.00 +1,442.8928 + 2,162.25) 75 50 1 = ( 4,055.1428) 75 50 = 81.102856 75 = 6.102856

218

i.e.,

H obs

## 6.102 (corrected to 2 decimal places)

Step 5: Decision Comparing H obs = 6.10 to the decision rules (Step 3), we note that
( H obs = 6.10 ) > ( H crit = 5.99 )

## is rejected at the 0.05 level of significance.

Step 6: Interpretation At the 0.05 level of significance, a difference exists in job performance among the employees selected using the three methods of selection in that particular year. [NB: If the question had required us to evaluate the relative effectiveness of the three methods of selection on work performance, then after rejecting H 0 , there would have been the need to compare the average ranks among the three groups, taking them two at a time, using the Mann-Whitney U test (see Section 12.6). By inspecting the average ranks of the three groups in our worked example, it appears that using a screening test only for selection led to higher job performance than using interviews only ( R2 = 14.36 > R1 = 7.50) ; and using both interviews and screening test also appears to have led to higher job performance than using interviews only ( R3 =15.50 > R1 = 7.50) . However, it also appears that using screening test only and using both interviews and screening test did not lead to a difference in job performance between the two groups ( R2 = 14.36) is quite close to ( R3 = 15.50) . Try to confirm or disconfirm these observations by using the Mann-Whitney U test to do the actual comparisons as in Section 12.6. (b) If the first 4 subjects in each group were females then we are dealing with small sample size where n1 = 4 , n 2 = 4 , and n3 = 4 . H is calculated from the given data as follows:
2

One would have observed that too many ties occurred in the rankings and a correction for ties would have been needed, particularly if the H obs value had been found to be less than but quite close to 5.99. In this example, H (corrected for ties) = 6.22 , which is not much different from the computed H value of 6.10.

219

Method of Selection

Interviews only
Employee No. Perform Rating Rank

## Screening Test Only

Employee No. Perform . Rating Rank

## Interviews & Screening Test

Employee No. Perform . Rating Rank

## 1 50 3 2 65 8.5 3 70 11.5 4 65 8.5 n 2 = 4 ; R 2 = 31.50 R 2 = 7.875

1 65 2 65 3 70 4 55 n3 = 4 ; R3 = 33

## 8.5 8.5 11.5 4.5

R3 = 8.25

k R2 12 j H = 3 ( N +1) N ( N +1) j =1 n j

12 (13.50) 2 (31.50) 2 (33.00) 2 + + 3(13) 12(13) 4 4 4 1 ( 45.5625 + 248.0625 + 272.25) 39 i.e., H = 13 1 = (565.875) 39 13 = 43.528846 39 =

= 4.528846

i.e., H obs 4.5288 (corrected to 4 decimal places.) Referring H obs = 4.5288 to the KruskalWallis probability tables (Appendix 2.9) with n1 = 4 , n 2 = 4 , and n3 = 4 we find that we need an H value of 5.6923 or greater to reject H 0 at p = 0.049 and an H value of 5.6538 or greater to reject H 0 at p = 0.054 . Therefore at p = 0.05 , we would need an H value greater than 5.6538 but less than 5.6923 (i.e., 5.6538 < H obs < 5.6923) to reject H 0 . This is because 0.05 lies between 0.049 and 0.054. Since our H obs = 4.5288 is less than the H crit value even at p = 0.054 then H 0 must be retained. Our conclusion then is that the three methods of selection did not produce differences in job performance among the female employees in the company in that particular year. It is to be noted that the value of H corrected for ties 220

here = 4.7272 . This value of H is however, still not significant at the 0.05 level . 12.9: Exercises on the Kruskal Wallis H Test 12.9.1: Three types of drugs, A, B, and C that are believed to lead to the suppression of opportunistic infection associated with HIV were clinically tested on volunteer HIV positive patients over a period of two years. During the two-year period of observation, the number of times that the patients reported symptoms that were known to be associated with HIV were recorded. The following data were gathered over the two-year period of observation. HIV Treatment with Drug A B C No. of HIV- Patient No. of Patient No. of related No. HIVNo. HIVsymptoms related related reported symptoms symptoms reported reported 20 1 25 1 10 28 2 28 2 15 18 3 30 3 12 40 4 50

Patient No.

1 2 3 4

Was there a difference in the efficacy levels of the three drugs in the suppression of opportunistic infections associated with HIV? 12.9.2: Twentyone (21) children with reading disabilities in English were classified by a specialist in Special Education into mild, moderate, and severe forms of the disability based on an instrument for measuring reading skill that measured the number of errors (incorrect pronunciations, improper intonations, etc.) in the reading of a standard English passage. The following data were recorded for the three groups of children. Assuming that the errors recorded represent at best measurement on an ordinal scale, can the categorization by the specialist be considered valid in the light of the above data? Mild Disability Moderate Disability Severe Disability 221

222