Sie sind auf Seite 1von 2

The Modified Levene Test

Supplement to Section 6.4


Brian Habing University of South Carolina Last Updated: July 16, 2004

While the F-tests for regression and ANOVA are fairly robust, the standard F test for two variances is incredibly non-robust to lack of normality. The F-max test for testing equality of several variances (described on pg. 238) is similarly non-robust. Two major reasons for this can be seen by simply examining the formula for the variance:

n 1 A single outlier will cause the mean to change greatly and the squaring amplifies the effect. An alternative would be to use some procedure that replaced the mean with median, and squaring with absolute values. The difficulty in doing this directly is that the calculus that underlies the distributional theory we commonly use becomes very difficult for either medians or absolute values. The Modified Levene test (called the Brown and Forsythe test by SAS) begins by considering what it means for different populations to have equal standard deviations. If populations have the same standard deviation, then the average deviation from the center of each population should be the same. In particular, the average of the |yi-median(y)| should be equal for each population. The test is constructed by calculating this absolute deviation from the sample median for each observation, and then using ANOVA to test that the means of this quantity are the same for all of the populations. This is worked out below for the example discussed on page 238 (data on page 227).

s2 =

i =1

2 ( yi y )

Sample from Population 1 2 3 4 934 47.5 880 55 987 23.5 992 149.5

Observation |Observation Median| 1041 1028 59.5 46.5 963 924 28 11 951 976 12.5 12.5 1143 1140 1.5 1.5

Sample Median 935 46.5 946 11 840 123.5 1191 49.5 981.5 935 963.5 1066

DATA deviations; INPUT sample $ deviations @@; CARDS; 1 47.5 1 59.5 1 46.5 2 55 2 28 2 11 3 23.5 3 12.5 3 12.5 4 149.5 4 1.5 4 1.5 ; PROC GLM DATA=deviations; CLASS sample; MODEL deviations = sample; RUN;

1 2 3 4

46.5 11 123.5 49.5

The GLM Procedure Dependent Variable: deviations Sum of Squares 1538.18750 24740.75000 26278.93750

Source Model Error Corrected Total

DF 3 12 15

Mean Square 512.72917 2061.72917

F Value 0.25

Pr > F 0.8607

The test of H0: 12=22=32=42 versus the alternate that at least one is different gives an F-statistic of 0.25 for 3 and 12 degrees of freedom, resulting in a p-value of 0.8607. We would therefore fail to reject the null hypothesis that the variances were equal. (Note that these values are different than those shown on page 239, where the means were used instead of the medians). The commands to carry this out from the original data using SAS are given below. The HOVTEST stands for homogeneity of variances test and the BF stands for Brown and Forsythe.
DATA original; INPUT sample $ values @@; CARDS; 1 934 1 1041 1 2 880 2 963 2 3 987 3 951 3 4 992 4 1143 4 ; PROC GLM DATA=original; CLASS sample; MODEL values = sample; MEANS sample / HOVTEST=BF; RUN;
The GLM Procedure Brown and Forsythe's Test for Homogeneity of values Variance ANOVA of Absolute Deviations from Group Medians Sum of Squares 1538.2 24740.8 Mean Square 512.7 2061.7

1028 924 976 1140

1 2 3 4

935 946 840 1191

Source sample Error

DF 3 12

F Value 0.25

Pr > F 0.8607

Das könnte Ihnen auch gefallen