Sie sind auf Seite 1von 48

Chi square tests

BITS Pilani Dr. Udayan Chanda, Department of Management, BITS Pilani.


Pilani|Dubai|Goa|Hyderabad
Learning Objectives
In this lecture, you learn:
 How and when to use the chi-square test for
contingency tables
– The 2 test for the difference between two
proportions
– The 2 test for independence
 Use the chi-square goodness-of-fit test to determine
whether data fits a specified distribution

24-Oct-18 Chi square tests 2


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Tables

Contingency Tables
• Useful in situations comparing multiple
population proportions
• Used to classify sample observations
according to two or more characteristics
• Also called a cross-classification table.

24-Oct-18 Chi square tests 3


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Table Example
Left-Handed vs. Gender
Dominant Hand: Left vs. Right
Gender: Male vs. Female

 2 categories for each variable, so this


is called a 2 x 2 table

 Suppose we examine a sample of


300 children
24-Oct-18 Chi square tests 4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Table Example
(continued)

Sample results organized in a contingency table:

Hand Preference
sample size = n = 300:
Gender Left Right
120 Females, 12
were left handed Female 12 108 120
180 Males, 24 were
left handed Male 24 156 180

36 264 300

24-Oct-18 Chi square tests 5


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
2 Test for the Difference
Between Two Proportions
H0: π1 = π2 (Proportion of females who are left
handed is equal to the proportion of
males who are left handed)
H1: π1 ≠ π2 (The two proportions are not the same –
hand preference is not independent
of gender)

• If H0 is true, then the proportion of left-handed females should be the


same as the proportion of left-handed males
• The two proportions above should be the same as the proportion of left-
handed people overall

24-Oct-18 Chi square tests 6


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Chi-Square Test Statistic
The Chi-square test statistic is:
( f o  f e )2
 STAT
2
 
all cells fe
• where:
fo = observed frequency in a particular cell
fe = expected frequency in a particular cell if H0 is true

 STAT
2
for the 2 x 2 case has 1 degreeof freedom

(Assumed: each cell in the contingency table has expected


frequency of at least 5)
24-Oct-18 Chi square tests 7
Decision Rule
2
The χ STAT test statistic approximately follows a chi-
squared distribution with one degree of freedom

Decision Rule:
2 2
χ
If STAT  χ α , reject H0,

otherwise, do not reject
H0 0
Do not Reject H0 2
reject H0
2α
24-Oct-18 Chi square tests 8
Computing the
Average Proportion
The average X1  X 2 X
p 
proportion is: n1  n2 n

120 Females, 12 Here:


were left handed
12  24 36
180 Males, 24 were p   0.12
left handed 120  180 300

i.e., based on all 300 persons the proportion of left handers is 0.12,
that is, 12%
24-Oct-18 Chi square tests 9
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Finding Expected Frequencies

• To obtain the expected frequency for left handed females,


multiply the average proportion left handed (p) by the total
number of females
• To obtain the expected frequency for left handed males,
multiply the average proportion left handed (p) by the total
number of males
If the two proportions are equal, then
P(Left Handed | Female) = P(Left Handed | Male) = .12

i.e., we would expect (.12)(120) = 14.4 females to be left handed


(.12)(180) = 21.6 males to be left handed
24-Oct-18 Chi square tests 10
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Observed vs. Expected
Frequencies

Hand Preference
Gender Left Right
Observed = 12 Observed = 108
Female 120
Expected = 14.4 Expected = 105.6
Observed = 24 Observed = 156
Male 180
Expected = 21.6 Expected = 158.4

36 264 300

24-Oct-18 Chi square tests 11


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Chi-Square Test Statistic
Hand Preference
Gender Left Right
Observed = 12 Observed = 108
Female 120
Expected = 14.4 Expected = 105.6
Observed = 24 Observed = 156
Male 180
Expected = 21.6 Expected = 158.4
36 264 300
The test statistic is:
(fo  fe ) 2
χ 2
STAT  
all cells fe
(12  14.4)2 (108  105.6)2 (24  21.6)2 (156  158.4)2
     0.7576
14.4 105.6 21.6 158.4
24-Oct-18 Chi square tests 12
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Decision Rule

2
The test statisticis χ STAT  0.7576; χ02.05 with1 d.f.  3.841
Decision Rule:
If  STAT > 3.841, reject H0,
2

otherwise, do not reject H0

Here,
0.05 2 2
χ STAT χ
=0.7576 < 0.05 = 3.841,
so we do not reject H0 and
0
Do not Reject H0 2 conclude that there is not
reject H0 sufficient evidence that the two
20.05 = 3.841 proportions are different at  =
0.05
24-Oct-18 Chi square tests 13
2 Test of Independence
(continued)

The Chi-square test statistic is:


( fo  fe )2
2
χ STAT  
all cells
fe
 where:
fo = observed frequency in a particular cell of the r x c table
fe = expected frequency in a particular cell if H0 is true

χ 2STAT for the r x c case has (r - 1)(c - 1) degrees of freedom

(Assumed: each cell in the contingency table has expected


frequency of at least 1)
24-Oct-18 Chi square tests 14
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Expected Cell Frequencies

• Expected cell frequencies:

row total  column total


fe 
n

Where:
row total = sum of all frequencies in the row
column total = sum of all frequencies in the column
n = overall sample size

24-Oct-18 Chi square tests 15


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Decision Rule

• The decision rule is

If χ 2  χ 2 , reject H ,
0
STAT α

otherwise, do not reject H0

2
Where χ α is from the chi-squared distribution
with (r – 1)(c – 1) degrees of freedom

Chi square tests Chap 11-16


Example
• The meal plan selected by 200 students is shown below:

Number of meals per week


Class
Standing 20/week 10/week none Total
Fresh. 24 32 14 70
sophomore 22 26 12 60
Junior 10 14 6 30
Senior 14 16 10 40
Total 70 88 42 200

24-Oct-18 Chi square tests 17


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
(continued)

• The hypothesis to be tested is:


H0: Meal plan and class standing are independent
(i.e., there is no relationship between them)
H1: Meal plan and class standing are dependent
(i.e., there is a relationship between them)

24-Oct-18 Chi square tests 18


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example:
Expected Cell Frequencies
(continued)
Observed:
Number of meals
per week
Class Expected cell
Standing 20/wk 10/wk none Total
Fresh. 24 32 14 70
frequencies if H0 is true:
Soph. 22 26 12 60 Number of meals
Junior 10 14 6 30 Class per week
Senior 14 16 10 40 Standing 20/wk 10/wk none Total
Total 70 88 42 200 Fresh. 24.5 30.8 14.7 70
Soph. 21.0 26.4 12.6 60
Example for one cell:
row total  column total Junior 10.5 13.2 6.3 30
fe 
n Senior 14.0 17.6 8.4 40

30  70 Total 70 88 42 200
  10.5
24-Oct-18
200 Chi square tests 19
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example: The Test Statistic
(continued)

• The test statistic value is:

( f o  f e )2
2
χ STAT  
all cells
fe
( 24  24 .5 ) 2 ( 32  30 .8 ) 2 ( 10  8.4 ) 2
     0.709
24 .5 30 .8 8.4

χ 0.2 05 = 12.592 from the chi-squared distribution


with (4 – 1)(3 – 1) = 6 degrees of freedom

Chi square tests Chap 11-20


Example:
Decision and Interpretation
(continued)

2
The test statistic is χ STAT  0.709 ; χ 02.05 with 6 d.f.  12.592

Decision Rule:
2
If χ STAT > 12.592, reject H0,
otherwise, do not reject H0

0.05 Here,
2 2
χ STAT = 0.709 < χ 0.05 = 12.592,
so do not reject H0
0
Do not Reject H0 2 Conclusion: there is not
reject H0 sufficient evidence that meal
20.05=12.592 plan and class standing are
related at  = 0.05
24-Oct-18 Chi square tests 21
Chi-Square Goodness-of-Fit Test
• Does sample data conform to a hypothesized
distribution?
– Examples:
• Are technical support calls equal across all days
of the week? (i.e., do calls follow a uniform
distribution?)
• Do measurements from a production process
follow a normal distribution?

24-Oct-18 Chi square tests 22


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-Square Goodness-of-Fit Test
(continued)
• Are technical support calls equal across all days of the week?
(i.e., do calls follow a uniform distribution?)
– Sample data for 10 days per day of week:
Sum of calls for this day:
Monday 290
Tuesday 250
Wednesday 238
Thursday 257
Friday 265
Saturday 230
Sunday 192
 = 1722
24-Oct-18 Chi square tests 23
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Logic of Goodness-of-Fit Test

 If calls are uniformly distributed, the 1722


calls would be expected to be equally divided
across the 7 days:
1722
 246 expected calls per day if uniform
7

 Chi-Square Goodness-of-Fit Test: test to see


if the sample results are consistent with the
expected results
24-Oct-18 Chi square tests 24
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Observed vs.
Expected Frequencies
Observed Expected
fo fe
Monday 290 246
Tuesday 250 246
Wednesday 238 246
Thursday 257 246
Friday 265 246
Saturday 230 246
Sunday 192 246
TOTAL 1722 1722

24-Oct-18 Chi square tests 25


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
HA: The distribution of calls is not uniform

• The test statistic is


(fo  f e ) 2
 
2
(where df  k  1)
fe
where:
k = number of categories
oi = observed cell frequency for category i
ei = expected cell frequency for category i
24-Oct-18 Chi square tests 26
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Rejection Region
H0: The distribution of calls is uniform
over days of the week
HA: The distribution of calls is not uniform

( fo  fe ) 2
  2

fe

• Reject H0 if  
2 2
α


(with k – 1 degrees
of freedom) 0 2
Do not Reject H0
reject H0 2 
24-Oct-18 Chi square tests 27
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
HA: The distribution of calls is not uniform

(290  246) 2
(250  246) 2
(192  246) 2
2    ...   23.05
246 246 246
k – 1 = 6 (7 days of the week) so
use 6 degrees of freedom:
2.05 = 12.5916
 = .05
Conclusion:
2 = 23.05 > 2 = 12.5916 so
reject H0 and conclude that the 0 2
Do not Reject H0
distribution is not uniform reject H0

24-Oct-18 Chi square tests 2.05 = 12.5916 28


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Normal Distribution Example
• Do measurements from a production
process follow a normal distribution
with μ = 50 and σ = 15?
• Process:
• Get sample data
• Group sample results into classes (cells)
(Expected cell frequency must be at least
5 for each cell)
• Compare actual cell frequencies with
expected cell frequencies
24-Oct-18 Chi square tests 29
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Normal Distribution Example
(continued)
• Sample data and values grouped into classes:

150 Sample Class Frequency


Measurements less than 30 10
80 30 but < 40 21
65 40 but < 50 33
36
50 but < 60 41
66
50 60 but < 70 26
38 70 but < 80 10
57
80 but < 90 7
77
59 90 or over 2
…etc… TOTAL 150

24-Oct-18 Chi square tests 30


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Normal Distribution Example
(continued)
• What are the expected frequencies for these classes for a
normal distribution with μ = 50 and σ = 15 ?
Expected
Class Frequency Frequency
less than 30 10
30 but < 40 21
40 but < 50 33 ?
50 but < 60 41
60 but < 70 26
70 but < 80 10
80 but < 90 7
90 or over 2
TOTAL 150
24-Oct-18 Chi square tests 31
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Expected Frequencies
Expected Expected frequencies
Value P(X < value) frequency in a sample of size
n=150, from a normal
less than 30 0.09121 13.68
distribution with
30 but < 40 0.16128 24.19 μ=50, σ=15
40 but < 50 0.24751 37.13
50 but < 60 0.24751 37.13 Example:
60 but < 70 0.16128 24.19  30  50 
P(x  30)  P z  
 15 
70 but < 80 0.06846 10.27
 P(z  1.3333)
80 but < 90 0.01892 2.84
90 or over 0.00383 0.57  .0912

TOTAL 1.00000 150.00 (.0912)(150)  13.68

24-Oct-18 Chi square tests 32


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Test Statistic
Frequency Expected The test statistic is
Class (observed, fo) Frequency, fe
less than 30 10 13.68 ( f  f ) 2

30 but < 40 21 24.19


2   o e
fe
40 but < 50 33 37.13
50 but < 60 41 37.13
60 but < 70 26 24.19
70 but < 80 10 10.27 • Reject H0 if
80 but < 90 7 2.84  2 2
α
90 or over 2 0.57
TOTAL 150 150.00 (with k – 1 degrees
of freedom)
24-Oct-18 Chi square tests 33
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
24-Oct-18 Chi square tests 34
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Rejection Region
H0: The distribution of values is normal
with μ = 50 and σ = 15
HA: The distribution of calls does not
have this distribution
( f o  f e ) 2 (10  13.68) 2 (2  0.57) 2
 
2
  ...   12.097
fe 13.68 0.57
8 classes so use 7 d.f.:

2.05 = 14.0671
Conclusion: =.05
2 = 12.097 < 2 = 14.0671 so
0 2
do not reject H0 Do not Reject H0
reject H0
24-Oct-18 Chi square tests 2.05 = 14.0671 35
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example

24-Oct-18 Chi square tests 36


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Solution

24-Oct-18 Chi square tests 37


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Table of the observed and Expected
Frequencies

24-Oct-18 Chi square tests 38


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Table of the observed and Expected
Frequencies (contd..)

24-Oct-18 Chi square tests 39


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Contingency Table of the observed and
Expected Frequencies

24-Oct-18 Chi square tests 40


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Computation of expected frequencies and chi-
square statistics
At 95% confidence level, the
critical value obtained from
chi-square table is

Which is greater than the


calculated value of chi-
square = 7.23

Hence there is insufficient


evidence to reject the null
hypothesis and we can
conclude that brand
preference is independent
of age group.
24-Oct-18 Chi square tests 41
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-square Test with Excel-Based on Observed and Expected
Frequencies (p-value approach)

24-Oct-18 Chi square tests 42


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-square Test with Excel

24-Oct-18 Chi square tests 43


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-square Test with Excel

24-Oct-18 Chi square tests 44


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Chi-square Test with Excel

24-Oct-18 Chi square tests 45


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example:
Decision and Interpretation
(continued)

The test statistic is  2  7.23 , p - value  .7797

Decision Rule:
If the p-value is < , reject H0, otherwise, do not
reject H0

Here,
p=value not <  so do not reject H0
Conclusion: There is not sufficient evidence that brand
preference and age group are related at  = .05

Statistics for Managers Using


Microsoft Excel, 5e © 2008 Chap 12-46
Prentice-Hall, Inc.
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
24-Oct-18 Chi square tests 47
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Lecture Summary
• Developed and applied the 2 test for the difference
between two proportions
• Examined the 2 test for independence
• Used the chi-square goodness-of-fit test to determine
whether data fits a specified distribution
- Example of a discrete distribution (uniform)
- Example of a continuous distribution (normal)

24-Oct-18 Chi square tests 48


BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956

Das könnte Ihnen auch gefallen