Sie sind auf Seite 1von 10

TUTORIAL-CHI SQUARE TESTS

QUESTION 1
Where people turn to for news is different for various age groups. A study of
indicated where different age groups primarily get their news.

Column variable
Row variable

Under 36

36-50

50+

Total

Local TV

97

115

130

342

National TV

71

101

126

298

Radio

80

95

108

283

Local newspaper

50

80

103

233

105

85

71

261

403

476

538

1417

Internet
Total

At the 5% level of significance, is there evidence of significant relationship between


the age group and where people primarily get their news? If so, explain the
relationship.
Solution:

(a)

H0:Thereisnorelationshipbetweentheagegroupandwherepeopleprimarilyget
theirnews.

H1:Thereisarelationshipbetweentheagegroupandwherepeopleprimarilyget
theirnews.

Exceloutput
Observed Frequencies
Column variable
Row variable

Under 36

36-50

50+

Total

Local TV

97

115

130

342

National TV

71

101

126

298

Radio

80

95

108

283

Local newspaper

50

80

103

233

Page1of10
TUTORIALWEEK72012_2013

Internet
Total

105

85

403

476

71

261

538 1417

Expected Frequencies
Column variable
Row variable

Under 36

36-50

50+

Total

Local TV 97.26606 114.8850 129.8490

342

National TV 84.75229 100.1044 113.1433

298

Radio 80.48624 95.06563 107.4481

283

Local newspaper 66.26606 78.26958 88.46436

233

Internet 74.22936 87.67537 99.09527

261

Total

403

476

538 1417

Data
Level of Significance

0.05

Number of Rows

Number of Columns

Degrees of Freedom

Results
Critical Value

15.50731

Chi-Square Test Statistic

30.92932

p-Value

0.000145

Reject the null hypothesis

2
Decisionrule:If STAT
>15.5073,rejectH0.

2
Teststatistic: STAT
=30.9293

Page2of10
TUTORIALWEEK72012_2013

2
=30.9293isgreaterthanthecriticalboundof15.5073,rejectH0.
Decision:Since STAT

Thereisevidenceofasignificantrelationshipbetweentheagegroupandwherepeople
primarilygettheirnews.The50+grouphasalowerthanexpectedfrequencyofgetting
theirnewsthroughtheInternetwhiletheunder36grouphasahigherthanexpected
frequencyofgettingtheirnewsthroughtheInternet.

QUESTION 2
Where people turn to for news is different for various age groups. Suppose that a
study conducted on this issue and it was based on 200 respondents who were
between the ages of 36 and 50 and 200 respondents who were above 50. The
results are represented in the following table with the specific breakdown of the
responses.
Source of News
Newspapers
Other
Total

Age
36-50
82
118
200

Above 50
104
96
200

Total
186
214
400

a) Is there evidence of a significance difference of the proportion who get their


news primarily from newspapers between those of 36 to 50 years old and
those above 50 years old? Use a 5% level of significance. Please use
mathematical calculations to solve it.
b) Use Excel to determine the p-value in (a) and interpret its meaning.

Solution: In this problem we have a 2x2 contingency table. Firstly we define the null and
alternative hypothesis.
a)
H 0 : 1 2 . (Proportion of 36-50 years old and the proportion of the group above
50, who get the news primarily for the newspaper is equal)

H 1 : 1 2 (the two proportions of people between the age 36 to 50 and above 50


who get their news primarily form newspapes are not the same. The way they get
their news is not independent of the age.)
In this type of problems we will use the chi square test statistic.

Page3of10
TUTORIALWEEK72012_2013

( fo fe )2
Test Statictic
fe
all cells

df (r - 1)(c - 1) 2 where r is the number of rows in the

contigency table and c is the number of columns.


Alos f 0 : fo observed frequency in a particular cell
f e : expected frequency in a particular cell if H 0 is true
The average proportion is

X1 X 2 X
82 104

0.465 and
n1 n2
n 200 200

X1 X 2 X
118 96

0.535
n1 n2
n 200 200

It follows that the expected frequency table is


Source of News
Newspapers
Other
Total

Age
36-50
200(0.46
5)=93
200(0.53
5)=107
200

Above 50
200(0.465)=93

Total
186

200(0.537)=93

214

200

400

Hence the chi square test statistic is


Test Statictic

( f o f e ) 2 (82 - 93) 2 (104 93) 2 (118 - 107) 2 (96 - 107) 2

4.863

fe
93
93
107
107
all cells

The value for the chi square distribution for 1degrees of freedom at the 5% level of
significance is

2
0.05

3.841 .

Comparing the critical value with the value of the chi square test statistic we can see that
3.841 4.863 . Hence it follows that we will not accept the H 0 hypothesis. At the 5% level of
significance, the two proportions are not the same and there is a significant difference of the
proportion who get their news primarily from the newspaper between the ages 36 to 50 years
old and those above 50 years old.

b)

Page4of10
TUTORIALWEEK72012_2013

Results
Critical Value
3.8415
Chi-Square Test Statistic
4.8638
p-Value
0.0274
Reject the null hypothesis

Based on the
excel output we can see
that the p-values is 0.024. If we compare it with the significance level is smaller
0.024<0.05 hence we will reject the H 0
hypothesis mentioned in a).

QUESTION 3
More shoppers do their majority of their grocery shopping on Saturday than any
other day of the week. To check this statement 600 shoppers were interviewed, 200
from each age group: under 35, between 35-54 and over 54. The results are
represented in the following table
Observed Frequencies
Row variable
Saturday
A Day other than Saturday
Total

Column variable
Under 35
35-54
48
56
152
144
200
200

Over 54
24
176
200

Total
128
472
600

Is there evidence of a significant difference among the age groups with respect to
the majority shopping day? Use the 5% level of significance.
Solution: In this problem we have a 3x2 contingency table. Firstly we define the null and
alternative hypothesis.

a)
H 0 : 1 2 3 .

H 1 : at least one proportion differs


where population 1 = under 35, 2 = 35-54, 3 = over 54

In this type of problems we will use the chi square test statistic.

Page5of10
TUTORIALWEEK72012_2013

( fo fe )2
Test Statictic
fe
all cells

df (r - 1)(c - 1)(2 1) (3 1) 2 where r is the number of rows in the

contigency table and c is the number of columns.


Alos f 0 : fo observed frequency in a particular cell
f e : expected frequency in a particular cell if H 0 is true
The average proportion is

X1 X 2 X 3
48 56 24

0.213 and
200 200 200
n1 n2 n3

X1 X 2 X 3
152 144 176

0.786
200 200 200
n1 n 2 n3

It follows that the expected frequency table is


Expected Frequencies
Row variable
Saturday
A Day other than Saturday
Total

Column variable
Under 35
35-54

Over 54

200(0.213)=
42.67
200(0.786)=
157.33
200

200(0.213)=
42.67
200(0.786)=
157.33
200

200(0.213)=
42.67
200(0.786)=
157.33
200

Total
128
472
600

Hence the chi square test statistic is


Test Statictic

( f o f e ) 2 (48 - 42.6) 2
(56 42.67) 2 (24 - 42.67) 2 (152 - 157.33) 2

fe
42.67
42.67
42.67
157.33
all cells
(144 - 157.33) 2 (176 - 157.33) 2

16.521
157.33
157.33

The value for the chi square distribution for 2 degrees of freedom at the 5% level of
significance is

2
0.05

5.991 .

Hence 2 0,05 5.991 and we reject the H 0 . There is enough evidence to conclude that there
is a difference between the age group and the day that they do their shopping.

Page6of10
TUTORIALWEEK72012_2013

QUESTION 4
A sample of 500 shoppers was selected in a large metropolitan in order to determine
various informations concerning consumer behaviour. Among the questions asked
was Do you enjoy shopping for clothing? The results are summarized in the
following contingency table:
ObservedFrequencies

EnjoyShopping

Gender

Male

Female

Total

Yes

126

234

360

No

104

36

140

Total

230

270

500

a) Is there evidence of a significant difference between the proportion of male


and females who enjoy shopping for clothing at the 1% level of significance?
b) Determine the p-value in ( a) and interpret its meaning.
c) What are your answers to (a) and (b) if 206 males enjoyed shopping for
clothing and 24 did not?

Solution
Exceloutput:
ObservedFrequencies

EnjoyShopping

Gender

Male

Female

Total

Yes

126

234

360

No

104

36

140

Total

230

270

500

ExpectedFrequencies

EnjoyShopping

Gender

Male

Female

Total

Page7of10
TUTORIALWEEK72012_2013

Yes

165.6

194.4

360

No

64.4

75.6

140

Total

230

270

500

LevelofSignificance

0.01

NumberofRows

NumberofColumns

DegreesofFreedom

Results

CriticalValue

6.634897

ChiSquareTestStatistic

62.6294

pValue

2.5E15

Rejectthenullhypothesis

H0: 1 2

2
>6.635,rejectH0.
Decisionrule:df=1.If STAT

2
=62.6294
Teststatistic: STAT

H1: 1 2

wherepopulation:1=males,2=females

2
Decision:Since STAT
=62.6294isgreaterthantheuppercriticalboundof6.6349,

rejectH0.Thereisenoughevidencetoconcludethatthereissignificantdifference
betweentheproportionsofmalesandfemaleswhoenjoyshoppingforclothingatthe
0.01levelofsignificance.

(b)

pvalue=virtuallyzero.Theprobabilityofobtainingateststatisticof62.6294or
largerwhenthenullhypothesisistrueisvirtuallyzero.

(c)

(a)

H0: 1 2

2
=0.9881islessthantheuppercriticalboundof
Decision:Since STAT

H1: 1 2

wherePopulations:1=males,2=females

6.635,donotrejectH0.Thereisnotenoughevidencetoconcludethatthe
proportionofmalesandfemaleswhoenjoyshoppingforclothingare
different.
(b)

pvalue=0.3202.Theprobabilityofobtainingateststatisticof0.9881or
largerwhenthenullhypothesisistrueis0.3202.

Page8of10
TUTORIALWEEK72012_2013

QUESTION 5
A survey was conducted in five countries. The percentage of respondents said that
they eat out once a week or more are as follows:
GERMANY
10%
FRANCE
12%
UNITED KINDOM
28%
GREECE
39%
US
57%
Suppose that the survey was based on 1000 respondents in each country
a) At the 5% level of significance determine whether there is a significance
difference in the proportion of people who eat out at least once in a week in
the various countries.
b) Find the p-value in (a) and interpret its meaning.
Solution:
EXCEL output:
Observed Frequencies
Column variable
Row variable

Germany

France

UK

Greece

US

Total

Yes

100

120

280

390

570

1460

No

900

880

720

610

430

3540

Total

1000

1000

1000

1000

1000

5000

US

Total

Expected Frequencies
Column variable
Row variable

Germany

France

UK

Greece

Yes

292

292

292

292

292

1460

No

708

708

708

708

708

3540

Total

1000

1000

1000

1000

1000

5000

Data
Level of Significance

0.05

Page9of10
TUTORIALWEEK72012_2013

Number of Rows

Number of Columns

Degrees of Freedom

Results
Critical Value

9.487728

Chi-Square Test
Statistic

742.3961

p-Value

2.3E-159

Reject the null


hypothesis

(a)

H 0 : 1 2 3 4 5

( the proportion of people that they eat out is the same


In all countries)

H1 : Not all j are equal.

( the proportion of people that they eat out is NOT


the same in all countries)

where population 1 = Germany, 2 = France, 3 = UK, 4 = Greece, 5 = US


Test statistic:

2
STAT

( f0 fe )2

= 742.3961
fe
All Cells

Decision: Since the calculated test statistic 742.3961 is greater than the critical value of
9.4877, you reject H 0 and conclude that there is a difference in the proportion of people
who eat out at least once a week in the various countries.
(b)

p-value is virtually zero. The probability of obtaining a data set which gives rise to a test
statistic of 742.3961 or more is virtually zero if there is no difference in the proportion of
people who eat out at least once a week in the various countries.

Page10of10
TUTORIALWEEK72012_2013

Das könnte Ihnen auch gefallen