Sie sind auf Seite 1von 21

Statistics for Management Unit 10

Sikkim Manipal University Page No. 393






Unit 10 ChiSquare Test

Structure:
10.1 Introduction
Objectives
Relevance
10.2 Chi-Square test
Characteristics of Chi-Square test
Steps in solving problems related to Chi-Square test
Conditions for applying the Chi-Square test
Restrictions in applying Chi-Square test
Practical applications of Chi-Square test
Uses of Chi-Square test
Degrees of freedom
Levels of significance
Interpretation of Chi-Square values
10.3 Applications of Chi-Square Test
Tests for independence of attributes
Test of goodness of fit
Test for comparing variance
10.4 Summary
10.5 Glossary
10.6 Terminal Questions
10.7 Answers
10.8 Case Study

10.1Introduction

In the previous unit, testing of hypothesis, we discussed about how to test
hypothesis concerned with parameters like mean and proportion, using data
from either one or two samples. We used one-sample tests to determine
whether a mean or a proportion was significantly different from a
hypothesised value. In the two-sample tests, we examined the difference
between either two means or two proportions, and we tried to learn whether
this difference was significant.

For example, we have proportions from five populations instead of only two,
then for these cases, the methods for comparing proportions described for
Statistics for Management Unit 10
Sikkim Manipal University Page No. 394




testing hypothesis for two-samples do not apply; we must use the Chi-
Square test (_
2
test). In this unit, Chi-Square, we will discuss the Chi-Square
tests which enable us to test whether more than two population proportions
can be considered equal. In other words, a Chi-Square test is also a
parametric test which can be applied on categorical data or qualitative data.
This test can be applied when we have few or no assumptions about the
population parameter.

Actually, Chi-Square tests allow us to do a lot more than just test for the
quality of several proportions. If we classify a population into several
categories with respect to two attributes (such as age and job performance),
we can then use a Chi-Square test to determine whether the two attributes
are independent of each other. So, Chi-Square tests can be applied on a
contingency table.

Objectives:
After studying this unit, you should be able to:
- describe the non parametric method of testing hypothesis
- describe the Chi-Square characteristics
- identify the conditions required for applying Chi-Square test for a given
population distribution
- recognise the applications of Chi-Square test
- describe the steps in solving problems related to Chi-Square test
10.1.1 Relevance
Case-let
Women still earn less than men
On 27 February 2006 the Women and Work Commission (WWC), published
its report on the causes of the gender pay gap or the difference between
mens and womens hourly pay. According to the report, British women
working full-time currently earn 17% less per hour than men. In February the
European commission also brought out its own report on the pay gap across
the European Union. Its findings were similar in that, on an hourly basis,
women earn 15% less than men for the same work.

In the United States, the difference in median pay between men and women
is around 20%. According to the WWC report the gender pay gap opens
early. Boys and girls study different subjects in school, and boys subjects
Statistics for Management Unit 10
Sikkim Manipal University Page No. 395


2


lead to more lucrative careers. They then work in different sorts of jobs. As a
result, average hourly pay for a woman at the start of her working life is only
91% of a mans; even through nowadays she is probably better qualified.
How do we compile this type of statistical information? We can use Chi-
Square testing for more than one type of population.

(Source: Derek L Waller Published by Elsevier Inc Ed 2008).

10.2Chi-Square test

The Chi-square test is one of the most commonly used non-parametric tests
in statistical work. The Greek Letter _
2
is used to denote this test. _
2
describe the magnitude of discrepancy between the observed and the
expected frequencies. The value of _
2
is calculated as:
(O E )
2
(O E )
2
(O E )
2
(O E )
2
(O E )
2

_
2
=

i i
=
E
i

1 1
+

E
1

2 2
+

E
2

3 3
+ ....... +
n n

E
3
E
n


Where, O1, O2, O3.On are the observed frequencies and E1, E2, E3En
are the corresponding expected or theoretical frequencies.

10.2.1 Characteristics of Chi-Square test
The following are the characteristics of a Chi-Square test (_
2
test):
- The _
2
test is based on frequencies and not on parameters
- It is a non-parametric test where no parameters regarding the rigidity of
populations are required
- Additive property is also found in _
2
test
- The _
2
test is useful to test the hypothesis about the independence of
attributes
- The _
2
test can be used in complex contingency tables
- The _
2
test is very widely used for research purposes in behavioral and
social sciences including business research
- While testing whether the observed frequencies of certain outcomes fits
with expected frequencies defined by a theoretical distribution, the _
2
value defined here follows _
2
distribution:

_
2
=

(O
i
E
i
)
E
i

Statistics for Management Unit 10
Sikkim Manipal University Page No. 396




where, Oi is the observed frequency and Ei is the expected frequency.

Key Statistic
The observed frequencies are the frequencies obtained from the
observation, which are sample frequencies. The expected frequencies
are the calculated frequencies.

10.2.2 Steps in solving problems related to Chi-Square test
Figure 10.1 depicts the steps required for solving the problems related to
Chi-Square test.
























Fig. 10.1: Procedural Steps in Solving Problems on Chi-Square Test

10.2.3 Conditions for applying the Chi-Square test
The following are the conditions for using the Chi-Square test:
1. The frequencies used in Chi-Square test must be absolute and not in
relative terms.
2. The total number of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be
independent of each other.
4. As _
2
test is based wholly on sample data, no assumption is made
concerning the population distribution. In other words, it is a non
parametric-test.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 397




5. _
2
test is wholly dependent on degrees of freedom. As the degrees of
freedom increase, the Chi-Square distribution curve becomes
symmetrical.
6. The expected frequency of any item or cell must not be less than 5, the
frequencies of adjacent items or cells should be polled together in order
to make it more than 5.
7. The data should be expressed in original units for convenience of
comparison and the given distribution should not be replaced by relative
frequencies or proportions.
8. This test is used only for drawing inferences through test of the
hypothesis, so it cannot be used for estimation of parameter value.

10.2.4 Restrictions in applying Chi-Square test
The sample observations should be independently and normally distributed.
For this; either the parent population should be infinitely large (for example,
greater than 50), or sampling should be done with replacement.

Constraints imposed upon the observations must be of linear character, for
example,

O
i
=

E
i

The _
2
distribution is essentially a continuous distribution; however its
character of continuity is maintained only when the individual frequencies of
the variate values remain greater than or equal to 5. So, in applying _
2
test
in the testing of the goodness of fit or testing of the dependency of variables
in a contingency table, the cell frequency should not be less than 5. In
practical problems we can combine a few values of small frequencies into
one to get the pooled frequency greater than 5.

Key Statistic
The results of Chi-Square test cannot be accurate if the cell frequencies
in a contingency table are less than 5.

10.2.5 Practical applications of Chi-Square test
In inferential statistics, the Chi-Square test can also be applied for the
discrete distributions. In using Chi-Square test, we need no assumptions
regarding the shape of sampling distributions. The applications of Chi-
Square test include testing:
Statistics for Management Unit 10
Sikkim Manipal University Page No. 398




- the significance of sample variances
- the goodness of fit of a theoretical distribution
- the independence in a contingency table whether the observed results
are consistent with the expected segregations in breeding experiments
of genetics

Where the first is a parametric test and the other two are nonparametric test.

10.2.6 Uses of Chi-Square test
The _
2
test is used broadly to:
- Test goodness of fit for one way classification or for one variable only
- Test independence or interaction for more than one row or column in the
form of a contingency table concerning several attributes
- Test population variance o
2
through confidence intervals suggested by
_
2
test

10.2.7 Degrees of freedom
The number of degrees of freedom for n observations is n-k and is usually
denoted by v, where k is the number of independent linear constraints
imposed upon them.

Example 1
For example, we are asked to write any four numbers, we will have all the
numbers of our choice. If a restriction is applied or imposed to the choice
that the sum of these numbers should be 50; then the freedom of choice
would be reduced to three only and so the degrees of freedom would
now be 3.

If a _
2
is defined as the sum of the squares of n independent standardized
normal variates, and the condition of the satisfaction of one linear relation is
imposed upon them (such as the estimation of some population parametric
value, etc.), then the effect of these n constraints would be replaced by n-
k. If the sum of squares of a sample mean is taken instead of the population
mean, then n is replaced by n -1 = v. This is because one linear constraint
has been imposed.

Key Statistic
The Chi-Square distribution has only one parameter, that is, the degrees
of freedom.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 399


0


10.2.8 Levels of significance
Tables have been prepared for the values of P, where the probability of
getting a value of _
2
> _
2
where _0
2
is an observed value. From these
tables, we can find the value of P corresponding to an observed value of _
2
and then proceed to test, whether the difference between observed and
theoretical frequencies is significant or not. Smaller the values of P, greater
the divergence between fact and theory so that small values lead us to
suspect the hypothesis. Not only do small values of P lead us to suspect
the hypothesis but a value of P very near to unity may also lead to a similar
result. Thus, if P = 1, _
2
= 0, showing that there is a perfect agreement
between fact and theory and this is a very improbable event. There are two
conventional levels of significance. They are:

- If P < 0.05, we say that the observed value of _
2
is significant at
5 percent level of significance.

- Similarly, if P < 0.01, the value is significant at 1 % level.

10.2.9 Interpretation of Chi-Square values
After ascertaining the _
2
value, the _
2
table comprises of columns headed
with symbols 0.05 for 5% level of significance, 0.01 for 1% level of
significance, etc. The left hand side indicates the degrees of freedom. If the
calculated value of _
2
falls in the acceptance region, the null hypothesis Ho
is accepted and vice-versa. Figure 10.2 depicts the acceptance and
rejection regions of Chi-Square distribution.


















Fig. 10.2: Acceptance and Rejection Regions under Chi-Square Distribution
Statistics for Management Unit 10
Sikkim Manipal University Page No. 400




Key Statistic
The Chi-Square curve will be on the positive side of x-axis because the
Chi-Square values are always positive.

10.3Applications of Chi-Square test

10.3.1 Tests for independence of attributes
In the test for independence, the null hypothesis is that the row and column
variables are independent of each other. We have studied earlier, that the
hypothesis testing is done under the assumption that the null hypothesis is
true.

The following are the properties of the test for independence:
- The data are the observed frequencies
- The data is arranged in the form of a contingency table
- The degrees of freedom v can be calculated as:
v = (Number of rows 1)- (Number of columns 1)
where, v is the degrees of freedom
- The test for independence has a Chi-Square distribution and is always a
right tail test.
- The expected value is computed by taking the row total, multiplying it
with the column total and dividing by the grand total. That is given by:
Row T otal Column T otal
E =
Grand T otal

- The test statistic value does not change, if the order of the rows or
columns is interchanged. Also the value does not change even if the
rows and columns are interchanged.

Solved Problem 1
Calculate the degrees of freedom for a contingency table with three rows
and two columns.

Solution The degrees of freedom denoted by v is calculated as:
v = (Number of rows 1)- (Number of
v = (3 1)- (2 1) = 2
columns 1)
Statistics for Management Unit 10
Sikkim Manipal University Page No. 401




Hence, a contingency table with three rows and two columns has two
degrees of freedom.

Solved Problem 2
Table 10.1 depicts the production in three shifts and the number of defective
goods that turned out in three weeks. Test at 5% level of significance
whether weeks and shifts are independent.

Table 10.1: Production of Defective Goods in Three Shifts
Shift 1 Week 2 Week 3 Week Total
I 15 5 20 40
II 20 10 20 50
III 25 15 20 60
Total 60 30 60 150

Solution: Table 10.1a depicts the observed and expected values required
to calculate _
2
.
Table 10.1a: Observed and Expected Values
Observed
Value
Oi
Expected Value

Row T ot alColumn T ot al
E
i
=
Grand T otal


(O E )
2

i i
(O
i
E
i
)
2

E
i

15 (40 x 60) /150 = 16 1 0.0625
20 (50 x 60) /150 = 20 0 0.0000
25 (60 x 60) /150 = 24 1 0.0417
5 (40 x 30) /150 = 8 9 1.1250
10 (50 x 30) /150 = 10 0 0.0000
15 (60 x 30) /150 = 12 9 0.7500
20 (40 x 60) /150 = 16 16 1.0000
20 (50 x 60) /150 = 20 0 0.0000
20 (60 x 60) /150 = 24 16 0.6667
_
2
cal =3.6459

The steps to calculate _
2
are described as follows:

1. Null hypothesis Ho: The week and shifts are independent
Alternate hypothesis H1: The week and shifts are dependent
2. Level of significance is 5% and degrees of freedom
d.f. = (3 1) (3 1) = 4
_
tab
2
= 9.49
Statistics for Management Unit 10
Sikkim Manipal University Page No. 402


2
ca
l
tab


3. Test statistics


_ =

(O E )
2 i i
E
i

_
2
cal = 3.6459
4. Conclusion: Since _
2
(3.6459) < _
2
), Ho is accepted. Hence,
the attributes week and shifts are independent.


Solved Problem 3
Out of 1000 people surveyed, 600 belonged to urban areas and rest to rural
areas. Among 500 who visited other states, 400 belonged to urban areas.
Test at 5% level of significance whether area and visiting other states are
dependent.

Solution: Table 10.2 depicts the information given in solved problem 3 in a
tabulated form.

Table 10.2: People Belonging to Urban and Rural Areas
Other States Urban Rural Total
Visited 400 100 500
Not Visited 200 300 500
Total 600 400 1000

Table 10.2a depicts the observed and expected values for the calculation of _
2
.

Table 10.2a: Observed and Expected Values
Observed
Value
Oi
Expected Value

Row T ot alColumn T ot al
E
i
=
Grand T otal

(O E )
2

i i
(O E )
2

i i
E
i

400 300 10000 33.33
200 300 10000 33.33
100 200 10000 50.00
300 200 10000 50.00
_
2
cal = 166.66

The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis H0: Area and visit are independent.

Alternate hypothesis H1: They are dependent.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 403


2
cal tab


2. Level of significance is 5% and degrees of freedom
d.f. = (2 1) (2 1) = 1

_
tab
2
= 3.84

3. Test statistics


_ =

(O E )
2 i i
E
i

_
2
cal = 166.66
4. Conclusion: Since _
2
(166.66) > _
2
(3.84), Ho is rejected. Hence, the
area and visit are dependent.

10.3.2 Test of goodness of fit
The test of goodness of fit of a statistical model measures how accurately
the test fits a set of observations. This test measures and summarises the
differences if any, between the observed and expected values of the
considered statistical model. These test results are helpful to know whether
the samples are drawn from identical distributions or not. The degrees of
freedom are n-1 and the expected value is equal to the average of the
observed values.

Solved Problem 4
A personal manager is interested in trying to determine whether
absenteeism is greater on one day of the week than on another day of the
week. The record for the past years is available. Table 10.3a depicts the
absenteeism for each working day over a week. Test whether absenteeism
is uniformly distributed over the week.

Table 10.3: Comparison of Data about Absenteeism

Days of
Week

Monday

Tuesday

Wednesday

Thursday

Friday
Number of
absentees

66

57

54

48

75

Solution: If the absenteeism is uniformly distributed over the week, then
expected number of absenteeism per day is given by:
Statistics for Management Unit 10
Sikkim Manipal University Page No. 404


2
ca
l
tab


E
i
=
(66 + 57 + 54 + 48 + 75)
5

= 60

The table 10.3a depicts the calculated expected values required for
calculation of _
2
for the data related to problem 4.

Table 10.3a: Observed and Expected Values for Calculation of _
2



Observed Value
Oi
Expected Value
E
i


(O E )
2

i i
(O
i
E
i
)
2

E
i

66 60 36 0.6000
57 60 9 0.1500
54 60 36 0.6000
48 60 144 2.4000
75 60 225 3.7500

_
2
cal=7.5000

The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis Ho: The observed frequencies fit with uniform
distribution.

2. Alternate hypothesis H1: The observed frequencies does not fit with
uniform distribution.
3. Level of significance is 5% and degrees of freedom (d.f.)= (5 1) = 4

_
2
tab = 9.49

4. Test statistics


_
2
=

(O
i
E
i
)
E
i


_
2
cal = 7.50

5. Conclusion: Since _
2


(7.5) < _
2


), Ho is accepted. In other
words, we conclude at 5% level of significance that absenteeism is
uniformly distributed and is independent of the days of the week.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 405


2
_
2
ca
l
tab


Solved Problem 5
According to a theory in Genetics, the proportion of beans of A, B, C and D
types in a generation should be 9:3:3:1. In an experiment with 1600 beans,
the frequency of bean of A, B, C and D type was observed to be 882, 313,
287 and 118 respectively. Does the result support the theory?

Solution: The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis Ho: The result supports theory

Alternate hypothesis H1: The result does not support theory

2. Level of significance is 5% and degrees of freedom(d.f.)= (4 1) = 3




3. Test statistics

_
tab
2
= 7.81

_
2
=

(O
i
E
i
)
E
i


Table 10.4 depicts the observed and expected values for calculation of _
2

for solved problem 5.

Table 10.4: Observed and Expected Values for Calculation of _
2



Observed Value
Oi
Expected Value
E
i


(O E )
2

i i
(O
i
E
i
)
2

E
i

882 (1600 x 9) / 16 = 900 324 0.36
313 (1600 x 3) / 16 = 300 169 0.56
287 (1600 x 3) / 16 = 300 169 0.56
118 (1600 x 1) / 16 = 100 324 3.24

_
2
cal = 4.72


cal = 4.72

4. Conclusion: Since _
2


(4.72) < _
2


), Ho is accepted. Therefore,
the result supports the theory.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 406


2
_
2


Solved problem 6
The following table gives the classification of 100 workers according to
gender and the nature of work. Test whether nature of work is independent
of the gender of the worker.
Table 10.5

Skilled Unskilled Total
Males 40 20 60
Females 10 30 40
Total 50 50 100


The steps for calculation of Chi-Square are described as follows:

1. Null hypothesis Ho: There is no association between nature of work
and is independent of the gender of the worker

2. Level of significance is 5% and degrees of freedom(d.f.)=
(r-1)(c-1)= (2-1) (2-1)=1
_
tab
2
= 3.84

3. Test statistics


_ =

(O E )
2 i i
E
i


Table 10.5a depicts the observed and expected values for calculation of _
2

for solved problem 6.

Table 10.5a: Observed and Expected Values for Calculation of _
2



Observed Value
Oi
Expected Value
E
i


(O E )
2

i i
(O
i
E
i
)
2

E
i

40 30 10 3.333
10 20 -10 5.000
20 30 -10 3.333
30 20 10 5.000

_
2
cal = 16.666


cal = 16.666
Statistics for Management Unit 10
Sikkim Manipal University Page No. 407


cal tab
s p p
p
2


4. Conclusion: Since _
2
(16.666) > _
2


), Ho is accepted. Therefore
the null hypothesis that gender and nature of work are independent will
be rejected.

10.3.3 Test for comparing variance
When we have to use _
2
as a test of population variance, then,
Ho: o
2
= o
2
and HA: os
2

2
= o
2

_
2
=
o
s

o
p

2

(n 1)
Where os = variance of the sample
o
2
= variance of the population
(n -1) = degrees of freedom, n being the number of items in the
sample.
Then by comparing the calculated value with the table value of _
2
for (n-1)
degrees of freedom at a given level of significance, we may either accept or

reject the null hypothesis. If the calculated of _
2
is less than the table value,
the null hypothesis is accepted, but if the calculated value is equal or greater
than the table value the hypothesis is rejected.


Self Assessment Questions


1. _
2
test is a test.
2. A table with 4 rows and 2 columns has the degrees of freedom of
.
3. _
2
test is wholly based on data.
4. If there are four rows and five columns in classification for _
2

test, then the number of degrees of freedom equal to .
5. If the calculated _
2
value is less than the tabulated _
2
value, then the
null hypothesis is .
Statistics for Management Unit 10
Sikkim Manipal University Page No. 408


i) 100.0
ii) 38.4
iii) 0.61
iv) -2.45

i) 5
ii) 6
iii) 7
iv) 12



Activity

Objective Questions:
1. What is the appropriate test to use if you want to determine whether
there is evidence that the proportion of successes is higher in group 1
than in group 2 and we have obtained independent samples from the
two groups?
i) The Z test
ii) The Chi-Square test
iii) Both of the above
iv) None of the above

2. Which of the following values cannot occur in a Chi-Square
distribution?






3. What test would you use to determine whether a set of observed
frequencies differ from their corresponding expected frequencies?
i) The t test for dependent samples
ii) The Chi-Square test
iii) The t test for independent samples
iv) The F test

4. When using the chi-square test for differences in two proportions with
a contingency table that has r rows and c columns, how many degrees
of freedom will the test statistic have?
i) n 1
ii) n
1
+ n - 2
2
iii) (r - 1) x (c - 1)
iv) (r - 1) + (c 1)

5. When testing for the independence in a contingency table with 3 rows
and 4 columns, how many the degrees of freedom will the test statistic
have?
Statistics for Management Unit 10
Sikkim Manipal University Page No. 409




6. Which of the following is true about the Chi-Square distribution?
i) It is a skewed distribution
ii) Its shape depends on the number of degrees of freedom
iii) As the degrees of freedom increase, the Chi-Square distribution
becomes more symmetrical
iv) All of the above

7. What other name is used for a contingency table?
i) A cross-classification table
ii) An ANOVA table
iii) A histogram
iv) None of the above

Solutions to Objective Questions
1. i) The Z test
2. iv) -2.45
3. ii) The Chi-Square test
4. iii) (r - 1)x(c 1)
5. ii) 6

6. 8 iv) All of the above

7. i) A cross-classification table

10.4Summary

Let us recapitulate the important concepts discussed in this unit:
- Chi-Square test is a non-parametric test. The important applications of
Chi-Square test are the tests for independence of attributes, the test of
goodness of fit and the test for specified variance.

- _
2
describe the magnitude of discrepancy between the observed and the
expected frequencies. The value of _
2
is calculated as:

(O E )
2
(O E )
2
(O E )
2
(O E )
2
(O E )
2

_
2
=

i i
=
E
i

1 1
+

E
1

2 2
+

E
2

3 3
+ ....... +
n n

E
3
E
n

Where, O1, O2, O3.On are the observed frequencies and E1, E2,
E3En are the corresponding expected or theoretical frequencies..
Statistics for Management Unit 10
Sikkim Manipal University Page No. 410




- An important criterion for applying the Chi-Square test is that the sample
size should be very large.

10.5Glossary

Chi-Square test: It is a non-parametric test where no parameters regarding
the rigidity of population are required.
Level of significance: The smallest probability at which the null hypothesis
would be rejected (type I error). Usually, if the significance level is less than
a number such as 0.05 (5%), the null hypothesis would be rejected in favour
of the alternative; the chance of getting a sample like the one being
analysed if the null hypothesis were true. A small significance level would
imply that getting such a sample was highly unlikely, suggesting that the null
hypothesis is probably not true; also called the P-value of the test.

10.6Terminal Questions

5. 400 items of each (material) were given treatment x and y to enhance
the strength of the material. 80 gained strength by treatment x and 20
gained strength by treatment y. Does the gain in strength depend on
the treatment?
6. The demand for a particular spare part was found to vary from day to
day. Table 10.6 depicts the information obtained in a sample study.
Test the hypothesis that the number demanded depends upon the day.

Table 10.6: Spare Part Demand from Monday to Saturday

Days

Mon

Tue

Wed

Thur

Fri

Sat

Quantity
Demanded

1124

1125

1110

1120

1126

1115

7. In a survey of 200 boys, of which 75 were intelligent, 40 had skilled
fathers. While 85 of the unintelligent boys had unskilled fathers. Can we
say on the basis of the information that skilled fathers had intelligent
boys?
8. The number of car accidents per month in a town was as follows: 6, 9, 4,
12, 8, 20, 14, 15, 2, and 10. Test the hypothesis that the number of
accidents is same every month.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 411


1. _
2. _
3. _
4. _
5. _
6. _


9. In a particular industry the post graduate, graduate, undergraduates are
in the ratio 2:3:5. A firm belonging to the industry had 400, 550 and 1050
postgraduates, graduates and undergraduates on its pay-roll. Do they
follow earlier observation about the industry?
10. Three hundred digits were chosen at random from a set of tables. The
frequencies of the digits were as follows:
Digits 0 1 2 3 4 5 6 7 8 9
Frequency 28 29 33 31 26 35 32 30 31 25
Using Chi-square test assess the hypothesis that the digits were
distributed in equal numbers in the table.

10.7 Answers

Self Assessment Questions
1. Non-parametric
2. 3
3. Sample
4. 12
5. Not Rejected


Terminal Questions

2
cal
2
cal
2
cal
2
cal
2
cal
2
cal

= 41.142 Ho
= 0.179 Ho
= 8.888 Ho
= 26.6 Ho
= 6.6667 Ho
= 2.864 Ho

rejected
accepted
rejected
rejected
rejected
accepted

10.8Case Study

Automobile Preference
A market research firm in an Asian country made a survey to see if there
was any correlation between a persons nationality and their preference in
the make of automobile they purchased. Table 10.7 depicts the sample
information obtained.
Statistics for Management Unit 10
Sikkim Manipal University Page No. 412




Table 10.7: Types of Automobile Purchased in Various Countries

Pakistan China India Srilanka Nepal
Maruti Suzuki 40 28 30 25 50
Opel 32 35 29 39 35
Lancer 24 40 27 28 29
Ford 40 20 40 26 40
Fiat 26 10 35 35 46

Discussion Questions:
i. Indicate the appropriate null and alternative hypothesis to test if the
make of automobile purchased is dependent on an individuals
nationality?
ii. Using the critical value approach of the Chi-Square test at a 1%
significant level, does it appear that there is a relationship between
automobile purchase and nationality?
iii. Verify the result to Question 2 by using the p-value approach of the
Chi-Square test
iv. What has to be the significance level in order that there appears a
breakeven situation between dependency of nationality and
automobile preference?
v. What is your comment about the results?



References:
- Bevington, P. R. & Robinson, D. K. Data Reduction and Error Analysis
for the Physical Sciences (3rd Edition). (Paperback).
- Cowan, G. Statistical Data Analysis (Oxford Science Publications).
(Paperback).
- Devore, J. L. Probability and Statistics for Engineering and the Sciences
Enhanced Review Edition. (Hardcover - Jan. 29, 2008).
- Froedesen, A. G., Skieggestad, D. & Tofte, H. Probability and Statistics
in Particle Physics. (Hardcover, 1979 out of print).

- James. H. Statistical Methods in Experimental Physics (2nd Edition).
(Hardcover - Nov. 29, 2006).
- Levin, R. I. & Rubin, D. S. (2008) Statistics for Management, Seventh
Edition, PHI Learning Private Limited.
- Lyons, L. Nuclear and Particle Physicists. (Paperback, 1989).
Statistics for Management Unit 10
Sikkim Manipal University Page No. 413




- Mandel, J. The Statistical Analysis of Experimental Data. (Paperback).

- Mayer, S. L. Data Analysis for Scientists and Engineers. (Paperback).

- Morris. H., Schervish, M. J. & Degroot Probability and Statistics
[PROBABILITY & STATISTICS 3 -OS]. (Paperback - Jan. 31, 2002).
- Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P.
Numerical Recipes (3rd Edition): The Art of Scientific Computing.

- Ross, S. M. Introduction to Probability and Statistics for Engineers and
Scientists, Fourth Edition. (Hardcover - Feb. 13, 2009).
- Taylor, J. R. An Introduction to Error Analysis: The Study of
Uncertainties in Physical Measurements. (Paperback).