Sie sind auf Seite 1von 27

9.

HYPOTHESIS TESTING

1) HYPOTHESIS ASSUMPTION ( about the parameter)

NULL HYPOTHESIS ( Ho)
ALTERNATIVE HYPOTHESIS ( Ha) : One tail and two tail tests

2) Level of significance : THE PROBABILITY OF MAKING TYPE I ERROR WHEN
THE NULL HYPOTHESIS IS TRUE

3) Test statistic :

4) Calculation

5) Inference: If Table value is greater than Calculated value, we accept Ho. There is no
sample evidence to reject Ho.
P value approach : Reject Ho if p value is less than or equal to .


One tailed and Two tailed test:

In two tailed, the null hypothesis will be rejected if sample statistic is significantly higher than
or lower the hypothesized population parameter. I.e. the rejection region is located in both the
tails. But in one tailed test, the rejection region is located only in one tail either in right side or
left side.

Error in sampling :

Accept Ho Reject Ho

Ho is true Correct Type I error

Ho is false Type II error Correct

P ( rejecting Ho/ Ho is true) = alpha Size of Type I error
P ( accept Ho/ Ho is false) = Beta Size of Type II error

(we should not commit both the errors and should be reduced to the minimum - they can be
completely eliminated when the full population is examined. The prob of type I error would be
kept down to lower limits)





Z test : Sample mean ( pop s.d known)

1. A machine is set to produce with mean 21.3 and standard deviation 0.4. A random sample
of 625 observations has 21.33 as mean. Test whether the sample mean differs
significantly from population mean.

Solution:

Step 1: Set up Null and alternative hypothesis
H
0
: = 21.3
v/s
H
1
: 21.3

Step 2: Level of significance = 0.05


Step 3:
Test statistics ) 1 , 0 ( ~ N
n
X
Z
o

=

Step 4: Calculation

Given, Population standard deviation () = 0.4 Sample size (n) = 625
Sample mean (
X

) = 21.33 Population mean () = 21.3



Substituting in values we get Z
cal
as

875 . 1
625
4 . 0
3 . 21 33 . 21
=

= Z

Step 5: Inference
Z
cal
= 1.875 Z
tab
at 5% level of significance = 1.96

Since the table value is greater than calculated value there is no sample evidence to reject
H
0
.
At 5% level of significance, = 21.3

PRACTICE :

A random sample of 100 students from the current years batch gives the mean CGPA as 3.55
and variance 0.04. Can we say that this is same as the mean CGPA of the previous batch which
was 3.5? Test at 5% level of significance

It has been found from experience that the mean breaking strength of a brand of threat is 500
gms. With s.d. of 40 gms. From the supplies, received during last month, a sample of 60 pieces
of thread was tested which showed a mean strength of 450 gms. Can we conclude that the thread
supplied is inferior? Test at 5% level of significance

A telephone companys records indicate that individual customers pay on an average Rs. 155 per
month for long distance telephone calls with standard deviation Rs. 45. A random sample of 40
customers bill during a given month produced a sample of 160 for long distance calls. At 5%
level of significance, can we say that the companys records indicate lesser mean than the actual
i.e. actual mean is more than 155 mts?



PROPORTIONS:

1. It is believed that 90% of potential customers are familiar with a Banks Logo. A study
conducted drawing a sample of 100 consumers revealed that only 70% recognized the
Logo. Is this significantly less than the expected proportion at 5% significance level?

Solution:

Step 1: Set up null and alternative hypothesis
H
0
: 90% of potential customers are familiar with a banks logo, i.e. t = 0.9
H
1
: Less than 90% of potential customers are familiar with a banks logo, i.e.t <
0.9

Step 2: Level of significance = 0.05

Step 3: Test statistics

| |
(1 )
p
Z
n
t
t t

~ N(0,1)

Step 4: Calculation

Given, Sample size (n) = 100
Proportion of customers are familiar with a banks logo () = 0.9
Sample proportion (p) =0.7 q=1-p = 0.3

Substituting the values we get,
=

=
100
) 9 . 0 1 ( 9 . 0
9 . 0 7 . 0
Z 6.67

Step 5: Inference

Z
tab
= 1.645 (at 5% level of significance) Z
cal
= 6.67

Since the table value is lesser than the calculated value, there is no evidence to accept H
0
.
We can conclude that the proportion of consumers familiar with the logo is significantly
less than 90% at 5% significance level.

PRACTICE:

A cable TV operator claims that 50% of the homes in a city have opted for his services. Before
sponsoring advertisements on the local cable channel; a company conducted a survey, and found
that 280 out of 600 persons were found to have cable TV services from the operator. On the basis
of this data can we accept the claim of the cable TV operator? Test at 5% level of significance

In a departmental store, 380 customers out of a sample of 800 customers were found to be using
visa credit card. Discuss whether this information supports the view that the majority of
customers of the store are using cards other than visa.

A manufacturer of LCD TV claims that it is becoming quite popular, and that about 5% homes
are having LCD TV. However, a dealer of conventional TVs claims that the percentage of homes
with LCD TV is less than 5%. A sample of 400 household is surveyed, and it is found that only
18 households have LCD TV. Test at 1% level of significance whether the claim of the company
is tenable.


Equality of two mean


A civil rights group in the city claims that a female college graduate earns less than a male
college graduate. To test this claim, a survey of starting salary of 60 male graduates and 50
female graduates was taken. It was found that the average starting salary for female graduates
was Rs. 29,500 with a standard deviation of Rs. 500 and the average salary for male graduates
was Rs. 30,000 with a standard deviation of Rs. 600. At 1% level of significance, test if the claim
of this civil rights group is valid.

Solution:

Step 1: Set up Null and alternative hypothesis

H
o
: A female college graduate earns more than a male college graduate.
v/s
H
1
: A female college graduate earns less than a male college graduate. (ONE
TAIL TEST)


Step 2: Level of significance = 0.01


Step 3: Test statistics

2
1
2 2
1 2
| |
( )
1 2
X X
Z
s s
n n

=
+
~N(0,1)


Step 4: Calculation

Given, Male graduates (n
1
) = 60 Female graduates (n
2
) = 50
Average salary of male graduates ( 1 x

) = 30,000
Average salary of female graduates ( 2 x

) = 29,500
Standard deviation in salary of male graduates (s
1
) = 600
Standard deviation in salary of female graduates (s
2
) = 500

Substituting the values we get

50
) 500 (
60
) 600 (
29500 30000
2 2
+

= Z = 4.767

Step 5: Inference

Z
tab
= 2.33 (at 1% level of significance) and Z
cal
= 4.767

The table value at 1% level of significance is 2.33, is lesser than calculated value 4.767.
There is no sample evidence to accept H
0
.
We can conclude that a female college graduate earns less than a male college graduate.



PRACTICE:
An automobile company is interested in testing the mileage given by one of the car brands in two
different cities, Mumbai and Delhi. The company surveyed 100 car owners in Mumbai and found
that the average mileage is 12 Kms per litre. Out of 150 car owners in Delhi, the mileage
averaged to 12.5 Kms per litre. The standard deviation for mileage of this brand of car is known
to be 0.9 kms. Can we state that these two cities give different mileage? Test at 5% level of
significance


Difference in proportion :

Out of 80 batteries produced using process I, 3 batteries were found defective. Another sample of
130 batteries which were produced using process II, 2 batteries were found defective. Test
whether the proportion of defectives in two processes differs, using 1% level of significance.

Solution:

Step 1: Set up Null and alternative hypothesis

H
0
: The proportion of defective batteries in two processes does not differ.

H
1
: The proportion of defective batteries in two processes does differ
significantly.


Step 2: Level of significance = 0.01


Step 3: Test statistics

1 2
| |
1 1
( )
1 2
p p
Z
pq
n n

=
+
~ N (0,1)


Step 4: Calculation

Given, Sample size (n
1
) = 80 No. of defective batteries in sample 1 (x
1
) = 3
Sample proportion (p
1
) = 0375 . 0
80
3
1
1
= =
n
x

Sample size (n
2
) = 130 No. of defective batteries in sample 2 (x
2
) = 2
Sample proportion (p
2
)

= 0154 . 0
130
2
2
2
= =
n
x



2 1
2 2 1 1
n n
p n p n
p
+
+
= and q = 1 p

Substituting the values we get
( ) ( )
0238 . 0
130 80
0154 . 0 130 0375 . 0 80
=
+
+
= p

q = 1 0.0238 = 0.9762

Substituting the values for p and q in the test statistic we get,


|
.
|

\
|
+

=
130
1
80
1
) 9762 . 0 0238 . 0 (
0154 . 0 0375 . 0
Z = 1.0210


Step 5: Inference

Since the table value at 1% level is 2.58 and the calculated Z value is 1.0210 which is
greater than table value. There is no sample evidence to reject H
0
.

At 1% level of significance, the proportion of defective batteries in two processes does
not differ.


PRACTICE:

A firm wanted to choose a popular actor to be the brand ambassador for the firms product.
However, before taking the final decision, the firm conducted a market survey to know the
opinion of its customers in Mumbai and Delhi. The surveys conducted in the two cities revealed
that while 290 out of 400 customers favoured the choice, in Mumbai, only 160 out of 300
customers favoured the choice in Delhi. Can the firm conclude that the proportions of customers
who favoured the actor in Mumbai and Delhi are the same?
( p = 0.643 q= 0.357 SE= 0.037 p1= 0.725 p2 = 0.533 t cal = 5.24)

An investment consultancy firm finds that 71 investors out of 100 investors in city A prefer
equity investments and 66 investors out of 90 investors in city B prefer equity investments. Test
at 1% level of significance, whether the two cities differ in the proportion of investors preferring
equity?











T test ( mean : population sd unknown)


An automobile tyre manufacturer claims that the average life of their tyres is more than 20,000
kms. A random sample of 16 tyres was tested and found to have a mean and standard deviation
of 22,000 km and 5000 km respectively. Find whether manufacturers claim is valid at 5% level
of significance.

Solution:
Step 1: Set up Null and alternative hypothesis

H
0
: The average life of tyres is equal to 20,000 kms, i.e. <= 20,000 Kms
V/s
H
1
: The average life of tyres is more than 20,000 kms, i.e. > 20,000 Kms


Step 2: Level of significance = 0.05

Step 3: Test statistics

t
n
s
x
t ~

= (n 1) degree of freedom


Step 4: Calculation
Given, Sample size (n) = 16 Sample mean (
x

) = 22000
Sample standard deviation (s) = 5000 The average life of tyres () = 20000

Substituting the values we get,
6 . 1
16
5000
20000 22000
=

= t

Step 5: Inference
Table value at 15 degree of freedom and at 5% significance level of significance = 1.753,
where as the calculated value = 1.6.
Since table value is greater than calculated value, there is no sample evidence to reject
H
0
.
At 5% level of significance, the average life of tyres is less than or equal to 20,000 kms.

PRACTICE:

A sample size of 10 drawn from a normal population has mean as 31 , and variance as 2.25. Is it
reasonable to assume that the mean of the population is 30. Assume alpha = 0.01
( t = (x-)/se ~ t /2 (n-1)df 2.11 table value 3.25


A car manufacturer claims that its new car gives a mileage of at least 10 kms per litre of petrol.
A sample of 10 cars is taken , and their mileage recorded as follows ( in kmpl)
11.2 10.7 11.3 11.0 10.8 10.7 10.6 10.6 10.7 10.4
Is there any statistical evidence to support the claim of the manufacturer about the mileage of its
car? Test at 5% level of significance
H1. M>10
1
) (
2 2

=


n
x n x
s = 0.283; mean = 10.8 t = 8.94 t= 1.83


The mean nicotine content of a brand of cigarette is 20.0 gms. A new process is proposed to
lower the nicotine content without affecting the flavor. To test the new process, 16 cigarettes are
selected at random from the weeks output from the test plant. The sample mean nicotine content
is found to be 18.5 mg. If the s.d. of nicotine content is calculated to be 2 mgs, is the claim of the
new process justified? Use 5% level of significance.
H1 : m < 20.0 t = -3.0 t (tab) = -1.75



Equality of mean :


The mean return of a portfolio of 15 shares listed on NSE, is found to be 18% with standard
deviation 4%. Another portfolio of 25 shares is having mean return of 22% with standard
deviation 5%. Do the two groups significantly differ in their yield at 5% level of significance?

Solution:

Step 1: Set up Null and alternative hypothesis

H
0
: There is no significant difference in the average of yield between two scrips.

v/s

H
1
: There is significant difference in the average of yield between two scrips.


Step 2: Level of significance = 0.05


Step 3: Test statistics
2
1
1 2
| |
1 1
( )
x x
t
S
n n

=
+
~ t (n
1
+ n
2
- 2) degree of freedom

Where
2 2
1 1 2 2
1 2
( 1) ( 1)
2
n s n s
s
n n
+
=
+


Step 4: Calculation

Given, No. of shares in portfolio 1 (n
1
) = 15 Standard deviation of portfolio 1 (s
1
) = 4
Mean return of portfolio 1 (
1
x ) = 18%
No. of shares in portfolio 2 (n
2
) = 25 Standard deviation of portfolio 2 (s
2
) = 5
Mean return of portfolio 1 (
2
x ) = 22%

( ) ( ) ( ) ( )
2 25 15
5 1 25 4 1 15
2 2
+
+
= s = 4.5387

Substituting values in test statistics we get

25
1
15
1
5387 . 4
22 18
+

= t = 2.6984

Step 5: Inference

Table value at 38 degree of freedom and at 5% level of significance = 2.024
Calculated value = 2.6984
Since the table value is lesser than calculated value, there is no sample evidence to accept
Ho. Therefore we can conclude that there is significant difference in the average of yield
between two portfolios.

In a test given to two groups of students , the marks obtained are as follows:
First Group : 18 20 36 50 49 36 34 49 41
Second group : 29 28 26 35 30 44 46
Examine the significance of the difference between the mean of the marks secured by the
students of the above two groups at 5% level of significance.
( X1 = 37 X2 = 34 s= 10.76 tcal = 0.551 ttab = 2.14 Accept Ho.)

PRACTICE:

Strength tested on two varieties of yarn gave the following results.
Sample Size Mean Sample Variance
Type A 4 52 42
Type B 9 42 56

Is there a significant difference in the mean? Test at 5 % level of significance.



A car manufacturer is procuring car batteries from two companies. For testing whether the two
brands of batteries say A and B , had the same life, the manufacturer collected data about the
lives of both brands of batteries from 20 car owners- 10 using A brand and 10 using B brand.
The lives were reported as follows:
Lives in Months
Battery A : 50 61 54 60 52 58 55 56 54 53
Battery B : 65 57 60 55 58 59 62 67 56 61
Test whether both the brands of batteries have the same life? ( s= 3.68, t = -2.85 t(tab) = 2.101








DEPENDENT SAMPLE (PAIRED T TEST)

A company ABC of New Delhi goes for advertisement on Zee TV to be shown during prime
time. It gets its sales records of six months prior to advertising and of six months after
advertisement, which is as follows:
Month 1 2 3 4 5 6
Sales before
advertisement
(Rs. in lakh)
43 20 35 45 52 47
Sales after
advertisement
(Rs. in lakh)
48 21 34 52 46 50
Test at 5% level of significance whether advertising has any effect on sales.

Solution:

Step 1: Set up Null and alternative hypothesis

H
o
: Advertisement has no effect on sales.
V/s
H
1
: Advertisement has effect on sales.

Step 2: Level of significance = 0.05

Step 3: Test statistics

d
t
s
n

= ~ t
(n-1)
degree of freedom (df)

Where, d = Sales After advertisement Sales Before advertisement

Step 4: Calculation


Employee Sales after
advertisement
Sales before
advertisement
d = A-B
d- d


2
d d

| |

|
\ .

1 48 43 5 3.5 12.25
2 21 20 1 -0.5 0.25
3 34 35 -1 -2.5 6.25
4 52 45 7 5.5 30.25
5 46 52 -6 -7.5 56.25
6 50 47 3 1.5 2.25
Total 9 107.5

5 . 1
6
9
= = d
Standard deviation (s) =
( )
1 6
5 . 107
1
2

n
d d
4.6368

Substituting the values we get,

6
6368 . 4
5 . 1
= t = 0.7924

Step 5: Inference

The t table value at 5% level with 5 df is 2.571 and the calculated value is 0.7924.
Since the table value is greater than calculated value, there is no sample evidence to
reject Ho.
Therefore, advertisement has no effect on sales.



PRACTICE:

As per ET- TNS consumer confidence survey, the consumer confidence indices for some of the
cities changed from Dec to Sep as follows. Is the difference significant?
City Dec Sep
A 106 83
B 117 142
C 112 126
D 123 108
E 83 84
F 137 144
G 137 138
H 113 134

( S = 5.90; D(mean) = -3.875 t cal = -0.657 t tab = 2.365 ( m1#m2)

Five salesmen were imparted a one week specialized training for improving their selling skills.
The following data was recorded during the month preceding the training and the month after
the training relating to their sales per month. Can we conclude that the training has made an
significant impact?
Before : 5 6.2 5.4 4.5 5.6
After : 5.5 7 5.6 5.5 6.6 ( d mean = 0.7 s= 0.346 t cal = 4.54 t
tab= 2.78 ( right tail)


TEST FOR CORRELATION:


1. Is a correlation coefficient of 0.5 significant if obtained from a random sample of 11 pairs
of values from a normal distribution? Use t test. (5% level of significance)


Solution:

Ho: The population correlation is uncorrelated i.e. = 0
v/s
H
1
: The population correlation is correlated. i.e. 0

Level of significance = 0.05

Test statistics
2
1
2

=
n
r
r
t

~ t
(n-2)
df (1)

Calculation:
Given, n = 11, = 0, r = 0.5
Substituting the values in eq. (1) we get,
732 . 1
2 11
) 5 . 0 ( 1
0 5 . 0
2
=


= t

t
cal
= 1.732

Inference: t (tab) value at 9 df at 5% level is 2.262
Since the table value is greater than calculated value, there is no sample evidence to reject
Ho.

PRACTICE:
A value of r = 0.6 is calculated for a random sample of 39 pairs of observations from a bivariate
normal population. Is this value of r consistent with the hypothesis that = 0.4? (1% level of
significance)

In a sample of size 18, a correlation coefficient 0.62 was observed. Is this significant of
correlation in the population? (1% significance level)

CHI SQUARE

Conditions :
O = E
If there are only two cells, the expected frequency in each cell should be 5 or more.

Uniform :

1. The demand for a particular spare part was found to vary from day to day. In a sample
study the following information was obtained.
Days Monday Tuesday Wednesday Thursday Friday Saturday
Quantity
Demanded
1124 1125 1110 1120 1126 1115

Test the hypothesis at 1% level of significance that the number demanded depends upon
the day.

Solution:

Step 1: Set up the null and alternative hypothesis:
H
0
: The demand for the spare part is uniform on all days
v/s
H
1
: The demand for the spare part varies from day to day.

Step 2: Level of significance. = 0.01

Step 3: Test statistics
( )
) 1 ( ~
2
2
2

n
E
E O
_ _ degrees of freedom
Where,
O = observed values
E = Expected values

Step 4: Calculation
O E O E (O E)
2

E
E O
2
) (

1124 1120 4 16 0.014286
1125 1120 5 25 0.022321
1110 1120 -10 100 0.089286
1120 1120 0 0 0
1126 1120 6 36 0.032143
1115 1120 -5 25 0.022321
Total 0.180357

Where,
1120
6
1115 1126 1120 1110 1125 1124
=
+ + + + +
= E

Step 5: Inference
The tabulated _
2
with degree of freedom 5 (6 - 1)

at 1% level of significance is 15.0863 (from
Chi-square distribution table). Whereas the calculated value for _
2
is 0.180.
The tabulated value is greater than calculated value. Hence, there is no sample evidence to reject
H
0
.
Therefore, the demand for the spare part is uniform on all days.

PRACTICE:
The number of car accidents per month in a town was as follows: 6, 9, 4, 12, 8, 20, 14, 15, 2, and
10. Test the hypothesis that number of accidents is same every month. Test at 1% level of
significance.


Independence

1. The following table gives the liking of a particular car model by different age groups.
AGE
Below 20 20 39 40 59 60 and above Total
Persons who liked Car 140 80 40 20 280
Persons who disliked
Car
60 50 30 80 220
Total 200 130 70 100 500

Test at 5% level of significance whether liking and age are independent.

Solution:

Step 1: Set up the null and alternative hypothesis
H
0
: The liking of a particular model car and age are independent.
v/s
H
1
: The liking of a particular model car and age are dependent.

Step 2: Level of significance
= 0.05

Step 3: Test statistics
( )
) 1 ( ) 1 ( ~
2
2
2

c r
E
E O
_ _ degrees of freedom
Where,
O = observed values
E = Expected values =
GrandTotal
l ColumnTota RowTotal


Step 4: Calculation
O E (O-E) (O-E)
2
E
E O
2
) (

140 112 28 784 7
60 88 -28 784 8.909091
80 72.8 7.2 51.84 0.712088
50 57.2 -7.2 51.84 0.906294
40 39.2 0.8 0.64 0.016327
30 30.8 -0.8 0.64 0.020779
20 56 -36 1296 23.14286
80 44 36 1296 29.45455
Total

70.16198
Where,
112
500
200 280
=

= E , other values for E can be found in the same manner.



Step 5: Inference
The tabulated _
2
with degree of freedom 3 {(2 1) x (3 1)}

at 5% level is 7.81473
(from Chi-square distribution table). Whereas the calculated value for _
2
is 70.16198.
Since, the tabulated value is lesser than calculated value. There is no sample evidence to
accept H
0
.
Therefore, the liking of a particular car model and age are dependent.

PRACTICE:
1. A survey of 200 boys was conducted. 75 boys were found intelligent, out of which 40
boys had skilled fathers and 85 boys of the unintelligent boys had unskilled fathers. Can
we say on the basis of the information that skilled fathers had intelligent boys? Test at 5%
level of significance.

2. Five hundred students in a school were graded according to their intelligence and the
economic conditions of their homes. Examine whether there is any association between
economic conditions at home and intelligence at 5% level of significance.
Intelligence
Good Bad Total
Rich 85 75 160
Poor 165 175 340
Total 250 250 500


Ratio

1. In a particular industry the undergraduates, graduates, and post graduates are in the ratio
5:3:2. A firm belonging to the industry had 1050, 550 and 400 undergraduates, graduates
and postgraduates on its pay-roll. Does the firm follow earlier observation (ratio) about
the industry? Test at 5% level of significance.


Solution:



Step 1: Set up the null and alternative hypothesis

H
0
: The observations are in the ratio 5:3:2
v/s
H
1
: The observations are not in the ratio 5:3:2



Step 2: Level of significance = 0.05



Step 3: Test statistics


( )
) 1 ( ~
2
2
2

n
E
E O
_ _ degrees of freedom
Where,
O = observed values
E = Expected values

Step 4: Calculation


O E (O-E) (O-E)
2
E
E O
2
) (

400 400 0 0 0
550 600 -50 2500 4.166667
1050 1000 50 2500 2.5
Total

6.666667


Step 5: Inference

The tabulated _
2
(2 degree of freedom) at 5% level is 5.99 and calculated value is 6.66.
Since the tabulated value is lesser than calculated value, there is no sample evidence to accept
H
0
.
Therefore, the observations are not in the ratio 5:3:2.

Goodness of fit

A book has 700 pages. The number of pages with various numbers of misprints is recorded
below:
No. of misprints : 0 1 2 3 4 5
No. of pages with misprints :616 70 10 2 1 1
Can Poisson distribution be fitted to this data?
( mean = 0.15; E(X) = 602.5, 90.38, 6.78, 0.34,0.013, 0 ; Cal value = 44.159 ; df= 6-1-3 = 2;
reject Ho)

Note 1 : [ Yates continuity correction : Since the distribution is continuous but the data under set
is categorical which is discrete. The correction factor suggested by Yate in case of 2x2
contingency table is as follows:

) )( )( )( (
} 2 / 1 {
2
2
d b c a d c b a
N bc ad N
+ + + +

= X

Or
E
E O


= X
2
2
} 5 . 0 | {|


Note 2 : Inference about a population variance :

o
s n
2
2
2
) 1 (
o

= X ~X
2
(n-1)df
A random sample of size 25 from a population gives the sample standard deviation of 8.5. Test
the hypothesis that the population standard deviation is 10.

F test :

Ratio of variance
1.
The percentage sugar content in Tobacco for two samples were found to be as follows:

Sample A

2.4

2.7

2.6

2.1

2.5


Sample B

2.7

3.0

2.8

3.1

2.2

3.6

Test whether their population variances are same. Test at 5% level of significance.
Attributes A Not A Total
B a b a+b
Not B c d c+d
Total a+c b+d N


Solution:


Step 1: Set up the null and alternative hypothesis

H
0
: there is no significant difference between population variances of the two
samples, i.e.
2
2
2
1
o o =
v/s
H
1
: there is significant difference between population variances of the two samples,
i.e.
2
2
2
1
o o =


Step 2: Level of significance = 0.05


Step 3: Test statistics

) 1 ( ), 1 ( ~
2 1
2
2
2
1
= n n F
S
S
F degree of freedom
Where, S
1
2
= variance for sample 1
S
2
2
= variance for sample 2

Step 4: Calculation

X
1

(X
1
-
1
X ) ( )
2
1 1
X X
X
2

( )
2 2
X X ( )
2
2 2
X X
2.4 -0.06 0.0036 2.7 -0.2 0.04
2.7 0.24 0.0576 3 0.1 0.01
2.6 0.14 0.0196 2.8 -0.1 0.01
2.1 -0.36 0.1296 3.1 0.2 0.04
2.5 0.04 0.0016 2.2 -0.7 0.49

3.6 0.7 0.49
12.3

0.212 17.4

1.08



( ) 053 . 0
1 5
212 . 0
1
1
1
2
1
2
1
=

=

=
n
i
i
X X
n
S

( ) 216 . 0
1 6
08 . 1
1
1
1
2
2
2
2
=

=

=
n
i
i
X X
n
S

) (
1 2
2
1
2
2
S S
S
S
F > =
07547 . 4
053 . 0
216 . 0
= = F

F (calculated value ) = 4.07547

F(5, 4) degree of freedom at 5% significance level = 6.26

Step 5: Inference

Since the tabulated value is greater than calculated value, there is no sample evidence to
reject H
0
.
Therefore, there is no significant difference between the variances.

PRACTICE:

1. Time taken by workers in performing a job are given below:
Method 1 20 16 26 27 23 22
Method 2 27 33 42 36 32 34 38
Test whether there is any significant difference between the variances of time distribution at 5%
level.

2. Techgene, Inc., is concerned about variability in the number of bacteria produced by
different cultures. If the cultures have significantly different variability in the number of
bacteria produced, then experiments are messed up and some strange things get produced.
(The management of the company gets understandably anxious when the scientists
produce strange things). The following data have been collected:
Number of Bacteria (in thousands)

Culture Type A

91

89

83

101

144

118

108

125

138

Culture Type B

62

76

90

75

110

140

145

130

110
a) Compute the variance for culture type A and B.
b) State explicit null and alternative hypotheses and then test at the 0.01 significance
level.

ANOVA one way

1. The following figures related to production in kg of three varieties A, B and C of when sown
in 12 plots
A 14 16 18
B 14 13 15 22
C 18 16 19 19 20
Is there any significant difference in the production of three varieties?


1. Four brands of bulbs were tested for their of life time (in 000 hours). From the following
data obtained, test whether the life duration of the bulbs are significantly different.

Make I Make II Make III Make IV
20
23
18
17
19
15
17
20
16
21
19
20
17
16
15
17
16
18

Solution:


Step 1: Set up the null and alternative hypothesis

Ho: There is no significant difference between the lengths of their life of bulbs.
Make I = Make II = Make III = Make = IV

v/s

H
1
: There is significant difference between the lengths of their life of bulbs.
Make I Make II Make III Make IV


Step 2: Level of significance = 0.05


Step 3: Test statistics

2
2
2
1
S
S
F =
Where, S
1
2
= variance between samples
S
2
2
= variance within samples


Step 4: Calculation

Make I Make II Make III Make IV
20
23
18
17
19
15
17
20
16
21
19
20
17
16
15
17
16
18

Correction factor (CF) =
( )
5832
18
) 324 (
2 2
= =
n
GrandTotal

Total sum of squares (TSS) = (Sum of all the observations)
2
CF
(Sum of all observations)
2
=
( )
2
18 16 17 15 16 17 20 19 21 16 20 17 15 19 17 18 23 20 + + + + + + + + + + + + + + + + + = 5914
TSS = = 5914 5832 = 82
Treatment Sum of square (TrSS) = CF
n
T
i
i

2
) (

= 5832
4
) 66 (
5
) 93 (
5
) 87 (
4
) 78 (
2 2 2 2
+ + +
= 21.6

ANOVA TABLE

Source D.F Sum of
squares
Mean sum
of squares

F
calculated
F
tabulated
Treatment
(Employee)

4-1 =3 21.6 7.2
=
31428 . 4
2 . 7


=1.6688
F
(3,14)
df

3.34
Error 14 60.4 4.31428

Total 18-1=17 82


Step 5: Inference

Since the table value is greater than calculated value, there is no sample evidence to reject
Ho. Therefore, there is no significant difference between the length of their life of bulbs.
i.e. Make I = Make II = Make III = Make = IV



PRACTICE:
1. The following data shows the number of claims processed per day for a group of four
insurance companies employees observed for a number of days. Test the hypothesis that
the employees mean claims processed per day are the same. Use the 0.05 level of
significance.


Employee 1

15

17

14

12


Employee 2

12

10

13

17


Employee 3

11

14

13

15

12

9

Employee 4

13

12

12

14

10


2. A research company has designed three different systems to clean up oil spills. The
following table contains the results, measured by how much surface area (in square
meters) is cleared in 1 hour. The data were found by testing each method in several trials.
Are the three systems equally effective? Use the 0.05 level of significance


System A

55

60

63

56

59

55

System B

57

53

64

49

62


System C

66

52

61

57




ANOVA 2 way

1. The data on production rate by five work men on four machines are as follows. Test
whether the rate is significantly different due to workers and machines. Test at 5% level
of significance.

Machines Workmen
1
2
3
4
46
40
49
38
48
42
54
45
36
38
46
34
35
40
48
35
40
44
51
41

Solution:

Step 1: Set up the null and alternative hypothesis

Ho
1
: There is no significant difference between the workers
V/s

H
11
: There is significant difference between the workers.


H
02:
There is no significant difference between the machines.
V/s
H
12
: There is significant difference between the machines.



Step 2: Level of significance = 0.05



Step 3: Test statistic

2
2
2
1
S
S
F =
Where, S
1
2
= variance between samples
S
2
2
= variance within samples


Step 4: Calculation

M / cs Workmen Total
1
2
3
4
46
40
49
38
48
42
54
45
36
38
46
34
35
40
48
35
40
44
51
41
205
204
248
193
Total 173 189 154 158 176 850



Correction factor ( CF) =
( )
36125
20
) 850 (
2 2
= =
n
GrandTotal


Total sum of squares ( TSS) = (Sum of all the observations)
2
CF
(Sum of all observations)
2
=
2
41 35 34 45 38
51 48 46 54 49 44 40 38 42 40 40 35 36 48 46
|
|
.
|

\
|
+ + + +
+ + + + + + + + + + + + + + +
= 36754

TSS = 36754 36125 = 629

Treatment Sum of square ( TrSS) = CF
n
T
i
i

2
) (


(Machine) = 36125
5
) 193 (
5
) 248 (
5
) 204 (
5
) 205 (
2 2 2 2
+ + + = 353.8

Workmen sum of squares = 36125
) 176 (
4
) 158 (
4
) 154 (
4
) 189 (
4
) 173 (
2 2 2 2 2
+ + + + =
201.5


ANOVA TABLE

Source D.F Sum of
squares
Mean sum
of squares

F
calculated
F
tabulated
Machine 4-1 = 3 353.8 117.9333
1466 . 6
9333 . 117
=
19.202
F
3,12
=3.49
Workmen 5-1= 4 201.5 50.375
1466 . 6
375 . 50
=
8.19558
F
4,12=
3.26

Error 19-7=12 73.7 6.1466
Total 20-1=19 629


Step 5: Inference

Since the table value is lesser than calculated value the above two null hypothesis are
rejected. There is significant difference between the production rate of workers and
between the machines.

Das könnte Ihnen auch gefallen