Chapter 3

Chapter 3
HYPOTHESIS TESTING
3.1 BASIC CONCEPTS IN HYPOTHESIS TESTING
Statistical Decisions
Very often in practice we are called up n to make decisions about populations on the basis of
sample information. Such decisions are called statistical decisions.
For example, we may wish to decide on the basis of sample data whether a new serum is really
effective in curling a disease, whether one education procedure is better than another, etc.
Hypothesis and type of hypotheses

A claim (or statement) about a population parameter is called a hypothesis.
Example 8.14:
a) The mean daily profit of a supermarket is 1000 Birr or m = 1000 Birr.
b) The mean time to complete a certain assembly job is less than 2 hours or m < 2 hours.
c) The proportion of customers in this area who prefer this product is more than 75 percent
or P > 75% .
There are two types of hypotheses:
1. Null hypothesis: A null hypothesis is a claim (or statement) about a population parameter
that is assumed to be true until it is declared false. It is denoted by Ho.
2. Alternative hypothesis: An alternative hypothesis is a claim about a population
parameter that will be true if the null hypothesis is false. The alternative hypothesis is
denoted by HA or Ha or H1.
Example 3.1: A soft drink bottling companys advertisement states that a bottle of its
products contains 330 milliliters (ml.). But customers are complaining that the company
is under filling its products. To check whether the complaint is true or not, an inspector
may test the following hypotheses:
Ho: The average content of a bottle of this product is no less than 330 ml,
against,
HA: The average content of a bottle of this product is less than 330ml
Or symbolically,
H o : m 330ml
H A : m < 330ml
If the inspector takes a random sample of bottles of this product and finds that the mean
content per bottle is much less than 330 ml, then he may conclude that the complaint of
the customers is correct.
Hypothesis testing is a procedure for checking the validity of statistical hypothesis. It is
the process by which we decide whether the null hypothesis should be rejected or not.
The value computed from a sample that is used to determine whether the null hypothesis
has to be rejected or not is called a test statistic. Some times some known population
quantities are used in the calculation of a test statistic alongside sample values.
Types of errors
1
Applying a hypothesis test may lead to a wrong conclusion. There are two kinds of possible
errors, called type I error and type II error.
Type I error: Type I error occurs when a true null hypothesis is rejected. The value of a
represents the probability of committing this type of error; that is,
a = P ( H o is rejected \ H o is true ) .
The value of a represents the significance level of the test.
Type II error: Type II error occurs when a false null hypothesis is not rejected. The value of
b represents the probability of committing a type II error; that is
b = P ( H o is not rejected \ H o is false ) .
The value of 1 - b is called the power of the test. It represents the probability of not making a
type II error.
The two types of errors that occur in tests of hypotheses depend on each other. We can not lower
the values of a and b simultaneously for a test of hypothesis for a fixed sample size. Lowering
the value of a will raise the value of b and lowering the value of b will raise the value of a ,
However, we can decrease both a and b simultaneously by increasing the sample size.
The following table presents the possible conclusions and errors in performing a test.
Actual situation
Ho is true Ho is false
Correct Type II or
Do not reject Ho
decision b error
Decision
Type I or Correct
Reject Ho
a error decision
Type of Tests
Based on the form of the null and alternative hypotheses, we have two types of tests: one-sided
(one tailed) tests and two-sided (two tailed) tests.
i) Test of the form
H o : m = mo H o : m < m o

H A : m < mo H A : m m o
2
OR OR
H o : m = mo H o : m > mo

H A : m > mo H A : m mo
Where the alternative hypotheses are inequality (ies) type are called one-sided (one-tailed) tests.
mo is the hypothesized (assumed) mean.
ii) Test of the form
H o : m = mo
H A : m mo ,
in which the critical region (rejection region) includes both large and small values of the test
statistic are called two-sided or two-tailed tests.
3.1.2 TESTS ABOUT MEAN
i) Population normal, s 2 known, sample large or small
Suppose we have a random sample of size n (small or large) from a normal population with mean
m and variance s 2 , where s 2 is known.
1. To test the hypothesis
H o : m = mo
H A : m > mo , mo is a specific value.
i.e. There is no difference between the population mean m and the specified value( mo ). Since
the best estimator of m is X , the test statistic must be dependent on X ,
s2
We know that X : N m , , then
n
X -m
Z= : N ( 0,1)
s
n
X - mo
Z= : N ( 0,1)
If Ho is true, then s .
n
X - mo
=
s Zcalculated is called the test statistic for testing single mean.
n
Let a = level of significance (type I error).
Acceptance region Rejection (critical) region
1-a Area= a
3
za

X - m
P o
> za = a
s

n
X - mo
> za
The critical region (Ho rejected) is s .
n
Where a z = z tabulated is the critical value that can be obtained form the standard normal
distribution table.
X - mo
s is the test statistic calculated.
n
2. If H o : m = mo
H A : m < mo
Acceptance region
Critical region
1-a
Area= a
- za

X - m
P o
< - za = a .
s

n
X - mo
< - za
The critical region is s .
n
3. For the two tailed test
H o : m = mo
H A : m mo
4
Critical region Acceptance Critical region
region
1-a
a a
2 2
-z a za
2 2

X - mo
P- za < < za = 1 - a = Acceptance region
s 2
2
n

X - mo X - mo
P < - za = a and P > za =a
s 2 s 2 2
2

n n

X - mo
P > za = a
s 2
n

Thus the critical region is (rejected HO)
X - mo
if P > za .
s 2
n
Hypothesis testing procedures

The major steps involved in testing a statistical hypothesis are the following
1. State the null and alternative hypotheses.
2. Specify the level of significance ( a ) .
3. Calculate the appropriate test statistic.
4. Determine the acceptance and rejection regions.
5. Make a decision.
Example 8.14: A producer of an electric bulbs claims that the average life length of its product
is 1800 hrs. A sample of 400 bulbs gave mean life length of 1780 hrs. Suppose its known that
life lengths are normally distributed with standard deviation of 200 hrs.
a) Would you support the producers claim at 5% level of significance?
b) Test also that the average life length of the bulbs is less than the producers claims at 1%
level of significance.
Solution:
a) Step 1: H o : m = mo = 1800
5
H A : m mo = 1800
Step2: a = 5% = 0.05
Step 3: Population is normal, s = 200 ( known ) , n = 400
Then the test statistic is
X - mo
Z cal =
s
n
Step 4: The critical region is
X - mo
> za
s 2
n
But X = 1780, s = 200, n = 400 and mo = 1800
1780 - 1800
Then Z cal = = -2
200
400
=2
Z tab = za = z 0.05 = z 0.025 = 1.96 (From a table).
2
2
Step 5: Since Z cal > Z tab , reject Ho . That is the claim of the producer is not correct.
b) H o : m = 1800
H A : m < 1800
a = 0.01
The test statistic is
X - mo
Z=
s
n
Critical region (Ho is rejected) when
X - mo
Z= < - za
s
n
X - mo 1780 - 1800
Z= =
s 200
n 400
= -2
Z tab = - za = - z0.01
= -2.33
Since -2 > -2.33 , then accept Ho. That is the average life length is not less than the producers
claim.
6
Example 8.15: According to the advertisement of car manufacturing company, their cars
averaged at least 32 miles per gallon (mpg) in the city. From past records it is known that
mileage is normally distributed with a standard deviation of 2.5 mpg. Tests on 16 cars showed
that mean mileage in the city is 31.5 mpg. Do the data support the advertisement at the 99
percent confidence level?
Solution: Given: mo = 32mpg , s = 2.5mpg ,
X = 31.5mpg , n = 16
Step1: H o : m 32 mpg
H A : m < 32 mpg
Step 2: a = 0.01
Step 3: The test statistic is:
X - mo 31.5 - 32
Z cal = =
s 2.5
n 16
= -0.8
Step 4: The critical region is
X - mo
< - za
s
n
zcal = -0.8, ztab = - za = -2.33
Step 5: Since -0.8>-2.33, we accept Ho. That is the mean mileage is at least 32 mpg.
ii) Non normal population, large sample, s known /unknown
We wish to test
H o : m = mo or H o : m = mo or H o : m = mo
H A : m > mo H A : m < mo H A : m mo
By central limit theorem, if n is large ( n 30 ) sample mean, X , is approximately normal.
s2
i.e. X : N m , .
n
X -m
Z = : N ( 0,1)
s
n
The critical regions in all these tests in this case are the same as case (i) above. If s is
unknown, estimate it by sample standard deviation, S.
n
Where (X i - X )2
S = i =1
n -1
7
X -m
Z= : N ( 0,1)
S
n
Example 8.16: A company has a computer system that can process at most 1200 bills per hour.
A new system is tested which processes an average of 1260 bills per hour with a standard
deviation of 215 bills in a sample of 40 hours. Test if the new system is significantly better than
old one at the 5% level of significance.
Solution:
Step 1: H o : m 1200
H A : m > 1200
Step 2: a = 5% = 0.05
Step3: Population is non- normal, s is unknown.
But n = 40 30 is large, S= 215, x =1260
Then by CLT the test statistic is
X - mo
zcal =
S
n
X - mo 1260 - 1200
Z cal = =
S 215
n 40
60
=
215
40
60
=
215
6.32
Z cal = 1.76
X -m
Step4: The critical region is S > za
n
But Z tab = z a = Z 0.05 = 1.64
Step 5: Since zcal > ztab , we reject Ho .that is we conclude that the new system represents an
improvement over the old system at the a = 0.05 level of significance
iii) Normal population small sample and s unknown

Under such circumstances, we use the student t distribution with (n-1) degrees of freedom (df)
to determine the critical value (s). If the assumed mean under H o is mo , then the test statistic will
be
X -m
tcal =
S
n
After specifying the level of significance a , we have one of the following three cases;
8
i) To test the hypothesis: H o : m = mo
H A : m mo
X - mo
The critical region is: > ta ( n-1)
S 2
n
ii) To test the hypothesis : H o : m mo
H A : m < mo ,
X - mo
The critical region is S < -ta ( n -1)
n
iii) To test the hypothesis : H o : m mo
H a : m > mo
X - mo
The critical region is: S > ta ( n -1)
n
Example 8.17: The labour management contract calls for the mean daily out put of a particular
production section to be no less than 50 units. A random sample of 22 days reveals a mean of
48.2 units with a standard deviation of 4 units. Assume that the daily out put levels are
approximately normally distributed. Is the contract provision fulfilled? Test it at the 5%
significance level
Solution:
Given mo = 50units, S = 4units, X = 48.2units n = 22
The null and alternative hypotheses are
H o : m 50 units
H a : m > 50units
a = 5% = 0.05
Since n < 30 and the population standard deviation s is unknown, we use the student-t
distribution table to find the critical value.
X - mo 48.2 - 50
The test statistic is: tcal = = = - 2.11
S 4
n 22
X - mo
The critical region is S < ta ( n -1) .
n
But ta ( n -1) = t0.05( 21) = 1.72
-t 0.05 ( 21) = -1.72
Since t cal < t tab , we reject Ho (accept Ha), and conclude that the provision is not fulfilled.
Exercise:
9
It is believed that the average hour of study per day to pass a certain exam is a normal variable,
with 4 hrs. A sample of 16 students was asked and gave the following opinion of their own
hours:
3,4,4,5,2,5,4,3,3,2,6,6,2,1,7,4,
a) Test at 5% level of significance whether the data are consistent with the specified study
hr.
b) Test at 1% level of significance whether students study per day less hrs to pass the same
exam.
8.4.3 TESTS ABOUT PROPORTION

Large sample test
In the case of a proportion, the sample size is considered to be large if both nP > 5 and nQ > 5 ,
then we use the normal distribution to perform tests about proportion by the central limit
theorem.
Let Po be the assumed population proportion. Hence under such circumstances, the test statistic
will be
P - Po
Z cal =
Po ( 1 - Po )
n
Where P is sample proportion.
Case I
If the hypothesis to be tested is
H o : P = Po or H o : P Po
H A : P > Po H A : P > Po

P - P
P > za = a
PQ

n
The critical region is
P - P
> za
PQ
n
Case II
H o : P = Po or H o : P Po
H A : P < Po H A : P < Po
10
P - Po
< - za
PoQo
n
Case III
For the two tailed tests
H o : P = Po
H A : P Po
P - Po za
>
PQ 2
n
Example 8.11: A television manufacturer claims that at least 90% of his TV sets do not require
any repair during the 1st two years of operation. A consumer protection agency selects a random
sample of 100 sets and finds that 14 sets required some repair with in the first two years of
operation. At the 1% level of significance, what conclusion should be reached by the consumer
protection agency?
Solution: Given Po = 90% = 0.9, n = 100 .
14 sets required repair means that 100-14= 86 did not require repair. Hence, the proportion of TV
sets that did not require repair in the sample was
x 86
P = = = 0.86 .
n 100
The null and alternative hypotheses are
H o : P 0.90
H A : P < 0.9
nP = 100 x0.9 = 90 > 5
nQ = 100 X 0.1 = 10 > 5
Then by CLT we use Z-distribution to test the hypothesis.
P - Po
The test statistics is: Z= PoQo
n
P - Po
< - za
The critical region is PoQo
n
11
P - Po 0.86 - 0.9
zcal = =
PoQo 0.9 x0.1
n 100
= -1.33
zcal = - za = - z0.01
= -2.33
Since z cal > z tab , we accept H o . Therefore, we conclude that the claim is true.
12

Chapter 3

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Chapter 3

Hochgeladen von

Copyright:

Verfügbare Formate

Chapter 3

Hypothesis and type of hypotheses

Acceptance region Rejection (critical) region

Hypothesis testing procedures

iii) Normal population small sample and s unknown

8.4.3 TESTS ABOUT PROPORTION

The critical region is

Das könnte Ihnen auch gefallen