Sampling Notes Part 01

Sampling Theory
distribution or p in the case of the binomial

Statistical Inference distribution. All such quantities are often called
Statistical inference is a technique of population parameters.
ascertaining an unknown parameter of the The population parameters are generally
probability distribution on the basis of the data denoted by Greek letters. For example,
collected from that distribution.
𝜇 – Population mean
There are two approaches to ascertain the 𝜎 - Population standard deviation
unknown parameter. They are Estimation and Tests
of Hypothesis. These help in making inferences Statistics
regarding population characteristic based on We can take random samples from the
samples. population and then use these samples to obtain
values that serve to estimate and test hypotheses
Population about the population parameters.
Any statistical investigation usually deals
with the study of some characteristics of a collection Any quantity obtained from a sample for the
of objects. Such a group of objects under study is purpose of estimating a population parameter is
called Population or Universe. called a sample statistic.
A population having finite number of items The sample statistic are generally denoted by
is called a finite population. A population having Roman letters. For example,
infinite number of items is called an infinite
population. The population size is usually denoted 𝑥 – Sample mean
by N. 𝑠- Sample standard deviation
Sampling The mean height 𝜇 of students of a college

A finite subset of population is called a is a parameter, whereas mean height 𝑥 of 40
sample and the number of objects in a sample is randomly selected students of the college is a
called the sample size usually denoted by n. statistic.
Random sampling
A random sample is one in which each and Note:
every unit of the population has an equal chance of 1. Mathematically, a sample statistic for a sample of
being included in the sample. size n can be defined as a function of the random
variables X1, X2,…… Xn i.e., t(X1, X2,… Xn). The
For example, if we take a sample of 4 students out of function t(X1, X2,…… Xn) is another random
30 students in a class, we get 30C4 samples, each variable, whose values can be represented by
having same chance of being selected. t(x1, x2,…… xn).
Parameters
2. Generally, the parameter value is not known. So,
A population is considered to be known
it is estimated with the help of statistic. A statistic
when we know the probability distribution f(x) of the
t = t(x1, x2,…… xn) is an unbiased estimator of the
associated random variable X.
population parameter 𝜃 , if 𝐸 𝑡 = 𝜃 (i.e., if
If X is normally distributed, we say that the E(Statistic) = Parameter).
population is normally distributed or that we have a
normal population. Similarly, if X is binomially
distributed, we say that the population is binomially
distributed or that we have a binomial population.
There will be certain quantities that appear

in f(x), such as 𝜇 and 𝜎 in the case of the normal
Page 1
Sampling Theory
Sampling Distribution of a Statistic The corresponding sample means are

Let us consider a population of size N and
3 5 7 9
let us draw all possible samples of a given size n. For 5 7 9 11
each of these samples, we compute a statistic (i.e., 7 9 11 13
sample mean, sample variance, sample proportion 9 11 13 15
etc..). The value of the statistic may vary from
sample to sample.
Now mean of the sampling distribution of means
The distribution of values of the statistic for denoted by 𝜇𝑥 is given by
different samples of the same size is called sampling
sum of all sample means
distribution of the statistic. 𝜇𝑥 = =9
16
When we obtain a distribution of mean, it is
Note that 𝜇𝑥 = 𝜇
called Sampling distribution of mean and when we
obtain a distribution of proportion, it is called
The variance 𝜎𝑥2 of the sampling distribution of
Sampling distribution of proportion.
means is obtained by subtracting the mean 9 from
each of the sample means, squaring the result,
Standard Error
adding all 16 numbers obtained and dividing by 16.
The standard deviation of sampling
The final result is
distribution is called the standard error (S.E).
Standard error is a measure of variability of 𝜎𝑥2 = 10

the statistic. It is useful in estimation and testing of
𝜎2 20
hypothesis. Note that 𝑛
= 2
= 𝜎𝑥2
1. In the theory of estimation, standard error is used

Case-II (Without replacement)
to decide the efficiency and consistency of the
statistic as an estimator. There are 4C2 = 6 samples of size n=2 which can be
2. In interval estimation, standard error is used to drawn without replacement (this means that we draw
write down the confidence intervals. one number and then another number different from
3. In testing of hypothesis, standard error of the test the first). These are
statistic is used to standardize the distribution of
the statistic. (3,7) (3,11) (3,15)
(7,11) (7,15)
Sampling distribution of means (11,15)
Consider a population consisting of four
The selection (3,7), for example, is considered the
numbers 3, 7, 11, 15. For this population,
same as (7,3).
N=4
3+7+11+15 The corresponding sample means are
𝜇= 4
=9
3−9 + 7−9 2 + 11−9 2 + 15−9 2
2 5 7 9
𝜎2 = 4
= 20 9 11
𝜎 = 4.472 13
Case-I (With replacement)

Now mean of the sampling distribution of means 𝜇𝑥
Now, let us Consider all samples of size n=2 with is given by
replacement from the given population. These are
5 + 7 + 9 + 9 + 11 + 13
𝜇𝑥 = =9
(3,3) (3,7) (3,11) (3,15) 6
(7,3) (7,7) (7,11) (7,15)
(11,3) (11,7) (11,11) (11,15) Note that 𝜇𝑥 = 𝜇
(15,3) (15,7) (15,11) (15,15)
Page 2
Sampling Theory
5−9 2 + 7−9 2 + 9−9 2 + 9−9 2 + 11−9 2 + 13−9 2 then the sample mean is normally distributed with
𝜎𝑥2 = 6 𝜎2
mean 𝜇 and variance .
𝑛
20
=
3
Theorem-05 (Central limit theorem)
Note that 𝜎𝑥2 ≠ 𝜎2 Suppose that the population from which samples are
taken has a probability distribution with mean 𝜇 and
𝜎 2 𝑁−𝑛 20 4−2 20 variance 𝜎 2 that is not necessarily a normal
In fact 𝑛 𝑁−1
= 2 4−1
= 3
= 𝜎𝑥2
distribution. Then the standardized variable
associated with 𝑥 , given by
Theorem-01 𝑥 −𝜇
𝑧 = 𝜎/ -----(4)
𝑛
The expected value of the sample mean 𝜇𝑥 is the
population mean 𝜇 .
is asymptotically normal, i.e.,
i.e., 𝜇𝑥 = 𝜇 ----(1) 𝑧
1 2
lim𝑛→∞ 𝑃 𝑍 ≤ 𝑧 = 𝑒 −𝑧 /2 𝑑𝑧 ----(5)
2𝜋 −∞
Note: Thorem-01 is illustrated in both Case-I and
Case-II of above example. Note:
It is assumed here that the population is infinite or
Theorem-02 that sampling is with replacement. Otherwise, the
If a population is infinite and the sampling is random above is true if we replace 𝜎/ 𝑛 in Eqn(4) by 𝜎𝑥
or if the population is finite and sampling is with from Eqn(3).
replacement, then the variance of the sampling
distribution of means denoted by 𝜎𝑥2 , is given by Conclusion:
If the population is normal, then the sampling
𝜎2
𝜎𝑥2 = 𝑛
-----(2) distribution of mean is also normal with mean 𝜇 and
standard deviation 𝜎/ 𝑛.
Where 𝜎 2 is the population mean and n is sample
size. While for large samples (usually n ≥30 is considered
as large sample), the same result hold even if the
Note: Thorem-02 is illustrated in Case-I of above distribution of the population is non-normal.
example.
Theorem-03
If a population is finite of size N and if sampling is
without replacement, then
𝜎 2 𝑁−𝑛
𝜎𝑥2 = 𝑛 𝑁−1
-----(3)
Eqn(3) reduces to Eqn(2) when N→ ∞
While population mean and sample mean remains

same.
Note: Thorem-03 is illustrated in Case-II of above

example.
Theorem-04
If the population from which samples are taken is
normally distributed with mean 𝜇 and variance 𝜎 2 ,
Page 3
Sampling Theory
Estimator and Estimate Definitions

The statistic used for the estimation of
unknown parameter is called an estimator of the 1. Hypothesis is a statement about the values of
parameter. the population parameter.
2. Testing of hypothesis is a procedure for
Estimate is a specific value of the estimator deciding whether to accept or reject the
for a specified sample. hypothesis.
There are two types of estimation. 3. Null hypothesis is a hypothesis tested for
possible rejection under the assumption that it is
1. Point estimation true. It is usually denoted by H0.
2. Interval estimation
4. Alternate hypothesis is a hypothesis which is
Point Estimation accepted when the null hypothesis is rejected.
A point estimate is a single value which is The alternative hypothesis is denoted by H1.
used to estimate an unknown population parameter. 5. Test statistic is the statistic based on whose
For example, the head of the department would distribution the test is conducted.
make a point estimate when he says that 70% of the In general,
students will get S grade in SEE. Relevent statistic −E t t−E t
Test statistic = =
S.E. t S.E.(t)
A point estimate is not quite sufficient as it
would be either right or wrong. If it is wrong, we 6. Critical region
will not know how wrong it is. So, it tis always The test procedure divides the possible values of
better if an estimate lies within an interval. the test statistic into two regions namely an
acceptance region for H0 and a rejection region
Interval Estimation for H0. The region where H0 is rejected is known
In interval estimation, an interval (T1, T2) as the critical region.
which is likely to contain the parameter is proposed
as estimator of the parameter. The interval (T1, T2) is If the value of the test statistic falls in the
called confidence interval. The limits T1 and T2 of critical region, we reject the null hypothesis H0.
the confidence interval are called the confidence
limits. 7. Errors of the first and second kind
Decision
For example, the head of the department would Actual
based on Error
fact
make an interval estimate when he says that the the sample
percentage of students getting S grade could be Correct
1 H0 is true Accept H0 -----
between 65% and 75%. decision
H0 is Wrong
2 Reject H0 Type I
Testing of Hypothesis true decision
H0 is not Wrong
Let us assume that the population parameter 3 Accept H0 Type II
true decision
has a certain value. Then the unknown parameter H0 is not Correct
value is estimated using sample values. If the sample 4 Reject H0 -----
true decision
value is exactly same or very close to our
assumption, then it can be straight away accepted as The probability of occurrence of the Type I error
the parameter. If it is far away, then we can totally is denoted by 𝛼.
reject it. But if it is neither close nor far away, then The probability of occurrence of the Type II
we have to develop a procedure to decide whether to error is denoted by 𝛽.
accept the presumed value or not, on the basis of The value 1 − 𝛽 is called the power of the test.
sample values. This procedure is called as the
Testing of Hypothesis. 8. Power of the test is the probability of rejecting
H0 when H0 is not true.
Page 4
Sampling Theory
9. Level of significance is the probability of 11. Two tailed test

rejecting H0 when it is true. A test statistical hypothesis where H1 is NOT
EQUAL TO type, it is called two tailed test.
Level of Type I
i.e., =α=P For example,
Significance Error
When H0: 𝜇 = 𝜇0 verses H1: 𝜇 ≠ 𝜇0 then it is a
Usually two levels are used, namely 5% and 1% two tailed test.
level of significance.
When we take 5% as the level of significance
then the probability of committing Type I error
is α =0.05. Similarly, When we take 1% as the
level of significance then the probability of
committing Type I error is α =0.01.
10. One tailed test

The nature of the critical region depends on the
alternative hypothesis H1.
A test of statistical hypothesis where H1 is of 12. Critical values
more than or less than type (right tailed or left
tailed), it is called one-tailed test. Tests 1% Level 5% Level
Two tailed 𝑍𝛼 = 2.58 𝑍𝛼 = 1.96
For example,
Right tailed 𝑍𝛼 =2.33 𝑍𝛼 = 1.645
When H0: 𝜇 = 𝜇0 verses H1: 𝜇 < 𝜇0 then it is a
left tailed test. Left tailed 𝑍𝛼 = −2.33 𝑍𝛼 = −1.645
When H0: 𝜇 = 𝜇0 verses H1: 𝜇 > 𝜇0 then it is a
right tailed test. 13. 95% Confidence limits for 𝝁
𝜎 𝜎
𝑥 − 1.96 𝑛
, 𝑥 + 1.96 𝑛
for large samples
14. 99% Confidence limits for 𝝁

𝜎 𝜎
𝑥 − 2.58 𝑛
, 𝑥 + 2.58 𝑛
for large samples
Left tailed Example

Let us consider the example of weight of
bananas. Here, the unknown mean weight of bananas
(population mean 𝜇) is estimated using a sample of
size n = 100 for which the sample mean is 𝑥 = 84g.
Let the population standard deviation be 𝜎 = 5gms.
Let weight of bananas be normally distributed with
𝜎 5
mean 𝜇 and standard deviation 𝑛
= 100
, then
95% Confidence limits for 𝝁 is
𝜎 𝜎
𝑥 − 1.96 , 𝑥 + 1.96 = 83.02 , 84.98
𝑛 𝑛
Right tailed So with 95% confidence we say that bananas on an
average weigh between 83.02 and 84.98 gms.
Page 5
Sampling Theory
99% Confidence limits for 𝝁 is Problems

𝜎 𝜎
𝑥 − 2.58 , 𝑥 + 2.58 = 81.61, 85.29 1. A sample of 900 members is found to have a
𝑛 𝑛 mean of 3.4 cm. Can it be reasonably regarded
So with 99% confidence we say that bananas on an as a truly random sample from a large
population with mean 3.25cm and SD 1.61cm.
average weigh between 81.61 and 85.29 gms.
Soln: n=900; 𝑥 = 3.4; 𝜇 = 3.25; 𝜎 = 1.61
Test procedure
The steps in the application of a statistical H0: 𝜇 = 3.25
test procedure for testing a null hypothesis are as
follows: H1: 𝜇 ≠ 3.25 (Two tailed test)
a) Setting up the null hypothesis. 𝑥 −𝜇

𝑧= = 2.8
b) Setting up the alternative hypothesis. 𝜎/ 𝑛
c) Identifying the test statistic.
d) Setting a suitable level of significance such as
1% or 5%.(Default 5%)
e) Identifying the critical region.
f) Making decision based on calculated value and
critical value.
Large sample tests

For large samples (n ≥30), most of the
sampling distributions tend to normality, and so, the As 2.8 lies in critical region, H0 is rejected at 5%
test may be based on normal distribution. level of significance.
Test for Mean 2. A sample of 900 men is found to have a mean

height of 64inch. If the sample has been
Suppose the mean 𝜇 of a population is not
drawn from a population with S.D. 20 inch,
known. We want to test whether the mean is a given find the 99% confidence limits for the mean
value 𝜇0 . height of men in the population.
a) The null hypothesis is Soln: n=900; 𝑥 = 64; 𝜎 = 20

b) H0: 𝜇 = 𝜇0 (Population mean is 𝜇0 )
c) For a large random sample of size n, under H0, 99% Confidence limits for 𝜇 is
𝑥 −𝜇 0
the distribution of 𝑧 = is N(0,1) and so, the 𝜎 𝜎
𝜎/ 𝑛
𝑥 − 2.58 , 𝑥 + 2.58 = 62.28, 65.72
test statistic is 𝑧 =
𝑥 −𝜇 0 𝑛 𝑛
𝜎/ 𝑛
d) The alternative hypothesis may be any one of the ________________________________________________

following:
(i) H1: 𝜇 ≠ 𝜇0 (Two tailed test)
(ii) H1: 𝜇 < 𝜇0 (Left tailed test)
(iii) H1: 𝜇 > 𝜇0 (Right tailed test)
Note
𝑥 −𝜇 0
If 𝜎 is not known, then the test statistic 𝑧 = 𝑠/ 𝑛
is
used where 𝑠 is the sample standard deviation.
Page 6
Sampling Theory
Test for Proportion Problems

If X is the number of success in independent 1. A die was thrown 9000 times and a throw of 3
trials with constant probability of success 𝑃 for each or 4 was observed 3240 times. Show that the
trial, then we have 𝐸 𝑋 = 𝑛𝑃 & 𝑉 𝑋 = 𝑛𝑃𝑄 die cannot be regarded as an un biased one at
where 𝑄 = 1 − 𝑃. 1% level of significance.
W.k.t. for large n, the binomial distribution Soln: n=9000 ;

tends to a normal distribution. Hence for large n,
𝑋~𝑁 𝑛𝑃, 𝑛𝑃𝑄 . P=Probability of getting 3 0r 4 in a throw of die
P = (1/6) + (1/6) = 1/3 ; Q = 2/3

𝑋−𝐸 𝑋 𝑋 − 𝑛𝑃
∴ 𝑧= = ~𝑁 0,1
𝑆. 𝐸. 𝑋 𝑛𝑃𝑄 The number of success X = 3240
Note (1) H0: The die is unbiased (P=1/3)

If 𝑋 is the number of persons in a sample of H1: The die is biased (P≠1/3) ; Two tailed test
size n possessing the given attribute, then the
𝑋 𝑋 − 𝑛𝑃
observed proportion of success 𝑝 = 𝑛 𝑧= = 5.4 = 𝑧𝑐𝑎𝑙
𝑛𝑃𝑄
𝑋 𝑛𝑃
∴𝐸 𝑝 =𝐸 = =𝑃 Wkt. For two tailed test 𝑍𝛼 = 2.58 (1% level)
𝑛 𝑛
𝑋 𝑉 𝑋 𝑛𝑃𝑄 𝑃𝑄 As 𝑧𝑐𝑎𝑙 > 𝑍𝛼 , the hypothesis is rejected at 1% level
Also, 𝑉 𝑝 = 𝑉 𝑛
= 𝑛2
= 𝑛2
= 𝑛 of significance.
Conclusion: Die is biased.

𝑃𝑄
∴ 𝑆. 𝐸. 𝑝 = 𝑉 𝑝 =
𝑛 Alternate method:
Given n=9000
Therefore, test statistic for proportion is
X=3240 (getting 3 or 4 out of 9000 throws)
𝑝−𝐸 𝑝 𝑝−𝑃
𝑧= = ~𝑁 0,1 𝑋 3240
𝑆. 𝐸. 𝑝 𝑃𝑄 𝑛 ∴ 𝑝 = 𝑛 = 9000 = 0.36
1 1 1
Note (2) P = P[getting 3 or 4]= 6 + 6 = 3
Since the probable limits for a normal
variate 𝑋 are 𝐸 𝑋 ± 3 𝑣𝑎𝑟 𝑋 , the probable limits H0: The die is unbiased (P=1/3)
for the observed proportion of successes are
H1: The die is biased (P≠1/3) ; Two tailed test
𝐸 𝑝 ± 3𝑆. 𝐸. 𝑝 i.e., 𝑃 ± 3 𝑃𝑄 𝑛
Test statistic is
Note (3) 𝑝−𝐸 𝑝 𝑝−𝑃
If P is not known then the probable limits for 𝑧= =
𝑆. 𝐸. 𝑝 𝑃𝑄 𝑛
the proportion in the population are 𝑝 ± 3 𝑝𝑞 𝑛
0.02666667
𝑧= = 5.4 = 𝑧𝑐𝑎𝑙
0.004969
Wkt. For two tailed test 𝑍𝛼 = 2.58 (1% level)
As 𝑧𝑐𝑎𝑙 > 𝑍𝛼 , the hypothesis is rejected at 1% level

of significance.
Page 7
Sampling Theory
2. A coin was tossed 400 times and head turned 4. A survey was conducted in a slum locality of
up 216 times. Test the hypothesis that the coin 2000 families by selecting a sample of size 800.
is unbiased at 5% level of significance. It was revealed that 180 families were
illiterates. Find the probable limits of the
Soln: n=400; P=P(H)=1/2; Q=1/2 illiterate families in the Population of 2000.
X = number of heads = 216
Soln: n=800 ;
H0: The coin is unbiased (P=1/2) X = number of illiterate families = 180
H1: The coin is biased (P≠1/2) ; Two tailed test 𝑋

∴𝑝= = 0.225
𝑛
𝑋 − 𝑛𝑃
𝑧= = 1.6 = 𝑧𝑐𝑎𝑙
𝑛𝑃𝑄 𝑃 is not known and hence we take 𝑃 = 𝑝 = 0.225
∴ 𝑄 = 𝑞 = 1 − 𝑝 = 0.775
Wkt. For two tailed test 𝑍𝛼 = 1.96 (5% level)
As 𝑧𝑐𝑎𝑙 < 𝑍𝛼 , the hypothesis is accepted at 5% 𝑃𝑄

𝑆. 𝐸 = = 0.0148
level of significance. 𝑛
Conclusion: Coin is unbiased.

Probable limits
3. A sample of 900 days was taken in a coastal 𝑃 ± 3 𝑃𝑄 𝑛 = 0.225 ± 0.044
town and it was found that on 100 days the i.e., 0.2694, 0.1806
weather was very hot. Obtain the people
limits of the percentage of very hot weather. ∴ Number of illiterate families out of 2000 families
lies between 2000 𝑋 0.1806 = 361.2 ≈ 361 and
Soln: n=900 ;
2000 𝑋 0.2694 = 538.8 ≈ 539
X = number of hot days = 100
∴𝑝=𝑛=9
𝑋 1 Similar problems for practice
𝑃 is not known and hence we take 𝑃 = 𝑝 = 1/9 1. In 324 throws of a six faced die , an odd number
turned up 181 times. Is it reasonable to think that
∴ 𝑄 = 8/9 the die is an unbiased one?
𝑆. 𝐸. 𝑜𝑓 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑃𝑄 2. A sample of 100 days is taken from

= = 0.0105 meteorological records of a certain district and
𝑜𝑓 ℎ𝑜𝑡 𝑑𝑎𝑦𝑠 𝑛
10 of them are found to be foggy. What are the
Probable limits probable limits of the percentage of foggy days
in the district.
𝑃 ± 3 𝑃𝑄 𝑛 = 0.1111 ± 0.0315
3. A die was thrown 9000 times and a throw of
i.e., 0.0796, 0.1426 ‘5’ or ‘6’ was obtained 3240 times . On the
assumption of random throwing ,does the data
∴ Percentage of hot days lies between 7.96 and
include an unbiased die at 0.01 level of
14.26 i.e., between 8 and 14 days.
significance
Page 8
Sampling Theory
Small sample tests The test statistic is given by
A sample of size n is taken to be small if n < 30. 𝑋 −𝜇

𝑡 = 𝑠/ 𝑛−1
which follows t-distribution with (n-1)
2
For small samples, we use t-test, 𝜒 -test, F-test etc. degrees of freedom.
The basic assumption in small samples is that the Here 𝑠 = sample mean
population from which samples are drawn is normal.
Confidence limits
Student’s t-distribution If 𝜎 is not known and n is small, then
The probability density function of a
t-distribution with n degrees of freedom is given by 1. 95% confidence limits for 𝜇 is
𝑠 𝑠
𝑛 +1
𝑥 − 𝑡0.05 , 𝑥 + 𝑡0.05
1 Γ
2
𝑛−1 𝑛−1
𝑓 𝑡 = 𝑛𝜋 Γ n 2 𝑛 +1 2 , −∞ < 𝑡 < ∞
𝑡2
1+
𝑛
2. 95% confidence limits for 𝜇 is
𝑠 𝑠
Note 𝑥 − 𝑡0.01 , 𝑥 + 𝑡0.01
𝑛−1 𝑛−1
Number of degrees of freedom is the number of
values in a set which may be assigned arbitrarily.
Problems
For example, if x + y + z =10 and we assign 1. A sample of 12 measurements of the
values for any two variables say x and y then z can diameter of a metal ball gave the mean
be computed depending on the values of x and y. 𝑥 = 7.38mm With S.D. S = 1.24mm.
Here x and y can be chosen arbitrarily i.e., these two Find a) 95% and b) 99% confidence limits
variables are free and independent choices for for the actual diameter.
finding the third. Hence degrees of freedom in this
case are 2. Soln:
95% confidence limits for 𝜇 (actual diameter) is

The probability curve for t-distribution
𝑠 𝑠
𝑥 − 𝑡0.05 , 𝑥 + 𝑡0.05
𝑛−1 𝑛−1
From table 𝑡0.05 = 2.201 for 𝑛 − 1 = 12-1=11

degrees of freedom.
𝑠
∴ 𝑥 − 𝑡0.05 𝑛−1
= 6.5571
𝑠
& 𝑥 + 𝑡0.05 𝑛−1
= 8.20289
t is normally distributed for large samples.
Similarly
t-test for single mean 99% confidence limits for 𝜇 (actual diameter) is
The following are the assumptions of t-test
for single mean. 𝑠 𝑠
𝑥 − 𝑡0.01 , 𝑥 + 𝑡0.01
𝑛−1 𝑛−1
1. The parent population is Normal.
2. 𝜎 2 is unknown. From table 𝑡0.01 = 3.106 for 11 degrees of freedom.
3. Sample size is small. 𝑠
∴ 𝑥 − 𝑡0.01 𝑛−1
= 6.2187
Under the null hypothesis
𝑠
& 𝑥 + 𝑡0.01 𝑛−1
= 8.5413
H0: 𝜇 = 𝜇0 (population mean = sample mean)
Page 9
Sampling Theory
2. Show that 95% confidence limits for the mean 3. A random sample of 10 boys had the
𝒔
µ of the population are 𝒙 ± 𝒕𝟎.𝟎𝟓 . Deduce following I.Q. : 70, 120, 110, 101, 88, 83, 95,
𝒏−𝟏
that a random sample of 16 values with mean 98, 107, 100. Do these data support the
41.5 inches and the sum of the square of the assumption of a population mean I .Q. = 100
deviation from the mean 135 inches2 and at 5% level of significance.
drawn from a normal population ,95%
Soln: n=10, 𝜇 = 100
confidence limits for the mean of population
are 39.9 and43.1 inches. H0: 𝜇 = 100
Soln: (a) The probability that 𝑡 ≤ 𝑡0.05 is 0.95. H1: 𝜇 ≠ 100 (Two tailed test)
Hence the 95% confidence limits for µ are given by
Test statistic
𝑥−𝜇
≤ 𝑡0.05 𝑥 −𝜇
𝑠/ 𝑛 − 1 𝑡 = 𝑠/ 𝑛−1
𝑠 𝑥𝑖 𝑥 𝑖 −𝑥 2
⟹ 𝑥−𝜇 ≤ 𝑡0.05 𝑥= = 97.2 ; 𝑠 2 = = 183.36 ;
𝑛−1 𝑛 𝑛
∴ 𝑠 = 13.54
𝑠 𝑠
⟹− 𝑡0.05 ≤ 𝑥 − 𝜇 ≤ 𝑡0.05
𝑛−1 𝑛−1 Thus 𝑡 = −0.6204 ; 𝑡 = 0.6204
𝑠 𝑠 The table value for 9 d.f at 5% level of significance
⟹− 𝑡0.05 − 𝑥 ≤ −𝜇 ≤ −𝑥 + 𝑡0.05
𝑛−1 𝑛−1 is 𝑡0.05 = 2.26
𝑠 𝑠
⟹ 𝑡0.05 + 𝑥 ≥ 𝜇 ≥ 𝑥 − 𝑡0.05 Clearly 𝑡 = 0.6204 < 𝑡0.05
𝑛−1 𝑛−1
∴ H0 is accepted.
or
4. The average breaking strength of steel rods is
𝑠 𝑠
𝑥 − 𝑡0.05 ≤ 𝜇 ≤ 𝑥 + 𝑡0.05 specified to be 18.5 thousand pounds. To test
𝑛−1 𝑛−1 this a sample of 14 rods was tested. The mean
and standard deviation obtained were 17.85
b) Given n=16, 𝜈 = 𝑛 − 1 = 15 d.f
and 1.955 respectively. Is the result of the
From table, 𝑡0.05 = 2.131 for 15 d.f experiment significant with 95% confidence?
Given 𝑥 = 41.5 & 𝑥𝑖 − 𝑥 2

= 135 Soln: n=14, 𝜇 = 18.5, 𝑥 = 17.85, 𝑠 = 1.955
𝑥𝑖 − 𝑥 2
135 H0: 𝜇 = 18.5
∴ 𝑠2 = = = 8.4375
𝑛 16
H1: 𝜇 ≠ 18.5 (Two tailed test)
∴ 𝑠 = 2.9047
Test statistic
𝑠 𝑠
𝑥 − 𝑡0.05 𝑛−1
= 39.9 & 𝑥 + 𝑡0.05 𝑛−1
= 43.1 𝑥 −𝜇
𝑡 = 𝑠/ 𝑛−1
= −1.20
∴ 95% confidence limits are 39.9 and 43.1
Thus 𝑡 = 1.20
The table value for 13 d.f at 5% level of

significance is 𝑡0.05 = 2.16
Clearly 𝑡 < 𝑡0.05
∴ H0 is accepted.
Page 10
Sampling Theory
5. In the past, a machine has produced washers 3. Consider the sample consisting of numbers 45,
having a thickness of 0.50mm. To determine 47, 50, 52, 48, 47, 49, 53 and 51.The sample is
whether the machine is in proper working drawn from a population whose mean is 48.5.
condition, a sample of 10 washers is chosen Find whether the sample mean differs
for which the mean thickness is found as significantly from the population mean at 50%
0.53mm with standard deviation 0.03mm. level of significance.
Test the hypothesis that the machine is in
proper working condition, using a level of Test for the difference between means
significance of (i) 0.05, (ii) 0.01 of two independent samples of sizes n1
Soln: n=10, 𝜇 = 0.50, 𝑥 = 0.53, 𝑠 = 0.03
and n2.
Test statistic is
H0: 𝜇 = 0.50 (the machine is working properly) 𝑥1 − 𝑥2
H1: 𝜇 ≠ 0.50 (Two tailed test) 𝑡=
Test statistic 𝑛1 𝑠12 + 𝑛2 𝑠22 1 1
𝑥 −𝜇 𝑛1 + 𝑛2 − 2 𝑛1 + 𝑛2
𝑡 = 𝑠/ 𝑛−1
=3
Thus 𝑡 = 3 Which follows t-distribution with 𝑛1 + 𝑛2 − 2 d.f.
Case(i): Here 𝑠1 and 𝑠2 are standard deviations of samples.
The table value for 9 d.f at 5% level of Problems
significance is 𝑡0.05 = 2.26
Clearly 𝑡 > 𝑡0.05 1.
∴ H0 is rejected at 5% level of significance. Diet A Diet B A group of 10 rats fed
5 2 on a diet A and
Case(ii): 6 3 another group of 8 rats
8 6 fed on a different diet
The table value for 9 d.f at 1% level of 1 8 B recorded the
significance is 𝑡0.01 = 3.25 12 1 following increase in
4 10 weights in gms.
Clearly 𝑡 < 𝑡0.01
∴ H0 is accepted at 1% level of significance. 3 2
9 8 Test whether the diet
6 - A is superior to diet B.
Note
10 -
Since we can reject the null hypothesis at
5% level of significance but not at 1% level, we can
conclude to check the machine or take at least Soln: n1=10, n2 = 8, 𝑥1 = 6.4, 𝑥2 = 5
another sample .
2
𝑥−𝑥 𝑥2
𝑠12 = = − 𝑥 2
= 10.24
Similar problems for practice 𝑛 𝑛
1. The nine items of a sample have the following Similarly, for the second sample 𝑠22 = 10.25
values : 45, 47, 50, 52, 48, 47, 49, 53, 51.Does
H0: 𝜇1 = 𝜇2 (No significant difference between
the mean of these differ significantly from the
two diets)
assumed mean of 47.5?
H1: 𝜇1 > 𝜇2 (Diet A is superior to Diet B - right
2. A machinist is making engine parts with axle tailed test)
diameter of 0.7 inch. A random sample of 10
parts shows mean diameter 0.742 inch with a SD Test statistic
of 0.04inch.On the basis of the sample ,would 𝑥1 − 𝑥2
𝑡= = 0.875
you say that the work is inferior?
𝑛1 𝑠12 + 𝑛2 𝑠22 1 1
𝑛1 + 𝑛2 − 2 𝑛1 + 𝑛2
Page 11
Sampling Theory
The table value for 𝑛1 + 𝑛2 − 2 = 16 d.f at 5% 2.

level of significance is 𝑡0.05 = 1.746 (one-tailed) Cow’s Buffalo’s The table gives the
milk milk biological values of
Clearly 𝑡 < 𝑡0.05 1.8 2.0 protein from 6 cows’
2.0 1.8 milk and 6 buffalo’s
∴ H0 is accepted. 1.9 1.8 milk.Examine whether
1.6 2.0 the differences are
Test for the difference between means 1.8 2.1 significant.
of two independent samples of equal 1.5 1.9
size.
Test statistic is Soln: n=6, 𝑥1 = 1.766, 𝑥2 = 1.933
𝑥1 − 𝑥2 𝑥−𝑥 2
𝑡= 𝑠12 = 𝑛
= 0.03124 𝑠22 = 0.0135
𝑛 𝑠12+ 𝑠22 2
2𝑛 − 2 𝑛 H0: 𝜇1 = 𝜇2 (No significant difference)
Which follows t-distribution with 2𝑛 − 2 d.f.
H1: 𝜇1 ≠ 𝜇2 (Two tailed test)
Here 𝑠1 and 𝑠2 are standard deviations of samples.
Test statistic
𝑥1 − 𝑥2
Problems 𝑡= = −1.765
𝑛 𝑠12 + 𝑠22 2
1. 2𝑛 − 2 𝑛
City 1 City 2 To compare the prices
61 55 of a certain product in The table value for 2𝑛 − 2 = 10 d.f at 5% level
63 54 two cities ten shops of significance is 𝑡0.05 = 2.23
56 47 were selected at
63 59 random in each town. Clearly 𝑡 < 𝑡0.05 ; ∴ H0 is accepted.
56 51 The prices noted are
63 61 given in the table. _________________________________________
59 57 Test whether the
56 54 average prices can be Paired t-test
44 64 said to be same in the
two cities.
61 58 (Test for equality of means for
dependent samples)
Soln: n=10, 𝑥1 = 58.2, 𝑥2 = 56 Examples of dependent samples:
𝑥−𝑥 2 1. Patients undergoing YOGA treatment for high

𝑠12 = = 10.24 ; 𝑠22 = 21.8 B.P. have two measurements of B.P. – one
𝑛
before treatment (x) and the other after
H0: 𝜇1 = 𝜇2 (No significant difference between the
treatment(y)
mean price of the product in two cities)
2. Students attending coaching classes score marks
H1: 𝜇1 ≠ 𝜇2 (Two tailed test) (x) in a test before coaching and y in another test
after coaching.
Test statistic
𝑥1 − 𝑥2 In such situations, suppose we want to test whether
𝑡= = 0.92 the means 𝜇1 & 𝜇2 are equal, then the null hypothesis
𝑛 𝑠12 + 𝑠22 2
2𝑛 − 2 𝑛 is
The table value for 2𝑛 − 2 = 18 d.f at 5% level H0: 𝜇1 = 𝜇2

of significance is 𝑡0.05 = 2.10
For n random pairs of observations 𝑥𝑖 , 𝑦𝑖 ;
Clearly 𝑡 < 𝑡0.05 ; ∴ H0 is accepted. 𝑖 = 1 𝑡𝑜 𝑛, let 𝑑𝑖 = 𝑥𝑖 − 𝑦𝑖 .
Page 12
Sampling Theory
Let 𝑑 be the sample mean and 𝑠𝑑 be the sample The table value for 𝑛 − 1 = 4 d.f at 5% level of
standard deviation of these observations. significance is 𝑡0.05 = −2.132 (left tailed)
Then, under H0 , the test statistic t is Clearly 𝑡 > 𝑡0.05 ; ∴ H0 is accepted.
𝑑 2. A certain drug was administered on 6 patients

𝑡=𝑠 which is a student’s t-variate with (n-1)
𝑑/ 𝑛−1
suffering from hypertension resulted in the
degrees of freedom. following change of blood pressure:
3, -1, 8, 2, 0. Can it be concluded that the drug
Problems will in general be accompanied by decrease in
1. The following are the marks obtained by 5 blood pressure?
students before and after attending special
coaching: Soln:
Students 1 2 3 4 5 H0: 𝜇1 = 𝜇2 (No significant difference in blood

Marks before 125 pressure)
110 120 123 132
coaching (x)
Marks after 121 H1: 𝜇1 > 𝜇2 (Right tailed tailed test)
120 118 125 136
coaching (y) (decrease in blood pressure)
Test whether the marks of the students have
improved after special coaching. Test statistic
𝑑
Soln: 𝑡=
𝑠𝑑 / 𝑛 − 1
H0: 𝜇1 = 𝜇2 (There is no change in marks before
and after special coaching) d 3 -1 8 4 2 𝑑 = 12
d2 9 1 64 4 0 𝑑2 = 78
H1: 𝜇1 < 𝜇2 (left tailed test)
(Marks of the students have improved after special 2
coaching) 𝑑 2
𝑑2 𝑑
𝑑= = 2.4 ; 𝑠𝑑 = − = 9.84
𝑛 𝑛 𝑛
Test statistic
𝑑 ∴ 𝑠𝑑 = 3.137
𝑡=
𝑠𝑑 / 𝑛 − 1 𝑑
𝑡= = 1.5302
Marks Marks 𝑠𝑑 / 𝑛 − 1
Students d=x-y d2
before(x) after(y)
The table value for 𝑛 − 1 = 4 d.f at 5% level of
1 110 120 -10 100
2 120 118 2 4 significance is 𝑡0.05 = 2.132 (right tailed)
3 123 125 -2 4
4 132 136 -4 16 Clearly 𝑡 < 𝑡0.05 ; ∴ H0 is accepted.
5 125 121 4 16
𝑑 = −10 𝑑 2 = 140
__________________________________________
2
𝑑 𝑑2 𝑑
𝑑= = −2 ; 𝑠𝑑 2 = − = 23.9992
𝑛 𝑛 𝑛
∴ 𝑠𝑑 = 4.8989
𝑑
𝑡= = −0.8164
𝑠𝑑 / 𝑛 − 1
Page 13

Sampling Notes Part 01

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Sampling Notes Part 01

Hochgeladen von

Copyright:

Verfügbare Formate

Sampling Theory

distribution or p in the case of the binomial

Sampling The mean height 𝜇 of students of a college

There will be certain quantities that appear

Sampling Distribution of a Statistic The corresponding sample means are

Standard error is a measure of variability of 𝜎𝑥2 = 10

1. In the theory of estimation, standard error is used

Case-I (With replacement)

Eqn(3) reduces to Eqn(2) when N→ ∞

While population mean and sample mean remains

Note: Thorem-03 is illustrated in Case-II of above

Estimator and Estimate Definitions

9. Level of significance is the probability of 11. Two tailed test

10. One tailed test

14. 99% Confidence limits for 𝝁

Left tailed Example

99% Confidence limits for 𝝁 is Problems

a) Setting up the null hypothesis. 𝑥 −𝜇

Large sample tests

Test for Mean 2. A sample of 900 men is found to have a mean

a) The null hypothesis is Soln: n=900; 𝑥 = 64; 𝜎 = 20

d) The alternative hypothesis may be any one of the ________________________________________________

Test for Proportion Problems

W.k.t. for large n, the binomial distribution Soln: n=9000 ;

P = (1/6) + (1/6) = 1/3 ; Q = 2/3

Note (1) H0: The die is unbiased (P=1/3)

Conclusion: Die is biased.

Wkt. For two tailed test 𝑍𝛼 = 2.58 (1% level)

As 𝑧𝑐𝑎𝑙 > 𝑍𝛼 , the hypothesis is rejected at 1% level

H1: The coin is biased (P≠1/2) ; Two tailed test 𝑋

As 𝑧𝑐𝑎𝑙 < 𝑍𝛼 , the hypothesis is accepted at 5% 𝑃𝑄

Conclusion: Coin is unbiased.

𝑆. 𝐸. 𝑜𝑓 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑃𝑄 2. A sample of 100 days is taken from

Small sample tests The test statistic is given by

A sample of size n is taken to be small if n < 30. 𝑋 −𝜇

95% confidence limits for 𝜇 (actual diameter) is

From table 𝑡0.05 = 2.201 for 𝑛 − 1 = 12-1=11

Given 𝑥 = 41.5 & 𝑥𝑖 − 𝑥 2

The table value for 13 d.f at 5% level of

Clearly 𝑡 < 𝑡0.05

The table value for 𝑛1 + 𝑛2 − 2 = 16 d.f at 5% 2.

𝑥−𝑥 2 1. Patients undergoing YOGA treatment for high

The table value for 2𝑛 − 2 = 18 d.f at 5% level H0: 𝜇1 = 𝜇2

Then, under H0 , the test statistic t is Clearly 𝑡 > 𝑡0.05 ; ∴ H0 is accepted.

𝑑 2. A certain drug was administered on 6 patients

Students 1 2 3 4 5 H0: 𝜇1 = 𝜇2 (No significant difference in blood

Das könnte Ihnen auch gefallen