Sie sind auf Seite 1von 7

Session-16

We have discussed what is the distribution of sample variance defined by 𝑆 2 .


we will consider a random sample of size 𝑁 let say 𝑋1 , 𝑋2 , … … , 𝑋𝑁 .
Sample mean is
𝑋1 + 𝑋2 +⋯….+𝑋𝑁
𝑋̅ =
𝑁

Define sample variance by


2 ∑𝑁 ̅ 2
𝑖=1(𝑋𝑖 −𝑋)
𝑆 =
(𝑁−1)

Since 𝑆 2 is a random variable, it will have a distribution. Our task is to find the
distribution of 𝑆 2 .
Distribution of sample variance 𝑺𝟐

Theorem
Let 𝑋1 , 𝑋2 , … … , 𝑋𝑁 be i.i.d random variables each having a normal
distribution with mean µ and variance 𝜎 2 , then
(𝑵 − 𝟏)𝑺𝟐 /𝝈𝟐 will follow a 𝝌𝟐𝑵−𝟏 distribution (chi-square distribution with
(N-1) d.f).

Properties of 𝝌𝟐𝑵 distribution (chi-square distribution)


i) It is a right skewed distribution.
ii) it will take only positive values.
iii) It has only one parameter which is degree of freedom (d.f)
iv) It is leptokurtic distribution.
v) If, we increase the d.f it will converge to normal distribution.
Exercise-1
The Ever-Effective advertising company is constructing an aptitude test for a job.
Mr. Sukumar, the HRD manager, feels that it is important to plan for a fairly large
variance in the test scores so that the best applicants can be easily identified. For
a certain test, scores are assumed to be normally distributed with a mean of 80
and a standard deviation of 10. Ten applicants are to take the aptitude test.
Find the approximate probability that the sample variance of the scores for these
applicants is greater than 200.

Hint:
Here 𝑁 = 10
We have to find the probability
𝑃(𝑆 2 > 200)
According to the question 𝜎 2 = 100
𝑃(𝑆 2 > 200) = 𝑃((𝑁 − 1)𝑆 2 /𝜎 2 > (𝑁 − 1)200/𝜎 2 )
𝑃(𝑆 2 > 200) = 𝑃((𝑁 − 1)𝑆 2 /𝜎 2 > (𝑁 − 1)200/𝜎 2 )
(𝑁 − 1)𝑆 2 /𝜎 2 will follow a 𝝌𝟐𝑵−𝟏 distribution. Here, 𝑵 = 𝟏𝟎
𝜎 2 = 100
𝑃(𝑆 2 > 200) = 𝑃(𝜒92 > 18) = 3%
Use excel function to calculate the probability.

Confidence interval (C.I) for the population variance 𝝈𝟐

Let say, we have taken a sample of size 25 and the sample variance found to be
𝑆 2 = 36
Our goal is to construct an interval say (𝑎, 𝑏) such that
𝑃(𝑎 < 𝜎 2 < 𝑏) = 95%
𝑃(1/𝑏 < 1/𝜎 2 < 1/𝑎) = 95%
𝑃((𝑁 − 1)𝑆 2 /𝑏 < (𝑁 − 1)𝑆 2 /𝜎 2 < (𝑁 − 1)𝑆 2 /𝑎) = 95%
(𝑁 − 1)𝑆 2 /𝜎 2 will follow a 𝝌𝟐𝑵−𝟏 distribution. Here, 𝑵 = 𝟐𝟓
𝑃((𝑁 − 1)𝑆 2 /𝑏 < 𝝌𝟐𝟐𝟒 < 1/𝑎) = 95%
Using excel we find that
𝑷(𝝌𝟐𝟐𝟒 < 𝟏𝟐. 𝟒𝟎) = 𝟎. 𝟎𝟐𝟓
𝑷(𝝌𝟐𝟐𝟒 < 𝟑𝟗. 𝟑𝟔) = 𝟎. 𝟗𝟕𝟓
(𝑁−1)𝑆 2
= 12.4
𝑏

𝑆 2 = 36, (𝑁 − 1) = 24
𝑏 = 69.67
Similarly
(𝑁−1)𝑆 2
= 39.36
𝑎

𝑎 = 21.95
So, the 95% C.I is (21.95, 39.36)

Hypothesis testing for population variance

The systolic blood pressure readings of males between the ages of 35 and 59 show
a standard deviation 17 millimeters. A sample of 41 male runners in the
age group of 35 to 59 showed a (sample) standard deviation of 15 millimeters. Test
the claim that runners in this age group show less variability in their systolic blood
pressure, using a 5% level of significance.
Here
𝐻0 : 𝜎 2 = 172
𝐻1 : 𝜎 2 < 172

Method-1
Decision rule is, reject null if
𝑠2 < 𝑘
𝑃(𝑡𝑦𝑝𝑒 − 1 𝑒𝑟𝑟𝑜𝑟) = 5%
𝑃(𝑟𝑒𝑗𝑒𝑐𝑡 𝑛𝑢𝑙𝑙 𝑤ℎ𝑒𝑛 𝑖𝑡 𝑖𝑠 𝑡𝑟𝑢𝑒) = 5%
𝑃(𝑠 2 < 𝑘 |𝐻0 𝑇𝑟𝑢𝑒) = 5%
𝑃((𝑁 − 1)𝑠 2 /𝜎 2 < (𝑁 − 1)𝑘/𝜎 2 |𝐻0 𝑇𝑟𝑢𝑒) = 5%
(𝑁 − 1)𝑆 2 /𝜎 2 will follow a 𝝌𝟐𝑵−𝟏 distribution. Here, 𝑵 = 𝟒𝟏
𝑃(𝝌𝟐𝟒𝟎 < (𝑁 − 1)𝑘/𝜎 2 |𝐻0 𝑇𝑟𝑢𝑒) = 5%
Under 𝐻0 𝑇𝑟𝑢𝑒, 𝜎 2 = 172 .
𝑃(𝝌𝟐𝟒𝟎 < (𝑁 − 1)𝑘/172 ) = 5%
Using excel
𝑃(𝝌𝟐𝟒𝟎 < 26.50) = 0.05
(𝑁−1)𝑘
= 26.50 , 𝑁 = 41
172

𝑘 = 191.46
According to the question
𝑠 2 = 152 = 225
𝑠2 > 𝑘
So, we cannot reject null at 5% significance level.
Method-2
Reject null if
(𝑁 − 1)𝑠 2
2 < 𝝌𝟐𝟎.𝟎𝟓,𝟒𝟎
𝝈𝐻0

𝝌𝟐𝟎.𝟎𝟓,𝟒𝟎 is the lower 5th quantile of a chi-square distribution with 40 d.f


Under 𝐻0 𝑇𝑟𝑢𝑒, 𝝈2𝐻0 = 172
(𝑁−1)𝑠 2
= 31.14, 𝝌𝟐𝟎.𝟎𝟓,𝟒𝟎 = 26.50
𝝈2𝐻0

So, we cannot reject null at 5% significance level.

Method-3

p-value = 𝑃( 𝑠 2 < 𝑠 2 𝑂𝑏𝑡𝑎𝑖𝑛𝑒𝑑 |𝐻0 𝑇𝑟𝑢𝑒)


= 𝑃( 𝑠 2 < 152 |𝐻0 𝑇𝑟𝑢𝑒)
= 𝑃( (𝑁 − 1)𝑠 2 /𝜎 2 < (𝑁 − 1)152 / 𝜎 2 |𝐻0 𝑇𝑟𝑢𝑒)
Under 𝐻0 𝑇𝑟𝑢𝑒, 𝜎 2 = 172 .
(𝑁 − 1)𝑆 2 /𝜎 2 will follow a 𝝌𝟐𝑵−𝟏 distribution. Here, 𝑵 = 𝟒𝟏
(41−1)225
= 𝑃 (𝝌𝟐𝟒𝟎 < ) = 15%
172

p-value= 15%
So, we cannot reject null at 5% significance level.

We have also solved a problem based on the variability of timings when


IIMN students woke up in morning on weekdays (See class note).

Next, we started our discussion on two population mean test and two population
variance tests.

Das könnte Ihnen auch gefallen