Beruflich Dokumente
Kultur Dokumente
3. INTERVAL ESTIMATION
One of the most important activities that statisticians do is to use sample statistics to estimate unknown
population parameters, for example population mean or population proportion. There are two ways to make
estimation: point estimation and interval estimation.
Point Estimation
In point estimation, we use a single numerical value to estimate the value of a population parameter. For
example, we may want to estimate the value of an unknown population mean. In this case, some of the possible
candidates that we can use to estimate the population mean include sample mean, sample median, sample mode,
etc. The question then becomes which of these candidates is the best estimator for the population mean.
Similarly, among those candidates that we can choose to be the point estimator for the population proportion,
which one is the best?
Statisticians use sample mean to estimate the population mean, sample proportion to estimate the population
proportion, and sample standard deviation to estimate the population standard deviation. This implies that
statisticians think these sample statistics are the best estimators for the corresponding population parameters.
But why? By what standards are these sample statistics considered better than other estimators? Among the list
of criteria that statisticians use to choose a point estimator, following are the more important ones:
1. Unbiased Estimator A sample statistic is called an unbiased estimator of a population parameter if the
mean of all possible values of that sample statistic is equal to the value of that parameter.
Recall from our previous example about the distribution of sample means. We found that the mean of
sample means is always equal to the population mean. Hence, sample mean is an unbiased estimator of the
population mean. Similarly, it can be shown that sample proportion and sample standard deviation
(variance) are unbiased estimators of population proportion and population standard deviation (variance),
respectively. You should know that sample mean may not be the only unbiased estimator of the population
mean. For unimodal and symmetric distributions, sample median is also an unbiased estimator of the
population mean.
2. Consistency An unbiased estimator is said to be consistent if the average squared difference between the
estimator and the parameter becomes smaller as the sample size becomes larger.
Again, recall from our previous example on the distribution of sample means. We found that the standard
error of sample mean is equal to / n , where n is the sample size and is the population standard
deviation. Hence, as the sample size becomes larger, the average distance between sample means and the
population mean (measured by the standard error of sample mean) becomes smaller. Therefore, sample
mean is a consistent estimator for the population mean. Similarly, sample proportion is a consistent
estimator for the population proportion.
3. Relative Efficiency If there are two unbiased estimators of a parameter, the one whose variance is smaller
is said to be relatively efficient. For example, sample mean and median are both unbiased estimators of the
population mean when the population is normal. However, the variance of sample mean, 2/n, is smaller
than the variance of sample median, 1.57 2/n. Hence, sample mean is relatively more efficient than sample
median as an estimator for the population mean.
Statisticians argue that a good estimator should satisfy as many of the above criteria as possible. Relatively
speaking, sample mean, sample proportion, sample variance, and sample standard deviation are doing better in
terms of the above criteria than other candidates in making estimation.
Estimation - 2
Interval Estimation
Point estimation is good in the sense that it provides a very sharp estimation for the unknown population
parameter. However, it has the following major drawbacks:
1. With only a single point, it is unlikely that the point estimator will actually hit the target. For example,
when we use a sample with mean equal to 10 to estimate the population mean, it is unlikely that the
population mean will be exactly equal to 10. (You may get very close, but it’s hard to be exactly right.)
2. Point estimator doesn’t allow us to specify how confident we are with our estimation. For example, again,
when we use a sample mean 10 to estimate a population mean, how confident are we in making this
estimation?
It is mainly because of the above two reasons, statisticians often use interval, rather than a single point, to make
estimation. For example, instead of estimating the population mean to be 10, we may estimate it to be some
where between 8 and 12. This interval, from 8 to 12, is our interval estimation for the population mean.
If you can answer the last question, then you should know that a 95% confidence interval for the population
mean will be x 1.96 x . By the same reasoning, a 90% confidence interval for will be x 1.64 x ,
and a 99% confidence interval for will be x 2.58 x . (Why?)
In summary, to construct a 100(1-)% confidence interval for , we use the following formula:
x Z / 2 x x Z / 2 , (1)
n
where (1 - ) is the ”confidence level”, and Z/2 is the value of Z such that P(Z Z/2) = /2. Notice that, to use
the formula given by (1), it is required that either (1) the population is normal or (2) the sample size is
sufficiently large.
In Excel, the function “NORMSINV” can be used to find the critical value of Z. If is the confidence
coefficient, then entering “=NORMSINV(1 – /2)” will return the desired critical Z value. For example, the Z
value needed to construct a 95% confidence interval can be found by entering the formula
“=NORMSINV(0.975)”, which returns the value 1.96. Similarly, the Z values required for 99% confidence
intervals can be found by using the formula “=NORMSINV(0.995)”, which return the value of 2.58.
Estimation - 3
Examples:
1 Suppose a sample of size 9 was taken from a normal population with = 3, and it was found that the
sample mean = 20.
a) Find a 90% confidence interval for . (18.90-21.10)
b) Find a 95% confidence interval for . (18.69-21.31)
c) Find a 99% confidence interval for . (18.27-21.73)
d) Find a 80% confidence interval for . (19.24-20.86)
2 Suppose a sample of size 100 was taken from a population with = 5, and it was found that the sample
mean = 80.
a) Find a 90% confidence interval for . (79.18-80.82)
b) Find a 95% confidence interval for .
c) Find a 99% confidence interval for .
d) Find a 80% confidence interval for .
In Excel, the function “TINV” can be used to find the critical value of t. Notice that the way to use TINV
function is not quite the same as NORMSINV. (I’ll explain this in class. Your textbook explains this in detail –
See pp.426-429). TINV function requires two arguments: the probability and the degrees of freedom. If the
confidence coefficient is and degrees of freedom is n, then entering “=TINV(, n)” will return the desired
critical t value. For example, if the degrees of freedom is 20, then the t value needed to construct a 95%
confidence interval can be found by entering the formula “=TINV(0.05, 20)”, which returns the value 2.0860.
Recall that when n is sufficiently large, t distribution will be very close to the Z distribution. Hence, we can use
Z/2 to replace t/2, n-1 in the above formula when n gets sufficiently large,
Examples:
3. A random sample of 25 homeowners in New Territories showed that the sample mean mortgage payment is
HK$18,000 per month. The sample standard deviation was $1,500. Assuming that the mortgage payments
are approximately normally distributed, find an interval within which you can be 98% confident that the
true mean mortgage payment per month of all New Territories homeowners will lie. How about if you want
to be 90%, 80%, or 70% sure?
4. A random sample of 400 college students showed that, on average, they spent 1.5 hours in study with a
sample standard deviation of 0.5 hours. Assuming that this study time follows a normal distribution, find a
Estimation - 4
90% confidence interval for the true average study time of the college students. How about an 80%, a 60%,
a 99%, or a 95% confidence interval? 90%(1.46-1.54)