466 Chapter8

8 Parameter Estimation 1
8 Parameter Estimation
8.1 Introduction
Parameter estimation is one important type of statistical
inference: use the sample (observations) to estimate the value of
unknown parameter which characterizes the population.
Examples of practical application:
to estimate the proportion p of washers that are expected fail prior
to 1-year warranty
to estimate the mean waiting time at a supermarket checkout
to estimate the standard deviation of the measurement error of
an electronic instrument
p, , are also called as target parameter
Parameter Estimators
An estimator,

, is a statistic (function of samples) used to estimate
the value of a parameter .
Since estimators are functions of random variables, they are themselves
random variables.
Estimators (statistics) follow probability distributions, often referred to
as sampling distributions. They have
expected values: E(
);
variances: V (
) =
2
standard deviations (often called standard errors):

_
V (
) =
Example E.g. Data: 4.3 2.0 3.3 3.0 2.3 3.0 3.4 4.1 2.7 4.0
Model: X
1
, X
2
, , X
10
i.i.d. rvs from N(,
2
)
i.e., we assume that we know something about the distribution that
generated the data, but not the values of the parameters and
2
.
Objectives:
Estimate and
2
Study the properties of the estimators
Assumed model: X
1
, X
2
, , X
n
i.i.d. from N(,
2
)
From this model, we often use the estimators:
Estimator for :
=

X =
1
n
n
i=1
X
i
Estimator for
2
:

2
= S
2
=
1
n 1
n
i=1
(X
i

X)
2
Estimates and Estimators
An estimate is the value obtained when an estimator is evaluated
at observed sample values. Since estimates are functions of numbers,
they are themselves numbers.
Example 8.1.1 Data: 4.3 2.0 3.3 3.0 2.3 3.0 3.4 4.1 2.7 4.0
To estimate the mean , we might use the sample mean:
X =
1
10
10
i=1
X
i
(estimator)
x =
1
10
10
i=1
x
i
= 3.21 (estimate)
To estimate the variance
2
, we might use the sample variance:
S
2
=
1
101
10
i=1
(X
i

X)
2
(estimator)
s
2
=
1
101
10
i=1
(x
i
x)
2
= 0.59 (estimate)
8.2 Properties of Estimators
Question: If a markman res once and shot at the bulls eye, do you
conclude that he is an excellent shooter?
In statistical problems, the target itself is completely unknown. How to
evaluate the performance of dierent estimators? The only way is
to use probabilistic methods.
The key idea is, for each estimator, we obtain multiple measurements
(replicated estimates), and then study the average performance of
the estimator.
An illustrating Example: Who do you hire?
2 1 0 1 2 3
2
0
1
2
3
Marksman I
x
y
2 1 0 1 2 3
2
0
1
2
3
Marksman II
x
y
2 1 0 1 2 3
2
0
1
2
3
Marksman III
x
y
2 1 0 1 2 3
2
0
1
2
3
All marksmen
x
y
Figure 1: Position of n = 50 shots by three marksmen. The red circle
indicates the bulls eye.
Desired Properties of Estimators
There are many possible estimators for a parameter. For example, for
a random sample from N(,
2
), we might use the sample mean, or
the sample median as estimators of the mean .
good accuracy:
measured by bias of the estimator
prefer the estimator right on the target
good precision:
measured by variance of the estimator
prefer the estimator not too variable
Will learn three evaluation criteria:
- unbiasedness, variance (or standard deviation), mean squared error
Properties of Estimators: Unbiasedness
An estimator,

, of a parameter, , is said to be unbiased if its
expected value is equal to the quantity it is estimating. That is,

is
unbiased if
E(
) = , for all values of .

In other words, an estimator is unbiased if
the mean of its sampling distribution
= the parameter being estimated
In the long run, the estimator is centered at the truth .
Measure of Bias
If an estimator is not unbiased, we say it is biased. The bias of an
estimator is measured as:
B(
) = E(
) .
An estimator,

, for which
B(
) = 0 for all
is said to be unbiased.
Example 8.2.1 For a random sample from N(,
2
), is =

X an
unbiased estimator of ?
Proof:
Example 8.2.2 For a random sample with mean and variance
2
,

2
= S
2
is an unbiased estimator of
2
(without assuming normality).
Proof:
Properties of Estimators: Precision
The precision of an estimator

is a measure of the tendency of that
estimator to take values closely clustered around the mean of its
sampling distribution.
Precision is generally measured using the standard error,
, of the
estimator.
The smaller the standard deviation, the greater the precision.
Given two unbiased estimators of a parameter , and all other things
are equal, we would select the estimator with the smaller variance.
Figure 2: Sampling distribution for two unbiased estimators: (a) large
variation; (b) small variation.
Example Suppose Y
1
, Y
2
, Y
3
is a random sample from exp().
Consider the following estimators of , which are unbiased? Among
the unbiased estimators, which has the smallest variance?
1
= Y
1
,

2
=
Y
1
+Y
2
2
,

3
=
Y
1
+ 2Y
2
3
,

4
= min(Y
1
, Y
2
, Y
3
),

5
=

Y .
Answer:
Properties of Estimators: Mean Squared Error (MSE)
If

is an estimator of a parameter , then its Mean Squared Error
(MSE) of

is measured by:
MSE(
) = E
_
(
)
2
_
.
The mean squared error is an important criterion for comparing two
estimators. Show that:
MSE(
) = V (
) + [B(
)]
2
= [Precision]
2
+ [Bias]
2
MSE naturally takes into account both the bias and variance of an
estimator and combines them into one number.
If

is unbiased, then MSE(
) = V (
).
Proof:
Example Take a random sample Y
1
, Y
2
, , Y
n
from Unif(, + 1).
(a) Show that

Y is a biased estimator for and compute the bias.
(b) Find a function of

Y that is unbiased for .
(c) Find MSE(
Y ) when

Y is used as an estimator of .
Answer:
Comparison of Dierent Estimators
Given many estimators, we compare them based on their properties
(biasedness, precision, MSE, etc.).
Bias: prefer B(
) to be small or zero.
Precision: prefer
to be small.
MSE: prefer MSE(
) to be small.
We often seek unbiased estimators with relatively small variances.
Let

1
and

2
be two estimators of . The Relative Eciency of

1
relative to

2
is
RE(
1
,

2
) =
MSE(
2
)
MSE(
1
)
.
If RE(
1
,

2
) > 1, then we say

1
is better than

2
in terms of MSE;
otherwise,

2
is better than

1
in terms of MSE.
Choosing an Estimator from a Class
Example 8.2.3 Suppose X is a rv that follows the exponential
distribution with mean . We wish to estimate the parameter using
an estimator of the form

= cX. How should we choose the constant
c?
1. Choose c to construct an unbiased estimator

1
.
2. Choose c to construct an estimator

2
that minimizes the MSE.
3. Calculate the relative eciency of

1
against

2
.
Answer:
8.3 Some Common Unbiased Point Estimators
Standard Estimators for One Continuous Sample
Data type: Y
1
, , Y
n
i.i.d. with common mean and variance
2
.
Sample size: n.
1. Parameter :
Estimator

:
Standard Error
:
Answer:
2. Parameter :
2
Estimator

:
Standard Error
:
Answer:
Two Independent Continuous Samples
Data type: Y
i1
, , Y
in
i
i.i.d. (
i
,
2
i
), i = 1, 2 (two independent
groups).
Parameter :
1
2
Estimator

:
Standard Error
:
Answer:
Standard Estimators of Binomial Samples
Data type: Y Binomial(n, p)
Parameter : p
Estimator

:
Standard Error
:
Answer:
Two Independent Binomial Samples
Data type: Y
i
Binomial(n
i
, p
i
), i = 1, 2 (independent)
Parameter : p
1
p
2
Estimator

:
Standard Error
:
Answer:
Figure 3: Expected values and standard errors of some common point
estimators.
8.4 Error of Estimation for an Estimator

For an estimator

of , dene the absolute error of estimation as
= |
|.
The absolute error of estimation is a random quantity, so its smallness
(largeness) should be determined probabilistically
Probabilistic Bound on Error of Estimation
Can we nd a b such that P( < b) = 0.95?
P(|
| < b) = P(b <

< b) = P( b

+b)
=
_
+b
b
f(
)d
.
We call (
b,

+b) the 95% condence interval of , as
P(|
| b) = P(b

b) = P( (
b,

+b)) = 0.95
Figure 4: Sampling distribution of a point estimator

.
Tchebyshes Theorem
For any estimator

of ,
P(|
| b) 1
MSE(
)
b
2
.
Suppose

is an unbiased estimator of . From the above result, if we
take b = 2
_
MSE(
) = 2
, then the probability that the error of

estimation will be less than this bound (2
) is at least 0.75 .
This bound is usually very conservative. The actual probabilities
usually exceeds the Tchebyshe bonds by a considerable amount.
We call the quantity
b = 2
2-standard-error bound on the error of estimation for an unbiased

estimator

. In practice, we have to approximate the error bound.
An Empirical Rule:
If

is approximately normally distributed, and

is unbiased for , then
the probability that the error of estimation is less than 2
is around
0.95. (Recall the empirical rule from Chapter 1).
Figure 5: Probability P( 2 < Y < + 2), assuming Y has mean
and variance
2
.
Example 8.4.1 Suppose n = 1000 voters were selected at random
and x = 560 of them favor proposal A. We wish to estimate p, the
proportion of voters in favor of A.
1. Estimate p =?
2. Standard error
p
=?
3. 2-standard-error bound: 2
p
=?
4. Estimated 2-standard-error bound: 2
p
=?
5. Interpretation: the probability that the error of estimation is less
than is 0.95, i.e., p = is likely to be within
of the true value of p.
6. The interval is the approximate 95% condence
interval of p.
Example 8.4.2 Two types of automobile tires are compared. A
testing sample of n
1
= n
2
= 100 were taken from each type. The
number of miles until wear-out was recorded. The measurements for
two types of tires were independent. The summary statistic are
y
1
= 26, 400, y
2
= 25, 100, s
2
1
= 1, 440, 000, s
2
2
= 1, 960, 000.
1. Estimate the dierence in mean miles to wear-out.
2. What is the standard error of the estimator?
3. Estimate the standard error.
4. Estimated 2-standard-error bound for the error estimation.
Answer:
8.5 Condence Interval for a Parameter
Given a small number (say 0.05 or 0.01 etc.), can we nd two
estimators

L
and

U
such that
P(
U
) = 1 ?
The answer is YES! And in that case,
L
and

U
are called the lower and upper condence limits
of , respectively.
The probability (1 ) is called the condence coecient, or
the level of condence.
Compare Point Estimate with Interval Estimate
Point estimate: You estimate that the proportion of people in favor of
proposal A is p = 0.56.
Condence interval: You are 95% condent that the value of p falls in
the interval:
0.53 p 0.59
How can we construct such intervals? One common method is the
following pivotal method.
The Pivotal Method
1. Find a pivotal quantity:
(a) A pivotal is an expression that is a function of the estimator
T = T(X
1
, , X
n
), the unknown parameter , and known
quantities.
(b) In this expression, is the only unknown quantity
(c) The expression has a probability distribution that does not
depend on any unknown parameter.
2. Invert the pivotal function to nd an interval for in terms of T.
Example 8.5.1 Suppose X
1
, , X
n
i.i.d. N(, 1). Construct a
(1 )100% condence interval for .
Note

X N(,
1
n
)
Z =
n(

X ) N(0, 1). Thus Z is a pivotal for = .
Find a, b such that P(a Z b) = 1 , i.e.
P(a
n(

X ) b) = 1
Inverting the two inequality to obtain the lower/upper bounds for
:

L
=?

U
=?
Answer:
Normal Sample with known:
Let X
1
, , X
n
be a random sample from N(,
2
), where
2
is
known, the (1 )100% condence interval for
From
P(z
/2

X
/
n
z
/2
) = 1 ,
we get
P(

X z
/2
n

X +z
/2
n
) = 1 .
X z
/2
n

X +z
/2
n
Example 8.5.2 Length of pregnancy for a random sample r.s. of 9
healthy women (in days):
262, 278, 265, 258, 229, 264, 242, 265, 259
Assume
2
= 225. Find a 95% CI for the mean length .
Answer:
Normal Sample with unknown:
Notice that in the above, we have assumed that the standard deviation
is known. In practice, has to be estimated from the data.
what would be a pivotal statistic to estimate ?
what is the distribution of T =

X
S/
n
how would you construct a condence interval for in this case?
X t
n1,/2
S
n

X +t
n1,/2
S
n
Proof:
Deriving a Condence Interval from a Pivotal
Example 8.5.3 (Exponential Sample) Suppose that X is the waiting
time between arrivals of a certain bus. Assume X Exp(), where
is the expected waiting time. Suppose there is one observation n = 1,
x = 7.25 minutes. Construct a 90% CI and 95%for .
Answer:
Figure 6: Density function of U.
Commonly Used Pivotal Statistics for .
X
1
, , X
n
N(,
2
), with
2
known. = .
X
1
, , X
n
N(,
2
), both ,
2
unknown. = .
X N(0,
2
),
2
unknown. =
2
.
X
1
, , X
n
N(,
2
) with known. =
2
.
X
1
, , X
n
N(,
2
), both ,
2
unknown. =
2
.
X Exp(). = .
X
1
, X
2
, , X
n
Exp(). = .
Y Gamma(, ).
X Unif(0, ).
One-sided Condence Bounds (CB)
From the statement:
P(
L
) = 1 ,
we obtain the (1 )100% lower condence bound for the
unknown parameter :

L
.
From the statement:
P(

U
) = 1 ,
we obtain the (1 )100% upper condence bound for the
unknown parameter :

U
.
One-sided Condence Bounds (CB)
If the (1 )100% CI has the symmetric form:
w
/2

+w
/2
,
where w
is a -th percentage point, then

the (1 )100% lower CB has the form:
,
the (1 )100% upper CB has the form:

+w
.
Example 8.5.4 Length of pregnancy for a random sample of 9
healthy women (in days):
262, 278, 265, 258, 229, 264, 242, 265, 259
Assume the length of pregnancy is normally distributed, an
2
= 225.
Find a 95% upper CB for mean .
Answer:
Interpretation of CI
A 95% CI for : [
L
,

U
] satises
P(
U
) = 95%.
The interpretation of a 95% CI is based on the concept of repeated
sampling. If samples of the sample size are drawn repeatedly, and a
condence interval is computed from each sample in the same manner,
then approximately 95% of these intervals will contain the true value
of the parameter of interest.
8.6 Large Sample CI
Let

be an estimator for , and its standard error is
. For large n, if
Z =
is approximately standard normally distributed. In this case, Z is a

pivotal (at least approximately), and
P(z
/2
Z z
/2
) = 1 .
So an approximate large sample
(1 )100% CI for is
z
/2

+z
/2
.
100(1 )% lower bound for is

z
.
100(1 )% upper bound for is

+z
.
Because in this case we are assuming that the sample size is large, we
may replace the unknown values of
with appropriate estimators that

we may denote by
. Theoretical justication for the substitution is

given in Section 9.3.
Figure 7: Location of z
/2
and z
/2
.
Normal Example
Example 8.6.1 The shopping times of n = 64 randomly selected
customers at a supermarket were recorded. The average and variance
of the sample were 33 minutes and 256 minutes
2
. Estimate = , the
average shopping time per customer and a 90% CI.
Answer:
Binomial Example
In Binomial samples, for large n, the pivotal quantity
Z =
p p

p
,
is approximately N(0, 1). The 100(1 )% CI is p z
/2
p
. When n
is large, as p p, we approximate
p
=
_
p(1 p)/n with
p
=
_
p(1 p)/n.
This does not change normality of the pivotal quantity Z, so
L
= p z
/2
_
p(1 p)
n
,

U
= p +z
/2
_
p(1 p)
n
Example Binomial trials with true p = 0.5. n = 35, the number of
successes y = 18. Then p = y/35 = 0.514. Find 95% CI for p.
Answer:
Figure 8: Twenty-four realized 95% CI for a populuation proportion.
The true value p = 0.5 is contained in 23 out of 24 (95.8%) of intervals
observed.
Two Binomial Samples
Suppose we want to compare two proportions p
1
and p
2
from two
populations. Two Binomial samples Y
1
and Y
2
,
Y
i
Binomial(n
i
, p
i
), i = 1, 2
p
i
=
Y
i
n
i
, q
i
= 1 p
i
, i = 1, 2.
The parameter = p
1
p
2
. The estimator

= p
1
p
2
has standard
error
= .
Then a large sample (1 )100% CI for = p
1
p
2
is
.
Example 8.6.2 Of 50 people with a certain gene, 12 have a certain
trait. Of 60 people without the gene, 12 have the trait. Find a large
sample 98% CI for the dierence between the true proportions p
1
p
2
.
Answer:
8.7 Choice of Sample Size
Use

to estimate . The sample size, n, may be chosen to achieve:
(1) the desired bound B on the error estimation with probability 1 .
(2) a specied maximum CI width w.
Assume X
1
, , X
n
is a normal sample,
2
known. = and

=

X.
(1) To achieve the desired bound B with probability 1 , then
z
/2
n
= B. n =
_
z
/2
B
_
2
.
Round it up to the closest integer.
(2) The (1 )100% CI for is:

X z
/2
/
n and the width is

2z
/2
n
. Find the minimum sample size n to control the width w.
n
_
2z
/2
w
_
2
. Round it up to the closest integer.
Example 8.7.1 Length of pregnancy for a random sample of n
healthy women (in days). Assume
2
= 225. Find the sample size
required to assure that the width of the 95% CI for mean is no more
than 10 days.
Answer:
Choice of Sample Size
What if the variance,
2
, is unknown?
Use estimate, s
2
, from a previous study
Run a small pilot study to estimate the variance
Guess
Rule of thumb: guess at reasonable max and min values for the
random variable, and use
= (max min)/4 = Range/4.
Example In a Binomial experiment, how many trials must be done in
order to make the error of estimation less than 0.04 with probability
equal to 0.90. Assume that the true p lies somewhere in the
neighborhood of 0.6.
Answer:
Example Want to compare the eectiveness of two methods of
training employees to perform an operation. The selected employees
are divided into two groups of equal size, the rst receiving method 1
and the second receiving method 2. The measurement for both groups
has a range of approximately 8 minutes. If the estimate of the
dierence in mean times is to be correct to within 1 minute with
probability 0.95, how many workers must be included in each group?
Answer:
8.8 Small-Sample CI
Large sample CIs cannot be used if the sample sizes are not large
enough (say n < 30). We need nite sample CIs in these cases.
One-sample Problems
Suppose X
1
, X
2
, , X
n
are i.i.d. N(,
2
). (,
2
) both unknown.
Notice that
T =
n(

X )
S
t
n1
is a pivotal quantity for ; and
(n 1)S
2
2

2
n1
is a pivotal quantity for
2
.
The (1 )100% condence interval for is:
X t
n1,/2
S
n

X +t
n1,/2
S
n
The (1 )100% condence interval for
2
is:
(n 1)S
2
2
n1,/2

2
(n 1)S
2
2
n1,1/2
Proof:
Example Take a random sample of size n = 8 from a Normal
population with mean . The observations are
3005 2925 2935 2995 3005 2937 2905.
Find a 95% CI for .
Answer:
Two-sample Problems with Equal Variance
Suppose we have two independent samples:
X
11
, X
12
, , X
1n
1
i.i.d. N(
1
,
2
1
)
X
21
, X
22
, , X
2n
2
i.i.d. N(
2
,
2
2
)
We wish to estimate
1
2
. Assume that
2
1
=
2
2
=
2
(to simplify
the calculation)
Notice that
Z =
(

X
1

X
2
) (
1
2
)
_
1
n
1
+
1
n
2
N(0, 1)
Because is unknown, we need nd an estimator of the common
variance
2
.
Recall Homework 2 Problem 10, an unbiased estimator is the pooled
estimator
S
2
p
=
(n
1
1)S
2
1
+ (n
2
1)S
2
2
n
1
+n
2
2
where S
2
1
and S
2
2
are the sample variances from the rst and the
second samples, respectively. Furthermore,
W =
(n
1
+n
2
2)S
2
p
2
=
(n
1
1)S
2
1
+ (n
2
1)S
2
2
2

2
n
1
+n
2
2
Prove the above results.
Therefore,
T =
Z
_
W/(n
1
+n
2
2)
t
n
1
+n
2
2
is a pivotal quantity for
1
2
.
So the (1 )100% condence interval for
1
2
is:
(

X
1

X
2
) t
v,/2
S
p
_
1
n
1
+
1
n
2
,
where v = n
1
+n
2
2 and S
2
p
=
(n
1
1)S
2
1
+(n
2
1)S
2
2
n
1
+n
2
2
.
Example We compare two methods of training employees: the
standard method and new method. n
1
= n
2
= 9.
Procedure Measurements
Standard 32 37 35 28 41 44 35 31 34
New 35 31 29 25 34 40 27 32 31
Assume the samples are independent. The measurements are
approximately normally distributed, and the variances are equal for the
two methods. Estimate the mean dierence
1
2
with 95% CI.
Answer:
Remark: If
2
1
=
2
2
, nding a CI for
1
2
could be very tricky!
A suggested method for the case
2
1
=
2
2
:
An approximate CI is given by:
(

X
1

X
2
) t
v,/2
S
2
1
n
1
+
S
2
2
n
2
,
where the degrees of freedom
v =
(
s
2
1
n
1
+
s
2
2
n
2
)
2
(s
2
1
/n
1
)
2
n
1
1
+
(s
2
2
/n
2
)
2
n
2
1
rounded to the nearest integer.

466 Chapter8

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

466 Chapter8

Hochgeladen von

Copyright:

Verfügbare Formate

8 Parameter Estimation 1

standard deviations (often called standard errors):

) = , for all values of .

| < b) = P(b <

, then the probability that the error of

2-standard-error bound on the error of estimation for an unbiased

is a -th percentage point, then

is approximately standard normally distributed. In this case, Z is a

with appropriate estimators that

. Theoretical justication for the substitution is

n and the width is

Das könnte Ihnen auch gefallen