Sie sind auf Seite 1von 71

Introduction to Probability

and Statistics
Twelfth Edition

Large-Sample Estimation

Mechanical Engineering Dept.


Faculty of Engineering
Mutah University
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Introduction
• Populations are described by their probability
distributions and parameters.
– For quantitative populations, the location
and shape are described by m and s.
– For a binomial populations, the location and
shape are determined by p.
• If the values of parameters are unknown, we
make inferences about them using sample
information.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Types of Inference
• Estimation:
– Estimating or predicting the value of the
parameter
– “What is (are) the most likely values of m
or p?”
• Hypothesis Testing:
– Deciding about the value of a parameter
based on some preconceived idea.
– “Did the sample come from a population
with m = 5 or p = .2?”
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Types of Inference
• Examples:
– A consumer wants to estimate the average
price of similar homes in her city before
putting her home on the market.
Estimation: Estimatem,
Estimation:Estimate m,the
theaverage
averagehome
homeprice.
price.

–A manufacturer wants to know if a new type


of steel is more resistant to high
temperatures than an old type was.
Hypothesis
Hypothesistest:
test:IsIsthe
thenew
newaverage resistance,mmNN
averageresistance,
equal
equalto
tothe
theold
oldaverage resistance,mmOO??
averageresistance,
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Types of Inference
• Whether you are estimating parameters
or testing hypotheses, statistical methods
are important because they provide:
– Methods for making the inference
– A numerical measure of the
goodness or reliability of the
inference
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Definitions
• An estimator is a rule, usually a
formula, that tells you how to calculate
the estimate based on the sample.
– Point estimation: A single number is
calculated to estimate the parameter.
– Interval estimation: Two numbers are
calculated to create an interval within
which the parameter is expected to lie.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Properties of
Point Estimators
• Since an estimator is calculated from
sample values, it varies from sample to
sample according to its sampling
distribution.
• An estimator is unbiased if the mean of
its sampling distribution equals the
parameter of interest.
– It does not systematically overestimate
or underestimate the target parameter.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Properties of
Point Estimators
• Of all the unbiased estimators, we prefer
the estimator whose sampling distribution
has the smallest spread or variability.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Measuring the Goodness
of an Estimator
• The distance between an estimate and the
true value of the parameter is the error of
estimation. The
Thedistance
distancebetween
betweenthe
thebullet
bulletand
and
the
thebull’s-eye.
bull’s-eye.

• In this chapter, the sample sizes are large,


so that our unbiased estimators will have
normal distributions. Because
Becauseof
ofthe
theCentral
Central
Limit
LimitTheorem.
Theorem.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Margin of Error
• For unbiased estimators with normal
sampling distributions, 95% of all point
estimates will lie within 1.96 standard
deviations of the parameter of interest.
• Margin of error: The maximum error of
estimation, calculated as
1.96  stderror of theestimator

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Estimating Means
and Proportions
• For a quantitative population,
Point estimatorof populationmean μ : x
s
M arginof error(n  30) :  1.96
n
• For a binomial population,
Point estimatorof populationproportionp : pˆ  x/n
pˆ qˆ
M arginof error(n  30) :  1.96
n
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• A homeowner randomly samples 64 homes
similar to her own and finds that the average
selling price is $252,000 with a standard
deviation of $15,000. Estimate the average
selling price for all similar homes in the city.
Point estimator of μ: x  252, 000
s 15, 000
Margin of error :  1.96  1.96  3675
n 64
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
A quality control technician wants to estimate
the proportion of soda cans that are underfilled.
He randomly samples 200 cans of soda and
finds 10 underfilled cans.
n  200 p  proportionof underfilled cans
Point estimatorof p : pˆ  x/n  10 / 200  .05
pˆ qˆ (.05)(.95)
M arginof error :  1.96  1.96  .03
n 200

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Interval Estimation
• Create an interval (a, b) so that you are fairly
sure that the parameter lies between these two
values.
• “Fairly sure” is means “with high probability”,
measured using the confidence coefficient, 1-
a. 1-a
Usually,
Usually, 1-a =
= .90,
.9 0 , .95,
.9 5 , ..
98, .99
•98,Suppose
.99 1-a = .95 and
that the estimator has a
normal distribution.
Parameter1.96SE
Parameter 1.96SE Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Interval Estimation
• Since we don’t know the value of the parameter,
consider Estimator
Estimator1.96SE
1.96SE which has a variable
center.

Worked
Worked
Worked
Failed

• Only if the estimator falls in the tail areas will


the interval fail to enclose the parameter. This
happens only 5% of the time. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
To Change the
Confidence Level
• To change to a general confidence level, 1-a,
pick a value of z that puts area 1-a in the center
of the z distribution. Tail area z a/2

.05 1.645
.025 1.96
.01 2.33
.005 2.58

100(1-a)%
100(1-a)% Confidence
Confidence Interval: Estimator  zza/2
Interval: Estimator SE
a/2SE

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Working Backwards
Find the value of z that has area .25 to its left.
1.
1. Look
Lookfor
forthe
thefour
fourdigit
digitarea
area
closest
closestto
to.2500
.2500in
inTable
Table3.3.
2.
2. What
Whatrow
rowandandcolumn
columndoes
does
this
thisvalue
valuecorrespond
correspondto?to?

3. z = -.67

4.4. What
Whatpercentile
percentile
does
doesthis
thisvalue
value
represent?
represent? 25th percentile,
or 1st quartile (Q1)
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
 Example:
Industrial engineers who specialize in ergonomics are concerned
with designing workspace and worker-operated devices so as to
achieve high productivity and comfort. Reports on a study of
preferred height for an experimental keyboard with large forearm–
wrist support. A sample of trained typists was selected, and the
preferred keyboard height was determined for each typist.
The resulting sample average preferred height was 80.0 . Assuming
that the preferred height is normally distributed with σ = 2.0cm
obtain a 95% CI for the true average preferred height for the
population of all experienced typists.

Random interval centered at


Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Confidence Intervals
for Means and Proportions
• For a quantitative population,
Confidenceintervalfor apopulationmean μ :
s
x  z / 2
n

• For a binomial population,


Confidenceintervalfor apopulationproportionp :
pˆ qˆ
pˆ  z / 2
n
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• A random sample of n = 50 males showed a
mean average daily intake of dairy products
equal to 756 grams with a standard deviation of
35 grams. Find a 95% confidence interval for the
population average m.
s 35
x  1.96  756 1.96  756 9.70
n 50
or 746.30    765.70grams.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• Find a 99% confidence interval for m, the
population average daily intake of dairy
products for men.
s 35
x  2.58  756  2.58  756  12.77
n 50
or 743.23    768.77 grams.
The interval must be wider to provide for the
increased confidence that is does indeed
enclose the true value of m.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
• Of a random sample of n = 150 college students,
104 of the students said that they had played on a
soccer team during their K-12 years. Estimate the
porportion of college students who played soccer
in their youth with a 98% confidence interval.

pˆ qˆ 104 .69(.31)
pˆ  2.33   2.33
n 150 150
 .69 .09 or .60  p  .78.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Estimating the Difference
between Two Means
• Sometimes we are interested in comparing the
means of two populations.
• The average growth of plants fed using two
different nutrients.
• The average scores for students taught with two
different teaching methods.
• To make this comparison,
A random sample of size n1 drawn from
A random
population 1 with mean μ1sample of size 
and variance n2 2drawn
. from
1
population 2 with mean μ2 and variance  22 .
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Estimating the Difference
between Two Means
• We compare the two averages by making
inferences about m1-m2, the difference in the
two population averages.
• If the two population averages are the same,
then m1-m2 = 0.
• The best estimate of m1-m2 is the difference
in the two sample means,
x1  x2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Sampling
Distribution of x1  x2
1. The mean of x1  x2 is 1   2 , the difference in
the population means.
 12  22
2. The standard deviation of x1  x2 is SE   .
n1 n2
3. If the sample sizes are large, the sampling distributi on
of x1  x2 is approximat ely normal, and SE can be estimated
s12 s22
as SE   .
n1 n2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Estimating m1-m2
• For large samples, point estimates and their
margin of error as well as confidence intervals
are based on the standard normal (z)
distribution. Point estimate for 1 -  2 : x1  x2
2 2
s s
Margin of Error :  1.96  1 2
n1 n2
Confidence interval for 1 -  2 :
s12 s22
( x1  x2 )  z / 2 
n1 n2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
Avg Daily Intakes Men Women
Sample size 50 50
Sample mean 756 762
Sample Std Dev 35 30

• Compare the average daily intake of dairy products of


men and women using a 95% confidence interval.
s12 s22
( x1  x2 )  1.96 
n1 n2
352 302
 (756  762)  1.96    6  12.78
50 50
or - 18.78  1   2  6.78.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example, continued
- 18.78  1   2  6.78

• Could you conclude, based on this confidence


interval, that there is a difference in the average daily
intake of dairy products for men and women?
• The confidence interval contains the value m1-m2= 0.
Therefore, it is possible that m1 = m2. You would not
want to conclude that there is a difference in average
daily intake of dairy products for men and women.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Estimating the Difference
between Two Proportions
• Sometimes we are interested in comparing the
proportion of “successes” in two binomial
populations.
• The germination rates of untreated seeds and seeds
treated with a fungicide.
• The proportion of male and female voters who
favor a particular candidate for governor.
•ATo makesample
random this comparison,
of size n1 drawn from
binomial population 1 with sample
A random parameter p1. n2 drawn from
of size
binomial population 2 with parameter p2 .
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Estimating the Difference
between Two Means
• We compare the two proportions by making
inferences about p1-p2, the difference in the
two population proportions.
• If the two population proportions are the
same, then p1-p2 = 0.
• The best estimate of p1-p2 is the difference
in the two sample proportions,
x1 x2
pˆ1  pˆ 2  
n1 n2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
The Sampling
Distribution of pˆ1  pˆ 2

1. The mean of pˆ 1  pˆ 2 is p1  p2 , the difference in


the population proportion s.
p1q1 p2 q2
2. The standard deviation of pˆ 1  pˆ 2 is SE   .
n1 n2
3. If the sample sizes are large, the sampling distributi on
of pˆ 1  pˆ 2 is approximat ely normal, and SE can be estimated
pˆ 1qˆ1 pˆ 2 qˆ 2
as SE   .
n1 n2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Estimating p1-p2
• For large samples, point estimates and their
margin of error as well as confidence intervals
are based on the standard normal (z)
distribution.Point estimate for p -p : pˆ  pˆ
1 2 1 2

pˆ 1qˆ1 pˆ 2 qˆ 2
Margin of Error :  1.96 
n1 n2
Confidence interval for p1  p2 :
pˆ1qˆ1 pˆ 2 qˆ 2
( pˆ 1  pˆ 2 )  z / 2 
n1 n2
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example
Youth Soccer Male Female
Sample size 80 70
Played soccer 65 39

• Compare the proportion of male and female college


students who said that they had played on a soccer team
during their K-12 years using a 99% confidence interval.
pˆ 1qˆ1 pˆ 2 qˆ2
( pˆ 1  pˆ 2 )  2.58 
n1 n2

65 39 .81(.19) .56(.44)
 (  )  2.58   .25  .19
80 70 80 70
or .06  p1  p2  .44.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example, continued
.06  p1  p2  .44
• Could you conclude, based on this confidence
interval, that there is a difference in the proportion of
male and female college students who said that they
had played on a soccer team during their K-12 years?
• The confidence interval does not contains the value
p1-p2 = 0. Therefore, it is not likely that p1= p2. You
would conclude that there is a difference in the
proportions for males and females.
A higher proportion of males than
females played Copyright
soccer©2006
in their youth.
Brooks/Cole
A division of Thomson Learning, Inc.
One Sided
Confidence Bounds
• Confidence intervals are by their nature two-
sided since they produce upper and lower
bounds for the parameter.
• One-sided bounds can be constructed simply
by using a value of z that puts a rather than
a/2 in the tail of the z distribution.
LCB : Estimator  z  (Std Error of Estimator)
UCB : Estimator  z  (Std Error of Estimator)

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Choosing the Sample Size
• The total amount of relevant information in a
sample is controlled by two factors:
- The sampling plan or experimental design:
the procedure for collecting the information
- The sample size n: the amount of
information you collect.
• In a statistical estimation problem, the
accuracy of the estimation is measured by the
margin of error or the width of the
confidence interval. Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Choosing the Sample Size
1. Determine the size of the margin of error, B, that
you are willing to tolerate.
2. Choose the sample size by solving for n or n = n 1 =
n2 in the inequality: 1.96 SE £ B, where SE is a
function of the sample size n.
3. For quantitative populations, estimate the population
standard deviation using a previously calculated
value of s or the range approximation s » Range /
4.
4. For binomial populations, use the conservative
approach and approximate p usingCopyright
the value
©2006 p = .5.
Brooks/Cole
A division of Thomson Learning, Inc.
Example
A producer of PVC pipe wants to survey
wholesalers who buy his product in order to
estimate the proportion who plan to increase their
purchases next year. What sample size is required if
he wants his estimate to be within .04 of the actual
proportion with probability equal to .95?
pq .5(.5)
1.96  .04  1.96  .04
n n
1.96 .5(.5)  n  24.5 2  600.25
 n  24.5
.04 He should survey at least 601
wholesalers.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
I. Types of Estimators
1. Point estimator: a single number is calculated to estimate
the population parameter.
2. Interval estimator: two numbers are calculated to form an
interval that contains the parameter.
II. Properties of Good Estimators
1. Unbiased: the average value of the estimator equals the
parameter to be estimated.
2. Minimum variance: of all the unbiased estimators, the best
estimator has a sampling distribution with the smallest
standard error.
3. The margin of error measures the maximum distance
between the estimator and the true value of the parameter.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Properties of t -Distributions

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Table

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Table A5 Critical values of t

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
The One-Sample t Confidence
Interval

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Proposition

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
 Example;
Suppose traditional markets for sweetgum lumber have
declined, large section solid timbers traditionally used for
construction bridges and mats have become increasingly
scarce. The article “Development of Novel Industrial
Laminated Planks from Sweetgum Lumber” (J. of Bridge
Engr., 2008: 64–66) described the manufacturing and
testing of composite beams designed to add value to low-
grade sweetgum lumber. The data the modulus of rupture
(psi; the article contained summary data expressed in
MPa): n = 30 sample mean =7203.19 Psi sample s.d s
= 543.54

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Confidence Interval on the Difference
in Means : Variances unknown

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Where the
degree of
freedom Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
CONFIDENCE INTERVAL ON THE
VARIANCE AND STANDARD
DEVIATION OF A NORMAL
POPULATION

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Example

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Confidence Interval on the Ratio
of Two Variances

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Definition

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Example;

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.
Key Concepts
III. Large-Sample Point Estimators
To estimate one of four population parameters when the
sample sizes are large, use the following point estimators with
the appropriate margins of error.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Key Concepts
IV. Large-Sample Interval Estimators
To estimate one of four population parameters when the
sample sizes are large, use the following interval estimators.

Copyright ©2006 Brooks/Cole


A division of Thomson Learning, Inc.
Key Concepts
1. All values in the interval are possible values for the
unknown population parameter.
2. Any values outside the interval are unlikely to be the value
of the unknown parameter.
3. To compare two population means or proportions, look for
the value 0 in the confidence interval. If 0 is in the interval,
it is possible that the two population means or proportions
are equal, and you should not declare a difference. If 0 is
not in the interval, it is unlikely that the two means or
proportions are equal, and you can confidently declare a
difference.
V. One-Sided Confidence Bounds
Use either the upper (+) or lower (-) two-sided bound,
with the critical value of z changed from za / 2 to za.
Copyright ©2006 Brooks/Cole
A division of Thomson Learning, Inc.

Das könnte Ihnen auch gefallen