Stats - 4

DADM-1
How Many German Tanks?

• WW-II:Allied forces wanted to know how
many tanks the German military was
producing. They tried using traditional
espionage techniques but they kept coming
up with absurdly high estimates.
Number of Tiles in the Bag?
• A bag contains numbered tiles
• You randomly pulled six tiles from the bag
10, 23, 17, 9, 35, and 3
• Can you come up with an estimator that
uses this sample of numbers to estimate the
total number of tiles in the bag?
Estimator?
• 2*Max Sample Value : Biggest integer in our six number
sample is 35. So twice that would be 2 x 35 = 70. So twice
the maximum value, 70, gives us a number that’s way
bigger than the real value(42).
• 2*Mean Value: Well, the mean value of the six number
sample—10, 23, 17, 9, 35, and 3—is just over 16. So twice
the mean value would be about 32. That’s definitely better
than 70, but it’s off by about 25%, which is still quite a bit.
• 2*Median:That turns out to give us an estimate of 27, but
that differs from the actual value by about 35%.
Clearly, these estimators just aren’t good enough.

Better Estimator?
Population Maximum = Sample Maximum + Sample Maximum/Sample Size –1
35 + (35 / 6) – 1, which is just a hair under 40. The

actual maximum value is 42, so our estimate of 40 is
only about 5% less than the actual value!
German Tank Problem
• Calculating the population maximum using the
serial numbers from the captured tanks yielded an
estimate of only 256 per month. And after the war
when all the official German documents were
analyzed, it was found that the true value was 255
per month. That’s only 1 less than was estimated!
Statistical Inference
Statistical inference: Acquire information and draw

conclusions about populations from samples. Two
broad methods:
– Estimation
Estimate unknown population parameter using
sample statistics
– Hypothesis testing
Make a hypothesis about unknown population
parameters and test it against evidence in sample
Estimation
Estimator: Random variable based on

sample statistics that is used in estimation
– Point Estimator: Uses a single value
Ex: Infer population mean is 7
– Interval Estimator: Uses a range of values
and specifies the level of confidence
Ex: Infer µ is between 6.8 and 7.2 with 95%
confidence
Estimator Properties
Some important properties of an

estimator:
• Unbiased vs. biased
• Consistent vs. not consistent
• Relatively efficient vs. not relatively
efficient
Unbiasedness
Unbiased estimator: Estimator whose

expected value equals the population
parameter that it estimates
Does being unbiased rely on having

a large enough sample size?
Direction of Bias
Sometimes we can sign bias

• Upward bias: E[estimator] > parameter
• Downward bias: E[estimator] < parameter
For a Uniform population, is the sample range

an unbiased estimator of population range?
Consistency
Consist estimator: As sample size grows

to an infinite size the estimator converges
to the true population parameter
Consistent?
Consistency
Efficiency
Part I
Single Population
Hypothesis Testing
Hypothesis testing can be used to determine whether
a statement about the value of a population parameter
should or should not be rejected.
The null hypothesis, denoted by H0 , is a tentative
assumption about a population parameter.
The alternative hypothesis, denoted by Ha, is the
opposite of what is stated in the null hypothesis.
The hypothesis testing procedure uses data from a
sample to test the two competing statements
indicated by H0 and Ha.
Developing Null and Alternative Hypotheses
Null Hypothesis as an Assumption to be Challenged

Example:
The label on a soft drink bottle states that it
contains 67.6 fluid ounces.
Null Hypothesis:
The label is correct. m > 67.6 ounces.
Alternative Hypothesis:
The label is incorrect. m < 67.6 ounces.
Hypotheses about a Population Mean
The equality part of the hypotheses always appears

in the null hypothesis.
In general, a hypothesis test about the value of a
population mean µ must take one of the following
three forms (where µ0 is the hypothesized value of
the population mean).
H 0 : µ ≥ µ0 H 0 : µ ≤ µ0 H 0 : µ = µ0
H a : µ < µ0 H a : µ > µ0 H a : µ ≠ µ0
One-tailed One-tailed Two-tailed
(lower-tail) (upper-tail)
Null and Alternative Hypotheses
A major west coast city provides one of the most
comprehensive emergency medical services in the
world. Operating in a multiple hospital system
with approximately 20 mobile medical units, the
service goal is to respond to medical emergencies
with a mean time of 12 minutes or less.
The director of medical services wants to
formulate a hypothesis test that could use a sample
of emergency response times to determine whether
or not the service goal of 12 minutes or less is being
achieved.
Null and Alternative Hypotheses
The emergency service is meeting

H0: µ < 12
the response goal; no follow-up
action is necessary.
The emergency service is not

Ha: µ > 12
meeting the response goal;
appropriate follow-up action is
necessary.
where: µ = mean response time for the population

of medical emergency requests
Type I Error
Because hypothesis tests are based on sample data,
we must allow for the possibility of errors.
A Type I error is rejecting H0 when it is true.
The probability of making a Type I error when the
null hypothesis is true as an equality is called the
level of significance.
Applications of hypothesis testing that only control
the Type I error are often called significance tests.
Type II Error
A Type II error is accepting H0 when it is false.

It is difficult to control for the probability of making
a Type II error.
Statisticians avoid the risk of making a Type II
error by using “do not reject H0” and not “accept H0”.
Steps of Hypothesis Testing
Step 1. Develop the null and alternative hypotheses.

Step 2. Specify the level of significance .
Step 3. Collect the sample data and compute the test
statistic.
p-Value Approach
Step 4. Use the value of the test statistic to compute the
p-value.
Step 5. Reject H0 if p-value < a.
Steps of Hypothesis Testing
Critical Value Approach

Step 4. Use the level of significanceto
 to determine the
critical value and the rejection rule.
Step 5. Use the value of the test statistic and the rejection
rule to determine whether to reject H0.
One-Tailed Tests About a Population Mean:
σ Known
Example: Metro EMS
The response times for a random sample of 40
medical emergencies were tabulated. The sample
mean is 13.25 minutes. The population standard
deviation is believed to be 3.2 minutes.
The EMS director wants to perform a hypothesis
test, with a .05 level of significance, to determine
whether the service goal of 12 minutes or less is
being achieved.
σ Known
p -Value and Critical Value Approaches
H0: µ < 12
Ha: µ > 12
α = .05
x − µ 13.25 − 12
z= = = 2.47
σ / n 3.2/ 40
σ Known
p –Value Approach
For z = 2.47, cumulative probability = .9932.

p–value = 1 − .9932 = .0068
Because p–value = .0068 < α = .05, we reject H0.
There is sufficient statistical evidence

to infer that Metro EMS is not meeting
the response goal of 12 minutes.
σ Known
For α = .05, z.05 = 1.645

Reject H0 if z > 1.645
Because 2.47 > 1.645, we reject H0.
There is sufficient statistical evidence

to infer that Metro EMS is not meeting
the response goal of 12 minutes.
p-Value Approach to
Two-Tailed Hypothesis Testing
Compute the p-value using the following three steps:
1. Compute the value of the test statistic z.
2. If z is in the upper tail (z > 0), find the area under
the standard normal curve to the right of z.
If z is in the lower tail (z < 0), find the area under
the standard normal curve to the left of z.
3. Double the tail area obtained in step 2 to obtain
the p –value.
The rejection rule:
Reject H0 if the p-value < α .
Critical Value Approach to
Two-Tailed Hypothesis Testing
The critical values will occur in both the lower and
upper tails of the standard normal curve.
Use the standard normal probability distribution
table to find zα/2 (the z-value with an area of α/2 in
the upper tail of the distribution).
The rejection rule is:
Reject H0 if z < -zα/2 or z > zα/2.
Two-Tailed Tests About a Population Mean:
σ Known
• Example: Glow Toothpaste
The production line for Glow toothpaste is
designed to fill tubes with a mean weight of 6 oz.
Periodically, a sample of 30 tubes will be selected in
order to check the filling process.
Quality assurance procedures call for the
continuation of the filling process if the sample
results are consistent with the assumption that the
mean filling weight for the population of toothpaste
tubes is 6 oz.; otherwise the process will be adjusted.
σ Known
Example: Glow Toothpaste
Assume that a sample of 30 toothpaste tubes
provides a sample mean of 6.1 oz. The population
standard deviation is believed to be 0.2 oz.
Perform a hypothesis test, at the .03 level of
significance, to help determine whether the filling
process should continue operating or be stopped and
corrected.
σ Known
p –Value and Critical Value Approaches
H0 : µ = 6
Ha: µ ≠ 6
α = .03
x − µ0 6.1 − 6
z= = = 2.74
σ / n .2 / 30
σ Known
p –Value Approach
For z = 2.74, cumulative probability = .9969

p–value = 2(1 − .9969) = .0062
Because p–value = .0062 < α = .03, we reject H0.

There is sufficient statistical evidence to
infer that the alternative hypothesis is true
(i.e. the mean filling weight is not 6 ounces).
σ Known
For α/2 = .03/2 = .015, z.015 = 2.17

Reject H0 if z < -2.17 or z > 2.17

There is sufficient statistical evidence to
infer that the alternative hypothesis is true
(i.e. the mean filling weight is not 6 ounces).
Confidence Interval Approach to
Two-Tailed Tests About a
Population Mean
Select a simple random sample from the population
x
and use the value of the sample mean to develop
the confidence interval for the population mean µ.
If the confidence interval contains the hypothesized

value µ0, do not reject H0. Otherwise, reject H0.
Confidence Interval Approach
The 97% confidence interval for µ is
σ
x ± zα /2 = 6.1± 2.17(.2 30) = 6.1± .07924
n
or 6.02076 to 6.17924
Because the hypothesized value for the
population mean, µ0 = 6, is not in this interval,
the hypothesis-testing conclusion is that the
null hypothesis, H0: µ = 6, can be rejected.
Tests About a Population Mean:
σ Unknown
• Test Statistic
x − µ0
t=
s/ n
This test statistic has a t distribution

with n - 1 degrees of freedom.
Tests About a Population Mean:
σ Unknown
Rejection Rule: p -Value Approach
Reject H0 if p –value < α
Rejection Rule: Critical Value Approach
H0: µ > µ0 Reject H0 if t < -tα
H0: µ < µ0 Reject H0 if t > tα
H0: µ = µ0 Reject H0 if t < - tα/2 or t > tα/2

p -Values and the t Distribution
The format of the t distribution table provided in most
statistics textbooks does not have sufficient detail
to determine the exact p-value for a hypothesis test.
However, we can still use the t distribution table to
identify a range for the p-value.
An advantage of computer software packages is that
the computer output will provide the p-value for the
t distribution.
Example: Highway Patrol
A State Highway Patrol periodically samples
vehicle speeds at various locations on a particular
roadway. The sample of vehicle speeds is used to
test the hypothesis H0: µ < 65.
The locations where H0 is rejected are deemed the
best locations for radar traps. At Location F, a
sample of 64 vehicles shows a mean speed of 66.2
mph with a standard deviation of 4.2 mph. Use α
= .05 to test the hypothesis.
One-Tailed Test About a Population Mean:
σ Unknown
H0: µ < 65
Ha: µ > 65
α = .05
x − µ 0 66.2 − 65
t= = = 2.286
s / n 4.2 / 64
σ Unknown
p –Value Approach
For t = 2.286, the p–value must be less than .025

(for t = 1.998) and greater than .01 (for t = 2.387).
.01 < p–value < .025
Because p–value < α = .05, we reject H0.

We are at least 95% confident that the mean speed
of vehicles at Location F is greater than 65 mph.
σ Unknown
For α = .05 and d.f. = 64 – 1 = 63, t.05 = 1.669

Reject H0 if t > 1.669

We are at least 95% confident that the mean speed
of vehicles at Location F is greater than 65 mph.
Location F is a good candidate for a radar trap.
Tests About a Population Proportion
Test Statistic
p − p0
z=
σp
where:
p0 (1 − p0 )
σp =
n
assuming np > 5 and n(1 – p) > 5

Tests About a Population Proportion
Rejection Rule: p –Value Approach

Reject H0 if p –value < α
Rejection Rule: Critical Value Approach
H0: p < p0 Reject H0 if z > zα
H0: p > p0 Reject H0 if z < -zα
H0: p = p0 Reject H0 if z < -zα/2 or z > zα/2

Two-Tailed Test About a
Population Proportion
Example: National Safety Council (NSC)
For a Christmas and New Year’s week, the
National Safety Council estimated that 500 people
would be killed and 25,000 injured on the nation’s
roads. The NSC claimed that 50% of the accidents
would be caused by drunk driving.
A sample of 120 accidents showed that 67 were
caused by drunk driving. Use these data to test the
NSC’s claim with α = .05.
H 0 : p = .5
H a : p ≠ .5
α = .05
p0 (1 − p0 ) .5(1 − .5)
σp = = = .045644
n 120
p − p0 (67 /120) − .5
z= = = 1.28
σp .045644
p−Value Approach
For z = 1.28, cumulative probability = .8997

p–value = 2(1 − .8997) = .2006
Because p–value = .2006 > α = .05, we cannot reject H0.

For α/2 = .05/2 = .025, z.025 = 1.96

Reject H0 if z < -1.96 or z > 1.96
Because 1.278 > -1.96 and < 1.96, we cannot reject H0.
Part II
Two Populations
Inferences About the Difference Between
Two Population Means: σ 1 and σ 2 Known
• Interval Estimation of µ 1 – µ 2
• Hypothesis Tests About µ 1 – µ 2
Sampling Distribution of x1 − x2
Expected Value
E ( x1 − x2 ) = µ 1 − µ 2
Standard Deviation (Standard Error)
σ12 σ 22
σ x1 − x2 = +
n1 n2
where: σ1 = standard deviation of population 1

σ2 = standard deviation of population 2
n1 = sample size from population 1
n2 = sample size from population 2
Interval Estimation of µ1 - µ2:
σ 1 and σ 2 Known
Interval Estimate
σ12 σ 22
x1 − x2 ± zα / 2 +
n1 n2
where:
1 - α is the confidence coefficient
Hypothesis Tests About µ 1 − µ 2:
σ 1 and σ 2 Known
Hypotheses
H0 : µ1 − µ2 ≥ D0 H0 : µ1 − µ2 ≤ D0 H0 : µ1 − µ2 = D0
Ha: µ1 − µ2 < D0 Ha : µ1 − µ2 > D0 Ha : µ1 − µ2 ≠ D0
Left-tailed Right-tailed Two-tailed
Test Statistic
( x1 − x2 ) − D0
z=
σ 12 σ 22
+
n1 n2
σ 1 and σ 2 Known
Example: Par, Inc.
Can we conclude, using α = .01, that the
mean driving distance of Par, Inc. golf balls is
greater than the mean driving distance of Rap, Ltd.
golf balls?
σ 1 and σ 2 Known
1. Develop the hypotheses. H0: µ1 - µ2 < 0

Ha: µ1 - µ2 > 0
where:
µ1 = mean distance for the population
of Par, Inc. golf balls
µ2 = mean distance for the population
of Rap, Ltd. golf balls
2. Specify the level of significance. α = .01

σ 1 and σ 2 Known
3. Compute the value of the test statistic.
( x 1 − x 2 ) − D0
z=
σ 12 σ 22
+
n1 n2
(235 − 218) − 0 17
z= = = 6.49
(15) 2 (20 ) 2 2.62
+
120 80
σ 1 and σ 2 Known
p –Value Approach
4. Compute the p–value.

For z = 6.49, the p –value < .0001.
5. Determine whether to reject H0.

At the .01 level of significance, the sample evidence
indicates the mean driving distance of Par, Inc. golf
balls is greater than the mean driving distance of Rap,
Ltd. golf balls.
σ 1 and σ 2 Known
4. Determine the critical value and rejection rule.
For α = .01, z.01 = 2.33


Because z = 6.49 > 2.33, we reject H0.
The sample evidence indicates the mean driving
distance of Par, Inc. golf balls is greater than the mean
driving distance of Rap, Ltd. golf balls.
Small Sample Case
• Each of the two populations is normally

distributed.
• The two samples are independent.
• At least one of the samples is small, n < 30.
• The values of the population variances are
unknown.
• The variances of the two populations are equal.
σ12 = σ22
T stat
(X − X )− (µ − µ )
1 2 1 2
t=
S (n − 1) + S (n − 1)
2 2
1 1 2 2 1 1
+
n +n −2
1 2 n n 1 2
Confidence Interval
(X − X )± t S (
2
1n − 1) + S (n
1
2
2 2
− 1) 1
+
1
1 2
n +n −2 1 2 n n
1 2
where df = n + n − 2
1 2
Two Population Means: Matched Samples
With a matched-sample design each sampled item
provides a pair of data values.
This design often leads to a smaller sampling error
than the independent-sample design because
variation between sampled items is eliminated as a
source of sampling error.
Example: Express Deliveries
A Chicago-based firm has documents that must
be quickly distributed to district offices throughout
the U.S. The firm must decide between two delivery
services, UPX (United Parcel Express) and INTEX
(International Express), to transport its documents.
Example: Express Deliveries
In testing the delivery times of the two services,
the firm sent two reports to a random sample of its
district offices with one report carried by UPX and
the other report carried by INTEX. Do the data on
the next slide indicate a difference in mean delivery
times for the two services? Use a .05 level of
significance.
Delivery Time (Hours)

District Office UPX INTEX Difference
Seattle 32 25 7
Los Angeles 30 24 6
Boston 19 15 4
Cleveland 16 15 1
New York 15 13 2
Houston 18 15 3
Atlanta 14 15 -1
St. Louis 10 8 2
Milwaukee 7 9 -2
Denver 16 11 5
1. Develop the hypotheses.
H0: µd = 0
Ha: µd ≠ 0
Let µd = the mean of the difference values for the
two delivery services for the population
of district offices
∑ di ( 7 + 6+... +5)
d = = = 2. 7
n 10
2
∑ ( di − d ) 76.1
sd = = = 2. 9
n −1 9
d − µd 2.7 − 0
t= = = 2.94
sd n 2.9 10
p –Value Approach
4. Compute the p –value.
For t = 2.94 and df = 9, the p–value is between

.02 and .01. (This is a two-tailed test, so we double
the upper-tail areas of .01 and .005.)

We are at least 95% confident that there is a
difference in mean delivery times for the two
services?
For α = .05 and df = 9, t.025 = 2.262.

Reject H0 if t > 2.262

Because t = 2.94 > 2.262, we reject H0.
We are at least 95% confident that there is a
difference in mean delivery times for the two
services?
Two Population Proportions
• Interval Estimation of p1 - p2
• Hypothesis Tests About p1 - p2
Sampling Distribution of p1 − p2
Expected Value
E ( p1 − p2 ) = p1 − p2
Standard Deviation (Standard Error)
p1 (1 − p1 ) p2 (1 − p2 )
σ p1 − p2 = +
n1 n2
where: n1 = size of sample taken from population 1

n2 = size of sample taken from population 2
Sampling Distribution of p1 − p2
The sample sizes are sufficiently

large if all of these conditions
are met:
n1p1 > 5 n1(1 - p1) > 5
n2p2 > 5 n2(1 - p2) > 5

Interval Estimation of p1 - p2
Interval Estimate
p1 (1− p1 ) p2 (1− p2 )
p1 − p2 ± zα / 2 +
n1 n2
Hypothesis Tests about p1 - p2
Hypotheses
We focus on tests involving no difference between
the two population proportions (i.e. p1 = p2)
H0 : p1 − p2 ≥ 0 H0: p1 - p2 < 0 H0: p1 − p2 = 0

Ha : p1 − p2 < 0 Ha: p1 - p2 > 0 Ha : p1 − p2 ≠ 0
Left-tailed Right-tailed Two-tailed
Standard Error of p1 − p2 when p1 = p2 = p
1 1
σ p −p = p(1 − p)  + 
 n1 n2 
1 2
Pooled Estimator of p when p1 = p2 = p
n1 p1 + n2 p2
p=
n1 + n2
Test Statistic
( p1 − p2 )
z=
 1 1 
p (1 − p )  + 
n
 1 n 2 
Example: Market Research Associates

Can we conclude, using a .05 level of significance,
that the proportion of households aware of the
client’s product increased after the new advertising
campaign?
1. Develop the hypotheses. H0: p1 - p2 < 0

Ha: p1 - p2 > 0
p1 = proportion of the population of households
“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign
250(. 48) + 150(. 40) 180

p= = =. 45
250 + 150 400
s p1 − p2 = . 45(. 55)( 1 + 1 ) = . 0514

250 150
(.48 − .40 ) − 0 .08
z= = = 1.56
.0514 .0514
p –Value Approach
4. Compute the p –value.
For z = 1.56, the p–value = .0594

Because p–value > α = .05, we cannot reject H0.
We cannot conclude that the proportion of households
aware of the client’s product increased after the new
campaign.

For α = .05, z.05 = 1.645

Because 1.56 < 1.645, we cannot reject H0.

We cannot conclude that the proportion of households
aware of the client’s product increased after the new
campaign.
Inferences About Population
Variances
Inferences about Two Populations Variances
Inference about a Population Variance
Inferences About a Population Variance
A variance can provide important decision-making

information.
Consider the production process of filling containers
with a liquid detergent product.
The mean filling weight is important, but also is the
variance of the filling weights.
By selecting a sample of containers, we can compute
a sample variance for the amount of detergent placed
in a container.
If the sample variance is excessive, overfilling and
underfilling may be occurring even though the mean
is correct.
Inferences About a Population
Variance
• Chi-Square Distribution
• Interval Estimation of σ 2
• Hypothesis Testing
Chi-Square Distribution
The chi-square distribution is the sum of squared
standardized normal random variables such as
(z1)2+(z2)2+(z3)2 and so on.
The chi-square distribution is based on sampling
from a normal population.
The sampling distribution of (n - 1)s2/σ 2 has a chi-
square distribution whenever a simple random sample
of size n is selected from a normal population.
We can use the chi-square distribution to develop
interval estimates and conduct hypothesis tests
about a population variance.
Examples of Sampling Distribution of (n - 1)s2/σ 2
With 2 degrees
of freedom
With 5 degrees
of freedom
With 10 degrees
of freedom
(n −1)s2
0 σ2
Chi-Square Distribution
2
We will use the notation χtoα2 denote the value for the
chi-square distribution that provides an area of α to
2
the right of the stated χvalue.
α
For example, there is a .95 probability of obtaining a χ2

(chi-square) value such that
2
χ.975 ≤ χ 2 ≤ χ.025
2
Interval Estimation of σ 2
2 ( n − 1)s 2 2
χ .975 ≤ ≤ χ .025
σ2
.025
.025
95% of the
possible χ2 values
χ2
2 2
0 χ .975 χ .025
There is a (1 – α) probability of obtaining a χ2 value

such that 2 2 2
χ (1−α / 2) ≤ χ ≤ χα / 2
Substituting (n – 1)s2/σ 2 for the χ2 we get

(n − 1) s 2
χ (12 −α / 2) ≤ ≤ χα2 / 2
σ2
Performing algebraic manipulation we get
( n − 1) s 2 ( n − 1) s 2
≤ σ2 ≤
χ α2 / 2 χ (21−α / 2)
• Interval Estimate of a Population Variance
( n − 1) s 2 2 ( n − 1) s 2
2
≤σ ≤
χα /2 χ (21−α / 2)
where the χ2 values are based on a chi-square

distribution with n - 1 degrees of freedom and
where 1 - α is the confidence coefficient.
Interval Estimation of σ
Taking the square root of the upper and lower
limits of the variance interval provides the confidence
interval for the population standard deviation.
(n −1)s2 (n −1)s2
2
≤σ ≤
χα /2 χ(12−α /2)
Example: Buyer’s Digest (A)
Buyer’s Digest rates thermostats manufactured
for home temperature control. In a recent test,
10 thermostats manufactured by ThermoRite
were selected and placed in a test room that was
maintained at a temperature of 68oF. The
temperature readings of the ten thermostats are
shown on the next slide.
Example: Buyer’s Digest (A)

We will use the 10 readings below to develop a
95% confidence interval estimate of the population
variance.
Thermostat 1 2 3 4 5 6 7 8 9 10
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
For n - 1 = 10 - 1 = 9 d.f. and α = .05
Selected Values from the Chi-Square Distribution Table

Degrees Area in Upper Tail
of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666
10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209
2
Our χ .975 value
For n - 1 = 10 - 1 = 9 d.f. and α = .05
( n − 1)s 2 2
2.700 ≤ ≤ χ .025
σ2
.025
Area in
Upper Tail
= .975
χ2
0 2.700
For n - 1 = 10 - 1 = 9 d.f. and α = .05

of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666
10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209
2
Our χ .025 value
n - 1 = 10 - 1 = 9 degrees of freedom and α = .05
( n − 1)s 2
2.700 ≤ 2
≤ 19.023
σ
.025 Area in Upper

Tail = .025
χ2
0 2.700 19.023
Interval Estimation of 2 σ
• Sample variance s2 provides a point estimate of σ 2.
2 ∑ ( xi − x ) 2 6. 3
s = = =. 70
n −1 9
A 95% confidence interval for the population variance is
given by:
(10 − 1). 70 2 (10 − 1). 70
≤σ ≤
19. 02 2. 70
.33 < σ 2 < 2.33

Hypothesis Testing
About a Population Variance
•Hypotheses
H 0 : σ 2 ≥ σ 02
• Left-Tailed Test
H a : σ 2 < σ 02
where σ 02 is the hypothesized value

for the population variance
•Test Statistic
( n − 1) s 2
χ2 =
σ 20
Hypothesis Testing
Left-Tailed Test (continued)
•Rejection Rule
Critical value approach: Reject H0 if χ 2 ≤ χ(12 −α )
p-Value approach: Reject H0 if p-value < α
where χ (21 −α ) is based on a chi-square

distribution with n - 1 d.f.
Hypothesis Testing
Right-Tailed Test
•Hypotheses
H0 : σ 2 ≤ σ 20
H a : σ 2 > σ 20

•Test Statistic
( n − 1) s 2
χ2 =
σ 20
Hypothesis Testing
Right-Tailed Test (continued)
•Rejection Rule
Critical value approach: Reject H0 if χ2 ≥ χα2
where χ α2 is based on a chi-square

distribution with n - 1 d.f.
Hypothesis Testing
Two-Tailed Test
•Hypotheses
H0 : σ 2 = σ 20
H a : σ 2 ≠ σ 20

•Test Statistic
( n − 1) s 2
χ2 =
σ 20
Hypothesis Testing
Two-Tailed Test (continued)
•Rejection Rule
Critical value approach:
Reject H0 if χ 2 ≤ χ (12 −α /2 ) or χ 2 ≥ χα2 /2
p-Value approach:
Reject H0 if p-value < α
where χ(12 −α /2) and χα2/2 are based on a

chi-square distribution with n - 1 d.f.
Hypothesis Testing
Example: Buyer’s Digest (B)
Recall that Buyer’s Digest is rating

ThermoRite thermostats. Buyer’s Digest gives an
“acceptable” rating to a thermostat with a
temperature variance of 0.5 or less.
We will conduct a hypothesis test (with α = .10)

to determine whether the ThermoRite thermostat’s
temperature variance is “acceptable”.
Hypothesis Testing
Example: Buyer’s Digest (B)
Using the 10 readings, we will conduct a
hypothesis test (with α = .10) to determine whether
the ThermoRite thermostat’s temperature variance is
“acceptable”.
Thermostat 1 2 3 4 5 6 7 8 9 10
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
Hypothesis Testing
H 0 : σ 2 ≤ 0.5
• Hypotheses
H a : σ 2 > 0.5
Rejection Rule
Reject H0 if χ 2 > 14.684

Hypothesis Testing
For n - 1 = 10 - 1 = 9 d.f. and α = .10

of Freedom .99 .975 .95 .90 .10 .05 .025 .01
5 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.647 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666
10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209

Hypothesis Testing
Rejection Region
( n − 1)s 2 9s2
χ2 = =
σ2 .5
Area in Upper
Tail = .10
χ2
0 14.684
Reject H0
Hypothesis Testing
The sample variance s 2 = 0.7

• Test Statistic 9(.7)
2
χ = = 12.6
.5
Conclusion
Because χ2 = 12.6 is less than 14.684, we cannot
reject H0. The sample variance s2 = .7 is insufficient
evidence to conclude that the temperature variance
for ThermoRite thermostats is unacceptable.
Hypothesis Testing
• Using the p-Value

The rejection region for the ThermoRite
thermostat example is in the upper tail; thus, the
appropriate p-value is less than .90 (χ 2 = 4.168)
and greater than .10 (χ 2 = 14.684).
Because the p –value > α = .10, we cannot
reject the null hypothesis.
The sample variance of s 2 = .7 is insufficient
evidence to conclude that the temperature
variance is unacceptable (>.5).
The exact p-value

p-value is .18156.
Inferences About Two Population Variances
We may want to compare the variances in:

product quality resulting from two different
production processes,
temperatures for two heating devices, or
assembly times for two assembly methods.
We use data collected from two independent random
sample, one from population 1 and another from
population 2.
The two sample variances will be the basis for making

inferences about the two population variances.
Hypothesis Testing About the
Variances of Two Populations
One-Tailed Test
•Hypotheses
H 0 : σ 12 ≤ σ 22
H a : σ 12 > σ 22
Denote the population providing the

larger sample variance as population 1.
•Test Statistic
s12
F=
s22
One-Tailed Test (continued)
•Rejection Rule
Critical value approach: Reject H0 if F > Fα
where the value of Fα is based on an

F distribution with n1 - 1 (numerator)
and n2 - 1 (denominator) d.f.

Two-Tailed Test
•Hypotheses
H 0 : σ 12 = σ 22
Ha : σ 12 ≠ σ 22
Denote the population providing the

larger sample variance as population 1.
•Test Statistic
2
s
F= 1
s22
Two-Tailed Test (continued)
•Rejection Rule
Critical value approach: Reject H0 if F > Fα/2
where the value of Fα/2 is based on an

F distribution with n1 - 1 (numerator)
and n2 - 1 (denominator) d.f.

Example: Buyer’s Digest (C)
Buyer’s Digest has conducted the same test, as
was described earlier, on another 10 thermostats,
this time manufactured by TempKing. The
temperature readings of the ten thermostats are
listed on the next slide.
We will conduct a hypothesis test with α = .10 to see
if the variances are equal for ThermoRite’s thermostats
and TempKing’s thermostats.
Example: Buyer’s Digest (C)
ThermoRite Sample
Thermostat 1 2 3 4 5 6 7 8 9 10
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
TempKing Sample
Thermostat 1 2 3 4 5 6 7 8 9 10
Temperature 67.7 66.4 69.2 70.1 69.5 69.7 68.1 66.6 67.3 67.5
• Hypotheses
H 0 : σ 12 = σ 22 (TempKing and ThermoRite thermostats
have the same temperature variance)
H a : σ 12 ≠ σ 22 (Their variances are not equal)
Rejection Rule
The F distribution table (on next slide) shows that with
with α = .10, 9 d.f. (numerator), and 9 d.f. (denominator),
F.05 = 3.18.
Reject H0 if F > 3.18
Selected Values from the F Distribution Table
Denominator Area in Numerator Degrees of Freedom
Degrees Upper
of Freedom Tail 7 8 9 10 15
8 .10 2.62 2.59 2.56 2.54 2.46
.05 3.50 3.44 3.39 3.35 3.22
.025 4.53 4.43 4.36 4.30 4.10
.01 6.18 6.03 5.91 5.81 5.52
9 .10 2.51 2.47 2.44 2.42 2.34

.05 3.29 3.23 3.18 3.14 3.01
.025 4.20 4.10 4.03 3.96 3.77
.01 5.61 5.47 5.35 5.26 4.96
TempKing’s sample variance is 1.768

ThermoRite’s sample variance is .700
• Test s 2
Statistic F= 1 2 = 1.768/.700 = 2.53

s
2
Conclusion
We cannot reject H0. F = 2.53 < F.05 = 3.18.
There is insufficient evidence to conclude that
the population variances differ for the two
thermostat brands.
• Determining and Using the p-Value
Area in Upper Tail .10 .05 .025 .01

F Value (df1 = 9, df2 = 9) 2.44 3.18 4.03 5.35
Because F = 2.53 is between 2.44 and 3.18, the area

in the upper tail of the distribution is between .10
and .05.
But this is a two-tailed test; after doubling the
upper-tail area, the p-value is between .20 and .10.
Because α = .10, we have p-value > α and therefore
we cannot reject the null hypothesis.
References/Sources
• Ken Black
• Anderson Sweeney Williams
• Levin Rubin

Stats - 4

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Stats - 4

Hochgeladen von

Copyright:

Verfügbare Formate

DADM-1

How Many German Tanks?

Clearly, these estimators just aren’t good enough.

35 + (35 / 6) – 1, which is just a hair under 40. The

Statistical inference: Acquire information and draw

Estimator: Random variable based on

Some important properties of an

Unbiased estimator: Estimator whose

Does being unbiased rely on having

Sometimes we can sign bias

For a Uniform population, is the sample range

Consist estimator: As sample size grows

Null Hypothesis as an Assumption to be Challenged

The equality part of the hypotheses always appears

The emergency service is meeting

The emergency service is not

where: µ = mean response time for the population

A Type II error is accepting H0 when it is false.

Step 1. Develop the null and alternative hypotheses.

Critical Value Approach

For z = 2.47, cumulative probability = .9932.

Because p–value = .0068 < α = .05, we reject H0.

There is sufficient statistical evidence

For α = .05, z.05 = 1.645

Because 2.47 > 1.645, we reject H0.

There is sufficient statistical evidence

For z = 2.74, cumulative probability = .9969

Because p–value = .0062 < α = .03, we reject H0.

For α/2 = .03/2 = .015, z.015 = 2.17

Because 2.74 > 2.17, we reject H0.

If the confidence interval contains the hypothesized

This test statistic has a t distribution

H0: µ < µ0 Reject H0 if t > tα

H0: µ = µ0 Reject H0 if t < - tα/2 or t > tα/2

For t = 2.286, the p–value must be less than .025

Because p–value < α = .05, we reject H0.

For α = .05 and d.f. = 64 – 1 = 63, t.05 = 1.669

Because 2.286 > 1.669, we reject H0.

assuming np > 5 and n(1 – p) > 5

Rejection Rule: p –Value Approach

H0: p > p0 Reject H0 if z < -zα

H0: p = p0 Reject H0 if z < -zα/2 or z > zα/2

For z = 1.28, cumulative probability = .8997

Because p–value = .2006 > α = .05, we cannot reject H0.

For α/2 = .05/2 = .025, z.025 = 1.96

Standard Deviation (Standard Error)

where: σ1 = standard deviation of population 1

1. Develop the hypotheses. H0: µ1 - µ2 < 0

2. Specify the level of significance. α = .01

3. Compute the value of the test statistic.

4. Compute the p–value.

5. Determine whether to reject H0.

4. Determine the critical value and rejection rule.

For α = .01, z.01 = 2.33

5. Determine whether to reject H0.

• Each of the two populations is normally

Delivery Time (Hours)

3. Compute the value of the test statistic.

For t = 2.94 and df = 9, the p–value is between

5. Determine whether to reject H0.

For α = .05 and df = 9, t.025 = 2.262.

5. Determine whether to reject H0.