Sampling and Sampling Distributions: 7.1 Definitions

Chapter 7
Sampling and Sampling

Distributions
7.1 Definitions
• A statistical population is a set or collection of all possible observations of some

characteristic.
• A sample is a part or subset of the population.
• A random sample of size n is a sample that is chosen in such a way as to ensure

that every sample of size n has the same probability of being chosen.
• A parameter is a number describing some (unknown) aspect of a population. (i.e.

u)
• A statistic is some function of the sample observations. (i.e. X̄)
• The probability distribution of a statistic is known as a sampling distribution. (How

is X̄ distributed)
• We need to distinguish the distribution of a random variable, say X̄ from the re-
alization of the random variable (ie. we get data and calculate some sample mean
say X̄ = 4.2)
1
2 CHAPTER 7. SAMPLING AND SAMPLING DISTRIBUTIONS
Populations and Samples
A Population is the set of all items or individuals

of interest
Examples: All lik ely voters in the next election
All parts produced today
All sales receipts for November
A Sample is a subset of the population

Examples: 1000 voters selected at random for interview
A few parts selected for destructi ve testing
Random receipts selected for audi t
Stati sti cs for Business and E conomi cs, 6e © 2007 Pearson E ducation, I nc. Chap 7-4
Figure 7.1:
7.1. DEFINITIONS 3
Population vs. Sample
Population Sample
a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y
Figure 7.2:
Note on Statistics
• The value of the statistic will change from sample to sample and we can therefore
think of it as a random variable with it’s own probability distribution.
• X̄ is a random variable
• Repeated sampling and calculation of the resulting statistic will give rise to a dis-
tribution of values for that statistic.
7.1. DEFINITIONS 5
Sampling Distributions
A sampling distribution is a distribution of

all of the possible values of a statistic for
a given size sample selected from a
population
Figure 7.3:
Chapter Outline
Sampling
Distributions
Sampling Sampling Sampling

Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Difference in
Means
Figure 7.4:
7.2. IMPORTANT THEOREMS RECALLED 7
7.2 Important Theorems Recalled
Suppose X1 , X2 , ..., Xn are independent with E[Xi ] = μi and V [Xi ] = σ 2i ∀i =

1, 2, ..., n.
Suppose Y = a1 X1 + a2 X2 + ... + an Xn + b, then:

X
E[Y ] = E[ ai Xi + b] = a1 E[X1 ] + a2 E[X2 ] + · · · + an E[Xn ] + b
= a1 μ1 + . . . an μn + b
X
= ai μi + b
and
X
V [Y ] = V [ ai Xi + b] = a21 V [X1 ] + a22 V [X2 ] + · · · + a2n V [Xn ] .
= a21 σ 21 + a22 σ 22 + · · · + a2n σ 2n

X
= a2i σ 2i because of independence
and if Xi is normal ∀i, i.e. Xi ∼ N(μi , σ 2i ) independently ∀i:

X X
Y ∼ N( ai μi + b, a2i σ 2i )
7.3 Frequently used Statistics

7.3.1 The sample mean
• Let X1 , X2 , . . . , Xn be a random sample of size n from a population with mean μ

and variance σ2 . The sample mean is:
1X
n
X̄ = Xi
n i=1
1. The expected value of the sample mean is the population mean:
1X 1X
n n
1
E[X̄] = E( Xi ) = E[Xi ] = (μ + μ + . . . μ) = μ
n i=1 n i=1 n
2. The variance of the sample mean ( X0s independent):
1X 1 X
n n
V [X̄] = V ( Xi ) = 2 V (Xi )
n i=1 n i=1
1 X 2 σ2
n
= 2 σ = .
n i=1 n
3. If we do not have independence it can be shown that

µ ¶
σ2 N − n
V [X̄] = where N is the population size
n N −1
µ ¶
N −n
is called the correction factor
N −1
¡ ¢ 2
and if N is large relative to n then N−n
N−1
⇒ 1 so that V [X̄] = σn
Note on Sample Mean
1. The use of the formulas for expected values and variances of sums of random
variables that we saw in chapter 5.
2. The variance of the sample mean is a decreasing function of the sample size.
3. The standard deviation of the sample mean (under independence)

σ
σ X̄ = √
n
7.4. SAMPLING DISTRIBUTION OF THE SAMPLE MEAN 9
7.3.2 The sample variance

• Let X1 , X2 , . . . , Xn be a random sample of size n from a population with variance
σ2.
• The sample variance is:
1 X
n
s2 = (Xi − X̄)2
n − 1 i=1
1. E[s2 ] = σ 2 (we omit the proof)
7.4 Sampling distribution of the Sample Mean

Sampling from a Normal Population
• Let X̄ be the sample mean of an independent random sample of size n from a

population with mean μ and variance σ 2 .
σ2
• Then we know that E[X̄] = μ and V [X̄] = n
.
• If we further specify the population distribution as being normal, then
Xi ∼ N(μ, σ2 ) for all i
and we can write:
µ ¶
σ2
X̄ ∼ N μ, .
n
7.5 Equation for the Standardized Sample Mean
σ 2
Since X̄ ∼ N(μ, n
) we can ask what transformation can give us to a standard normal
• Generically the approach is ALWAYS
Random Variable-Mean of Random Variable

Z=
Standard Deviation of Random Variable
• What does that mean for X̄ :
• Random Variable = X̄,
• Mean of Variable = E[X̄] = μ
• Standard Deviation of Variable = σ = √σ

X̄ n
• Put it all together
X̄ − μ
Z= .
√σ
n
7.5. EQUATION FOR THE STANDARDIZED SAMPLE MEAN 11
Z-value for Sampling Distribution

of the Mean
Z-value for the sampling distribution of X :
( X − μ) ( X − μ)
Z= =
σX σ
n
where: X = sample mean

μ = population mean
σ = population standard deviation
n = sample size
Figure 7.5:
Sampling Distribution Properties

(continued)
For sampling with replacement:

As n increases, Larger
σ x decreases sample size
Smaller
sample size
μ x
Figure 7.6:
Example of Standardizing for Sample Mean

The lengths of individual machined parts coming off a production line at Morton Metal-
works are normally distributed around their mean of μ = 30 centimeters. Their standard
deviation around the mean is σ = .1 centimeter. An inspector just took a sample of
n = 4 of these parts and found that X̄ for this sample is 29.875 centimeters. What is
the probability of getting a sample mean this low or lower if the process is still producing
parts at a mean of μ = 30?
Answer
• Given the population is normally distributed with mean μ = 30 and standard

deviation σ = .1, we know that Xi ∼ N(30, .12 ) for all i,
so
X̄ ∼ N(30, .12 /n).
Now want to apply our transformation stuff:

X̄ − μ 29.875 − 30
P (X̄ ≤ 29.875) = P ( √ ≤ √ )
σ/ n .1/ 4
29.875 − 30
= P (Z ≤ √ ) = P (Z ≤ −2.50) = .0062.
.1/ 4
Questions: 7.3.
7.5.1 Sampling from a Non Normal Distribution
• We have seen that we can obtain the exact sampling distribution for the sample
mean if the individual Xi are all independent normal variates.
• What happens when the Xi0 s are not normally distributed?

Developing a
Sampling Distribution
Assume there is a population …

D
A C
Population size N=4 B
Random variable, X,
is age of individuals
Values of X:
18, 20, 22, 24 (years)
Figure 7.7:
Developing a
(continued)
Summary Measures for the Population Distribution:
μ=
∑X i
P(x)
N
.25
18 + 20 + 22 + 24
= = 21
4
∑ (X − μ) 2 0
i 18 20 22 24 x
σ= = 2.236 A B C D
N
Uniform Distribution
Figure 7.8:
7.5.2 The Central Limit Theorem
Developing a
(continued)
Now consider all possible samples of size n = 2
st nd
1 2 Observation
16 Sample
Obs 18 20 22 24
Means
18 18,18 18,20 18,22 18,24
1st 2nd Observation
20 20,18 20,20 20,22 20,24 Obs 18 20 22 24
22 22,18 22,20 22,22 22,24 18 18 19 20 21
24 24,18 24,20 24,22 24,24 20 19 20 21 22
16 possible samples 22 20 21 22 23
(sampling with
replacement)
24 21 22 23 24
Figure 7.9:
Developing a
(continued)
Sampling Distribution of All Sample Means
16 Sample Means Sample Means

Distribution
1st 2nd Observation _
Obs 18 20 22 24 P(X)
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 _
24 21 22 23 24 1 8 19 20 21 22 23 24 X
(no longer uniform)
Figure 7.10:
Let X1 , X2 , . . . , Xn be an independent random sample having identical distribution from

a population of any shape with mean μ and variance σ 2 .
Then if n is large:
σ2
X̄ ∼ N(μ, ), approximately for large n
n
and similarly we can use the same transformation to standard form:
X̄ − μ
Z= ∼ N (0, 1), approximately for large n
√σ
n
Notes on the Central Limit Theorem
1. This result holds only for large n and we refer to such results as holding asymp-
totically. In this case we say that X̄ is asymptotically normally distributed
2. We have a short-hand way to write that the distribution of Xi is independently and

identically ( iid ) distributed with mean μ and variance σ2
Xi ∼ i.i.d(μ, σ2 )
3. Identically implies that E[Xi ] = μ and V [Xi ] = σ 2 for all i. That is the distribution
of each observation (i = 1, ...n) is the same.
Central Limit Theorem
the sampling
As the n? distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population
x
Figure 7.11:
Example
Suppose a population has mean µ = 8 and

standard deviation s = 3. Suppose a random
sample of size n = 36 is selected.
What is the probability that the sample mean is

between 7.8 and 8.2?
Figure 7.12:
Example From Transformation to Standard Form when Sampling from a Non-

Normal Distribution
• The delay time for inspection of baggage at a border station follows a bimodal
distribution with a mean of μ = 8 minutes and a standard deviation of σ = 6
minutes. A sample of n = 64 from a particular minority group has a mean of
X̄ = 10 minutes.
• Is their evidence that this minority group is being detained longer than usual?
• How likely is it that a sample mean of X̄ ≥ 10 if the population mean is 8?
Answer:
• note from the question that this population from which we are sampling is non
normal
• we know this is from the word bimodal since the normal has one mode
7.6. SAMPLING DISTRIBUTION: DIFFERENCES OF SAMPLE MEANS X̄1 −X̄2 21
Example
(continued)
Solution (continued):
⎛ ⎞
⎜ 7.8 - 8 μX -μ 8.2 - 8 ⎟
P(7.8 < μ X < 8.2) = P⎜ < < ⎟
⎜ 3 σ 3 ⎟
⎝ 36 n 36 ⎠
= P(-0.5 < Z < 0.5) = 0.3830

Population Samp ling Stand ard Normal
Distribution Distribution Distribution .1915
?? ? +.1915
? ??
? ? Sample Standardize
?? ?
?
7.8 8.2 -0.5 0.5
μ=8 X μX = 8 x μz = 0 Z
Figure 7.13:
• If the Xi are iid, then by applying (or invoke) the central limit theorem:
10 − 8
P (X̄ ≥ 10) ≈ P (Z ≥ √ )
6/ 64
= P (Z ≥ 2.67) = .0038,
approximately.
Questions: NCT 7.11, 7.18, 7.19, &7.20.
7.6 Sampling Distribution: Differences of Sample Means

X̄1 − X̄2
• Let X̄1 and X̄2 be the means of two samples from two separate and independent
populations.
σ2
E[X̄1 ] = μ1 V [X̄1 ] = 1
n1
σ 22
E[X̄2 [= μ2 V [X̄2 ] =
n2
Since X̄1 and X̄2 are independent:
E[X̄1 − X̄2 ] = E[X̄1 ] − E[X̄2 ] = μ1 − μ2
σ 21 σ 22
V [X̄1 − X̄2 ] = V [X̄1 ] + V [X̄2 ] = +
n1 n2
7.6.1 Normal Case for X̄1 − X̄2

If X1,i are i.i.d. as N(μ1 , σ 21 ) , X2,i are i.i.d . as N(μ2 , σ 22 ), and X1 and X2 are
independent,
then we have an exact normal result for any sample size:
σ 21 σ 22
X̄1 − X̄2 ∼ N(μ1 − μ2 , + ).
n1 n2
• Recall a linear combination of normals is normal
7.6.2 Non Normal Case for X̄1 − X̄2
If X1,i are i.i.d. with mean μ1 and variance σ 21 , X2,i are i.i.d . with mean μ2 and
variance σ 22 , and X1 and X2 are independent, then using the central limit theorem,
for large n 1 , and n2
σ 21 σ 22
X̄1 − X̄2 ∼ N(μ1 − μ2 , + ) approximately or asymptotically
n1 n2
7.6.3 Example of Difference of Means with Non-Normal Popu-

lation
Suppose that right-handed (RH) students have a mean IQ of 80 units with variance
1,400 and left-handed (LH) students have a mean IQ of 80 with variance 1,320. What is
the probability that the sample mean IQ of RH students will be at least 5 units higher
7.6. SAMPLING DISTRIBUTION: DIFFERENCES OF SAMPLE MEANS X̄1 −X̄2 23
than the sample mean IQ of LH students if we take a sample of 100 RH students and 120
LH students?
Answer: Let X1,i be the IQ of the ith RH student and X2,i be the IQ of the ith LH
student.
μ1 = 80, σ 21 = 1400, n1 = 100
μ2 = 80, σ 22 = 1320, n2 = 120
and we want to find P (X̄1 − X̄2 ≥ 5).
Since n1 and n2 are both large we can apply the central limit theorem,
σ 21 σ 22
X̄1 − X̄2 ∼ N(μ1 − μ2 , + ) approximately
n1 n2
here:
σ 21 σ 22
μ1 − μ2 = 80 − 80 = 0 and + = 25
n1 n2
Note: Formula for the standardizing transformation is:
(X̄1 − X̄2 ) − (μ1 − μ2 )

Z=
σ X̄1 −X̄2
where
s
σ 21 σ 22
σ X̄1 −X̄2 = + .
n1 n2
So that
X̄1 − X̄2 ∼ N(0, 25) approximately
P (X̄1 − X̄2 ≥ 5) = P (Z ≥ 1) = .1587

7.7 Sampling Distribution of Sample Proportion
Let X be a binomially distributed random variable (the number of successes in n trials).

• Recall the sample proportion is
X
p̂ = is the fraction of successes in n trials.
n
• In Chapter 6 we used the normal approximation to the binomial as the number of
trials got large (nπ(1 − π) ≥ 9)
• This is another application of the Central Limit Theorem
E(X) = μ = nπ and V ar(X) = σ2 = nπ(1 − π).
• So we might ask what is the E[p] and V [p]

• ∙ ¸
X E[X] nπ
E[p̂] = E = = =π
n n n
• Recall trials are independent so that
∙ ¸
X V [X] nπ(1 − π) π(1 − π)
V [p̂] = V = 2
= 2
=
n n n n
• So we apply the generic formula
Random Variable-Mean of Random Variable

Z=
Standard Deviation of Random Variable
• What is mean for p̂
• Random Variable is p̂,
• Mean of Variable is E[p̂] = p
q
p(1−p)
• Standard Deviation of Variable is σ p = n
• Put it all together
p̂−p
Z=q . for large n
p(1−p)
n
7.8. SAMPLING DISTRIBUTION OF SAMPLE PROPORTION: P̂1 − P̂2 25
7.8 Sampling Distribution of Sample Proportion: p̂1 −

p̂2
• Consider two independent populations.
1. Population 1:
X1
X1 = number of successes, n1 = number in sample 1, so p̂1 = .
n1
Recall
E[p̂1 ] = p1
and
π 1 (1 − π 1 )
V [p̂1 ] = .
n1
2. Population 2:
X2 = number of successes
n2 = number in sample 2
X2
p̂2 =
n2
p2 (1 − p2 )
V [p̂2 ]=
n2
3. Form Difference of Sample Proportion: p̂1 − p̂2
• If n1 and n2 are large, i.e. n1 p1 (1 − p1 ) ≥ 9 and n2 p2 (1 − p2 ) ≥ 9 then:
p1 (1 − p1 ) p2 (1 − p2 )
p̂1 − p̂2 ∼ N(p1 − p2 , + ) approximately.
n1 n2
• This is another application of the central limit theorem
Questions: NCT 7.21,7.22, 7.27 & 7.36.
Omit Sampling Distribution of the Sample Variance

Sampling and Sampling Distributions: 7.1 Definitions

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Sampling and Sampling Distributions: 7.1 Definitions

Hochgeladen von

Copyright:

Verfügbare Formate

Chapter 7

Sampling and Sampling

• A statistical population is a set or collection of all possible observations of some

• A sample is a part or subset of the population.

• A random sample of size n is a sample that is chosen in such a way as to ensure

• A parameter is a number describing some (unknown) aspect of a population. (i.e.

• A statistic is some function of the sample observations. (i.e. X̄)

• The probability distribution of a statistic is known as a sampling distribution. (How

Populations and Samples

 A Population is the set of all items or individuals

 A Sample is a subset of the population

Population vs. Sample

 A sampling distribution is a distribution of

Sampling Sampling Sampling

7.2 Important Theorems Recalled

Suppose X1 , X2 , ..., Xn are independent with E[Xi ] = μi and V [Xi ] = σ 2i ∀i =

Suppose Y = a1 X1 + a2 X2 + ... + an Xn + b, then:

= a21 σ 21 + a22 σ 22 + · · · + a2n σ 2n

and if Xi is normal ∀i, i.e. Xi ∼ N(μi , σ 2i ) independently ∀i:

7.3 Frequently used Statistics

• Let X1 , X2 , . . . , Xn be a random sample of size n from a population with mean μ

1. The expected value of the sample mean is the population mean:

2. The variance of the sample mean ( X0s independent):

3. If we do not have independence it can be shown that

Note on Sample Mean

3. The standard deviation of the sample mean (under independence)

7.3.2 The sample variance

• The sample variance is:

1. E[s2 ] = σ 2 (we omit the proof)

7.4 Sampling distribution of the Sample Mean

• Let X̄ be the sample mean of an independent random sample of size n from a

• If we further specify the population distribution as being normal, then

Xi ∼ N(μ, σ2 ) for all i

and we can write:

7.5 Equation for the Standardized Sample Mean

• Generically the approach is ALWAYS

Random Variable-Mean of Random Variable

• Random Variable = X̄,

• Mean of Variable = E[X̄] = μ

• Standard Deviation of Variable = σ = √σ

• Put it all together

Z-value for Sampling Distribution

where: X = sample mean

Sampling Distribution Properties

 For sampling with replacement:

Example of Standardizing for Sample Mean

• Given the population is normally distributed with mean μ = 30 and standard

X̄ ∼ N(30, .12 /n).

Now want to apply our transformation stuﬀ:

7.5.1 Sampling from a Non Normal Distribution

• What happens when the Xi0 s are not normally distributed?

 Assume there is a population …

Summary Measures for the Population Distribution:

7.5.2 The Central Limit Theorem

16 Sample Means Sample Means

Let X1 , X2 , . . . , Xn be an independent random sample having identical distribution from

and similarly we can use the same transformation to standard form:

Notes on the Central Limit Theorem

2. We have a short-hand way to write that the distribution of Xi is independently and

Central Limit Theorem

 Suppose a population has mean µ = 8 and

 What is the probability that the sample mean is

Example From Transformation to Standard Form when Sampling from a Non-

A Population is the set of all items or individuals

A Sample is a subset of the population

A sampling distribution is a distribution of

For sampling with replacement:

Assume there is a population …

Suppose a population has mean µ = 8 and

What is the probability that the sample mean is