Week 8 Lecture Notes PDF

Econ 102A
Introduction to Statistical Methods for Social Scientists
Stanford University
Course Materials for Week 8
Professor Scott M. McKeon
Winter Quarter, 2019 - 20
© Scott M. McKeon
All Rights Reserved
Week 8
Goals:
1. Learning the Central Limit Theorem of Means verbally, visually and mathematically.
2. Becoming familiar with sampling from a population and its relation to the Central Limit
Theorem of Means.
3. Learning the concept of point estimators.
4. Understanding proportion problems as special cases of the Central Limit Theorem of

Means.
5. Learning how to compute confidence intervals.
6. Understanding ‘margin of error’ and its relation to confidence intervals.

Handout #56
Econ 102A Statistical Methods for Social Scientists Page 1 of 1
Week 8 Worksheet
1. In a certain bookstore the distribution on the number of copies sold of Richard Bach’s
book, Jonathan Livingston Seagull (i.e., the distribution of X), on any particular day is
given by the following table:
Number of Books
Sold Daily
P(X)
(X)
0 .78
1 .16
2 .06
Determine how many copies of Jonathan Livingston Seagull the bookstore needs to have
on hand at the beginning of the month in order to have a 96% chance of not running out of
the book by the end of the month. (Assume the month has 30 days.)
2. Consider the Sample Data for Eastville Houses.
(a) Determine the 95% confidence interval for the population mean house selling price.
(b) How big a sample size would be necessary to have 95% confidence that the sample
mean price is within 5,000 of the population mean house selling price?
3. Consider a poll being conducted by CNN, where they are asking the general population
their opinion on a Yes/No question. The point of the poll is to infer what percentage of the
population is in favor of the issue in question. CNN wants to report a proportion that is off
by no more than 3% from the truth. Suppose the survey has begun and a few hundred
people have responded. Currently, 42% of the survey respondents have said ‘Yes’ to the
question. Determine how many total respondents need to participate in the poll in order
that the poll’s margin of error be 3%.
Handout #57
Normal Distributions (Averages)
 Experiment #1: Reconsider the situation described in Experiment #1, Handout #49.
Suppose we again survey two people as to how often they dine out. But,
instead of summing their responses, we choose to average their responses
(i.e., sum up two observations and divide by two). Under this scenario we
have the following:
Standardized
Average Probability
Average (Z)
0.0 - 2.234 .0049

0.5 - 1.915 .0266
1.0 - 1.596 .0557
1.5 - 1.276 .0658
2.0 - 0.957 .0762
2.5 - 0.638 .0986
3.0 - 0.319 .1081
3.5 0 .1264
4.0 0.319 .1256
4.5 0.638 .0866
5.0 0.957 .0759
5.5 1.276 .0622
6.0 1.596 .0423
6.5 1.915 .0330
7.0 2.234 .0121
Handout #57
Page 2 of 3
In graphical form, the probability distribution of Z is:
0.14
.1264 .1256
0.12
.1081
.0986
0.1
.0866
.0762 .0759
0.08
.0658
P(Z)
.0622
.0557
0.06
.0423
0.04 .0330
.0266
0.02 .0121
.0049
0
0.319
0.638
0.957
1.276
1.596
1.915
2.234
-2.234
-1.915
-1.596
-1.276
-0.957
-0.638
-0.319
 Experiment #2: Reconsider the situation described in Experiment #2, Handout #49.
Suppose we again survey ten people as to whether or not they have been to
the supermarket today. But instead of summing their responses, we
choose to average their responses (i.e., sum up ten observations and divide
by ten). Under this scenario we have the following:
Standardized
Average Probability
Average (Z)
0.0 - 2.476 .0084

0.1 - 1.824 .0514
0.2 - 1.173 .1419
0.3 - 0.521 .2319
0.4 0.130 .2487
0.5 0.782 .1829
0.6 1.433 .0934
0.7 2.085 .0327
0.8 2.736 .0075
0.9 3.388 .0010
1.0 4.039 .0001
Handout #57
Page 3 of 3
In graphical form, the probability distribution of Z is:
.2487
0.25 .2319
0.2 .1829
0.15 .1419
P(Z)
.0934
0.1
.0514
0.05 .0327
.0084 .0075 .0010 .0001
0
0.130
0.782
1.433
2.085
2.736
3.388
4.039
-2.476
-1.824
-1.173
-0.521
Z
Handout #58
Central Limit Theorem of Means
 Experiment #1:
P(X)
Average  Standardize 

Two Trials the Averages
X Z
 Experiment #2:
P(X)
Average Standardize
 Ten Trials  the Averages 
X Z
Central Limit Theorem of Means: Suppose we accumulate a group of i.i.d. random

variables and take their average. Assume we do this a large number of times. Then, the
probability distribution of these averages (i.e., means) will be a normal distribution no
matter what the initial distribution looks like. Further, if the means are standardized, the
resulting distribution will be a standard normal distribution (i.e., a normal distribution
having a mean of zero and standard deviation of one).
Handout #59
Creating Normal Distributions through Averaging
average standardize
n trials the distribution
P(X)
 
X
Any Distribution Standardized
Average Values ( S )
Values (Z)
Expected Value = μ X Distribution of Averages Distribution of
Standard Deviation = σ X Standardized Averages
Expected Value = μ X
Standard Deviation = σ X n Expected Value = 0
Standard Deviation = 1
(See Note #1)
(See Note #2)
 Note #1: Calculating the expected value and standard deviation of the distribution of
averages:
x1  x 2      x n
The average of n trials is: Sn =
n
The expected value of the distribution of averages is therefore:
x1  x 2      x n
E( Sn ) = E( )
n
1
= E(x1 + x2 +  + xn) (since E(cX) = c E(X))
n
1
= [E(x1) + E(x2) +  + E(xn)] (by the properties of expected value)
n
1
= ( μ x + μ x +  + μ x ) (since each trial is identically distributed)
n
1
= (n μ x ) = μ x
n
Handout #59
Page 2 of 3
The variance of the distribution of averages is therefore:
x1  x 2      x n
Var( Sn ) = Var( )
n
1
= 2
Var(x1 + x2 +  + xn) (since Var(cX) = c2 Var(X))
n
1
= [Var(x1) + Var(x2) +  + Var(xn)] (since each trial is independent)
n2
1
= 2
( σ 2x + σ 2x +  + σ 2x ) (since each trial is identically distributed)
n
1 σ 2x
= (n σ 2x ) =
n2 n
So, the standard deviation of the distribution of averages is:
σ 2x σx
=
n n
 Note #2: Standardizing the Distribution of Averages
As in the case of standardizing ‘summed’ distributions, the standardization of averages

takes the normal distribution formed through the averaging process and transforms it into a
new normal distribution having an expected value of zero and a variance of one (and,
therefore, a standard deviation of one).
Mathematically, moving from sums to averages simply implies dividing by n. So, the
standardizing process is exactly as before, but with each term being divided by n. That is,
in the case of averages, we convert to z-values through the relationship:
Sn  μ x
z
σx n
Then, when plotting the distribution of z-values versus probability, we obtain a

standardized normal distribution (i.e., a normal distribution having an expected value of
zero and a standard deviation of one).
Handout #59
Page 3 of 3
Proof that the expected value is zero:
Sn  μ x
E(z) = E( )
σx n
1
= E( Sn  μ x ) (since E(cX) = c E(X))
σx n
1
= [E( Sn ) – E( μ x )] (by the properties of expected value)
σx n
1
= (μx – μx ) = 0 (by substitution)
σx n
Proof that the variance (and, therefore, the standard deviation) is one:
Sn  μ x
Var(z) = Var( )
σx n
1
= Var( Sn  μ x ) (since Var(cX) = c2 Var(X))
σ 2x n
1
= [Var( Sn )] (since Var(X + b) = Var(X))
σ 2x n
1  σ 2x 
=   = 1 (by substitution)
σ 2x n  n 

Handout #60
Standardizing the Average (or Mean) of Random Variables
When standardizing the average (or mean) of random variables (i.e., when converting averages
of random variables into z-values) we use the relation:
Sn  μ x
z
σx n
Sn
Here: Sn = = the various potential averages
n
n = how many trials you are averaging
μ x = the expected value of just one trial
x = the standard deviation of just one trial
σ x / n = the standard deviation of the distribution of averages
When we plot the z-values on the horizontal axis and the associated probabilities on the
vertical axis, the resulting distribution will be normally distributed with expected value zero
and variance one (because we have standardized the distribution). That is, when converting to
these (standardized) z-values, we are in a position to use the z-table to make probabilistic
assessments.
Handout #61
Weeks 6 and 7 Worksheet (Question 3) Revisited – Two Different Methods
Any Central Limit Theorem question can be done either of two ways: the ‘sums’ way or the
‘means’ way. Regardless of the method chosen, one will arrive at the same conclusion. So, a
person really needs to learn just one of these methods since every Central Limit Theorem
application can be phrased from either a ‘sums’ perspective or a ‘means’ perspective.
As confirmation that one gets the same conclusion regardless of the method chosen, let us
return to the Week 6 and 7 Worksheet, Question 3 exercise. Originally, we did things the
‘sums’ way (which is recopied below) but this handout also does the exercise the ‘means’ way
(starting on Page 3) so you can directly contrast the methods as well as confirm that both
methods render the same solution.
The ‘Sums’ approach:
• The ‘one person’ or ‘one trial’ distribution is: 0. 5

.42
0. 4 .34
0. 3
.18
0. 2
0. 1
.06
0
1 2 3 4
• We send the survey out to 1000 people and elicit their responses. Once we get the responses
back, we will then add them all up. The Central Limit Theorem of Sums tells us that the
sum of these 1000 responses will be normally distributed. By logic, we can further deduce
that this normal distribution will be situated between 1000 and 4000. So, the sum of the
1000 responses visually looks like:
S
1000 4000
Handout #61
Page 2 of 4
• Ultimately, this exercise asks for P(average rating ≥ 3.15). But, because we are doing things
the ‘sums’ way, we need to convert the question into a group sum instead of a group
average. Because there are 1000 people in the group, an average rating of 3.15 corresponds
to a group sum of (3.15) * (1000) = 3150. So, in the world of sums, this exercise asks for
P(sum of all respondents ≥ 3150).
• In actuality, to find the answer to this exercise, we would need to add up the bars at 3150,
3151, 3152, 3153, … all the way up to the bar at 4000. That is, we would need to add up
851 bars in a bar graph. But here is where the beauty of the normal distribution table saves
the day! Instead of adding up 851 bars in a graph we can approximate the probability by just
taking the area to the right of 3150. Visually, we need to find the following shaded area:
3150
• To find this probability, we only need to standardize 3150 and consult the z-table. In the
world of sums the standardization is found through:
S n  nμ x
z 
n σx
where μ x and σ x are derived from the ‘one person’ distribution. In this instance,
standardizing Sn = 3150 leads to:
3150  1000 (3.12)

z  = 1.04
1000 (.9086)
which, upon consulting the z-table, translates to a probability of 1 – .8508 = .1492.
Alternatively, we can use Excel to determine the probability either as 1 – NORMDIST(3150,

1000 * 3.12, 1000 * .9086, True) = 1 – NORMDIST(3150, 3120, 28.7325, True), or as
1 – NORMSDIST(1.04).
Handout #61
Page 3 of 4
The ‘Means’ approach:
• As before, the ‘one person’ or ‘one trial’ distribution is:

0. 5
.42
0. 4 .34
0. 3
.18
0. 2
0. 1
.06
0
1 2 3 4
• We send the survey out to 1000 people and elicit their responses. Once we get the responses
back, we will then take the overall average. The Central Limit Theorem of Means tells us
that the average of these 1000 responses will be normally distributed. By logic, we can
further deduce that this normal distribution will be situated between 1 and 4. So, the
average of the 1000 responses visually looks like:
S
1 4
Remember that the graph constructed above is just an approximation. The real distribution
would be a bar graph having 3001 separate bars (which need to be deduced from a
1000-stage probability tree). In the world of sums this 3001-barred graph ranges from 1000
to 4000; in the world of means this 3001-barred graph ranges from 1 to 4. So, although
there are the same number of bars in the graph, these bars are much more condensed in the
‘averages’ graph as opposed to the ‘sums’ graph.
• Ultimately, this exercise asks for P(average rating ≥ 3.15). When we solve the exercise via
the ‘sums’ method, we need to convert the question into a commentary about group sums.
But, there is no need for such a conversion when employing the ‘means’ method because the
question is already posed as a group average. So, the natural wording of the question is
already conducive to the Central Limit Theorem of Means.
• In actuality, to find the answer to this exercise, we would need to add up the bars at 3.150,
3.151, 3.152, 3.153, … all the way up to the bar at 4.000. That is, we would need to add up
851 bars in a bar graph (as before). But instead of adding up 851 bars in a graph we can
approximate the probability by just taking the area to the right of 3.15. Visually, we need to
find the following shaded area:
Handout #61
Page 4 of 4
3.15
• To find this probability, we only need to standardize 3.15 and consult the z-table. In the
world of means the standardization is found through:
Sn  μ x
z
σx n
where μ x and σ x are derived from the ‘one person’ distribution. Notice that this
standardization formula differs from the one used for group sums.
In this instance, standardizing Sn = 3.15 leads to:
3.15  3.12
z  = 1.04
.9086 1000
which, upon consulting the z-table, translates to a probability of 1 – .8508 = .1492.
Alternatively, we can use Excel to determine the probability either as 1 – NORMDIST(3.15,

3.12, .9086 / 1000 , True) = 1 – NORMDIST(3.15, 3.12, .02873, True), or as
1 – NORMSDIST(1.04).
So, we see that the same answer is obtained whether we approach the exercise from a ‘sums’
perspective or from a ‘means’ perspective.
Handout #62
A Visual Summary of the Central Limit Theorems
Sum up n trials
(do this multiple times)
P(S )
Standardize
S n  nμ x
z
P( x ) S n σx
expected value = nμx
standard deviation = n σx
x Z
expected value = μx 0
Take the average of n trials

standard deviation = σx (do this multiple times)
Standardize
P(S )
Sn  μ x
z
σx n
S
expected value = μx
standard deviation = σx n
Handout #63
Chicago White Sox Attendance
During a recent baseball season, the Chicago White Sox had 81 home dates. The attendance
associated with these dates is given by: 1
Date Attendance Date Attendance Date Attendance Date Attendance
1 40,395 22 11,886 42 29,945 62 17,879
2 16,972 23 19,628 43 30,694 63 18,305
3 14,514 24 12,857 44 19,555 64 36,311
4 20,591 25 13,076 45 17,060 65 33,721
5 9,882 26 15,069 46 18,868 66 29,364
6 10,716 27 27,535 47 32,245 67 23,943
7 13,015 28 21,398 48 20,631 68 20,082
8 18,907 29 23,837 49 31,776 69 27,196
9 14,975 30 19,114 50 20,667 70 24,796
10 15,424 31 22,827 51 22,617 71 19,999
11 14,285 32 14,162 52 29,633 72 32,807
12 25,381 33 27,287 53 28,027 73 27,623
13 22,714 34 26,491 54 43,922 74 22,188
14 13,936 35 30,779 55 38,973 75 20,541
15 11,084 36 19,887 56 25,348 76 32,812
16 10,639 37 18,708 57 24,118 77 37,367
17 13,355 38 17,225 58 36,151 78 31,539
18 25,873 39 45,147 59 29,442 79 39,627
19 15,413 40 45,440 60 32,381 80 31,305
20 14,397 41 44,858 61 22,396 81 26,019
21 12,078
From the above data, the population distribution of attendance is given by:
.173
.180
.160 .136
.140
.111 .111
Probability
.120 .099
.100 .086 .086
.080 .062
.060 .049
.040 .025 .025 .025
.012
.020
.000
9-12
12-15
15-18
18-21
21-24
24-27
27-30
30-33
33-36
36-39
39-42
42-45
45-48
Attendance (000s)
For the above population distribution: Mean = μ = 23,946 and Standard Deviation = σ = 8,991
1
The Sports Network (http://www.sportsnetwork.com/home.asp)
Handout #63
Page 2 of 6
After simulating a sample of size n = 7, the following distribution resulted:
.286
0.3
0.25
Probability
0.2
.143 .143 .143 .143 .143
0.15
0.1
0.05
0
9-12
12-15
15-18
18-21
21-24
24-27
27-30
30-33
33-36
36-39
39-42
42-45
45-48
Attendance (000s)
For the above distribution: Sample Mean = x = 25,082

Sample Standard Deviation = s = 7,707
.300 .267
.250
.200
Probability
.200
.150 .133
.100 .067 .067 .067 .067 .067 .067

.050
.000
9-12
12-15
15-18
18-21
21-24
24-27
27-30
30-33
33-36
36-39
39-42
42-45
45-48
Attendance (000s)

Handout #63
Page 3 of 6
.200
.200
.160
.160
Probability
.120
.120
.080 .080 .080 .080
.080
.040 .040 .040 .040 .040 .040
.040
.000
9-12
12-15
15-18
18-21
21-24
24-27
27-30
30-33
33-36
36-39
39-42
42-45
45-48
Attendance (000s)

After simulating a sampling of size n = 55, the following distribution resulted:
.250 .236
.200
.164
Probability
.145
.150
.091
.100 .073
.055 .055 .055
.050 .036 .036
.018 .018 .018
.000
9-12
12-15
15-18
18-21
21-24
24-27
27-30
30-33
33-36
36-39
39-42
42-45
45-48
Attendance (000s)

Handout #63
Page 4 of 6
Consider taking random samples of size n = 20 from the population and calculating the
corresponding sample mean. After simulating 60 such random samples (all of size n = 20), the
following sample means resulted:
Sample Sample Sample Sample

Number Mean Number Mean
1 23,674 31 20,584
2 22,311 32 25,713
3 25,365 33 26,378
4 24,793 34 19,204
5 26,719 35 20,102
6 24,500 36 24,548
7 21,338 37 23,387
8 24,516 38 23,682
9 27,429 39 25,464
10 23,269 40 21,544
11 19,857 41 29,256
12 23,543 42 20,787
13 24,762 43 26,250
14 21,445 44 23,248
15 27,525 45 25,959
16 23,619 46 22,357
17 25,074 47 22,948
18 22,755 48 23,093
19 24,674 49 23,994
20 28,611 50 24,280
21 23,841 51 24,346
22 21,958 52 23,961
23 20,982 53 23,191
24 25,553 54 22,894
25 22,626 55 23,428
26 24,005 56 23,511
27 25,471 57 22,518
28 22,606 58 23,560
29 26,088 59 26,182
30 23,861 60 26,889
Handout #63
Page 5 of 6
Aggregating the above data into categories leads to:
Sample Mean Frequency Probability
19.00 – 19.75 1 .017

19.75 – 20.50 2 .033
20.50 – 21.25 3 .050
21.25 – 22.00 4 .067
22.00 – 22.75 5 .083
22.75 – 23.50 9 .150
23.50 – 24.25 11 .183
24.25 – 25.00 8 .133
25.00 – 25.75 6 .100
25.75 – 26.50 5 .083
26.50 – 27.25 2 .033
27.25 – 28.00 2 .033
28.00 – 28.75 1 .017
28.75 – 29.50 1 .017
which establishes the distribution of sample means as:
.200 .183
.180
.160 .150
.133
.140
Probability
.120 .100
.100 .083 .083
.080 .067
.060 .050
.033 .033 .033
.040
.017 .017 .017
.020
.000
19.00 - 19.75
19.75 - 20.25
20.25 - 21.25
21.25 - 22.00
22.00 - 22.75
22.75 - 23.50
23.50 - 24.25
24.25 - 25.00
25.00 - 25.75
25.75 - 26.50
26.50 - 27.25
27.25 - 28.00
28.00 - 28.75
28.75 - 29.50
Sample Average Attendance (000s)

Handout #63
Page 6 of 6
If all possible samples of size n = 20 are taken (as opposed to just the above 60) and we
graph the corresponding sample averages as above, we would then see a normal-shaped
8,991
distribution having E( x ) = 23,946 and σ x  = 2,010.
20
Handout #64
Computing Point Estimators

(i.e., computing sample means and sample standard deviations)
When presented with a list of raw data (i.e., a sample from a population), a common first step
in performing some statistical analysis on the data is to compute the sample mean and the
sample standard deviation. This handout presents the formulas for these computations.
First, though, the conventional notation for a sample mean is x and the conventional notation
for a sample standard deviation is s . Taken together, the sample mean and sample standard
deviation are called ‘point estimators.’ The reasoning here is as follows: samples are meant to
mimic the underlying population. So, the sample mean is the single point which estimates the
population mean and, likewise, the sample standard deviation is the single point which
estimates the population standard deviation. In notational form, x is the single number (i.e.,
the point) which estimates μ and s is the single number (i.e., the point) which estimates σ .
In terms of the actual calculations, suppose you have a list of data which is represented as x1,
x2, x3, …, xn. The sample mean is simply the average of all the data points. That is:
x1  x 2  x 3    x n
x =
n
The sample standard deviation is less intuitive. The idea is to first compute the sample
variance and then take the square root. Proceeding as we did in the first half of the class, a
person would think the calculation might go as follows:
1 1 1 1
s2 = (x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2
n n n n
This is the formula we used for variance in the first half of the class. Notice though that the
probabilities are each 1/n. This makes sense because each single data point constitutes 1/n of
the entire data set.
The above formula can then be simplified as:

1
s2 = [(x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2 ]
n
and a person might think the sample standard deviation would therefore be:
1
s = [(x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2 ]
n
Handout #64
Page 2 of 2
In actuality this formula is not quite correct. The sample standard deviation is actually
computed as:
1
s = [(x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2 ]
n 1
or, in slightly more simplified form:
1
s = (x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2
n 1
Notice the only difference between this formula and the one on the previous page is that we
have ‘n – 1’ in the denominator instead of ‘n.’ The reason for this is actually quite involved
mathematically. It falls under the heading of something called ‘biased and unbiased
estimators’ and the details are not discussed here. Instead, this issue will be elaborated upon in
Econ 102B.
The truth is, whether we use ‘n – 1’ or ‘n’ in the denominator, the ultimate calculation would
not be significantly different (because the data sets are typically quite large). But, the point is,
whenever your sample is a subset of the underlying population, you should be using ‘n – 1’ in
the denominator instead of ‘n.’
Overall, this handout attempts to make the following points:
• The sample mean ( x ) and sample standard deviation ( s ) are ‘point estimators’ for the
underlying population. Specifically, x estimates μ and s estimates σ .
• The sample mean ( x ) is calculated as:
x1  x 2  x 3    x n
x =
n
• The sample standard deviation ( s ) is calculated as:
1
s = (x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2
n 1
Handout #65
Contrasting the Different Standard Deviation Formulas
In the beginning of class data was often presented to us in the form of a distribution. When
sampling, data is typically in ‘raw’ form as simply a list of numerical survey responses. This
handout contrasts the different formulas used to compute standard deviation in each situation.
Overall, if the variable is

The standard deviation is calculated as …
presented as …
…. a distribution
P(X)
std dev = E(x  x) 2
…. raw data … if the n data points constitute the entire population, then:
n data points 1
x1  std dev = σ  (x1  x) 2  (x 2  x) 2    (x n  x) 2
n
x 2 
x 3  in Excel this is the function STDEVP( )

 
 
 … if the n data points constitute only a sample of the entire
x n  population, then:
1
std dev = s  (x1  x) 2  (x 2  x) 2    (x n  x) 2
n 1
in Excel this is the function STDEV( )

Handout #66
Distributions of Populations versus Distributions of Sample Means
Consider taking a sample from the population. Before we sample, we probably have no idea as
to what the distribution of the population looks like. It could have any shape.
Regardless of what the population distribution looks like, though, we can imagine that there
exists some expected value and standard deviation for this distribution. We assign the symbols
 and  to represent the population expected value and the population standard deviation
respectively.
Visually the situation looks like:
Population expected value = 

Population standard deviation = 
Now, instead of sampling from this population one by one, suppose we choose to sample in
groups of size n. Upon amassing a particular group of size n, we then compute the sample
average of these n observations and record that number on a piece of paper. For example, if we
are interested in the average annual income of a Palo Alto resident, instead of questioning
various Palo Alto residents one by one as to their income and recording each number
individually, we instead write down only the single, overall average income among these n
residents.
Then, instead of just taking the average of one group of size n, suppose you repeat the process
many, many times. That is, keep taking groups of size n and then compute the average within
the group. Ultimately, we will have many, many group averages written down on our piece of
paper. Now summarize each of the numbers on the piece of paper by graphing frequencies on
the x-axis and probability on the y-axis, thereby forming the distribution of group averages.
The key insight here is that, because we are talking about a distribution of group averages, this
distribution will eventually mimic a normal distribution! This is a consequence of the
Central Limit Theorem of Means.
Handout #66
Page 2 of 3
Aside from realizing that the distribution of the averages (i.e., the distribution of the sample
means) is normally distributed, consider how the expected value and standard deviation of this
distribution relate to the expected value and standard deviation of the population.
Upon taking a sample of n trials, the sample mean is given by:
x1  x 2      x n
Sn =
n
so, the expected value of the distribution of sample means is:
x1  x 2      x n
E( Sn ) = E( )
n
1
= [E(x1) + E(x2) +  + E(xn)]
n
and, since the expectation of any individual trial is x, we have:
1
= (n μ x ) = μ x .
n
The variance of the distribution of sample means is:
x1  x 2      x n
Var( Sn ) = Var( )
n
1
= [Var(x1) + Var(x2) +  + Var(xn)]
n2
and, since the variance of any individual trial is σ 2x , we have:
1 σ 2x
= (n σ 2x ) =
n2 n
So, the standard deviation of the distribution of sample means is:
σx
σS =
n
Handout #66
Page 3 of 3
Visually the situation looks like:
Sample mean expected value = x

σx
Sample mean standard deviation =
n
So, in summary, we have:
• The population can be distributed in any way. The expected value and standard deviation of
the population are x and x respectively.
• The sample means are always normally distributed. The expected value and standard
σ
deviation of the sample means are x and x respectively.
n
Note: The above discussion implicitly assumes that the entire population is being graphed.
That is, each observation within the population is included in exactly one of the
sample groups. Since we often only have a sample of observations from the entire
population we then only have x and s at our avail, and these values serve as our
best estimates of x and x respectively.
Handout #67
Contrasting Population Means and Sample Means
When one records data from the real world an important distinction is whether the data
recorded represents the entire population of data or merely a sample from the population. As
mentioned in lecture, there is a different set of notation with regard to population data versus
sample data. Specifically, if the data we have is for the entire population the expected value is
represented as μx and the standard deviation is represented as σx. By contrast, if the data we
have is merely a sample from the population the mean is represented as x and the standard
deviation is represented as s . Notice that x and s are variables as they (most likely) change
from sample to sample. But, μx and σx are fixed constants for a fixed population. Much of
inferential statistics is about analyzing how closely the variable values x and s mimic the
‘true’ (fixed) values μx and σx respectively.
In exercises involving the Central Limit Theorem (of Means), the formula we have been using
for standardization is:
Sn  μ x
z
σx n
We have previously used Sn to represent the average among a group of n observations. In the
realm of sampling, the sample mean x is indeed an average of n observations, so we can use x
and Sn interchangeably.
Recall that μx is typically unknown since it is usually implausible for us to glean the entire
population of data for some statistic of interest. However, x is known, since it is a direct
byproduct of the sample we collected. Overall, there are two main questions that get asked
when contrasting an observable sample mean x versus an unobservable population mean μx:
(1) Based on the observed sample mean x , what would then be plausible values for the
unknown population mean μx?
(2) Suppose we think the unknown population mean μx is equal to μ0. Does the sample mean
x seem consistent with this belief? (If the sample mean is close to the value of μ0 then the
belief would seem consistent; if the sample mean is quite different from the value of μ0
then the belief would seem inconsistent.)
Question (1) is addressed by the concept of ‘confidence intervals’ and Question (2) is
addressed by the concept of ‘hypothesis testing.’ These concepts will be discussed in detail
during lecture. The point here is to understand that both of these concepts are about comparing
x and μx. But, how does one best compare x and μx? One way would simply be to calculate
their difference; that is, compute x – μx (assuming we have a particular guess for μx). But
Handout #67
Page 2 of 2
suppose the difference is 2 or perhaps 10 or perhaps even 134. Are these large differences or
small differences? Frankly, it is hard to say what constitutes a large or a small difference
without knowing how much the sample means tend to vary from sample to sample. So, instead
of calculating the aggregate difference x – μx, it is preferable to calculate the difference in
terms of the number of standard deviations they are apart. In other words, to determine how far
away x is from μx, we will standardize x . As discussed in lecture, this then tells us how many
standard deviations x is away from μx. Specifically, we compute:
x  μx
z
σx n
It should be noted that σx is typically an unknown quantity but we can use the sample standard
deviation, s, as our best estimate of σx when actually working out the calculation. Since this
standardized value follows a normal distribution, we can deduce how close x and μx are by
noting the magnitude of z. Loosely speaking, the sample mean and population mean would be
considered close together for values such as | z |  1 but far apart for values such as | z |  2.
Overall, an important part of statistics is to infer the unobservable value μx from the observable
value x . We have mentioned that there are two concepts within statistics which address this
issue: confidence intervals and hypothesis tests, as both concepts are about comparing the
difference between a sample mean and a population mean. Again, the details will be discussed
in lecture. But, the important take-away at this point is to realize that both concepts are
intimately connected with the Central Limit Theorem of Means. With both concepts, we deal
with the standardized values of x and hence with normal distributions. Thus, the story of both
confidence intervals and hypothesis tests will ultimately be applications of the Central Limit
Theorem of Means.
Handout #68
A Summary of Notation
In the previous weeks, we have seen a fair amount of new notation. This handout attempts to
summarize this notation.
• Notation for Covariance and Correlation:
Cov(X, Y) = the covariance between X and Y. The covariance is computed as

Cov(X, Y) = E(XY) – E(X) E(Y).
ρ(X, Y) = the correlation coefficient between X and Y. The correlation coefficient is

computed as ρ(X, Y) = Cov(X, Y) / σx σy.
• Notation for the Central Limit Theorem of Sums:
x = the expected value of just one trial.
σx = the standard deviation of just one trial.
Sn = the variable label for the sum of n trials.
nx = the expected value of the distribution of Sn.
n σx = the standard deviation of the distribution of Sn.
• Notation for the Central Limit Theorem of Means:
x = both the expected value of just one trial and the expected value of the distribution of Sn
(i.e., both these values will always be the same).
σx = the standard deviation of just one trial.
Sn = the variable label for the average (or mean) of n trials.
σx
= the standard deviation of the distribution of Sn .
n
p = in a proportion problem, this value will simultaneously be (1) the probability of a ‘Yes’
for one trial, (2) the expected value of one trial, and (3) the expected value of the average
of n trials.
Handout #68
Page 2 of 2
p (1  p) = in a proportion problem, the standard deviation of just one trial.
p (1  p)
= in a proportion problem, the standard deviation of the average of n trials.
n
• Notation for Populations and Samples:
x = the population average of the statistic being analyzed (this value is typically unknown)
σx = the population standard deviation of the statistic being analyzed (this value is typically
unknown)
x = the sample mean. This is also the ‘point estimator’ of x and is calculated as:
x1  x 2    x n
x =
n
where x1 is the first data point, x2 is the second data point, and so on.
s = the sample standard deviation. This is also the ‘point estimator’ of σx and is calculated
as:
1
s  (x1  x) 2  (x 2  x) 2    (x n  x) 2
n 1
where x1 is the first data point, x2 is the second data point, and so on.
Handout #69
Eastville Real Estate 2
Suppose you are given the task of performing statistical analysis on real estate data.
Specifically, you will focus on the city of Eastville, Oregon.
To get an initial feel for the types of houses in the area, you have collected the data on the
following page. Among other things, the data contains information on the selling price of
various homes in Eastville along with square footage, number of bedrooms and bathrooms, and
the presence of a basement and/or fireplace. In the data set:
SQ_FT = total square footage of the house
BEDS = total number of bedrooms
BATHS = total number of bathrooms
HEAT = 0 if gas forced air heating; 1 if electric heating
STYLE = 0 if one-story; 1 if two-story
GARAGE = total number of cars that fit into the garage
BASEMENT = 0 if basement absent; 1 if basement present
AGE = age of house in years
FIRE = 0 if fireplace absent; 1 if fireplace present
PRICE = selling price of house
SCHOOL = 0 if Eastville school district; 1 if Apple Valley school district
For this exercise we will concentrate on just one of these variables, the house selling price.
2
Adapted from Bryant, Peter G. and Smith, Marlene A., Practical Data Analysis: Case Studies in Business
Statistics, Irwin, Inc., 1995.
Handout #69
Page 2 of 2
Sample Data for Eastville Houses
BEDS HEAT GARAGE AGE PRICE

SQ_FT BATHS STYLE BASEMENT FIRE SCHOOL
1912 4 2 1 0 2 1 19 1 439,000 0
2238 3 2 1 0 1 1 12 1 449,900 1
1816 3 2 1 1 2 1 19 0 461,500 0
2008 5 2 1 1 2 1 17 0 463,500 0
2707 3 2 1 0 2 0 13 1 464,000 0
2296 4 2 1 0 2 1 17 0 466,500 0
2320 3 2 0 0 2 1 11 1 466,500 0
2210 3 2 1 0 1 0 6 1 466,900 0
1933 4 2 1 1 2 1 16 1 466,950 1
2296 3 2 1 0 2 1 17 1 468,000 0
2765 3 2 0 0 2 1 20 0 468,500 0
2725 4 3 1 1 2 1 12 0 479,000 0
2794 4 2 1 1 2 1 18 0 480,950 0
2294 3 2 0 0 2 0 13 1 481,000 0
2372 3 2 0 0 2 1 9 0 482,692 1
2162 3 2 1 0 1 0 8 1 482,801 0
2996 4 2 1 1 2 0 13 1 485,207 1
2764 4 2 1 0 2 1 13 1 486,000 0
2416 3 2 0 0 2 0 8 0 486,000 1
2730 4 2 1 0 2 0 15 1 487,500 0
2392 3 2 0 0 2 1 8 1 489,900 1
2664 3 3 0 1 2 0 11 1 489,900 0
2332 3 2 1 0 2 1 14 0 495,000 1
2752 3 3 1 0 2 0 18 1 496,800 0
3167 3 3 1 1 2 1 18 1 499,900 0
2664 3 2 1 0 2 0 9 1 500,000 0
2973 4 3 1 0 2 0 13 1 501,000 0
2384 3 2 0 0 2 0 5 1 501,280 0
2431 3 2 1 0 2 1 7 1 502,900 1
2950 5 3 0 1 2 0 13 1 510,000 0
2452 3 2 1 1 2 1 4 1 511,000 0
2829 3 2 1 1 2 0 10 1 511,439 1
2652 4 3 0 0 2 1 7 1 513,646 1
2516 3 2 1 0 2 1 10 1 514,293 1
2998 4 3 1 0 2 1 17 1 516,100 1
2984 4 3 1 1 2 1 9 1 516,149 0
2840 3 3 1 0 2 1 9 1 518,000 0
2823 4 3 1 1 2 1 3 1 520,000 1
3150 5 3 1 1 2 1 12 1 525,900 0
3096 3 3 1 0 2 1 9 1 538,000 1
3212 4 3 1 1 3 1 17 1 554,000 1
3375 4 3 1 1 2 1 11 1 568,000 1
3809 4 3 1 1 3 1 6 1 614,000 0
Handout #70
T-Distributions in Excel
There are in-built Excel functions which can help you determine probabilities and/or random
variable values associated with t-distributions:
• TDIST(number for lookup, degrees of freedom, number of tails in distribution)
 use this when you know the variable value and want the corresponding probability
in the tail(s) of the t-distribution. Designating the ‘number of tails in distribution’
as ‘1’ gives the probability in one tail only; designating the ‘number of tails in
distribution’ as ‘2’ gives the collective probability in both tails combined.
 this function will be used during our discussions on hypothesis testing
• TINV(collective probability in two tails combined, degrees of freedom)
 use this when you know the probability and want the corresponding variable value
 this function will be used during our discussions on both confidence intervals and
hypothesis testing
Handout #71
T-Distribution Table from Textbook

Handout #72
Confidence Intervals and T-Distributions
When developing confidence intervals we set a particular confidence level for the calculation.
As mentioned in lecture, conventional choices for the confidence level include 90%, 95% and
99%. The confidence level dictates the number of standard deviations (notated henceforth as
#SD) that the borderlines project outward from the sample mean in the ensuing confidence
interval. When the population standard deviation of the statistic of interest, σ x , is assumed to
be known we can use the z-table to determine #SD. However, in a real life setting, this would
rarely be the case. After all, if the population for some statistic consists of millions of
observations the only way to ever know σ x would be to amass all the millions of observations
and calculate the ensuing standard deviation of these millions of data points. The typical case
is that we approximate σ x by using the standard deviation of our sample data as a proxy. That
is, we estimate σ x by the point estimator s .
Although we hope s is close in value to σ x we can imagine situations where the sample
standard deviation does not closely mimic the population standard deviation since the original
sample may not be too indicative of the population by mere chance. So, when substituting s
for σ x we may be introducing some error into our confidence interval calculation. To
compensate, we then derive #SD from the t-distribution table instead of the z-table. You will
find that the t-distribution usually gives similar results as the z-table, but the t-distribution
consistently gives slightly higher values for #SD to compensate for the additional error that
may be introduced due to our lack of precision when estimating σ x .
When using the t-distribution one uses both the probability in the tail of the distribution as well
as the sample size. In particular, the t-table requires that we look up #SD with respect to the
‘degrees of freedom,’ denoted as df in the chart. The degrees of freedom is always taken to be
n – 1, where n is the sample size. In Excel, one can look up #SD by using the function TINV( ).
As an example, consider Question 2(a) on the Week 8 Worksheet. Here, we are trying to
predict the mean selling price of all homes in Eastville, Oregon based on a sample size of 43
data points. For the 43 data points, we have: x 0 = 496,270 and s = 32,222. We will use s as
our best estimate of σ x (the population standard deviation of all homes in Eastville, Oregon).
So, we derive #SD from the t-distribution table. The associated degrees of freedom is then
n – 1 = 43 – 1 = 42.
If we construct the 95% confidence interval, this implies a probability of .025 in each tail of the
normal distribution. Please see the visual below. By using the t-distribution from the textbook,
we find the consequent #SD is roughly 2.021. This is found as the entry corresponding to
column t.025 and row df = 40. Notice, in particular, that the table in the text allows us to look
up df = 40 but not df = 42. Notice further that the standard normal distribution table (i.e., the
z-table) would have delivered #SD = 1.96 which is close in value but slightly less, consistent
with the discussion above.
Handout #72
Page 2 of 2
The ensuing 95% confidence interval is then given as (using #SD from the table):
32,222
496,270 ± (2.021) = [486,339 to 506,201]
43
The visual for this situation is given by:
95%
x
496,270
.025 .025
.475 .475
t
- 2.021 2.021
Alternatively, we can look up #SD by using the TINV( ) function. Recall that, as parameters,
we need to input the total probability in both tails combined as well as the degrees of freedom.
In this example, these are .05 and 42 respectively, so we are led to TINV(.05, 42) = 2.018,
which is essentially the same as the 2.021 figure found via the table.
Handout #73
Confidence Intervals Summary
Objective: Try to decipher the population mean, μx, based on the point estimators x and s .
Step 1: Take a sample of data from the population you wish to analyze. The sample
should be taken in as IID a fashion as possible.
Step 2: Use the sample to calculate the point estimators x and s (i.e., the sample mean
and the sample standard deviation). Specifically …
(i) for a non-proportion problem the sample mean ( x ) is calculated as:
x1  x 2  x 3    x n
x =
n
and the sample standard deviation ( s ) is calculated as:
1
s = (x1  x) 2  (x 2  x) 2  (x 3  x) 2    (x n  x) 2
n 1
(ii) for a proportion problem the sample mean is simply p and the sample standard
deviation is simply (p) (1  p) where p = P(Yes) as taken from the sample.
Step 3: Decide on a confidence level for the interval. The confidence level is purely at
your discretion. How confident do you want to be in knowing that the true
population mean, μx, is actually contained in your final interval? Popular
confidence levels include 90%, 95% and 99%.
Step 4: Use the above information to construct the confidence interval. In general …
s
(i) for a non-proportion problem the interval is notated as x  (# SD)
n
p (1  p)
(ii) for a proportion problem the interval is notated as p  (# SD)
n
Note 1: #SD is determined by the confidence level chosen. In a non-proportion

problem (since we use s to estimate σ x ) we find this value via the
t-table.
Note 2: The distance between the middle of the interval to the upper boundary of
the confidence interval is referred to as the ‘margin of error.’ That is, the
margin of error is the distance from x to the upper borderline of the
confidence interval.
Handout #74
Week 8 Practice Exercises
1. Consider an experiment such that P(X = 1) = .30, P(X = 2) = .35 and P(X = 4) = .35.
(a) Suppose the experiment is performed two times and the resulting values of X are
averaged. Label the average of the two values of X as S2 . Construct a probability tree
for this experiment, and thereby determine the distribution of S2 .
(b) Suppose the experiment is performed three times and the resulting values of X are
averaged. Label the average of the three values of X as S3 . Construct a probability tree
(c) Suppose the experiment is performed four times and the resulting values of X are
averaged. Label the average of the four values of X as S4 . Construct a probability tree
(d) Based on the distributions you have derived in the previous parts, what do you conclude
about the shape of the ensuing distribution as we average more and more values of X?
2. A survey conducted by the American Automobile Association showed that a family of four
spends an average of $215.60 per day when on vacation. Assume that $215.60 is the
population mean expenditure per day for a family of four and that $85.00 is the population
standard deviation. Assume that a random sample of 40 families will be selected for
further study.
(a) Determine the mean and standard deviation of the distribution of sample means
(where the size of each sample is n = 40).
(b) Determine the probability that a sample of 40 families will provide a sample mean
within $20.00 of the population mean.
(c) Determine the probability that a sample of 40 families will provide a sample mean
within $10.00 of the population mean.
3. A library checks out an average of x = 320 books per day, with a standard deviation of
x = 75 books. Consider taking many samples where each sample records the number of
books checked out each day for one month of operation. That is, each sample contains 30
data points where each particular entry is taken from a distribution with x = 320 books and
x = 75 books.
Handout #74
Page 2 of 4
(a) Determine the mean and standard deviation of the distribution of sample means.
(b) Determine the probability that any sample mean for the 30 days will be between 300
and 340 books.
(c) Determine the probability that any sample mean for the 30 days will show 325 or
more books checked out.
4. In the Weeks 6 and 7 Practice Exercises, you were presented with the following question:
Information on 3731 subscribers to The Wall Street Journal includes the following data on
household members:
Number of
Frequency
Household Members
1 474
2 1664
3 627
4 522
5 444
Suppose we believe this table is indicative as to how future respondents will answer the
subscriber survey. Consider the next 800 respondents being surveyed. Determine the
probability that these 800 respondents aggregately have at least 2200 members in their
households.
Determine the solution to this exercise using the Central Limit Theorem of Means as
opposed to the Central Limit Theorem of Sums.
5. Assume that the population standard deviation of some random variable of interest is
x = 25. Compute the standard deviation of the distribution of sample means, for sample
sizes of 50, 100, 150 and 200. What can you say about the size of the standard deviation of
sample means as the sample size increases? Does it increase, decrease or stay the same?
Write a few sentences explaining the intuition behind the conclusion you reach.
6. To obtain cost savings, a company is considering offering an early retirement incentive for
its older management personnel. The consulting firm that designed the early retirement
program has found that, historically, 22% of the employees qualifying for the program will
select early retirement during the first year of eligibility. Assume that the company offers
the early retirement program to 50 of its management personnel.
Handout #74
Page 3 of 4
(a) What is the exact probability that at least 15 but no more than 19 employees will select
early retirement in the first year?
(b) Answer part (a) by using a Central Limit Theorem of Sums approach.
(c) Answer part (a) by using a Central Limit Theorem of Means approach.
7. The Food Marketing Institute shows that 17% of households spend more than $200 per
week on groceries. Assume a random sample of 800 households will be selected from the
population.
(a) For the sample, determine the mean and standard deviation of the percentage of
households spending more than $200 per week on groceries.
(b) Determine the probability that any sample of size 800 will produce a sample mean
within  2% of the population mean.
(c) Answer part (b) for a sample of 1600 households.
8. The California Highway Patrol maintains records showing the times between a report of an
accident and the arrival of an officer at the accident scene. A random sample of 10 records
shows the following times in minutes:
12.6 3.4 4.8 5.0 6.8 2.3 3.6 8.1 2.5 10.3
Determine the point estimate of both the population mean and standard deviation for the
time between an accident report and officer arrival.
9. J. D. Power & Associates annual quality survey for automobiles found that the industry
average number of defects per new car is 1.07. Suppose a sample of 34 new automobiles
taken from Manufacturer A provides the following data on number of defects per car:
0 1 1 2 1 0 0 2 3 0 2 1 0 4 3 1 1
0 2 0 0 2 1 3 0 2 1 0 2 0 3 1 0 2
(a) Determine the sample mean and sample standard deviation of this data.
(b) Provide a 95% confidence interval of the mean number of defects per car for the
population of cars produced by Manufacturer A.
(c) Upon considering the 95% confidence interval found in part (b), a statistical analyst
suggests that Manufacturer A test a larger number of new cars before drawing a
conclusion about how the quality of its cars compares to the J. D. Power & Associates
industry average of 1.07 defects per car. Do you support this idea? Why or why not?
Handout #74
Page 4 of 4
10. Dailey Paints, Inc., implements a long-term test study designed to check the wear
resistance of its major brand of paint. The test consists of painting eight houses in various
parts of the United States and recording the number of months until signs of peeling are
observed. Suppose the number of months until signs of peeling are observed is normally
distributed and further suppose the following data are obtained in regard to the eight
homes in the test:
House 1 2 3 4 5 6 7 8
Months Until Signs

60 51 64 45 48 62 54 56
of Peeling Observed
(a) Determine the 95% confidence interval to estimate the population mean number of
months until signs of peeling are observed.
(b) Determine the 99% confidence interval to estimate the population mean number of
months until signs of peeling are observed.
Handout #75
Week 8 Practice Exercises – Solutions
1. Creating a Normal Distribution Exercise (Means)
(a) Upon constructing the probability tree, the probability distribution of S2 is derived as:
.50
P( S2 )
.40
.30
P(S2 )
.245
.21 .21
.20
.1225 .1225
.09
.10
.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
SS22
(b) Upon constructing the probability tree, the probability distribution of S3 is derived as:
.30
P( S3 ) .2205
.20
P(S3 )
.1374 .1286 .1286

.1103 .1103
.0945
.10
.0429
.0270
.00
0.00 0.33 0.67 1.00 1.33 1.67 2.00 2.33 2.67 3.00 3.33 3.67 4.00 4.33
SS33
(c) Upon constructing the probability tree, the probability distribution of S4 is derived as:
.20
P( S4 ) .147 .154 .154
.15 .126
P(S4)
.10 .089 .090

.066 .060
.051
.05 .038
.008 .015
.00
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
S44
Handout #75
Page 2 of 7
(d) By reflecting on the distributions derived above, as we average more and more
observations of X within each group, the ensuing distribution of averages becomes
more and more normally distributed (as dictated by the Central Limit Theorem of
Means).
2. American Automobile Association Exercise
We have x = 215.60, x = 85.00 and n = 40
(a) Based on the above, the distribution of sample means is such that:
85.00
E( S ) = 215.60 and σS = = 13.44
40
195.60  215.60 235.60  215.60

(b) P(195.60  S  235.60) = P(  Z  )
13.44 13.44
= P(– 1.49  Z  1.49) = .9319 – .0681 = .8638
205.60  215.60 225.60  215.60

(c) P(205.60  S  225.60) = P(  Z  )
13.44 13.44
= P(– .74  Z  .74) = .7704 – .2296 = .5408
3. Library Exercise
We have x = 320, x = 75 and n = 30
75
E( S ) = 320 and σS = = 13.69
30
300  320 340  320

(b) P(300  S  340) = P(  Z  )
13.69 13.69
= P(– 1.46  Z  1.46) = .9279 – .0721 = .8558
325  320
(c) P( S  325) = P(Z  ) = P(Z  .37) = 1 – .6443 = .3557
13.69
Handout #75
Page 3 of 7
4. Revisiting the Wall Street Journal Subscriber Exercise
Recall that we had previously determined:
E(X) = .127 (1) + .446 (2) + .168 (3) + .140 (4) + .119 (5) = 2.678
Var(X) = .127 (1)2 + .446 (2)2 + .168 (3)2 + .140 (4)2 + .119 (5)2 – (2.678)2 = 1.4663
 x = 1.211
If the aggregate number of members in the 800 households is 2200, then the mean number
2200
of members per household is = 2.75.
800
So, upon using the Central Limit Theorem of Means, we consider the probability that the
average number of members per household exceeds 2.75.
Specifically, we have:
2.75  2.678
P( S  2.75) = P(Z  ) = P(Z  1.68) = 1 – .9535 = .0465.
1.211 800
5. Sample Mean Standard Deviation Versus Sample Size
With x = 25 and a sample size of n = 50, the standard deviation of the sample means
25
becomes = 3.54
50
25
becomes = 2.50
100
25
becomes = 2.04
150
25
becomes = 1.77
200
Overall, the sample size and sample mean standard deviation are inversely related. That is,
as the sample size increases the standard deviation decreases. This result has intuitive
appeal: the greater the sample size, the more we should have a nice blend of high, low and
medium values. Consequently, the overall average of these numbers should tend more and
more toward the population average. After all, to get excessively deviant averages the
Handout #75
Page 4 of 7
sample would have to have almost exclusively high values or low values. Thus, as sample
size increases we should expect the overall standard deviation of the sample mean to
diminish.
6. Early Retirement Exercise
Define: X = whether or not an employee chooses early retirement
Then, X is a Bernoulli random variable with P(X = 1) = .22
(a) To find the exact probability of between 15 and 19 employees choosing early
retirement, we have:
 50 
P(15 employees choose early retirement) =   (.22)15 (.78)35 = .0515
 15 
 50 
 16 
 50 
 17 
 50 
 18 
 50 
 19 
Therefore, P(between 15 and 19 employees choose early retirement) =
.0515 + .0318 + .0179 +.0093 + .0044 = .1149
(b) The situation involves summing up 50 Bernoulli random variables. By the Central
Limit Theorem of Sums we have:
n = 50, μ x = .78 (0) + .22 (1) = .22, σ 2x = .78 (0)2 + .22 (1)2 – (.22)2 = .1716
 x = .41425
15  50 (.22) 19  50 (.22)
Therefore, P(15  S  19) = P(  Z  )
50 (.41425) 50 (.41425)
= P(1.37  Z  2.73) = .9968 – .9147 = .0821

Handout #75
Page 5 of 7
(c) 15 employees opting for early retirement corresponds to an average of 15/50 = .30;
19 employees opting for early retirement corresponds to an average of 19/50 = .38.
.30  .22 .38  .22
Therefore, P(.30  S  .38) = P(  Z  )
(.41425) 50 (.41425) 50
= P(1.37  Z  2.73) = .9968 – .9147 = .0821
7. Food Marketing Institute Exercise
Define: X = whether or not a household spends more than $200 per week on groceries
Then, X is a Bernoulli random variable with P(X = 1) = .17
For the population distribution we therefore have μ x = .83 (0) + .17 (1) = .17, and
σ 2x = .83 (0)2 + .17 (1)2 – (.17)2 = .1411
 x = .3756
.3756
E( S ) = .17 and σS = = .0133
800
.15  .17 .19  .17
(b) P(.15  S  .19) = P(  Z  )
.0133 .0133
= P(– 1.50  Z  1.50) = .9332 – .0668 = .8664

.3756
(c) When n = 1600, σ S = = .0094
1600
.15  .17 .19  .17
So, P(.15  S  .19) = P(  Z  )
.0094 .0094
= P(– 2.13  Z  2.13) = .9834 – .0166 = .9668
8. California Highway Patrol Exercise
The point estimate of the population mean is:

1
x = (12.6 + 3.4 + 4.8 + 5.0 + 6.8 + 2.3 + 3.6 + 8.1 + 2.5 + 10.3) = 5.94
10
Handout #75
Page 6 of 7
The point estimate of the population standard deviation is:
s = 1
9
(6.66)2  (2.54)2  (1.14)2  (0.94)2  (0.86)2  (3.64)2  (2.34)2  (2.16)2  (3.44)2  (4.36)2
= 3.46
9. J. D. Power & Associates Survey Exercise
(a) From the sample data:

41 1
x = = 1.21 and s = 43.5588 = 1.15
34 33
(b) For the 95% confidence interval, #SD is found as TINV(.05, 33) = 2.03.
1.15
Lower bound of 95% confidence interval = 1.21 – 2.03 = .81
34
1.15
Higher bound of 95% confidence interval = 1.21 + 2.03 = 1.61
34
Therefore, the 95% confidence interval ranges from .81 up to 1.61.
(c) Since the 95% confidence interval goes as low as .81 and as high as 1.61 it is unclear
whether the mean number of automotive defects for Manufacturer A is above or below
the 1.07 industry average. So, the suggestion to test a larger number of cars (which
consequently refines the confidence interval since n is larger) is a wise one.
10. Dailey Paints, Inc. Exercise
440 1
(a) From the sample data, x = = 55 and s = 322 = 6.782.
8 7
Further, for the 95% confidence interval, #SD is found as TINV(.05, 7) = 2.365.
6.782
Lower bound of 95% confidence interval = 55 – 2.365 = 49.33
8
6.782
Higher bound of 95% confidence interval = 55 + 2.365 = 60.67
8
Therefore, the 95% confidence interval is [49.33 up to 60.67]

Handout #75
Page 7 of 7
(b) For the 99% confidence interval, #SD is found as TINV(.01, 7) = 3.499.
6.782
Lower bound of 99% confidence interval = 55 – 3.499 = 46.61
8
6.782
Higher bound of 99% confidence interval = 55 + 3.499 = 63.39
8
Therefore, the 99% confidence interval is [46.61 up to 63.39]

Week 8 Lecture Notes PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Week 8 Lecture Notes PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Econ 102A

Introduction to Statistical Methods for Social Scientists

Course Materials for Week 8

Professor Scott M. McKeon

Winter Quarter, 2019 - 20

3. Learning the concept of point estimators.

4. Understanding proportion problems as special cases of the Central Limit Theorem of

5. Learning how to compute confidence intervals.

6. Understanding ‘margin of error’ and its relation to confidence intervals.

2. Consider the Sample Data for Eastville Houses.

Normal Distributions (Averages)

0.0 - 2.234 .0049

In graphical form, the probability distribution of Z is:

0.0 - 2.476 .0084

In graphical form, the probability distribution of Z is:

Central Limit Theorem of Means

Central Limit Theorem of Means: Suppose we accumulate a group of i.i.d. random

Creating Normal Distributions through Averaging

The expected value of the distribution of averages is therefore:

The variance of the distribution of averages is therefore:

So, the standard deviation of the distribution of averages is:

 Note #2: Standardizing the Distribution of Averages

As in the case of standardizing ‘summed’ distributions, the standardization of averages

Then, when plotting the distribution of z-values versus probability, we obtain a

Proof that the expected value is zero:

Standardizing the Average (or Mean) of Random Variables

n = how many trials you are averaging

μ x = the expected value of just one trial

x = the standard deviation of just one trial

σ x / n = the standard deviation of the distribution of averages

Weeks 6 and 7 Worksheet (Question 3) Revisited – Two Different Methods

The ‘Sums’ approach:

• The ‘one person’ or ‘one trial’ distribution is: 0. 5

3150  1000 (3.12)

which, upon consulting the z-table, translates to a probability of 1 – .8508 = .1492.

Alternatively, we can use Excel to determine the probability either as 1 – NORMDIST(3150,

The ‘Means’ approach:

• As before, the ‘one person’ or ‘one trial’ distribution is:

In this instance, standardizing Sn = 3.15 leads to:

which, upon consulting the z-table, translates to a probability of 1 – .8508 = .1492.

Alternatively, we can use Excel to determine the probability either as 1 – NORMDIST(3.15,

A Visual Summary of the Central Limit Theorems

Take the average of n trials

Chicago White Sox Attendance

After simulating a sample of size n = 7, the following distribution resulted:

For the above distribution: Sample Mean = x = 25,082

After simulating a sample of size n = 15, the following distribution resulted:

.100 .067 .067 .067 .067 .067 .067

For the above distribution: Sample Mean = x = 24,661

After simulating a sample of size n = 25, the following distribution resulted:

For the above distribution: Sample Mean = x = 23,419

After simulating a sampling of size n = 55, the following distribution resulted:

For the above distribution: Sample Mean = x = 24,102

Sample Sample Sample Sample

Aggregating the above data into categories leads to:

Sample Mean Frequency Probability

19.00 – 19.75 1 .017

which establishes the distribution of sample means as:

Sample Average Attendance (000s)

Computing Point Estimators

The above formula can then be simplified as:

or, in slightly more simplified form:

Overall, this handout attempts to make the following points: