Sie sind auf Seite 1von 48

Lecture 7

Reasons to Sample
Sampling Methods
Sampling Distribution of the Sample Mean
The Central Limit Theorem

Reading: Chapter 8
Sampling Methods and
the Central Limit Theorem
When you have completed this chapter, you will be able to:
ONE
Explain why sampling is the
only feasible way to learn
about a population.
TWO
Describe methods to select
a sample.
THREE
Define and construct a
sampling distribution of the
sample mean.
FIVE
Use the CLT to find probabilities of selecting possible
sample means from a specified population.
FOUR
Explain the Central Limit
Theorem (CLT)
Why is it necessary to sample?

Why cant we just inspect all the
items in the population?
The physical impossibility of checking
all items in the population.

The destructive nature of certain tests.

The cost of studying all the items in a
population.

Contacting the whole population would
often be time-consuming.


The sample results are usually
adequate.
(1) Simple Random Sampling:
A sample formulated so that each item or
person in the population has the same
chance of being included.
Every item in the population is given an ID.
First, write the ID of each item on a small slip
of paper and put all of the slips in a box. The
selection begin after they have been
thoroughly mixed.

Computer program or the table of random
numbers (Appendix E) can be used to select
the samples.
(2) Systematic Random Sampling:
The items or individuals of the population
are arranged in some order. A random
starting point is selected and then every k
th

member of the population is selected for the
sample.
Each item is given an ID in the population of 100.

To have a sample of 10, an item is to be drawn
from every 10 items. (k = 100/10 = 10)

A random starting point is selected, say 6. Thus the
items chosen to be in the sample will be
6, 16, 26, 36, 46, 56, 66, 76, 86 & 96
(3) Stratified Random Sampling:

A population is first divided into
subgroups, called strata, and a
sample is selected from each
stratum.

A sample from 352 large companies is to be chosen
for study to determine whether firms with higher
returns on equity would spent more on advertising.
The 352 companies are classified into 5 group based
on their returns on equity.
Sample will be chosen from each group (stratum).
This ensure that each stratum is represented in the
sample.
(4) Cluster Sampling:
A population is first divided into primary
units then samples are selected from the
primary units.


A method often used to lower the cost of
sampling if the population is dispersed over
wide geographic area.

The area (Malaysia) is divided into primary
units (states). Then a few primary units are
chosen, and a random sample is selected
from each unit.
The sampling distribution of the sample
mean is a probability distribution
consisting of all possible sample means of
a given sample size selected from a
population.
X
) ( X P
Suppose that a population consists of
six families living in a neighbourhood.


A study on the number of children in the
families is to be done. The population
information is:
N = 6
Family Number of children
Andy 1
Badawi 2
Chandran 3
Daud 5
Elvin 3
Fong 4
(a) How many different samples of size
two can be selected from this
population?



There are 15 different samples. This is the
combination of 6 objects taken 2 at a time.
15
2 6
= C
(b) List down all the possible samples and find
the sample means.

Let X be the number of children in a sample.

These 15 possible samples are shown in the
following slides:

Example 1 contd.
Sample
#
Families in
the sample
Number of children
in sample, X
Sample Mean
1 A, B
2 A, C
3 A, D
4 A, E
5 A, F
X
X
Sample
#
Families in
the sample
Number of children
in sample, X
Sample Mean
1 A, B 1 + 2 = 3 1.5
2 A, C 1 + 3 = 4 2.0
3 A, D 1 + 5 = 6 3.0
4 A, E 1 + 3 = 4 2.0
5 A, F 1 + 4 = 5 2.5
X
Sample
#
Families in
the sample
Number of children
in sample, X
Sample Mean
6 B, C
7 B, D
8 B, E
9 B, F
10 C, D
X
Sample
#
Families in
the sample
Number of children
in sample, X
Sample Mean
6 B, C 2 + 3 = 5 2.5
7 B, D 2 + 5 = 7 3.5
8 B, E 2 + 3 = 5 2.5
9 B, F 2 + 4 = 6 3.0
10 C, D 3 + 5 = 8 4.0
X
Sample
no.
Families in
the sample
Number of children
in sample, X
Sample Mean
11 C, E
12 C, F
13 D, E
14 D,F
15 E, F
Total
X
Sample
no.
Families in
the sample
Number of children
in sample, X
Sample Mean
11 C, E 3 + 3 = 6 3.0
12 C, F 3 + 4 = 7 3.5
13 D, E 5 + 3 = 8 4.0
14 D,F 5 + 4 = 7 4.5
15 E, F 3 + 4 = 7 3.5
Total
45.0
(c) Find the mean of all the sample
means.
0 . 3
15
45
2 6
= = =

C
X
x

EXAMPLE 1 contd.
The mean of all the sample means is 3.0 children.
(d) Find the population mean number of
children per family.



=
x
3
6
18
6
4 3 5 3 2 1
= =
+ + + + +
=
(e) What is the sampling distribution of the
sample mean for n = 2 ?


Sampling distribution of the sample means:
Mean number of
children,
Frequency
Probability
(relative frequency)
1.5 1
2.0 2
2.5 3
3.0 3
3.5 3
4.0 2
4.5 1
Total 1.0000
X
As a sampling distribution of the sample means
Mean number of
children,
Frequency
Probability
(relative frequency)
1.5 1 1/15 = 0.0667
2.0 2 2/15 = 0.1333
2.5 3 3/15 = 0.2000
3.0 3 3/15 = 0.2000
3.5 3 3/15 = 0.2000
4.0 2 2/15 = 0.1333
4.5 1 1/15 = 0.0667
Total 15 1.0000
X
Compute the mean of the sample means.



0 . 3
15
45
15
1
5 . 4
15
2
0 . 4
15
3
5 . 3
= =
|
.
|

\
|
+
|
.
|

\
|
+
|
.
|

\
|
+
|
.
|

\
|
+
|
.
|

\
|
+
|
.
|

\
|
+
|
.
|

\
|
=
=

15
3
0 . 3
15
3
5 . 2
15
2
0 . 2
15
1
5 . 1
) ( X P X
X

The mean of the sample means is 3.0 children.


Consider a population of four values:
12 12 14 and 16.
N = 4
(a) List all samples of size 2, and compute the mean
of each sample.
(b) Compute the mean of the distribution of the
sample mean and the population mean. Compare
the two values.
(c) Compare the dispersion in the population with
that of the sample mean.
Sample no. Sample values Sample mean,
Sample no. Sample values Sample mean,
1
2
3
4
5
6
Total
6
)! 2 4 ( ! 2
! 4
2 4
=

= C
Sample no. Sample values Sample mean,
1 12, 12
2 12, 14
3 12, 16
4 12, 14
5 12, 16
6 14, 16
Total
6
2 4
= C
Sample no. Sample values Sample mean,
1 12, 12 12
2 12, 14 13
3 12, 16 14
4 12, 14 13
5 12, 16 14
6 14, 16 15
Total
81
=
x
5 . 13
6
81
2 4
= = =

C
X
x

(b) Compute the mean of the distribution of the


sample mean and the population mean. Compare
the two values.
5 . 13
4
54
4
16 14 12 12
= =
+ + +
=
(c) Compare the dispersion in the population with that
of the sample mean.
Population range = 16 12 = 4
Sample ranges = 15 12 = 3
Sample range < Population range
Sample dispersion < Population dispersion
It is the standard deviation of the
sampling distribution of the sample
means.
is the symbol for the standard error of the
sample mean.
is the standard deviation of the population.
n is the size of the sample.

o
o
x
n
=
o
x
o
If the size of the sample is sufficiently
large, the sampling distribution of the
sample mean will approach a normal
distribution regardless of the shape of
the population.
Thus, if a population which does not
follow the normal distribution, but the
sample size is at least 30, the
sampling distribution of the sample
means will follow the normal
distribution.

) , ( ~
) , ( on distributi any ~
n
N X
X
o

o
if n 30
The mean of the sampling distribution
equal to and the variance equal to

2
/n.
To determine the probability a sample
mean falls within a particular region,
use:
n
X
z
o

=
A normal population has a mean of 60
and a standard deviation of 12.
You select a random sample of 9.

Example 3 Exercise 15



Compute the probability the sample mean is:
(a) Greater than 63
(b) Less than 56
(c) Between 56 and 63
Example 3 contd.
Compute the probability the sample mean is:
(a) Greater than 63
0.2266
0.2734 - 0.5
) 75 . 0 (
)
9
12
60 63
( ) 63 (
=
=
> =

> = >
Z P
Z P X P
Example 3 contd.
Compute the probability the sample mean is:
(b) Less than 56
0.1587
0.3413 - 0.5
) 00 . 1 (
)
9
12
60 56
( ) 56 (
=
=
< =

< = <
Z P
Z P X P
Example 3 contd.
Compute the probability the sample mean is:
(c) Between 56 and 63
0.6147
0.2734 0.3413
) 75 . 0 00 . 1 (
)
9
12
60 63
9
12
60 56
( ) 63 56 (
=
+ =
< < =

<

= < <
Z P
Z P X P
Example 3 contd.
The mean selling price of a gallon of gasoline in the
United States is $1.30.
The distribution is positively skewed, with a standard
deviation of $0.28.

Given a sample of 35 gasoline stations , find the
probability that the sample mean is between 1.22 and
1.38.
( )
9090 . 0
) 4545 . 0 ( 2
) 69 . 1 69 . 1 (
)
35
28 . 0
30 . 1 38 . 1
35
28 . 0
30 . 1 22 . 1
(
38 . 1 $ 22 . 1 $
=
=
< < =

< <

=
< <
Z P
Z P
X P
28 . 0 ; 30 . 1 $ = = o
X ~ Positively skewed ( )
Estimation and Confidence Intervals

Point Estimates
Confidence Intervals for Means
Confidence Intervals for Proportions
Choosing an Appropriate Sample Size, n
Reading: Chapter 9

Das könnte Ihnen auch gefallen