Sie sind auf Seite 1von 8

# Sampling Distribution of the Proportion

## A sampling distribution of the proportion is the relative frequency

distribution of the proportion (p) of all possible samples of size n taken from
population of size N. A sampling distribution of a proportion for a simple
random sample has a
• normal distribution
• a mean equal to the population proportion (P)
• a standard error ( ) equal to

p (1 − p ) pq pˆ qˆ
σ pˆ = = ≈
n n n
Where q = 1 – p

## Sample proportion is computed by dividing the number of items in a sample

that possess the characteristic, X, by the number of items in the sample, n.

= Ps = Sample proportion

The equation, then, for a large sample confidence interval for p is as follows:

pq pˆ qˆ
pˆ ± zα ≈ pˆ ± zα
2 n 2 n
The central limit theorem also applies to sample proportions in that the
normal distribution approximates the shape of the distribution of sample
proportion if (n x p) > 5 and [n (1 - p)] > 5, where p is the population
proportion.
The mean of sample proportion for all samples of size n randomly drawn
from a population is p (the population proportion) and the standard deviation
of the sampling distribution of sample proportions (or the standard error of
the proportion) is the square root of (p . q)/n, where q = 1 - p. The Z
equation for the sample proportion is as follows:

## •Sample proportion ( Ps ) provides an estimate of p:

•0 equal or less than Ps equal or less than 1
•Ps has a binomial distribution
(Assuming sampling with replacement from a finite population or without
replacement from an infinite population)

• :
Approximated by a normal distribution if

. .
• np ≥ 5
• n (1− p) ≥ 5
Where

And
(Where p = population proportion)

## . If sampling is without replacement and n is greater than 5% of the

population size, then must use the finite population correction factor:

Example:

## If the true proportion of voters who support Proposition A is p = 0.4, what is

the probability that a sample of size 200 yields a sample proportion between
0.40 and 0.45?
Solution:

Example 2:

machine in 2005. Salman believes that this proportion may not be true for the
region of Karachi. If he takes a random sample of 600 households and finds
that only 135 have an answering machine, what is the probability of getting a
sample proportion this small or smaller if the population proportion really is
0.43?
Solution:

And z is

## Z = (0.23 - 0.43)/square root of [(0.43) . (0.57)]/600 = - 10

Almost all the area under the curve lies to the right of this Z value. The
probability of getting this sample proportion or a smaller one is virtually zero.
That is, the results obtained from this sample are almost too different from the
43% proportion for Salman to accept the national figure for region of Karachi.
The following graph shows this problem.
Example
1000 randomly selected peoples were asked if they believed the minimum
wage should be raised. 600 said yes. Construct a 95% confidence interval
for the proportion of peoples who believe that the minimum wage should be
raised.

Solution:

= 600/1000 = 0.6

== 1.96
and n = 1000
q = 1- p = 1 - 0.6 = 0.4

pq pˆ qˆ
pˆ ± zα ≈ pˆ ± zα
2 n 2 n

Hence we can conclude that between 57 and 63 percent of all peoples agree
with the proposal. In other words, with a margin of error of .03 , 60% agree.

## Determining N here follows the same general approach as with

estimating the population mean. The bound is still the error of the
estimate, but now we use the error estimate for proportions
To find the necessary sample size for a desired E, just solve for n.

We know that
= so

Solving for n

## Here, another potential problem arises, in the estimates of p and q. What do

you do if you don’t have any prior knowledge of p and q, and you don’t
already have a sample to get estimates?

## In this situation, we need to make as conservative an estimate of n as

possible. In other words, we would rather have a larger n than necessary than
an n that is too small. To accomplish this, we need conservative estimates of
p and q.

Fro the sample size equation, we can see that as the value of p*q gets bigger,
so does the value of n. Therefore, what values will make n the biggest if
everything else in the equation is fixed?
ˆ = qˆ = 0.5
p

Example
You have been hired by the Clear Optical company4 to design a study to
estimate the proportion of the Peshawar population who wear corrective
lenses. The desired margin of error is 1% (at the 95% confidence level).
What is the minimum sample size you should use? (Assume we don't know
^p yet.)
Solution

= = 0. 05
= 0.025

Z0.025 = 1.96

= 0. 5

= 1 – 0.5 = 0.5

## = 9603.6 or 9604 peoples

Thus, to attain the desired margin of error (at the 95% confidence level), a
random sample of 9604 people should be used.