Beruflich Dokumente
Kultur Dokumente
Probability Distributions
Statistics
McClave/Sincich, A First Course in Statistics, 10th ed.
Chapter 4: Random Variables and Probability
Distributions
2
Where Weve Been
Using probability to make inferences
about populations
Measuring the reliability of the
inferences
McClave/Sincich, A First Course in Statistics, 10th ed.
Chapter 4: Random Variables and Probability
Distributions
3
Where Were Going
Develop the notion of a random
variable
Numerical data and discrete random
variables
Discrete random variables and their
probabilities
4.1: Two Types of Random
Variables
A random variable is a variable hat
assumes numerical values associated
with the random outcome of an
experiment, where one (and only one)
numerical value is assigned to each
sample point.
4 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.1: Two Types of Random
Variables
A discrete random variable can assume a
countable number of values.
Number of steps to the top of the Eiffel Tower*
A continuous random variable can
assume any value along a given interval of
a number line.
The time a tourist stays at the top
once s/he gets there
*Believe it or not, the answer ranges from 1,652 to 1,789. See Great Buildings
5 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.1: Two Types of Random
Variables
Discrete random variables
Number of sales
Number of calls
Shares of stock
People in line
Mistakes per page
Continuous random
variables
Length
Depth
Volume
Time
Weight
6 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.2: Probability Distributions
for Discrete Random Variables
The probability distribution of a
discrete random variable is a graph,
table or formula that specifies the
probability associated with each
possible outcome the random variable
can assume.
p(x) 0 for all values of x
Ep(x) = 1
7 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.2: Probability Distributions
for Discrete Random Variables
Say a random variable
x follows this pattern:
p(x) = (.3)(.7)
x-1
for x > 0.
This table gives the
probabilities (rounded
to two digits) for x
between 1 and 10.
x P(x)
1 .30
2 .21
3 .15
4 .11
5 .07
6 .05
7 .04
8 .02
9 .02
10 .01
8 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
The mean, or expected value, of a
discrete random variable is
( ) ( ). E x xp x = =
2 2 2
[( ) ] ( ) ( ). E x x p x o = =
lose P
win P
On average, bettors lose about a nickel for each dollar they put down on a bet like this.
(These are the best bets for patrons.)
4.2: Probability Distributions
for Discrete Random Variables
4.3: The Binomial Distribution
A Binomial Random Variable
n identical trials
Two outcomes: Success or Failure
P(S) = p; P(F) = q = 1 p
Trials are independent
x is the number of Successes in n trials
13 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.3: The Binomial Distribution
A Binomial Random
Variable
n identical trials
Two outcomes: Success
or Failure
P(S) = p; P(F) = q = 1 p
Trials are independent
x is the number of Ss in n
trials
Flip a coin 3 times
Outcomes are Heads or Tails
P(H) = .5; P(F) = 1-.5 = .5
A head on flip i doesnt
change P(H) of flip i + 1
14 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.3: The Binomial Distribution
Results of 3 flips Probability Combined Summary
HHH (p)(p)(p) p
3
(1)p
3
q
0
HHT (p)(p)(q) p
2
q
HTH (p)(q)(p) p
2
q (3)p
2
q
1
THH (q)(p)(p) p
2
q
HTT (p)(q)(q) pq
2
THT (q)(p)(q) pq
2
(3)p
1
q
2
TTH (q)(q)(p) pq
2
TTT (q)(q)(q) q
3
(1)p
0
q
3
15 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.3: The Binomial Distribution
The Binomial Probability Distribution
p = P(S) on a single trial
q = 1 p
n = number of trials
x = number of successes
x n x
q p
x
n
x P
|
|
.
|
\
|
= ) (
16 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.3: The Binomial Distribution
The Binomial Probability Distribution
x n x
q p
x
n
x P
|
|
.
|
\
|
= ) (
17 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
Say 40% of the
class is female.
What is the
probability that 6
of the first 10
students walking
in will be female?
4.3: The Binomial Distribution
1115 .
) 1296 )(. 004096 (. 210
) 6 )(. 4 (.
6
10
) (
6 10 6
=
=
|
|
.
|
\
|
=
|
|
.
|
\
|
=
x n x
q p
x
n
x P
18 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.3: The Binomial Distribution
Mean
Variance
Standard Deviation
A Binomial Random Variable has
McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
19
2
np
npq
npq
o
o
=
=
=
4.3: The Binomial Distribution
16 250
250 5 . 5 . 1000
500 5 . 1000
2
~ = =
= = =
= = =
npq
npq
np
o
o
=
x
e x f
23 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.5: The Normal Distribution
2
2
2
1
) (
z
e z f
=
t
Each combination of and o produces a
unique normal curve
The standard normal curve is used in
practice, based on the standard normal
random variable z ( = 0, = 1), with the
probability distribution
The probabilities for z are given in Table IV
24 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.5: The Normal Distribution
0531 . 3413 . 3944 .
) 00 . 1 0 ( ) 25 . 1 0 (
) 25 . 1 1 (
6826 .
3413 . 3413 . ) 1 1 (
3413 . ) 0 00 . 1 (
3413 . ) 00 . 1 0 (
= =
< < < <
= < <
=
+ = < <
= < <
= < <
z P z P
z P
z P
z P
z P
25 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.5: The Normal Distribution
For a normally
distributed random
variable x, if we know
and o,
o
=
i
i
x
z
26 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
So any normally
distributed variable
can be analyzed
with this single
distribution
4.5: The Normal Distribution
Say a toy car goes an average of 3,000 yards between
recharges, with a standard deviation of 50 yards (i.e.,
= 3,000 and o = 50)
What is the probability that the car will go more than
3,100 yards without recharging?
27 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.5 The Normal Distribution
0228 . 4772 . 5 . 1
) 00 . 2 0 ( 5 . 1
) 00 . 2 ( 1 ) 00 . 2 (
50
3000 3100
) 3100 (
=
= < <
= < = >
=
|
.
|
\
|
> = >
z P
z P z P
z P x P
Say a toy car goes an average of 3,000 yards between
recharges, with a standard deviation of 50 yards (i.e.,
= 3,000 and o = 50)
What is the probability that the car will go more than
3,100 yards without recharging?
28 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.5: The Normal Distribution
To find the probability for a normal random
variable
Sketch the normal distribution
Indicate xs mean
Convert the x variables into z values
Put both sets of values on the sketch, z below x
Use Table IV to find the desired probabilities
29 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.6: Descriptive Methods for
Assessing Normality
If the data are normal
A histogram or stem-and-leaf display will look like
the normal curve
The mean s, 2s and 3s will approximate the
empirical rule percentages
The ratio of the interquartile range to the standard
deviation will be about 1.3
A normal probability plot , a scatterplot with the
ranked data on one axis and the expected z-scores
from a standard normal distribution on the other
axis, will produce close to a straight line
30 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.6: Descriptive Methods for
Assessing Normality
Errors per MLB team in 2003
Mean: 106
Standard Deviation: 17
IQR: 22
0
1
2
3
4
5
6
7
8
9
10
77 89.8 102.6 115.4 128.2 More
F
r
e
q
u
e
n
c
y
Errors per team, 2003
Histogram
Frequency
29 . 1
17
22
= =
s
IQR
157 55
51 106 3
140 72
34 106 2
123 89
17 106
=
s x
s x
s x
22 out of 30: 73%
28 out of 30: 93%
30 out of 30: 100%
\
\
\
31 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.6: Descriptive Methods for
Assessing Normality
A normal probability
plot is a scatterplot with
the ranked data on one
axis and the expected z-
scores from a standard
normal distribution on
the other axis
-3
-2
-1
0
1
2
3
60 80 100 120 140 160
N
o
r
m
a
l
Q
u
a
n
t
i
l
e
Errors
\
32 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.7: Approximating a Binomial
Distribution with the Normal Distribution
Discrete calculations may become very
cumbersome
The normal distribution may be used to
approximate discrete distributions
The larger n is, and the closer p is to .5, the
better the approximation
Since we need a range, not a value, the
correction for continuity must be used
A number r becomes r+.5
33 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
Calculate the mean plus/minus 3 standard deviations
npq np = o 3
If this interval is in the range 0 to n,
the approximation will be reasonably close
Express the binomial probability as a range of values
) ( ) (
) (
a x P b x P
a x P
s s
s
Find the z-values for each binomial value
o
+
=
) 5 . (a
z
Use the standard normal distribution to find
the probability for the range of values you calculated
34 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
4.7: Approximating a Binomial
Distribution with the Normal Distribution
Flip a coin 100 times and compare the binomial
and normal results
0796 . ) 10 . 0 10 . 0 (
5
50 5 . 50
5
50 5 . 49
) 5 . 50 5 . 49 (
5 5 . 5 . 100
50 5 . 100
0796 . 5 . 5 .
50
100
) 50 (
50 50
= s= s
=
|
.
|
\
|
s s
= s s
= =
= =
=
|
|
.
|
\
|
= =
z P
z P x P
x P
o
\
|
s s
= s s
= =
= =
=
|
|
.
|
\
|
= =
z P
z P x P
x P
o
\
|
s s
= s s
= =
= =
=
|
|
.
|
\
|
= =
z P
z P x P
x P
o
x
2
3
=
x
2
1
=
x
4.8: Sampling Distributions
4.9: The Central Limit Theorem
42 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
A point estimator is a single number
based on sample data that can be used as
an estimator of the population parameter
p
s
2
2
p
x
x
o 41 .
2
3
=
=
x
x
o 0
2
1
=
=
x
x
o
Heres our small population again, this time with the standard deviations of the
sample means. Notice the mean of the sample means in each case equals the
population mean and the standard error falls as n increases.
4.9: The Central Limit Theorem
4.9: The Central Limit Theorem
45 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
If a random sample of n observations is
drawn from a normally distributed
population, the sampling distribution of
will be normally distributed
4.9: The Central Limit Theorem
46 McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
The Central Limit Theorem
The sampling distribution of ,
based on a random sample
of n observations, will be
approximately normal with
= and = / n.
The larger the sample size, the
better the sampling
distribution will approximate
the normal distribution.
Suppose existing houses for sale average 2200 square
feet in size, with a standard deviation of 250 ft
2
.
What is the probability that a randomly selected house will have
at least 2300 ft
2
?
McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
47
4.9: The Central Limit Theorem
Suppose existing houses for sale average 2200 square
feet in size, with a standard deviation of 250 ft
2
.
What is the probability that a
randomly selected house will
have at least 2300 ft
2
?
McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
48
3446 . ) 40 . 0 (
250
2200 2300
) 2300 (
= >
=
|
.
|
\
|
>
= >
z P
z P
x P
4.9: The Central Limit Theorem
Suppose existing houses for sale average 2200 square
feet in size, with a standard deviation of 250 ft
2
.
What is the probability that a randomly selected sample of 16
houses will average at least 2300 ft
2
?
McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
49
4.9: The Central Limit Theorem
Suppose existing houses for sale average 2200 square
feet in size, with a standard deviation of 250 ft
2
.
What is the probability that a
randomly selected sample of
16 houses will average at
least 2300 ft
2
?
McClave/Sincich, A First Course in Statistics,
10th ed. Chapter 4: Random Variables and
Probability Distributions
50
0548 . ) 60 . 1 (
16
250
2200 2300
) 2300 (
= >
=
|
|
|
.
|
\
|
>
= >
z P
z P
x P
4.9: The Central Limit Theorem