Probablity Distributions

Probability Distributions
Random Variables
Expected Value
And Normal Distributions
Random Variable
Random variable
Outcomes of an experiment expressed
numerically
e.g.: Toss a dice twice; Count the number
of times the number 4 appears (0, 1 or 2
times)
Discrete Random Variable
Discrete random variable

Obtained by counting (1, 2, 3, etc.)
Usually a finite number of
different values
e.g.: Toss a coin five times;
Count the number of tails
(0, 1, 2, 3, 4, or 5 times)
Discrete Probability
Distribution Example
Event: Toss 2 Coins. Count # Tails.
Probability Distribution
Values Probability
T 0 1/4 = .25
1 2/4 = .50
T
2 1/4 = .25
T T
Discrete Probability Distribution
List of all possible [Xj , p(Xj) ] pairs

Xj = value of random variable
P(Xj) = probability associated with value
Mutually exclusive (nothing in common)

Collectively exhaustive (nothing left out)
0  P X j  1  P X j  1
Summary Measures
Expected value (the mean)

Weighted average of the probability
distribution:
  E  X    X jP X j 
j
Summary Measures
continued
Example of expected value (the mean):

Toss two coins, count the number of
tails, compute expected value
   X jP X j 
j
  0   2.5    1  .5    2   .25   1
Summary Measures
(continued
)
Variance
Weight average squared deviation about
the mean
  E   X       X j    P X j 
2 2 2
 
Summary Measures
(continued
)
Example of variance:
Toss two coins, count number of tails,
compute variance
   X j    P X j 
2 2
  0  1  .25   1  1  .5   2  1  .25  .5

2 2 2
Covariance and its Application
N
 XY    X i  E  X    Yi  E  Y   P  X iYi 
i 1
X : discrete random variable

X i : i th outcome of X
Y : discrete random variable
Yi : i th outcome of Y
P  X iYi  : probability of occurrence of the i th
outcome of X and the i th outcome of Y

Computing the Mean for
Investment Returns
Return per $1,000 for two types of investments
Investment
P(XiYi) Economic condition Dow Jones fund X Growth Stock Y
.2 Recession -$100 -$200
.5 Stable Economy + 100 + 50
.3 Expanding Economy + 250 + 350
E  X    X   100   .2    100   .5    250   .3  $105
E  Y   Y   200   .2    50   .5    350   .3  $90

Computing the Variance for
Investment Returns
Investment
P(XiYi) Economic condition Dow Jones fund X Growth Stock Y
.2 Recession -$100 -$200
.5 Stable Economy + 100 + 50
.3 Expanding Economy + 250 + 350
   .2   100  105    .5   100  105    .3  250  105 

2 2 2 2
X
 14, 725  X  121.35

   .2   200  90    .5   50  90    .3  350  90 
2 2 2 2
Y
 37,900  Y  194.68
Important Discrete
Probability Distributions
Discrete Probability
Distributions
Binomial Hypergeometric Poisson

The Binomial Random Variable
Binomial Random variable

– An experiment of n identical trials
– 2 possible outcomes on each trial, denoted as
S( success) and F( failure)
– Probability of success (p) is constant from trial
to trial. Probability of failure (q) is 1-p
– Trials are independent
– Binomial random variable – number of S’s in n
trials
Computer retailer selling desktop (D) and laptop

(L) PCs online. Sales of 80% desktop, 20% laptop.
What is the probability that next 4 sales are
Laptops?
Sample points for next 4 online
purchases
DDDD LDDD LLDD DLLL LLLL
DLDD LDLD LDLL
DDLD LDDL LLDL
DDDL DLLD LLLD
DLDL
DDLL
Use multiplicative rule to calculate probabilities of

the possible outcomes
P(DDDD) = .8*.8*.8*.8=.84=.4096
P(LDDD) = .2*.8*.8*.8=.2*.83=.1024
…..
P(LLLL) = .2*.2*.2*.2=.24=.0016
What is the probability that 3 of the next 4 online

sales are laptops?
P(3 of the next 4 customers purchase laptops) =
4(.2)3(.8)=4(.0064) = .0256
What is the probability that 3 of the next 4 online
sales are desktops?
P(3 of the next 4 customers purchase desktops) =
4(.8)3(.2)=4(.1024) = .4096
Do you see a pattern?

Formula for the probability distribution p(x)

n 
p( x ) = 
x
x n −x
• p q
 
Where p = probability of success on single trial
q = 1-p
n = Number of trials
x = number of successes in n trials
n n!
  =
 x  x!(n − x)!
Mean: µ = np
Variance: σ = npq
2
Standard deviation σ = npq

Using Binomial Tables

Binomial tables are cumulative tables, entries
represent cumulative binomial probabilities
Make use of additive and complementary
properties to calculate probabilities of individual
x’s, or x being greater than a particular value.
If x < 2, and p =.2, n =10, then P(x<2) =.678
If x = 2, and p =.2, n =10, then P(x=2) = P(x<2) - P(x<1)=.678-.376 = .302
If x >2, and p = .2, n =10, then P(x>2) = 1- P(x<2) =1-.678 = .322
Binomial probabilities for n=10 (partial table)
p
k .01 .05 .10 .20 .30
0 .904 .599 .349 .107 .028
1 .996 .914 .736 .376 .149
2 1.000 .988 .930 .678 .383
3 1.000 .999 .987 .879 .650
4 1.000 1.000 .998 .967 .850

Expected Values of Discrete
Random Variables
Probability Rules for a Discrete Random Variable
Probability Rules for a Discrete Random Variable

Chebyshev’s Rule Empirical Rule
Applies to any Applies to mound-

distribution shaped and symmetric
distributions
P( µ − σ < x < µ + σ ) ≥0 ≈ .68
P ( µ − 2σ < x < µ + 2σ ) ≥3 4 ≈ .95
P ( µ − 3σ < x < µ + 3σ ) ≥8 9 ≈ 1.00
Poisson Distribution
Poisson process ( =x|λ
PX
-λ x
Discrete events in an “interval” e λ
The probability of one success
in an interval is stable x!
The probability of more than
one success in this interval is 0
The probability of success is
independent from interval to
interval
e.g.: The number of customers arriving in 15
minutes
e.g.: The number of defects per case of light
bulbs
Poisson Distribution
Characteristics
Mean .6
P(X) λ = 0.5
  E X    .4
.2
N 0 X
  XiP  Xi  0 1 2 3 4 5
i 1
.6
P(X) λ =6
Standard deviation .4
.2
and variance 0 X
   
2 0 2 4 6 8 10
Poisson Probability
Distribution Function
 X
e 
P X  
X!
P  X  : probability of X "successes" given 
X : number of "successes" per unit
 : expected (average) number of "successes"
e : 2.71828 (base of natural logs)
e.g.: Find the probability of four e 3.6 3.64
P X    .1912
customers arriving in three 4!
minutes when the mean is 3.6.
Continuous Random Variables
Continuous Probability Distributions
Continuous Probability Distribution – areas under

curve correspond to probabilities for x
Area A corresponds to the probability that x lies
between a and b
Do you see the similarity in shape between the continuous and discrete
probability distributions?
The Normal Distribution
“Bell shaped” f(X)

Symmetrical
Mean, median and
mode are equal X
µ
Interquartile range
Mean
equals 1.33 σ Median
Random variable Mode
has infinite range

The Mathematical Model
1
1   X    2
f  X 

2
e
2 2
f  X  : density of random variable X
  3.14159; e  2.71828
 : population mean
 : population standard deviation
X : value of random variable    X   
A normal random variable has a probability

distribution called a normal distribution

Bell-shaped curve
Symmetrical about its mean μ
Spread determined by the value
of it’s standard deviation σ
The mean and standard deviation affect the

flatness and center of the curve, but not the
basic shape
Probabilities associated with values or ranges of a random
variable correspond to areas under the normal curve
Calculating probabilities can be simplified by working with a
Standard Normal Distribution
A Standard Normal Distribution is a Normal distribution with
µ =0 and σ =1
The standard normal
random variable is
denoted by the
symbol z
Table for Standard Normal Distribution contains probability
for the area between 0 and z
Partial table below shows components of table
Probability
associated with a
particular z value, in
Value of z a this case z=.13,
combination of p(0<z<.13) = .0517
column and
row Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
Many Normal Distributions
There are an infinite number of normal
distributions
By varying the parameters σ and µ, we

obtain different normal distributions
Finding Probabilities
Probability is
the area
under the P c  X  d   ?
curve!
f(X)
X
c d
Which Table to Use?
An infinite number of normal distributions

means an infinite number of tables to look
up!
Solution: The Cumulative Standardized
Normal Distribution
Cumulative Standardized
Normal Distribution Table
(Portion) Z  0 Z 1
Z .00 .01 .02
.5478
0.0 .5000 .5040 .5080
Shaded
Area
0.1 .5398 .5438 .5478 Exaggerate
d
0.2 .5793 .5832 .5871 0
Probabilities
0.3 .6179 .6217 .6255 Z = 0.12
Only One Table is

What is P(-1.33 < z < 1.33)?

Table gives us area A1
Symmetry about the mean
tell us that A2 = A1
P(-1.33 < z < 1.33) = P(-1.33 < z < 0) +P(0 < z < 1.33)=
A2 + A1 = .4082 + .4082 = .8164
What is P(z < .67)?

Table gives us area A1
tell us that A2 = .5
P(z < .67) = A1 + A2 = .2486 + .5 = .7486

What is P(|z| > 1.96)?

Table gives us area .5 - A2
=.4750, so A2 = .0250
tell us that A2 = A1
P(|z| > 1.96) = A1 + A2 = .0250 + .0250 =.05
What if values of interest were

not normalized? We want to know
P (8<x<12), with μ=10 and σ=1.5
Convert to standard normal using
x−µ
z=
σ
P(8<x<12) = P(-1.33<z<1.33) = 2(.4082) = .8164
Solution: The Cumulative
Standardized Normal
Distribution
Normal Distribution Table
(Portion) Z  0 Z 1
Z .00 .01 .02
.5478
0.0 .5000 .5040 .5080
Shaded
Area
0.1 .5398 .5438 .5478 Exaggerate
d
0.2 .5793 .5832 .5871 0
Probabilities
0.3 .6179 .6217 .6255 Z = 0.12
Only One Table is

Standardizing Example
X   6.2  5
Z   0.12
 10
Normal Standardized
Distribution Normal
  10 Distribution
Z 1
6.2 X 0.12 Z
 5 Z  0
Shaded Area Exaggerated
Example:
P  2.9  X  7.1  .1664
X   2.9  5 X   7.1  5
Z   .21 Z   .21
 10  10
Normal Standardized
Distribution Normal
.0832 Z 1
.0832
2.9 7.1 X 0.21 0.21 Z

 5 Z  0
Shaded Area
Example:
P  2.9  X  7.1  .1664(continued
)
Normal Distribution Table Z  0 Z 1
(Portion)
Z .00 .01 .02
.5832
0.0 .5000 .5040 .5080 Shaded
Area
0.1 .5398 .5438 .5478 Exaggerate
d
0.2 .5793 .5832 .5871 0
Z = 0.21
0.3 .6179 .6217 .6255
Example:
P  2.9  X  7.1  .1664(continued
)
(Portion)
Z .00 .01 .02 .4168
-03 .3821 .3783 .3745 Shaded
Area
-02 .4207 .4168 .4129 Exaggerate
d
-0.1 .4602 .4562 .4522 0
Z = -0.21
0.0 .5000 .4960 .4920
Example:
P  X  8   .3821
X   85
Z   .30
 10
Normal Standardized
Distribution Normal
Z 1
.3821
8 X 0.30 Z
 5 Z  0
Shaded Area
Example:
P  X  8   .3821 (continued
)
(Portion)
Z .00 .01 .02 .6179
0.0 .5000 .5040 .5080 Shaded
Area
0.1 .5398 .5438 .5478 Exaggerate
d
0.2 .5793 .5832 .5871 0
Z = 0.30
0.3 .6179 .6217 .6255
Finding Z Values for Known
Probabilities
What is Z Given Normal Distribution Table
Probability = (Portion)
0.1217 ?
Z  0 Z 1 Z .00 .01 0.2
0.0 .5000 .5040 .5080

.6217
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871

0
Shaded 0.3 .6179 .6217 .6255
Area Z  .31
Exaggerat
Recovering X Values for Known
Probabilities
Normal Standardized
Distribution Normal
.6179 Z 1
.3821
X
 5 ? Z  0
0.30 Z
X    Z  5   .30   10   8
Steps for Finding a Probability Corresponding to a

Normal Random Variable
•Sketch the distribution, locate mean, shade area
of interest
x−µ
•Convert to standard z values using z =
σ
•Add z values to the sketch
•Use tables to calculate probabilities, making use
of symmetry property where necessary
Given a normally distributed

variable x with mean 100000 and
standard deviation of 10000, what
value of x identifies the top 10%
of the distribution?
 x0 − µ   x0 − 100,000 
P ( x ≤ x0 ) = P z ≤  = P z ≤  = .90
 σ   10,000 
The z value corresponding with .40 is 1.28. Solving for x0
x0 = 100,000 +1.28(10,000) = 100,000 +12,800 = 112,800
Assessing Normality
Not all continuous random variables are

normally distributed
It is important to evaluate how well the data
set seems to be adequately approximated
by a normal distribution
Assessing Normality
(continued
)
Construct charts
For small- or moderate-sized data sets, do
stem-and-leaf display and box-and-whisker
plot look symmetric?
For large data sets, does the histogram or
polygon appear bell-shaped?
Compute descriptive summary measures
Do the mean, median and mode have similar
values?
Is the interquartile range approximately 1.33 σ?
Is the range approximately 6 σ?
Assessing Normality
(continued
)
Observe the distribution of the data set
Do approximately 2/3 of the observations lie
between mean  1 standard deviation?
between mean  1.28 standard deviations?
between mean  2 standard deviations?
Evaluate normal probability plot
Do the points lie on or close to a straight line
with positive slope?
Assessing Normality
(continued
)
Normal probability plot
Arrange data into ordered array
Find corresponding standardized normal
quantile values
Plot the pairs of points with observed data
values on the vertical axis and the
standardized normal quantile values on the
horizontal axis
Evaluate the plot for evidence of linearity
Assessing Normality
(continued
)
Normal Probability Plot
for Normal Distribution
90
X 60
30 Z
-2 -1 0 1 2
Look for Straight

Normal Probability Plot
Left-Skewed Right-Skewed
90 90
X 60 X 60
30 Z 30 Z
-2 -1 0 1 2 -2 -1 0 1 2
Rectangular U-Shaped
90 90
X 60 X 60
30 Z 30 Z
-2 -1 0 1 2 -2 -1 0 1 2
Example
e.g.: Customers arrive at the check out
line of a supermarket at the rate of 30
per hour. What is the probability that
the arrival time between consecutive
customers to be greater than 5
minutes?
  30 X  5 / 60 hours
P  arrival time >X   1  P  arrival time  X 

 1 1 e
30 5 / 60 

 .0821
Descriptive Methods for Assessing
Normality
•Evaluate the shape from a histogram or
stem-and-leaf display
•Compute intervals about mean x ± s, x ± 2s, x ± 3s
and corresponding percentages
•Compute IQR and divide by standard
deviation. Result is roughly 1.3 if normal
•Use statistical package to evaluate a
normal probability plot for the data
Approximating a Binomial Distribution with a
Normal Distribution
You can use a Normal Distribution as an

approximation of a Binomial Distribution for large
values of n
Often needed given limitation of binomial tables
Need to add a correction for continuity, because of
the discrete nature of the binomial distribution
Correction is to add .5 to x when converting to
standard z values
Rule of thumb: interval µ+3σ should be within
range of binomial random variable (0-n) for normal
distribution to be adequate approximation
Normal Distribution
Steps
Determine n and p for the binomial distribution
Calculate the interval µ ± 3σ = np ± 3 npq
Express binomial probability in the form P(x<a) or
P(x<b)–P(x<a)
Calculate z value for each a, applying continuity
correction
Sketch normal distribution, locate a’s and use table
to solve
Normal Distribution
You can use a Normal Distribution as an

approximation of a Binomial Distribution for large
values of n
Often needed given limitation of binomial tables
Need to add a correction for continuity, because of
the discrete nature of the binomial distribution
Correction is to add .5 to x when converting to
standard z values
Rule of thumb: interval µ+3σ should be within
range of binomial random variable (0-n) for normal
distribution to be adequate approximation

Probablity Distributions

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Probablity Distributions

Hochgeladen von

Copyright:

Verfügbare Formate

Probability Distributions

Discrete random variable

List of all possible [Xj , p(Xj) ] pairs

Mutually exclusive (nothing in common)

Expected value (the mean)

Example of expected value (the mean):

  0  1  .25   1  1  .5   2  1  .25  .5

X : discrete random variable

outcome of X and the i th outcome of Y

E  X    X   100   .2    100   .5    250   .3  $105

E  Y   Y   200   .2    50   .5    350   .3  $90

   .2   100  105    .5   100  105    .3  250  105 

 14, 725  X  121.35

Binomial Hypergeometric Poisson

Binomial Random variable

Computer retailer selling desktop (D) and laptop

Use multiplicative rule to calculate probabilities of

What is the probability that 3 of the next 4 online

Do you see a pattern?

Formula for the probability distribution p(x)

Standard deviation σ = npq

Using Binomial Tables

Binomial probabilities for n=10 (partial table)

2 1.000 .988 .930 .678 .383

3 1.000 .999 .987 .879 .650

4 1.000 1.000 .998 .967 .850

Probability Rules for a Discrete Random Variable

Applies to any Applies to mound-

Continuous Probability Distribution – areas under

“Bell shaped” f(X)

has infinite range

A normal random variable has a probability

The Normal Distribution

The mean and standard deviation affect the

By varying the parameters σ and µ, we

An infinite number of normal distributions

Only One Table is

What is P(-1.33 < z < 1.33)?

What is P(z < .67)?

P(z < .67) = A1 + A2 = .2486 + .5 = .7486

What is P(|z| > 1.96)?

What if values of interest were

Only One Table is

2.9 7.1 X 0.21 0.21 Z

0.0 .5000 .5040 .5080

0.2 .5793 .5832 .5871

Steps for Finding a Probability Corresponding to a

Given a normally distributed

Not all continuous random variables are

Look for Straight

You can use a Normal Distribution as an

You can use a Normal Distribution as an

Das könnte Ihnen auch gefallen