Beruflich Dokumente
Kultur Dokumente
Session 5-6
Reading: SfM 5.1-5.5, 6
Probability
0.15
0.35
0.2
0.15
0.1
0.05
=1 (
= )
= 1.85
=1
2 (
= )
= =
=1
2 (
= )
indicates a positive relationship (occurrence of one makes the other more likely),
and a covariance of 0 indicates the probability distributions are independent.
Covariance =
=1
( )
Standard deviation:
+ = ( + )
range.
1
=
, ()
Mean: =
+
2
Variance: 2
()2
12
Standard Deviation:
symmetrical at p=0.5
As p increases beyond 0.5, the distribution starts being skewed to
the left.
The probability for each outcome at a certain value p are the same
as the outcomes for q, except in reverse order.
Q: In 10 tosses of an honest coin, what are the chances of a)
Exactly 7 heads b) Less than 5 heads?
Hypergeometric Distribution
Where the binomial distribution the sample data are selected with replacement
from a finite pool (or without from an infinite pool) the hypergeometric
distribution is found when the samples are taken from a finite pool without
replacement.
If n samples are taken from population N, and out of the population A members are
of interest, then the probability of exactly x successes out of n samples is:
= , , =
()
2
Poisson Distribution
Characteristics of the Poisson Process:
The process is applied to a discrete random variable that takes integer values
The average value of the random variable over the given time period is already
=
!
If a binomial process has a large number of trials (n>20) and a small probability of
success (p<0.05), we can use the Poisson formula after substituting the binomial
mean np.
()
=
!
f(X)=
Mean = 1 /= standard deviation.
P(x<=X)=1-
2
Integrating between two values X1, X2, gives us the probability of the variable
falling between those two values.
Normal distribution
P(X1<X<X2)=
2
()
1
P(X<X1)=
1
()
P(X>X2)=
()
2
thee probability of the variable being below a certain value by using the Z-table in
the book.
Calculate
Z=
, the corresponding
calculate P(Z(X2))-P(Z(X1)).
Evaluating skewness
Can be accomplished with a quantile-quantile plot or the normal probability plot.
Calculate the quantiles (5%, 10%, etc) of the data.
Map them in a scatter plot, with the actual values on the Y-axis, and the theoretical
normal distribution Z-value of each quantile (-1.65, -1.28, etc.) on the X-axis.
If the plot rises sharply at the higher Z-values, it is right-skewed, left-skewed for
opposite result.
Problem
On average, 50 candidates selected out of 500. Probability of there being between
30 and 40 selections?
Since this is close to a Binomial dist. we use those formulas to calculate =
500
50
500
50050
500
= 6.708
P(30<=X<=40)=P(Z(40))-P(Z(30))= P(-1.49)-P(-2.98)
=0.0681-0.0014=0.0667
The annual household income of 300 surveyed families has a mean of 16 lakhs with
stdev 90,000. How many families have an income between 10 and 15 lakh?
Ans: 40 approx