Sie sind auf Seite 1von 25

DOM101 2016

Session 5-6
Reading: SfM 5.1-5.5, 6

Probability distribution of a discrete variable


A discrete variable is a variable that takes only discrete values. These values may

not be integer, but they do not form a continuous function.


It is a mutually exclusive list of all possible numerical outcomes along with the
probability of each outcome occurring.
Eg: The number of possible absentees in an office:
No. of absentees

Probability

0.15

0.35

0.2

0.15

0.1

0.05

Expected value of a discrete variable


The expected value of a discrete variable is the weighted average of all the

outcomes, the weights being the probability scores.


= =

=1 (

= )

In the previous example, = 0.15(0)+0.35(1)+0.2(2)+0.15(3)+0.1(4)+0.05(5)

= 1.85

Variance and standard deviation of discrete variable


The variance of the discrete variable is the sum of the squared difference between

outcome and expected value, multiplied by the probability of that outcome.


Variance =

=1

2 (

= )

The standard deviation is the square root of the variance.

= =

=1

2 (

= )

Covariance of a probability distribution


Covariance measures the strength of a relationship between the probability

distribution of two numerical variables.


A negative covariance indicates a negative relationship, a positive covariance

indicates a positive relationship (occurrence of one makes the other more likely),
and a covariance of 0 indicates the probability distributions are independent.
Covariance =

=1

( )

( ) is the probability of both and occurring.

Sum of two discrete variables


Expected value of the sum of two variables:

E(X+Y) = E(X) + E(Y)


Variance of the sum of two variables:
2
Var(X+Y) = +
= 2 + 2 + 2

Standard deviation:

+ = ( + )

Expected return and risk for a 2-asset portfolio


Out of the total investment in the portfolio, if a weight w1 is assigned to

investment X, then a weight w2 is assigned to Y, in a 2-asset portfolio.


The expected value of the portfolio is the weighted expectations on the two
investments, E(P) = (w1E(X) + w2E(Y))/(w1+w2).
The expected risk on the portfolio is the standard deviation of the investment
= (12 2 + 22 2 + 212 )/(1 + 2)2
From this, it can be seen that negatively correlated investments reduce portfolio

risk, while positively correlated investments increase it.

Invest 1 lakh in Stock A: E(A)=30%=0.3, stddev=0.2


Invest 2 lakh in Stock B: E(B)=20%=0.2, stdev= 12%=0.12
If the covariance is -0.005
Expected return and std.dev of combined investment?
E(I1)=0.3 E(I2)=0.2, w1=1, w2=2
E(I1+I2)=(1*0.3+2*0.2)/(1+2)=0.233,
Stdev(I1+I2)= 0.22 + 22 0.122 + 2 1 2 (0.005)=0.28

The Uniform Distribution


Also called rectangular dist. Has the same chance of occurrence anywhere in its

range.
1
=
, ()

Mean: =

+
2

Variance: 2

()2
12

Standard Deviation:

The Binomial Distribution


A discrete random variable distribution created by a Bernoulli Process, which has

the following properties


It is a series of trials, each trial has only two outcomes, with probabilities p and 1-p
The value of p stays fixed over the course of the process
The trials are statistically independent
If there are n trials, the chances of obtaining exactly r successes (r<=n) is given by
the binomial formula (let q = 1-p):
!

! ( !)

Graphical results of the binomial distribution


When p is small (around 0.1) the distribution is right-skewed
As p increases, the skewness is less noticeable until it is

symmetrical at p=0.5
As p increases beyond 0.5, the distribution starts being skewed to
the left.
The probability for each outcome at a certain value p are the same
as the outcomes for q, except in reverse order.
Q: In 10 tosses of an honest coin, what are the chances of a)
Exactly 7 heads b) Less than 5 heads?

Central Tendency of the Binomial Distribution


Mean: np
Variance: npq
Standard deviation:
Final note: To apply the binomial distribution, we must first ensure

that the process meets the conditions for a Bernoulli Process.


Look up Binomial distribution table in the book.

Hypergeometric Distribution
Where the binomial distribution the sample data are selected with replacement

from a finite pool (or without from an infinite pool) the hypergeometric
distribution is found when the samples are taken from a finite pool without
replacement.
If n samples are taken from population N, and out of the population A members are
of interest, then the probability of exactly x successes out of n samples is:


= , , =

Mean = / Std. dev. =

()
2

Poisson Distribution
Characteristics of the Poisson Process:
The process is applied to a discrete random variable that takes integer values
The average value of the random variable over the given time period is already

known or can be calculated given past data


At any one second, the possibility of a positive outcome is very small, and a fixed
value.
At any one second, the possibility of two or more positive outcomes is so small we
can assign it a value of zero.
The probability of a positive outcome at any given second is not only fixed, but
independent of the actual time as well as the result in any other second.

The Poisson Formula


Let be the mean number of occurrences in the interval of time under study.
e is the base of the natural logarithm system, approx. 2.71828
Poisson probability of x number of incidents occurring


=
!
If a binomial process has a large number of trials (n>20) and a small probability of
success (p<0.05), we can use the Poisson formula after substituting the binomial
mean np.
()
=
!

The Exponential Distribution


Right-skewed, ranges from zero to +infinity. Defined by the mean number of

occurrences per unit time, .

f(X)=
Mean = 1 /= standard deviation.

P(x<=X)=1-

A store receives 5 customers an hour on average. If the hourly customer arrival

follows an exponential distribution, what is the probability of receiving 10 or more


customers in an hour?
P(X>=10)=1-P(X<=9)

The normal distribution


It is a continuous probability distribution. Also called Gaussian distribution.

Can be used to approximate discrete distributions, with sufficiently large samples.

Can approximate Binomial distribution if np,nq>5


It is symmetrical, bell-shaped in appearance, its interquartile range is from -0.67
standard deviations to +0.67 std.devs.
All continuous functions have a probability density function which is the likelihood of
the variable taking a particular value. For the normal distribution:
1
1
(2)[ ]2
=

2
Integrating between two values X1, X2, gives us the probability of the variable
falling between those two values.

Normal distribution
P(X1<X<X2)=

2
()
1

P(X<X1)=

1
()

P(X>X2)=

()
2

Normal Distribution from Z-score


Instead of integrating the density function, for a normal distribution, we can find

thee probability of the variable being below a certain value by using the Z-table in
the book.
Calculate

Z=
, the corresponding

value in the table shows the probability of the


variable being less than or equal to that value.

To find the probability of the variable being between X1 and X2 (X1<X2),

calculate P(Z(X2))-P(Z(X1)).

Evaluating skewness
Can be accomplished with a quantile-quantile plot or the normal probability plot.
Calculate the quantiles (5%, 10%, etc) of the data.

Map them in a scatter plot, with the actual values on the Y-axis, and the theoretical

normal distribution Z-value of each quantile (-1.65, -1.28, etc.) on the X-axis.
If the plot rises sharply at the higher Z-values, it is right-skewed, left-skewed for

opposite result.

Problem
On average, 50 candidates selected out of 500. Probability of there being between

30 and 40 selections?
Since this is close to a Binomial dist. we use those formulas to calculate =
500

50
500

50050
500

= 6.708

P(30<=X<=40)=P(Z(40))-P(Z(30))= P(-1.49)-P(-2.98)

=0.0681-0.0014=0.0667

The annual household income of 300 surveyed families has a mean of 16 lakhs with

stdev 90,000. How many families have an income between 10 and 15 lakh?
Ans: 40 approx

Das könnte Ihnen auch gefallen