Sie sind auf Seite 1von 20

Notes on Discrete Probability

Prakash Balachandran
February 21, 2008
1 Probability
P(E
c
) = 1 P(E).
P(A B) = P(A) +P(B) P(A B).
For any events A, B, C,
P(A B C) = P(A) +P(B) +P(C) P(A B) P(A C) P(B C) +P(A B C).
Denition: The probability of A given B is P(A|B) =
P(AB)
P(B)
.
Denition: Events {A
i
}
n
i=1
are called independent if P (

n
i=1
A
i
) =

n
i=1
P(A
i
).
Let {A
i
}
n
i=1
be a partition of the sample space S. Then, for any event E,
P(E) =
n

j=1
P(E A
j
)
P(E) =
n

j=1
P(E|A
j
)P(A
j
).
Let E be an event, and {A
i
}
n
i=1
be a partition of the sample space S. Then,
P(A
i
|E) =
P(A
i
E)
P(E)
=
P(E|A
i
)P(A
i
)

n
j=1
P(E|A
j
)P(A
j
)
.
1
2 Discrete Random Variables
Denition: A discrete random variable X is a measurable function X : (, F, P) (R, B, ) with
countable range.
Denition: Let X be a discrete random variable. A probability function for X is a function p(x)
which assigns a probability to each value X, such that:
1. p(x) 0 for all x.
2.

xS
p(x) = 1
Denition Let X be a discrete random variable. The cumulative distribution function F(x) for X is
dened by
F(x) = P[X x].
Denition: Let X be a discrete random variable. The expected value of X is dened by
= E(X) =

xS
xp(x).
E(aX +b) = aE(X) +b.
V ar(aX +b) = a
2
V ar(X).
Denition: The mode of a probability function is the value of x which has the highest probability
p(x).
Denition: The variance of a random variable X is dened to be

2
= V ar(X) = E[(X )
2
] = E[X
2
] E[X]
2
.
Denition: For any possible value x of a random variable, the z-score is
z =
x

.
Denition: Let X be a discrete random variable with probability function p(x). Dene f(x) = p(x)n.
Then, the population mean and standard deviation are
=
1
n

xS
f x, =

1
n

xS
f (x )
2
.
Denition: Let X be a discrete random variable with probability function p(x). Dene f(x) = p(x)n.
Then, the sample mean and standard deviation are
x =
1
n

xS
f x, s =

1
n 1

xS
f (x x)
2
.
2
Note: these numbers are estimates of and .
M
aX+b
(t) = e
bt
M
X
(at).
3
3 Commonly Used Discrete Random Variables
3.1 The Binomial Distribution
Denition: A discrete random variable X is said to be a binomial random variable if
p(X = k) =
_
n
k
_
p
k
(1 p)
nk
, k = 1, 2, . . . , n.
p(x) is the probability that there will be exactly x successes in the n trials.
Denition: If n = 1, the binomial is referred to as a Bernoulli distribution.
E(X) = np.
V ar(X) = np(1 p).
M
X
(t) = (1 p +p
t
)
n
.
3.2 The Hypergeometric Distribution
Denition: A discrete random variable X is said to be a hypergeometric random variable if
p(X = k) =
_
Nr
nk
__
r
k
_
_
N
n
_ , k = 0, . . . , n, r n.
=
nr
N
, V ar(X) =
nr
N
_
1
r
N
_
_
Nn
N1
_
.
3.3 The Poisson Distribution
Denition: A Poisson random variable is a discrete random variable X with probability function
p(X = n) =
e

n
n!
, n = 0, 1, . . .
= , V ar(X) = .
M
X
(t) = e
(e
t
1)
.
3.4 The Geometric Distributions
Denition: A geometric random variable is a discrete random variable X with probability function
p(X = k) = q
k
p, k = 0, 1, . . . .
4
=
q
p
, V ar(X) =
q
p
2
.
M
X
(t) =
p
1qe
t
.
If Y represents the trial number of rst success (i.e. Y = X + 1), where X is as above,
p(y) = q
y1
p, y = 1, 2, 3, . . .
E(Y ) =
1
p
.
V ar(Y ) =
q
p
2
.
3.5 The Negative Binomial Distribution
Denition: A negative binomial random variable is a discrete random variable X with probability
function
P(X = k) =
_
r +k 1
r 1
_
q
k
p
r
, k = 0, 1, . . . .
=
rq
p
, V ar(X) =
rq
p
2
.
M
X
(t) =
_
p
1qe
t
_
r
.
3.6 The Discrete Uniform Distribution
Denition: A discrete uniform random variable on 1, 2, . . . , n is a discrete random variable X with
probability function
P(X = k) =
1
n
, k = 1, . . . , n.
=
n+1
2
, V ar(X) =
n
2
1
12
.
M
X
(t) =
e
t
(1e
nt
)
n(1e
t
)
.
5
4 Continuous Random Variables
4.1 The Density Function and Probabilities
Denition: A random variable X : (, F, P) (R, B, ) is called a continuous random variable if
there exists a Borel measurable function f : R R such that:
1. f(x) 0, x R.
2. P[X A] =
_
A
f(x)d(x) for every Borel set A.
When such a function f(x) exists, it is called the probability density function (pdf) of X.
Let g(x) be a strictly increasing/decreasing function on the sample space. Then, f
Y
(y) = f
X
(g
1
(y))|(g
1
)

(y)|.
4.2 The Cumulative Distribution Function
Denition: The cumulative distribution function (cdf) F(x) for a continuous random variable X
with pdf f(x) is dened by
F(x) = P[X (, x)] =
_
x

f(u)d(x).
When F

(x) exists, F

(x) = f(x).
P[a < X b] = F(b) F(a).
P[X = x] = F(x) lim
yx
F(y).
4.3 The Mode, Median, and Percentiles
Denition: The mode of a continuous random variable X with pdf f(x) is the value of x for which
f(x) is a maximum.
Denition: The median m of a continuous random variable X with pdf f(x) and cdf F(x) is the
solution to the equation
F(m) = P[X m] = 0.5.
Denition: Let X be a continuous random variable with cdf F(x). The 100 p
th
percentile of X is the
number x
p
such that
F(x
p
) = P[X x
p
] =
_
xp

f(u)d(x) = p.
6
4.4 The Mean and Variance
Denition: Let X be a continuous random variable with pdf f(x). The expected value or mean of X
is
= E(X) =
_

xf(x)d(x).
Let X be a continuous random variable with density f(x). Then, for any Borel measurable function
g : R R, E[g(X)] =
_

g(x)f(x)d(x).
E(aX +b) = aE(X) +b.
Denition: The k-th moment of X is E[X
k
].
Denition: The k-th central moment of X is E[(X )
k
].
Denition: The skewness of X is
E[(X)
3
]

3
.
Denition: the coefcient o variation of X is

V ar(X)
E(X)
.
Denition: Let X be a continuous random variable with density function f(x) and mean . Then, the
variance of X is dened by
V ar(X) = E[(X )
2
] =
_

(x )
2
f(x)d(x) = E(X
2
)
2
.
V ar(aX +b) = a
2
V ar(X).
4.5 Misc.
Suppose that X is a continuous random variable with pdf f
X
(x) and cdf F
X
(x), and suppose that
u(x) is an injective strictly increasing/decreasing function, Then, the random variable Y = u(X) has
distribution:
f
Y
(y) = f
X
(u
1
(y))|(u
1
)

(y)|.
7
5 Commonly Used Continuous Distributions
5.1 Survival Functions and Failure Rates
Denition: The survival function of a continuous random variable X is dened by
S(t) = P(T > t) = 1 F(t).
Denition: Let X be a random variable with density function f(x), and cdf F(x). The failure rate
function (x) is dened by
(x) =
f(x)
1 F(x)
=
f(x)
S(x)
.
Let X be a random variable taking values in [0, ). Then, S(x) = e

R
x
0
(u)du
.
Let X be a random variable taking values in [0, ). Then, E(X) =
_

0
S(x)dx =
_

0
(1 F(x))dx.
5.2 The Uniform Distribution
Denition: A continuous random variable X is a uniform random variable on [a,b] if its pdf
f(x) =
_

_
1
ba
a x b
0 otherwise.
Its cdf is
F(x) =
_

_
0 x a
xa
ba
x (a, b)
1 x b
E(X
n
) =
b
n+1
a
n+1
(n+1)(ba)
.
V ar(X) =
(ba)
2
12
.
M
X
(t) =
e
bt
=e
at
(ba)t
.
The median of X is m =
a+b
2
.
Usage: Can be used as a simple lifetime model.
8
5.3 The Exponential Distribution

0
x
n
e
ax
dx =
n!
a
n+1
=
(n+1)
a
n+1
for a > 0 and n a non-negative integer.
Denition: A continuous random variable X is an exponential random variable with parameter
> 0 if its pdf
f(t) = e
t
, t 0.
Its cdf is
F(t) = 1 e
t
, t 0.
S(t) = 1 F(t) = e
t
, t 0.
E(X
n
) =
_

0
x
n
e
x
dx =
n!

n
=
(n+1)

n
V ar(X) =
1

2
.
M
X
(t) =

t
, t < .
Lack of memory property: P[X > x +y|X > x] = P[X > y].
Usage: If the number of events in a time period of length 1 is Poisson with parameter , and the number
of events in a time period of length t is Poisson with parameter t, then the waiting time between events
between events is an exponential distribution with parameter .
Conversely, if X is used to model the time between successive events, then the number of events in t
units of time will have a Poisson distribution with parameter t.
5.4 The Gamma Distribution
() =
_

0
x
1
e
x
dx for > 0.
() = ( 1)( 1).
(n) = (n 1)! for any positive integer n.

_
1
2
_
=

.
Denition: A continuous random variable X has a Gamma Distribution with parameters (, ),
, > 0, if its pdf is
f(x) =

x
1
e
x
()
, x [0, ).
E(X
n
) =
Q
n
j=1
(+nj)

n
.
V ar(X) =

2
.
9
M
X
(t) =
_

t
_

for t < .
: Usage: If {X
j
}
n
j=1
are independent and i.i.d. continuous random variables with density f(x) =
e
x
, then S
n
=

n
j=1
X
j
has a Gamma distribution with parameters (n, ). This can be used to
model the waiting time for the n
th
occurrence of an event if successive occurrences are independent.
5.5 The Normal Distribution
Denition: A random variable X has a normal distribution if its pdf f(x) is of the form
f(x) =
1

2
e

(x)
2
2
2
, x (, ).
M
X
(t) = e
t+

2
t
2
2
.
Let X be a normal random variable with mean and standard deviation . Then, the transformed
random variable Y = aX +b is also normal with mean a +b and standard deviation |a|.
Z =
X

is a standard normal distribution (E(Z) = 0, V ar(Z) =


2
).
P[x
1
X x
2
] = P[z
1
Z z
2
] where z
j
=
x
j

.
E(Z
n
) = 0 for n odd. E(Z
2n
) =
(2n)!
2
n
n!

2n
.
5.6 The Central Limit Theorem
Let {X
j
}
n
j=1
be a sequence of independent i.i.d. random variables, with mean and variance
2
. If n
is large, then the sum S
n
=

n
j=1
X
j
will be approximately normal with mean n and variance n
2
.
In evaluating P[x
1
X x
2
] using the continuity correction, we calculate P[x
1
0.5 X
x
2
+ 0.5]. Only use this if asked to in the exam, or if is sufciently small such that
0.5

would change
the second place in the z-score.
5.7 The Lognormal Distribution
A random variable Y is lognormal if Y = e
X
for some normal random variable X with mean and
standard deviation . In this case, the pdf f(y) is given by
f(y) =
1
y

2
e

1
2
(
ln y

)
2
, y 0.
E(X) = e
+
1
2

2
.
10
V ar(X) = (e

2
1)e
2+
2
.
F
Y
(c) = F
X
(lnc).
Usage: Can be used to model insurance claim severity or investment returns.
5.8 The Pareto Distribution
A random variable X has a Pareto distribution with parameters (, ) if its pdf has the form
f(x) =

x
_
+1
, > 2, x > 0.
Its cdf is given by
F(x) = 1
_

x
_

, > 2, x > 0.
E(X) =

1
.
V ar(X) =

2
(2)(1)
2
.
(x) =

x
.
Usage: Used to model certain insurance loss amounts.
5.9 The Weibull Distribution
A random variable X has a Weibull distribution with parameters (, ), , > 0, if its pdf f(x) has
the form
f(x) = x
1
e
x

, x 0.
Its cdf is given by F(x) = 1 e
x

, x 0.
E(X) =
(1+
1

.
V ar(X) =
1

_
(1 +
2

) (1 +
1

)
2

.
(x) = (x
1
).
Usage: If the failure rate is constant, one might decide to use an exponential distribution. If the failure
rate increases with time, then the Weibull distribution can be useful.
11
5.10 The Beta Distribution
A random variable X has a Beta distribution with parameters (, ), , > 0, if its pdf has the form
f(x) =
( +)
()()
x
1
(1 x)
1
, 0 < x < 1.

_
1
0
x
1
(1 x)
1
dx =
()()
(+)
.
E(X) =

+
.
V ar(X) =

(+)
2
(++1)
.
Usage: Can be used to model random variables whose outcomes are percentages.
5.11 The Chi-Square Distribution with k Degrees of Freedom
This is a special case of the Gamma distribution with =
1
2
, =
k
2
.
12
6 Multivariate Distributions
6.1 Joint and Marginal Distribution Functions
Denition: Let X and Y be discrete random variables. The joint probability function for X and Y
is the function
p(s, y) = P(X = x, Y = y).
Denition: The individual distributions for the random variables X and Y are called marginal distri-
butions. Their marginal probability functions are dened, respectively, by
p
X
(x) =

yS
p(x, y)
p
Y
(y) =

xS
p(x, y).
Denition: Let X and Y be continuous random variables. The joint probability density function for
X and Y is a measurable function f(x, y) satisfying the following properties:
1. f(x, y) 0 for all x and y.
2. For and Borel set A, P[(X, Y ) A] =
_
A
f(x, y)dxdy.
Denition: Let f(x, y) be the joint density function for the continuous random variables X and Y .
Then, the marginal density functions of X and Y are dened by
f
X
(x) =
_

f(x, y)dy
f
Y
(y) =
_

f(x, y)dx.
Denition: The cumulative distribution function of a joint distribution is
F(x, y) = P[(X x) (Y y)].
When X and Y are continuous:
F(x, y) =
_
x

_
y

f(s, t)dtds
and
f(x, y) =

2
xy
F(x, y).
When X and Y are discrete:
F(x, y) =
x

f(s, t).
13
When X and Y are continuous random variables, F
X
(x) = lim
y
F(x, y).
Let X and Y be given. If U = u(X, Y ), V = v(X, Y ), and x = h(u(x, y), v(x, y)) and y =
k(u(x, y), v(x, y)) are the inverse functions, then the joint density of U and V is
g(u, v) = f(h(u, v), k(u, v))

h
u
k

k
u

.
6.2 Conditional Distributions
Denition: Let X and Y be discrete random variables. The conditional probability function of X
given that Y = y is given by
P(X = x|Y = y) = p(x|y) =
p(x, y)
p
Y
(y)
.
Similarly, the conditional probability function of Y given that X = x is given by
P(Y = y|X = x) = p(y|x) =
p(x, y)
p
X
(x)
.
Denition: Let X and Y be continuous random variables with joint density function f(x, y). The
conditional density function for X given that Y = y is given by
f(x|Y = y) = f(x|y) =
f(x, y)
f
Y
(y)
.
Similarly, the conditional density for Y given that X = x is given by
f(y|X = x) = f(y|x) =
f(x, y)
f
X
(x)
.
6.3 Conditional Expected Value
Denition: Let X and Y be discrete random variables with conditional probability functions p(x|y)
and p(y|x). Then, the conditional expectation of Y given that X = x is given by
E(Y |X = x) =

yS
p(y|x).
Similarly, the conditional expectation of X given that Y = y is given by
E(X|Y = y) =

xS
xp(x|y).
Let X and Y be continuous random variables, with conditional density functions f(x|y) and f(y|x).
Then, the conditional expectation of Y given that X = x is given by
E(Y |X = x) =
_

yf(y|x)dy.
14
Similarly, the conditional expectation of X given that Y = y is given by
E(X|Y = y) =
_

xf(x|y)dx.
6.4 Independence of Random Variables
Denition: Two discrete random variables X and Y are independent if
p(x, y) = p
X
(x)p
Y
(y), or equivalently F(x, y) = F
X
(x)F
Y
(y).
If X and Y are discrete and independent random variables, then
p(x|y) = p
X
(x), p(y|x) = p
Y
(y).
Denition: Two continuous random variables X and Y are independent if
f(x, y) = f
X
(x)f
Y
(y), or equivalently F(x, y) = F
X
(x)F
Y
(y).
If X and Y are continuous and independent random variables, then
f(x|y) = f
X
(x), f(y|x) = f
Y
(y).
Denition: Suppose that an experiment has k possible outcomes with probabilities p
1
, . . . , p
k
respec-
tively. If the experiment is performed n successive times (independently), let X
i
denote the number of
experiments that resulted in outcome i, so that X
1
+ + X
k
= n. The multinomial probability
function is
P[X
1
= n
1
, . . . , X
k
= n
k
] =
n!
n
1
! n
k
!
p
n
1
1
p
n
k
k
.
For each i {1, 2, . . . , n}, X
i
is a random variable with
E[X
i
] = np
i
, V ar[X
i
] = np
i
(1 p
i
).
Also,
Cov(X
i
, X
j
) = np
i
p
j
.
6.5 Covariance and Functions of Independent Random Variables
Let X and Y be discrete, independent, random variables, with S = X +Y . Then,
p
S
(s) =

xS
p
X
(x)p
Y
(s x).
15
Let X and Y be continuous, independent, random variables, with S = X +Y . Then,
f
S
(s) =
_

f
X
(x)f
Y
(s x)dx.
Let X and Y be independent, exponential random variables with parameters and respectively. If
M = min(X, Y ), then M is an exponential random variable with parameter +.
Let X and Y be independent random variables. Then,
S
min
(t) = S
X
(t)S
Y
(t)
F
max
(t) = F
X
(t)F
Y
(t).
E[h(X)] =
_

h(x)f(x)dx, where the integral is replaced with summation when X is discrete.


E(h(X, Y )) =
_

h(x, y)f(x, y)dxdy where the integral is replaced with summation in the
discrete case.
Let X and Y be independent random variables. Then,
E(XY ) = E(X)E(Y ).
If g and h are any measurable functions, and X and Y are independent E[g(X)h(Y )] = E[g(X)]E[h(Y )].
Denition: Let X and Y be random variables. The covariance of X and Y is dened by
Cov(X, Y ) = E[(X
X
)(Y
Y
)].
Alternatively,
Cov(X, Y ) = E(X, Y ) E(X)E(Y ).
In particular, if X and Y are independent,
Cov(X, Y ) = 0.
V ar(aX +bY +c) = a
2
V (X) +b
2
V (Y ) + 2ab Cov(X, Y ).
Cov(aX+bY +c, dW +eZ+f) = adCov(X, W) +aeCov(X, Z) +bdCov(Y, W) +beCov(Y, Z).
Denition: Let X and Y be random variables. The correlation coefcient between X and Y is
dened by

XY
=
Cov(X, Y )

Y
=
Cov(X, Y )
_
V (X)V (Y )
.
If X and Y are independent random variables, M
X+Y
(t) = M
X
(t)M
Y
(t).
16
Denition: Let X and Y be random variables. The joint moment generating function is dened by
M
X,Y
(s, t) = E(e
sX+tY
).
Let {X
i
}
n
i=1
be random variables. Then,
E
_
_
n

j=1
X
j
_
_
=
n

j=1
E(X
j
)
V
_
_
n

j=1
X
j
_
_
=
n

j=1
V (X
j
) + 2

i<j
Cov(X
i
, X
j
).
E[E(X|Y )] = E(X), E[E(Y |X)] = E(Y ).
V (X) = E[V (X|Y )] +V [E(X|Y )], V (Y ) = E[V (Y |X)] +V [E(Y |X)].
Let N be a Poisson randomvariable with parameter . If {X
j
} are independent, i.i.d. randomvariables,
with S = X
1
+ +X
N
, then
E(S) = E(N) E(X) = E(X)
V (S) = E(X
2
) = [V (X) + (E(X))
2
].
6.6 Sums of Particular Distributions
Assume X
1
, . . . , X
k
are independent, i.i.d., Y = X
1
+ +X
k
.
X
i
Bernoulli=Y Binomial.
X
i
Binomial with parameter n
i
=Y Binomial with parameter

n
i
.
X
i
Poisson with parameter
i
=Y Poisson with parameter

i
.
X
i
Geometric with parameter p =Y Negative Binomial with parameters (k, p).
X
i
Negative Binomial with parameters (r
i
, p) =Y Negative Binomial with parameters (

r
i
, p).
X
i
Normal with parameters (
i
,
2
i
) =Y Normal with parameters (

i
,

2
i
).
X
i
Exponential with mean =Y with parameters (k,
1

).
X
i
with parameters (
i
, ) =Y with parameters (

i
, ).
17
6.7 Order Statistics
Denition: Suppose that X has pdf f(x) and cdf F(x). Let {X
i
}
n
i=1
be a collection of independent
i.i.d. random variables. The order statistics of {X
i
}
n
i=1
are {Y
i
}
n
i=1
where the Y
i
s are ordered from
smallest to largest, and
g
k
(t) =
n!
(k 1)!(n k)!
[F(t)]
k1
[1 F(t)]
nk
f(t)
g
1
(t) = n[1 F(t)]
n1
f(t)
g
n
(t) = n[F(t)]
n1
f(t)
where g
i
is the pdf of Y
i
.
6.8 Mixtures of Distributions
Denition: Let X
1
and X
2
be random variables with pdfs f
1
(x) and f
2
(x), and 0 < a < 1. Dene a
new random variable X called the mixture of X
1
and X
2
with pdf
f(x) = af
1
(x) + (1 a)f
2
(x).
E(X
k
) = aE(X
k
1
) + (1 a)E(X
k
2
).
F(x) = aF
1
(x) + (1 a)F
2
(x).
M
X
(t) = aM
X
1
(t) + (1 a)M
X
2
(t).
18
7 Insurance Terminology
7.1 Insurance Policy Deductible
If X represents a loss random variable with pdf f
X
(x) and cdf F
X
(x), then for an insurance policy
with an ordinary deductible of amount d, the insurance will pay
Y =
_

_
0 X d
X d X > d
= Max{X d, 0}.
When a loss occurs, the expected amount paid by the insurance may be called the expected cost per
loss, and is equal to
E[Y ] =
_

d
(x d)f
X
(x)dx =
_

d
(1 F
X
(x))dx.
The expected cost per payment is the average amount paid by the insurance for the non-zero payments
that are made. This is _

d
(x d)f
X
(x)dx
1 F
X
(d)
.
7.2 Insurance Policy Limit
If X represents a loss random variable with pdf f
X
(x) and cdf F
X
(x), then for an insurance policy
with a policy limit of amount u, when a loss occurs, the amount paid by the insurance is
Z =
_

_
X X u
u X > u
= Min{X, u}.
The average amount paid by the insurance when a loss occurs is
E[Z] =
_
u
0
xf
X
(x)dx +u[1 F
X
(u)] =
_
u
0
[1 F
X
(x)]dx.
If insurance policy 1 has a deductible of c and insurance policy 2 has a limit of c, then when a loss
occurs, the combined payment of the two policies is Y +Z = X so that the two policies combined cover
the loss X.
19
7.3 Combined Policy Limit and Deductible
If the loss random variable is X and a policy has a deductible of amount d and maximum payment of
u d, then the policy pays
_

_
0 X d
X d d < X u
u d X > u
The expected cost per loss will be
_
t
d
(x d)f
X
(x)dx =
_
t
d
[1 F
X
(x)]dx.
20

Das könnte Ihnen auch gefallen