Sie sind auf Seite 1von 78

Chapter 4

Jointly distributed Random variables


Multivariate distributions
Conditional distributions

,
X
y
p x p x y !


,
Y
x
p y p x y !

Marginal distributions



,
Y X
X
p x y
p y x
p x
!


,
X Y
Y
p x y
p x y
p y
!
For a Discrete RV, the joint probability function:
p(x,y) = P[X = x, Y = y]
Joint probability function
Conditional distributions

,
X
f x f x y dy
g
g
!

Marginal distributions



,
Y X
X
f x y
f y x
f x
!



,
X Y
Y
f x y
f x y
f y
!

,
Y
f y f x y dx
g
g
!

For a Continuous RV, the joint probability function:


f(x,y) = Pf[X = x, Y = y]
Joint probability function
Definition: Independence
Two random variables X and Y are defined to be independent if






,
:
X Y
Y Y X
X X
p x y p x p y
p y x p y
p x p x
! ! ! Note






,
X Y
X X Y
Y Y
p x y p x p y
p x y p x
p y p y
! ! !

,
X Y
p x y p x p y !
if X and Y are discrete

,
X Y
f x y f x f y ! if X and Y are continuous
Thus, in the case of independence
marginal distributions conditional distributions
Independence
The Multiplicative Rule for densities



,
X Y X
Y X Y
p x p y x
p x y
p y p x y

if X and Y are discrete


if X and Y are continuous

if and are independent
X Y
p x p y X Y !



,
X Y X
Y X Y
f x f y x
f x y
f y f x y


if and are independent
X Y
f x f y X Y !
Proof:
This follows from the definition for conditional densities:



,

Y X
X
p x y
p y x
p x
!
The same is true for continuous random variables.
Hence

,
X Y X
p x y p x p y x !



,

X Y
Y
p x y
p x y
p y
!
and

,
Y X Y
p x y p y p x y !
The Multiplicative Rule for densities
Suppose that a rectangle is constructed by first choosing its length, X
and then choosing its width Y.
Its length X is selected form an exponential distribution with mean
=
1
/

= 5. Once the length has been chosen its width, Y, is selected


from a uniform distribution form 0 to half its length.
Find the probability that its area A = XY is less than 4.
Solution:
Finding CDFs (or pdfs) is difficult - Example

1
5 1
5
for 0
x
X
f x e x

! u

1
if 0 2
2
Y X
f y x y x
x
! e e


,
X
Y X
f x y f x f y x !
1 1
5 5 1 2
5 5
1
= if 0 2, 0
2
x x
x
e e y x x
x

! e e u
xy = 4
y = x/2
2 4 y x x ! !
2
8 or 8 2 2 x x ! ! !
2 2 2 2 2 y x ! ! !

2 2, 2
Finding CDFs (or pdfs) is difficult - Example

2 2, 2
? A
4
2 2
2
0 0 0 2 2
4 , ,
x
x
P XY f x y dydx f x y dydx
g
e !

Finding CDFs (or pdfs) is difficult - Example
? A
4
2 2
2
0 0 0
2 2
4 , ,
x
x
P XY f x y dydx f x y dydx
g
e !

1 1
5 5
4
2 2
2
2 2
5 5
0 0 0
2 2
x
x
x x
x x
e dydx e dydx
g

!

1 1
5 5
2 2
2 2 4
5 2 5
0 2 2
x x
x
x x x
e dx e dx
g

!

1 1
5 5
2 2
2
8 1
5 5
0 2 2
x x
e dx x e dx
g

!

This part can be
evaluated
This part may require Numerical
evaluation
Finding CDFs (or pdfs) is difficult - Example
Functions of Random Variables
Methods for determining the distribution of
functions of Random Variables
Given some random variable X, we want to study some function h(X).
X is a transformation from (, ,P) into the probability space (, A
X
,
P
X
). Now, Y is a transformation from (, A
X
, P
X
) into the probability
space (, A
Y
, P
Y
),
That is, we need to discover how the probability measure P
Y
relates to
the measure P
X
.
For some F F , we have that
P
Y
[F] = P
X
{x : Y(x) F} = P{ : g(X() F}=
= P{ : X() h
-1
(F)}= P
X
[h
-1
(F)}=
Methods for determining the distribution of
functions of Random Variables
With non-transformed variables, we step "backwards" from the values
of X to the set of events in . In the transformed case, we take two
steps backwards: 1) once from the range of the transformation back to
the values of the X, and 2) again back to the set of events in .
Potential problem: The transformation h(x) --we need to work with
h
-1
(x)-- may not yield unique results if h(X) is not monotonic.
Methods to get P
Y
:
1. Distribution function method
2. Transformation method
3. Moment generating function method
Method 1: Distribution function method
Let X, Y, Z . have joint density f(x,y,z, )
Let W = h( X, Y, Z, )
First step
Find the distribution function of W
G(w) = P[W w] = P[h( X, Y, Z, ) w]
Second step
Find the density function of W
g(w) = G'(w).
Distribution function method: Example 1
Let X have a normal distribution with mean 0, and variance 1 -i.e., a
standard normal distribution. That is:
Let W = X
2.
. Find the distribution of W.

2
2
1
2
x
f x e
T

!
First step
Find the distribution function of W
G(w) = P[W w] = P[ X
2
w]
if 0 P w X w w

! e e u

2
2
1
2
w x
w
e dx
T


F w F w !
where

2
2
1
2
x
F x f x e
T

d
! !


d w
d w
F w F w
dw dw

d d
!
Second step
Find the density function of W
g(w) = G'(w).
1 1
2 2 2 2
1 1 1 1
2 2
2 2
w w
e w e w
T T

!

1 1
2 2
1 1
2 2
f w w f w w

!
1
2 2
1
if 0.
2
w
w e w
T

! u
Thus if X has a standard Normal distribution then
W = X
2
has density

1
2 2
1
if 0.
2
w
g w w e w
T

! u
This distribution is the Gamma distribution with E = and = .
This distribution is the
l
2
distribution with 1 degree of freedom.
Using the additive properties of a gamma distribution, the sum of T
independent
l
2
RVs produces a
1
2
distributed RV. Alternatively, the
sum of T independent N(0,1) RVs produces a
1
2
distributed RV.
Note: If we add T independent N(
i
,
i
2
) RVs, the
i
(X
i
/
i
)
2
follows a
non-central
1
2
distribution, with non-centrality parameter
i
(
i
/
i
)
2
.
This distribution is common in the power analysis of statistical tests in
which the null distribution is (asymptotically) a
1
2
.
Distribution function method: Example 2
Suppose that X and Y are independent random variables each having
an exponential distribution with parameter (=> E(X) = 1/)
Find the distribution of W= X + Y.

1
for 0
x
f x e x


! u

2
for 0
y
f y e y


! u

1 2
, f x y f x f y !
2
for 0, 0
x y
e x y


! u u
First step
Find the distribution function of W = X + Y
G(w) = P[W w] = P[ X + Y w]
? A
1 2
0 0
w w x
P X Y w f x f y dydx

e !

2
0 0
w w x
x y
e dydx


!

? A
1 2
0 0
w w x
P X Y w f x f y dydx

e !

2
0 0
w w x
x y
e dydx


!

2
0 0
w w x
x y
e e dy dx



!



2
0
0
w x
w
y
x
e
e dx

0
2
0
w w x
x
e e
e dx

? A
P X Y w e
0
2
0
w w x
x
e e
e dx

0
w
x w
e e dx



!

0
w
x
w
e
xe


0 w
w
e e
we



!




1
w w
e we



!

Second step
Find the density function of W
g(w) = G'(w).
1
w w
d
e we
dw



!

w
w w
dw de
e e w
dw dw




!



2 w w w
e e we




!

2
for 0
w
we w


! u
Hence if X and Y are independent random variables each having
an exponential distribution with parameter then W has density

2
for 0
w
g w we w


! u
This distribution can be recognized to be the Gamma distribution
with parameters E = 2 and
Distribution function method: Example 3
Students t distribution
Let Z and U be two independent random variables with:
1. Z having a Standard Normal distribution -i.e., Z~ N(0,1)
2. U having a
2
distribution with v degrees of freedom
Find the distribution of
Z
t
U
v
!

2
2
1
2
z
f z e
T

!

2
1
2 2
1
2
2
u
h u u e
v
v
v




!

+


Therefore the joint density of Z and U is:
The distribution function of T is:
? A
Z t
G t P T t P t P Z U
U v
v



! e ! e ! e






2
2
1
2 2
1
2
,
2
2
z u
f z u f z h u u e
v
v
v
T





! !

+


2
2
1
2 2
0
1
2
2
2
t
U
z u
u e dzdu
v
v v
v
T
g

g



!

+



G(t)
2
2
1
2 2
0
1
2
2
2
t
U
z u
u e dzdu
v
v v
v
T
g

g



!

+




2
2
1
2 2
0
1
2
( )
2
2
t
u
z u
d
g t G t u e dz du
dt
v
v v
v
T
g

g






d
! !



+




Then:
Using the FTC:

( )
x
a
F x f t dt !

Then

2
2
1
2 2
0
1
2
2
2
t
u
z u
d
g t u e dz du
dt
v
v v
v
T
g

g






!



+




=>

( ) F x f x
d
!
2
2
1
2 2 2
0
1
2
2
2
t u u
u
u e e du
v
v
v
v
v
T
g




!

+


!
b
a
b
a
dx t x F
dt
d
dx t x F
dt
d
) , ( ) , ( Using:
Hence
2
2
1
1
2 2
0
1
2
( )
2
2
t
u
g t u e du
v
v
v
v
T v



g




!

+

Using

1
0
x
x e dx
E
E
E

g

+
!


1
0
1
x
x e dx
E
E

E
g

!
+

=>
2
1
1 2
1
2 2
1
2
0
2
1
2
2
1
t
u
u e du
t
v
v
v
v
v
v


+


!

Thus,
or
1 1
2 2
2 2
1
2
( ) 1 1
2
t t
g t K
v v
v
v
v v
Tv


+



! !



+


1
2
2
K
v
v
Tv


+


!

+


where
This is the Students t distribution. Its denoted as
t ~ t
v
, where is referred as degrees of
freedom (df).
William Gosset (1876 - 1937)
t distribution
standard normal distribution
Example of a t-distributed variable
Let ) 1 , 0 ( ~
) (
/
) (
_ _
N
x
n
n
x
z
W

W

!

!
1
_ _
2
2
_
~
/
) ( ) (
) 1 /(
) 1 (
) (

!
n
t
n s
x
s
x n
n
s n
x
n
t

W
W

2
1
2
2
~
) 1 (

!
n
s n
U
W
Let
Assume that Z and U are independent (later, we will learn how
to check this assumption). Then,
Let x
1
, x
2
, , x
n
denote a sample of size n from the density f(x).
Find the distribution of M = max(x
i
).
Repeat this computation for m = min(x
i
)
Assume that the density is the uniform density from 0 to U. Thus,
1
0
( )
elsewhere
x
f x
U
U

e e

? A
0 0
( ) 0
1
x
x
F x P X x x
x
U
U
U

! e ! e e

"

Distribution function method: Example 4


Distribution of the Max and Min Statistics
Finding the distribution function of M.
? A
( ) max
i
G t P M t P x t ! e ! e

? A
1
, ,
n
P x t x t ! e e
? A ? A
1 n
P x t P x t ! e e
0 0
0
1
n
t
t
t
t
U
U
U


! e e

"

Differentiating we find the density function of M.



1
0
0 otherwise
n
n
nt
t
g t G t
U
U

e e

d
! !

0
0.1
0.2
0.3
0.4
0.5
0.6
0 2 4 6 8 10
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10
f(x) g(t)
Finding the distribution function of m.
? A
( ) min
i
G t P m t P x t ! e ! e

? A
1
1 , ,
n
P x t x t ! " "
? A ? A
1
1
n
P x t P x t ! " "
0 0
1 1 0
1
n
t
t
t
t
U
U
U


! e e

"

0
0.1
0.2
0.3
0.4
0.5
0.6
0 2 4 6 8 10
Differentiating we find the density function of m.

1
1 0
0 otherwise
n
n t
t
g t G t
U
U U


e e


d
! !

0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10
f(x)
g(t)
The probability integral transformation
This transformation allows one to convert observations that come
from a uniform distribution from 0 to 1 to observations that come
from an arbitrary distribution.
Let U denote an observation having a uniform distribution from 0
to 1.
Let f(x) denote an arbitrary density function and F(x) its
corresponding cumulative distribution function. Let X=F
-
(U)
We want to find the distribution of X.
1 0 1
( )
elsewhere
u
g u
e e

Find the distribution of X.


? A
1
( ) G x P X x P F U x


! e ! e


P U F x ! e


F x !
Hence:

g x G x F x f x
d d
! ! !
Thus if U ~ Uniform distribution in [0, 1], then,
X=F
-1
(U) has density f(x).
U
1
( ) X F U

!
Method 2: The Transformation Method
Theorem
Let X denote a random variable with probability density
function f(x) and U = h(X).
Assume that h(x) is either strictly increasing (or decreasing)
then the probability density of U is:



1
1
( )
( )
dh u dx
g u f h u f x
du du

! !
Proof
Use the distribution function method.
Step 1 Find the distribution function, G(u)
Step 2 Differentiate G (u ) to find the probability density function
g(u)
? A
G u P U u P h X u ! e ! e

1
1
( ) strictly increasing
( ) strictly decreasing
P X h u h
P X h u h


e

!


u



1
1
( ) strictly increasing
1 ( ) strictly decreasing
F h u h
F h u h

Thus,

g u G u
d
!






1
1
1
1
strictly increasing
strictly decreasing
dh u
F h u h
du
dh u
F h u h
du


g u G u
d
!






1
1
1
1
strictly increasing
strictly decreasing
dh u
F h u h
du
dh u
F h u h
du




1
1
( )
( )
dh u dx
g u f h u f x
du du

! !
Or,
Example: Log Normal Distribution
Suppose that X has a Normal distribution with mean and
variance W
2
. Find the distribution of U = h(x) = e
X
.
Solution: Recall transformation formula:


2
2
2
1
2
x
f x e

W
TW

!


1
1
ln
1
ln and
dh u d u
h u u
du du u

! ! !



1
1
( )
( )
dh u dx
g u f h u f x
du du

! !
Replacing in transformation formula:



1
1
( )
( )
dh u dx
g u f h u f x
du du

! !

2
2
ln
2
1 1
for 0
2
u
e u
u

W
TW

! "
Since the distribution of log(U) is normal, the distribution of
U is called the log-normal distribution.
Note: It is easy to obtain the mean of a log-normal variable.

g

g
! !
0
2
) ) (ln(
0
2
2
2
1
) ( ] [ du e du u ug u E
u
W

W T
Substitution:
y=(ln(u) - )/=>u = e
y+
dy =(1/u)du => du= e
y+
dy.
Then,

g

g
! !
0
2
) ) (ln(
0
2
2
2
1
) ( ] [ du e du u ug u E
u
W

W T



g
g

g
g
g
g

g
g

! !
! !
! !
2 2 2 2
2 2
2
0
2
) ) (ln(
2 2 2 2
2 2
2
2
2
2
1
2
1
2
1
2
1
2
1
] [
W

W
W
W

W
W
W
W

T
T
W
W T
W T W T
e dy e e e
dy e e dy e e
dy e e du e u E
y
y
y
y
y
y
y
y
u
Example: Under the Black-Scholes model, in a short period of time
of length (t, the return on the stock is normally distributed.
Consider a stock whose price is S
where is expected return and is volatility. It follows from this
assumption that (we derived this result in Chapter 14)
Since the logarithm of S
T
is normal, S
T
is log-normally distributed.
t t, N ~
S
S
( (
(
t - T t], - [T
2

lnS N ~ lnS
or , t - T t], - [T
2

N ~ lnS lnS
2
t T
2
t T


Graph: Log-normal distribution
Method 3: Use of moment
generating functions
Review: Moment Generating Function - Definition
Let X denote a random variable with probability density function f(x) if
continuous (probability mass function p(x) if discrete). Then
m
X
(t) = the moment generating function of X

tX
E e !


if is continuous
if is discrete
tx
tx
x
e f x dx X
e p x X
g
g

The distribution of a random variable X is described by either


1. The density function f(x) if X continuous (probability mass
function p(x) if X discrete), or
2. The cumulative distribution function F(x), or
3. The moment generating function m
X
(t)
Review: Moment Generating Function - Properties
1. m
X
(0) = 1


0 derivative of at 0.
k th
X X
m k m t t ! !
2.

k
k
E X ! !

2 3
3 2
1
1 .
2! 3! !
k
k
X
m t t t t t
k

!
3.
4. Let X be a RV with MGF m
X
(t). Let Y = bX + a
Then, m
Y
(t) = m
bX + a
(t) = E(e
[bX + a]t
) = e
at
m
X
(bt)



continuous
discrete
k
k
k
k
x f x dx X
E X
x p x X

! !

5. Let X and Y be two independent random variables with moment


generating function m
X
(t) and m
Y
(t) .
Then m
X+Y
(t) = m
X
(t) m
Y
(t)
6. Let X and Y be two random variables with moment generating
function m
X
(t) and m
Y
(t) and two distribution functions F
X
(x) and
F
Y
(y) respectively.
Let m
X
(t) = m
Y
(t), then F
X
(x) = F
Y
(x).
This ensures that the distribution of a random variable can be identified
by its moment generating function
Review: Moment Generating Function - Properties
Using of moment generating
functions to find the distribution of
functions of Random Variables
Example: Sum of 2 independent exponentials
= MGF of the gamma distribution with =2.
Suppose that X and Y are two independent distributed exponential
random variables with pdf s given by
f(x) = e
-x
f(y) = e
-y
Solution:
1
1
) 1 ( ) (
) 1 ( ) ( ) (

!
!

t
t m
t
t
t m
Y
X
2 1 1
) 1 ( ) 1 ( ) 1 ( ) ( ) ( ) (

! ! !

t t t
t m t m t m
X X Y X
Example: Affine Transformation of a Normal

2 2
2
t
t
X
m t e
W

!



2
2
2
at
at
bt bt
aX b X
m t e m at e e
W

! !

2 2 2
2
a t
a b t
e
W

!
= MGF of the normal distribution with mean a + b and
variance a
2
W
2
.
Thus, Y = aX + b ~ N(a + b, a
2
W
2
).
Suppose that X ~ N(, W

) . What is the distribution of Y = aX + b?


Solution:
Thus Z has a standard normal distribution.
1 X
Z X aX b

W W W


! ! !


1
0
Z
a b


W W


! ! !


2
2 2 2 2
1
1
Z
a W W W
W

! ! !


Special Case: the z transformation
Example: Affine Transformation of a Normal
Example 1: Distribution of X+Y
Suppose that X and Y are independent. Let X ~ N(
X
, W
X
2
) and
Y ~ N(
Y
, W
Y
2
). Find the distribution of S = X + Y.
Solution:

2 2
2
X
X
t
t
X
m t e
W

!

2 2
2
Y
Y
t
t
Y
m t e
W

!

2 2 2 2
2 2
X Y
X Y
t t
t t
X Y X Y
m t m t m t e e
W W

! !
Now,



2 2 2
2
X Y
X Y
t
t
X Y
m t e
W W

!
Thus, Y = X + Y ~ N(
X
+
Y
, W
X
2
+W
Y
2
).
Example 2: Distribution of aX+bY
Suppose that X and Y are independent. Let X ~ N(
X
, W
X
2
) and
Y ~ N(
Y
, W
Y
2
). Find the distribution of L = aX + bY.
Solution:

2 2
2
X
X
t
t
X
m t e
W

!

2 2
2
Y
Y
t
t
Y
m t e
W

!

aX bY aX bY X Y
m t m t m t m at m bt

! !
Now,




2 2
2 2
2 2
X Y
X Y
at bt
at bt
e e
W W

!



2 2 2 2 2
2
X Y
X Y
a b t
a b t
aX bY
m t e
W W

!
Thus, Y = aX + bY ~ N(a
X
+ b
Y
, a
2
W
X
2
+b
2
W
Y
2
).

2 2
2 2 2 2
1 1
X Y X Y
W W W W !
Special Case:
a = +1 and b = -1.
Thus Y = X - Y has a normal distribution with mean
X
-
Y
and
variance:
Example 3: (Extension to n independent RVs)
Suppose that X
1
, X
2
, , X
n
are independent each having a normal
distribution with means
i
, standard deviations W
i
(for i = 1, 2, , n)
Find the distribution of L = a
1
X
1
+ a
1
X
2
+ + a
n
X
n
Solution:

2 2
2
i
i
i
t
t
X
m t e
W

!

1 1 1 1 n n n n
a X a X a X a X
m t m t m t

!

Now




2 2
2 2
1 1
1 1
2 2
n n
n n
a t a t
a t a t
e e
W W

!
(for i = 1, 2, , n)

1
1
n
X X n
m a t m a t !
or



2 2 2 2 2
1 1
1 1
1 1
...
...
2
n n
n n
n n
a a t
a a t
a X a X
m t e
W W




!

= the MGF of the normal distribution


with mean
and variance
Thus Y = a
1
X
1
+ + a
n
X
n
has a normal distribution with mean
a
1

1
+ + a
n

n
and variance
1 1
...
n n
a a
2 2 2 2
1 1
...
n n
a a W W
2 2 2 2
1 1
...
n n
a a W W
1 2
1
n
a a a
n
! ! ! !
1 2 n
! ! ! !
2 2 2 2
1 1 1
W W W W ! ! ! !
In this case X
1
, X
2
, , X
n
is a sample from a normal distribution
with mean , and standard deviations W, and

1 2
1
n
L X X X
n
!
the sample mean X ! !
Special case:
Thus
2 2 2 2 2
1 1
...
x n n
a a W W W !
and variance
1 1
...
x n n
a a !
has a normal distribution with mean
1 1
...
n n
Y x a x a x ! !
1
1 1
...
n
x x
n n
!

1 1
...
n n
! !
2 2 2
2
2 2 2
1 1 1
... n
n n n n
W
W W W

! ! !


Summary
Let x
1
, x
2
, , x
n
be a sample from a normal distribution with mean , and
standard deviations W. Then,
~ N(
X
,
2
/n)
X
0
0.1
0.2
0.3
0.4
20 30 40 50 60
Population
x Distribution of
Let the random variable Y
i
be determined by:
Y
i
= + X
i
+
i
, i=1,2,...,n.
where the X
i
s are (exogenous or pre-determined) numbers, and
constants. Let the random variable
i
follow a normal distribution with
cero mean and constant variance. That is,

i
~ N(0,
2
), i=1,2,...,n.
Then,
E(Y
i
)= E( + X
i
+
i
) = + X
i
Var(Y
i
)= Var( + X
i
+
i
) = Var(
i
) =
2
.
Since Y is a linear function of a normal RV => Y
i
~ N( + X
i
,
2
).
Application: Simple Regression Framework
Central and Non-central Distributions
Noncentrality parameters are parameters of families of probability
distributions which are related to other central families of distributions.
If the noncentrality parameter of a distribution is zero, the
distribution is identical to a distribution in the central family.
For example, the standard Student's t-distribution is the central family
of distributions for the Noncentral t-distribution family.
Noncentrality parameters are used in the following distributions:
- t-distribution
- F-distribution
-
2
distribution
- distribution
- Beta distribution
In general, noncentrality parameters occur in distributions that are
transformations of a normal distribution. The central versions are
derived from zero-mean normal distributions; the noncentral versions
generalize to arbitrary means.
Example: The standard (central)
2
distribution is the distribution of a
sum of squared independent N(0, 1). The noncentral
2
distribution
generalizes this to N( , W
2
).
There are extended versions of these distributions with two
noncentrality parameters: the doubly noncentral beta distribution, the
doubly noncentral F-distribution and the doubly noncentral t-
distribution.
Central and Non-central Distributions
There are extended versions of these distributions with two
noncentrality parameters. This happens for distributions that are
defined as the quotient of two independent distributions.
When both source distributions are central, the result is a central
distribution. When one is noncentral, a (singly) noncentral distribution
results, while if both are noncentral, the result is a doubly noncentral
distribution.
Example: A t-distribution is defined (ignoring constants) as the ratio
of a N(0,1) RV and the square root of an independent
2
RV. Both
RV can have noncentral parameters. This produces a doubly
noncentral t-distribution.
Central and Non-central Distributions
The proof of the CLT is very simple using moment generating
functions. We will rely on the following result:
Let m
1
(t), m
2
(t), , be a sequence of moment generating functions
corresponding to the sequence of distribution functions:F
1
(x) , F
2
(x),
Let m(t) be a moment generating function corresponding to the
distribution function F(x).
The Central Limit Theorem (CLT): Preliminaries

lim for all in an interval about 0.
i
i
m t m t t
pg
!

lim for all .
i
i
F x F x x
pg
!
then
Then, if
Let x
1
, x
2
, , x
n
be a sequence of independent and identically
distributed RVs with finite mean , and finite variance W

Then as n
increases, , the sample mean, approaches the normal distribution with
mean and variance W

n.
This theorem is sometimes stated as
where means the limiting distribution (asymptotic distribution) is
(or convergence in distribution).
Note: This CLT is sometimes referred as the Lindeberg-Lvy CLT.
The Central Limit Theorem (CLT)
_
x
) 1 , 0 (
) (
_
N
x n
d
p

d
p
Proof: (use moment generating functions)
Let x
1
, x
2
, be a sequence of independent random variables from a
distribution with moment generating function m(t) and CDF F(x).
Let S
n
= x
1
+ x
2
+ + x
n
then

1 2 1 2
=
n n n
S x x x x x x
m t m t m t m t m t

!


=
n
m t

1 2
now
n n
x x x S
x
n n

! !


1
or
n
n
n
x S
S
n
t t
m t m t m m
n n





! ! !



Let
x n n
z x
n

W
W W

! !

then
n
n n
t t
z x
nt nt
m t e m e m
n

W W
W W



! !






and ln ln
z
n t
m t t n m
n

W
W


!




2
2 2
Let or and
t t t
u n n
u u
n
W W
W
! ! !

2 2
2 2 2
ln
t t
m u
u u

W W
!

ln [m
z
(t)]

2
2 2
ln m u u
t
u

W


!




0
Now lim ln lim ln
z z
n u
m t m t
pg p
!


2
2 2
0
ln
lim
u
m u u
t
u

W
p


!


2
2
0
lim using L'Hopital's rule
2
u
m u
m u
t
u

W
p
d

!


2
2
2
2
0
lim using L'Hopital's rule again
2
u
m u m u m u
m u
t
W
p
dd d



!

2
2 2
ln m u u
t
u

W


!
ln [m
z
(t)]

2
2
2
0 0
2
m m
t
W
dd d

!


2
2
2 2
2
2 2
i i
E x E x
t t
W


! !



2
2
2
thus lim ln and lim
2
t
z z
n n
t
m t m t e
pg pg
! !


2
2
Now
t
m t e !
This is the moment generating function of the standard normal
distribution.
Thus the limiting distribution of z is the standard normal
distribution.

2
2
1
i.e. lim
2
x u
z
n
F x e du
T

pg
g
!

Q.E.D.
CLT: Asymptotic Distribution
The CLT states that the asymptotic distribution of the sample mean is a
normal distribution.
Q: What is meant by asymptotic distribution?
In general, the asymptotic distribution of X
n
means a non-constant
random variable X, along with real-number sequences {a
n
} and {b
n
}, such
that a
n
(X
n
b
n
) has as limiting distribution the distribution of X.
In this case, the distribution of X might be referred to as the asymptotic or
limiting distribution of either X
n
or of a
n
(X
n
b
n
), depending on the
context.
The real number sequence {a
n
} plays the role of a stabilizing
transformation to make sure the transformed RV i.e., a
n
(X
n
b
n
) does not
have zero variance as n .
Example: has zero variance as n . But, n
1/2
has a finite variance,
2
.
_
x
_
x
CLT: Remarks
The CLT gives only an asymptotic distribution. As an approximation for a
finite number of observations, it provides a reasonable approximation only
when the observations are close to the mean; it requires a very large
number of observations to stretch into the tails.
The CLT also applies in the case of sequences that are not identically
distributed. Extra conditions need to be imposed on the RVs.
Lindeberg found a condition on the sequence {X
n
}, which guarantees that
the distribution of S
n
is asymptotically normally distributed. W. Feller
showed that Lindeberg's condition is necessary as well (if the condition
does not hold, then the sum S
n
is not asymptotically normally distributed).
A sufficient condition that is stronger (but easier to state) than
Lindeberg's condition is that there exists a constant A, such that |X
n
|< A
for all n.
Jarl W. Lindeberg (1876 1932)
Vilibald S. (Willy) Feller (1906-1970)
Paul Pierre Lvy (18861971)

Das könnte Ihnen auch gefallen