Sie sind auf Seite 1von 9

Lecture 8: Characteristic Functions

1 of 9

Course:
Theory of Probability I
Term:
Fall 2013
Instructor: Gordan Zitkovic

Lecture 8
Characteristic Functions

First properties
A characteristic function is simply the Fourier transform, in probabilistic language. Since we will be integrating complex-valued functions,
we define (both integrals on the right need to exist)
Z

f d =

< f d + i

= f d,

where < f and = f denote the real and the imaginary part of a function
f : R C. The reader will easily figure out which properties of the
integral transfer from the real case.
Definition 8.1. The characteristic function of a probability measure
on B(R) is the function : R C given by
(t) =

eitx (dx )

When we speak of the characteristic function X of a random variable X, we have the characteristic function X of its distribution X
in mind. Note, moreover, that
X (t) = E[eitX ].
While difficult to visualize, characteristic functions can be used to
learn a lot about the random variables they correspond to. We start
with some properties which follow directly from the definition:
Proposition 8.2. Let X, Y and { Xn }nN be a random variables.
1. X (0) = 1 and | X (t)| 1, for all t.
2. X (t) = X (t), where bar denotes complex conjugation.
3. X is uniformly continuous.
4. If X and Y are independent, then X +Y = X Y .
Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

2 of 9

5. For all t1 < t2 < < tn , the matrix A = ( aij )1i,jn given by
a jk = X (t j tk ),
is Hermitian and positive semi-definite, i.e., A = A and T A 0, for
any Cn ,
D

6. If Xn X, then Xn (t) X (t), for each t R.

Note: We do not prove (or use) it in


these notes, but it can be shown that
a function : R C, continuous at
the origin with (0) = 1 is a characteristic function of some probability measure on B(R) if and only if it is positive semidefinite, i.e., if it satisfies part
5. of Proposition 8.2. This is known as
Bochners theorem.

Proof.
1. Immediate.
2. eitx = eitx .
R

3. We have | X (t) X (s)| = (eitx eisx ) (dx ) h(t s), where

iux

R iux
h(u) = e 1 (dx ). Since e 1 2, dominated convergence theorem implies that limu0 h(u) = 0, and, so, X is uniformly continuous.
4. Independence of X and Y implies the independence of exp(itX ) and
exp(itY ). Therefore,
X +Y (t) = E[eit(X +Y ) ] = E[eitX eitY ] = E[eitX ]E[eitY ] = X (t) Y (t).
5. The matrix A is Hermitian by 2. above. To see that it is positive
semidefinite, note that a jk = E[eit j X eitk X ], and so

!
!
n
n n
n
k eitk X
j k a jk = E j eitj X
j =1 k =1

j =1

k =1

= E[| j eit j X |2 ] 0.
j =1

6. The functions x 7 cos(tx ) and x 7 sin(tx ) and bounded and


continuous so it suffices to apply the definition of weak convergence.
Here is a simple problem you can use to test your understanding
of the definitions:
Problem 8.1. Let and be two probability measures on B(R), and
let and be their characteristic functions. Show that Parsevals
Identity holds:
Z
R

eits (t) (dt) =

Z
R

(t s) (dt), for all s R.

Our next result shows can be recovered from its characteristic


function :
Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

3 of 9

Theorem 8.3 (Inversion theorem). Let be a probability measure on B(R),


and let = be its characteristic function. Then, for a < b R, we have
Z T
1
lim
2 T
T

(( a, b)) + 21 ({ a, b}) =

eita eitb
(t) dt.
it

(8.1)

Proof. We start by picking a < b and noting that


eita eitb
=
it

Z b
a

eity dy,

so that, by Fubinis theorem, the integral in (8.1) is well-defined:


F ( a, b, T ) =

Z
[ T,T ][ a,b]

where
F ( a, b, T ) =

exp(ity) (t) dy dt,

Z T ita
e
eitb

it

(t) dt.

Another use of Fubinis theorem yields:


F ( a, b, T ) =

exp(ity) exp(itx ) dy dt (dx )



=
exp(it(y x )) dy dt (dx )
R
[ T,T ][ a,b]

Z
Z

 
it( a x )
it(b x )
1
e
dt (dx ).
=
it e
[ T,T ][ a,b]R

Z Z

[ T,T ]

Set
Z T

f ( a, b, T ) =

1 it( a x )
it ( e

eit(b x) ) dt and K ( T, c) =

Z T
sin(ct)
0

dt,

and note that, since cos is an even and sin an odd function, we have

Note: The integral

Z T
sin(( a x )t)

Z T

f ( a, b, T; x ) = 2

sin((b x )t)
t

dt

1
it

exp(it( a x )) dt

is not defined; we really need to work


with the full f ( a, b, T; x ) to get the right
cancellation.

= 2K ( T; a x ) 2K ( T; b x ).
Since

K ( T; c) =

R
T

sin(ct)
ct d ( ct )

R cT
0

sin(s)
s

ds = K (cT; 1),

c>0

0,

c=0

K (|c| T; 1),

c < 0,

(8.2)

Problem 5.11 implies that

lim K ( T; c) =

2,

c > 0,

0,

c = 0,

2 ,

c < 0.

Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

and so
lim f ( a, b, T; x ) =

0,

4 of 9

x [ a, b]c ,

x = a or x = b,

2,

a < x < b.

Observe first that the function T 7 K ( T; 1) is continuous on [0, ) and


has a finite limit as T so that supT 0 |K ( T; 1)| < . Furthermore,
(8.2) implies that |K ( T; c)| supT 0 K ( T; 1) for any c R and T 0
so that
sup{| f ( a, b, T; x )| : x R, T 0} < .
Therefore, we can use the dominated convergence theorem to get that
1
F ( a, b, T; x )
T 2

lim

1
T 2

= lim
=

1
2

f ( a, b, T; x ) (dx )

lim f ( a, b, T; x ) ( x )

= 21 ({ a}) + (( a, b)) + 12 ({b}).


Corollary 8.4. For probability measures 1 and 2 on B(R), the equality
1 = 2 implies that 1 = 2 .
Proof. By Theorem 8.3, we have 1 (( a, b)) = 2 (( a, b)) for all a, b C
where C is the set of all x R such that 1 ({ x }) = 2 ({ x }) = 0. Since
C c is at most countable, it is straightforward to see that the family
{( a, b) : a, b C } of intervals is a -system which generates B(R).

R
Corollary 8.5. Suppose that R (t) dt < . Then  and
bounded and continuous function given by
d
1
= f , where f ( x ) =
d
2

Z
R

d
d

is a

eitx (t) dt for x R.



Proof. Since is integrable and eitx = 1, f is well defined. For
a < b we have
Z b
a

b
1
f ( x ) dx =
eitx (t) dt dx
2 a R
Z b

Z
1
=
(t)
eitx dx dt
2 R
a

eita eitb
(t) dt
it
R
Z T ita
1
e
eitb
= lim
(t) dt
it
T 2 T

1
2

(8.2)

= (( a, b)) + 21 ({ a, b}),
by Theorem 8.3, where the use of Fubinis theorem above is justified by
the fact that the function (t, x ) 7 eitx (t) is integrable on [ a, b] R,
Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

5 of 9

for all a < b. For a, b such that ({ a}) = ({b}) = 0, the equation
Rb
(8.2) implies that (( a, b)) = a f ( x ) dx. The claim now follows by the
-theorem.
Example 8.6. Here is a list of some common distributions and the
corresponding characteristic functions:
1. Continuous distributions.
Density f X ( x )

Name

Parameters

Uniform

a<b

1
b a

Symmetric Uniform

a>0

1
2a

Normal

R, > 0

Exponential

>0

Double Exponential

>0

Cauchy

R, > 0

1
22

eita eitb
it(b a)

1[ a,b] ( x )

sin( at)
at

1[ a,a] ( x )
exp(

( x )2
22

exp(x )1[0,) ( x )
1
2

Ch. function X (t)

exp( | x |)

(2 +( x )2 )

exp(it 21 2 t2 )

it
2
2 + t2

exp(it |t|)

2. Discrete distributions.
Name

Parameters

Distribution X ,

Ch. function X (t)

Dirac

cR

exp(itc)

Biased Coin-toss

p (0, 1)

p1 + (1 p)1

cos(t) + (2p 1)i sin(t)

Geometric

p (0, 1)

nN0 pn (1 p)n

1 p
1eit p

10

Poisson

>0

n
n!

exp((eit 1))

nN0 e

n , n N0

3. A singular distribution.

11

Name

Ch. function X (t)

Cantor

t
eit/2
k =1 cos( 3k )

Tail behavior
We continue by describing several methods one can use to extract useful information about the tails of the underlying probability distribution from a characteristic function.
Proposition 8.7. Let X be a random variable. If E[| X |n ] < , then
dn
(t) exists for all t and
(dt)n X
dn
(dt)n

X (t) = E[eitX (iX )n ].

In particular
n

d
E[ X n ] = (i )n (dt
(0).
)n X

Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

6 of 9

Proof. We give the proof in the case n = 1 and leave the general case
to the reader:
(h) (0)
h
h 0

lim

= lim

h 0 R

eihx 1
h

(dx ) =

eihx 1
h
R h 0

lim

(dx ) =

Z
R

ix (dx ),

where the passage of the limit under the integral sign is justified by
the dominated convergence theorem which, in turn, can be used since
Z
ihx
e 1
| x | (dx ) = E[| X |] < .
h | x | , and
R

Remark 8.8.
1. It can be shown that for n even, the existence of

dn
(dt)n

X (0) (in the ap-

propriate sense) implies the finiteness of the n-th moment E[| X |n ].


2. When n is odd, it can happen that
- see Problem 8.6.

dn
(dt)n

X (0) exists, but E[| X |n ] =

Finer estimates of the tails of a probability distribution can be obtained by finer analysis of the behavior of around 0:
Proposition 8.9. Let be a probability measure on B(R) and let =
be its characteristic function. Then, for > 0 we have
([ 2 , 2 ]c )

(1 (t)) dt.

Proof. Let X be a random variable with distribution . We start by


using Fubinis theorem to get
1
2

(1 (t)) dt =

1
2 E[

= 1 E[

Z
0

(1 eitX ) dt]

(1 cos(tX )) dt] = E[1

sin( x )

sin(X )
X ].

sin( x )

It remains to observe that 1 x 0 and 1 x 1 |1x| for all


x. Therefore, if we use the first inequality on [2, 2] and the second
sin( x )
one on [2, 2]c , we get 1 x 12 1{| x|>2} so that
1
2

(1 (t)) dt 12 P[|X | > 2] = 12 ([ 2 , 2 ]c ).

Problem 8.2. Use the inequality of Proposition 8.9 to show that if


R
(t) = 1 + O(|t| ) for some > 0, then R | x | (dx ) < , for all
R
< . Give an example where R | x | (dx ) = .

Note: f (t) = g(t) + O(h(t)) means


that, for some > 0, we have

Problem 8.3 (Riemann-Lebesgue theorem). Suppose that  . Show


that
lim (t) = lim (t) = 0.

Hint: Use (and prove) the fact that f


L1+ (R) can be approximated in L1 (R)
by a function of the form nk=1 k 1[ak ,bk ] .

sup
|t|

| f (t) g(t)|
h(t)

< .

Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

7 of 9

The continuity theorem


Theorem 8.10 (Continuity theorem). Let {n }nN be a sequence of probability distributions on B(R), and let { n }nN be the sequence of their characteristic functions. Suppose that there exists a function : R C such
that
1. n (t) (t), for all t R, and
2. is continuous at t = 0.
Then, is the characteristic function of a probability measure on B(R) and
w
n .
Proof. We start by showing that the continuity of the limit implies
tightness of {n }nN . Given > 0 there exists > 0 such that 1
(t) /2 for |t| . By the dominated convergence theorem we
have
lim sup n ([ 2 , 2 ]c ) lim sup 1
n

n
Z

(1 n (t)) dt

(1 (t)) dt .

By taking an even smaller 0 > 0, we can guarantee that


sup n ([ 20 , 20 ]c ) ,

n N

which, together with the arbitrariness of > 0 implies that {n }nN is


tight.
Let {nk }kN be a convergent subsequence of {n }nN , and let
be its limit. Since nk , we conclude that is the characteristic
function of . It remains to show that the whole sequence converges
to weakly. This follows, however, directly from Problem 7.4, since
any convergent subsequence {nk }kN has the same limit .
Problem 8.4. Let be a characteristic function of some probability
measure on B(R). Show that (t) = e (t)1 is also a characteristic
function of some probability measure on B(R).

Additional Problems
Problem 8.5 (Atoms from the characteristic function). Let be a probability measure on B(R), and let = be its characteristic function.
R T ita
1
(t) dt.
1. Show that ({ a}) = limT 2T
T e
2. Show that if limt | (t)| = limt | (t)| = 0, then has no
atoms.
Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

8 of 9

3. Show that converse of (2) is false.


Problem 8.6 (Existence of 0X (0) does not imply that X L1 ). Let X
be a random variable which takes values in Z \ {2, 1, 0, 1, 2} with
P[ X = k ] = P[ X = k ] =
where C =

1
1
1
2 ( k 3 k2 log(k ) )

C
,
k2 log(k )

Hint: Prove that | (tn )| = 1 along a suitably chosen sequence tn , where


is the characteristic function of the Cantor distribution.

for k = 3, 4, . . . ,

(0, ). Show that 0X (0) = 0, but

X 6 L1 . Hint: Argue that, in order to establish that 0X (0) = 0, it is enough to show


that

cos(hk )1
lim 1
2
h0 h k3 k log(k)

= 0.

Then split the sum at k close to 2/h and use (and prove) the inequality |cos( x ) 1|
min( x2 /2, x ). Bounding sums by integrals may help, too.

Problem 8.7 (Multivariate characteristic functions). Let X = ( X1 , . . . , Xn )


be a random vector. The characteristic function = X : Rn C is
given by
(t1 , t2 , . . . , tn ) = E[exp(i

tk Xk )].

k =1

We will also use the shortcut t for (t1 , . . . , tn ) and t X for the random
variable nk=1 tk Xk . Prove the following statements
1. Random variables X and Y are independent if and only if
(X,Y ) (t1 , t2 ) = X (t1 ) Y (t2 ) for all t1 , t2 R.
2. Random vectors X 1 and X 2 have the same distribution if and only
if random variables t X 1 and t X 2 have the same distribution for
all t Rn . (This fact is known as Walds device.)

Note: Take for granted the following


statement (the proof of which is similar
to the proof of the 1-dimensional case):
Suppose that X 1 and X 2 are random vectors
with X 1 (t ) = X 2 (t ) for all t Rn . Then
X 1 and X 2 have the same distribution, i.e.
X 1 = X 2 .

An n-dimensional random vector X is said to be Gaussian (or, to


have the multivariate normal distribution) if there exists a vector Rn
and a symmetric positive semi-definite matrix Rnn such that
X (t ) = exp(i t 21 t t ),
where t is interpreted as a column vector, and () is transposition.
This is denoted as X N (, ). X is said to be non-degenerate if is
positive definite.
3. Show that a random vector X is Gaussian, if and only if the random
vector t X is normally distributed (with some mean and variance)
for each t Rn .

Note: Be careful, nothing in the second


statement tells you what the mean and
variance of t X are.

4. Let X = ( X1 , X2 , . . . , Xn ) be a Gaussian random vector. Show that


Xk and Xl , k 6= l, are independent if and only if they are uncorrelated.
Last Updated: December 8, 2013

Lecture 8: Characteristic Functions

9 of 9

5. Construct a random vector ( X, Y ) such that both X and Y are normally distributed, but that X = ( X, Y ) is not Gaussian.
6. Let X = ( X1 , X2 , . . . , Xn ) be a random vector consisting of n independent random variables with Xi N (0, 1). Let Rnn
be a given positive semi-definite symmetric matrix, and Rn
a given vector. Show that there exists an affine transformation
T : Rn Rn such that the random vector T ( X ) is Gaussian with
T ( X ) N (, ).
7. Find a necessary and sufficient condition on and such that
the converse of the previous problem holds true: For a Gaussian
random vector X N (, ), there exists an affine transformation
T : Rn Rn such that T ( X ) has independent components with the
N (0, 1)-distribution (i.e. T ( X ) N (0, yI ), where yI is the identity
matrix).
Problem 8.8 (Slutskys Theorem). Let X, Y, { Xn }nN and {Yn }nN be
random variables defined on the same probability space, such that
D

Xn X and Yn Y.

(8.3)

Show that
D

1. It is not necessarily true that Xn + Yn X + Y. For that matter,


D

we do not necessarily have ( Xn , Yn ) ( X, Y ) (where the pairs are


considered as random elements in the metric space R2 ).
2. If, in addition to (8.3), there exists a constant c R such that P[Y =
D

c] = 1, show that g( Xn , Yn ) g( X, c), for any continuous function


g : R2 R.

Hint:

It is enough to show that


D

( Xn , Yn ) ( Xn , c). Use Problem 8.7).

Problem 8.9 (Convergence of a normal sequence).


1. Let { Xn }nN be a sequence of normally-distributed random variables converging weakly towards a random variable X. Show that
X must be a normal random variable itself.
a.s.

2. Let Xn be a sequence of normal random variables such that Xn X.


Lp

Show that Xn X for all p 1.

Hint: Use this fact: for a sequence


{n }nN of real numbers, the following two
statements are equivalent
(a) n R, and
(b) exp(itn ) exp(it), for all t.
You dont need to prove it, but feel free to try.

Last Updated: December 8, 2013

Das könnte Ihnen auch gefallen