Sie sind auf Seite 1von 9

Econ 325 Section 003/004

Notes on Variance, Covariance, and Summation Operator


By Hiro Kasahara

Properties of Summation Operator


For a sequence of the values {x1 , x2 , ..., xn }, we write the sum of x1 , x2 , ..., xn−1 , and xn using the
summation operator as
Xn
x1 + x2 + ... + xn = xi . (1)
i=1
Given a constant c,
n
X n
X
cxi = cx1 + cx2 + ... + cxn = c × (x1 + x2 + ... + xn ) = c xi . (2)
i=1 i=1

• For example, consider the case thatPn = 2 with the values of {x1 , x2 } given by xP
1 = 0 and
x2 = 1. Suppose that c = 4. Then, i=1 4 × xi = 4 × 0 + 4 × 1 = 4 × (0 + 1) = 4 2i=1 xi .
2

• In the special case of x1 = x2 = ... = xn = 1, we have ni=1 cxi = ni=1 c × 1 = c × ni=1 1 =


P P P
c × (1 + 1 + ... + 1) = nc.

Consider another sequence {y1 , y2 , ..., ym } in addition to {x1 , x2 , ..., xn }. Then, we may consider
doubleP
summations over possible values of x’s and y’s. For example, consider the case of n = m = 2.
Then, 2i=1 2j=1 xi yj is equal to x1 y1 + x1 y2 + x2 y1 + x2 y2 because
P

x1 y1 + x1 y2 + x2 y1 + x2 y2
= x1 (y1 + y2 ) + x2 (y1 + y2 ) (by factorization)
2
X
= xi (y1 + y2 ) (by def. of the summation operator by setting c = (y1 + y2 ) in (2) )
i=1
 
2
X 2
X P2
= xi  yj  (because y1 + y2 = j=1 yj )
i=1 j=1
 
2
X X2
P2 P2
=  x i yj  (because xi j=1 yj = xi (y1 + y2 ) = (xi y1 + xi y2 ) = j=1 xi yj )
i=1 j=1
2 X
X 2
= xi yj .
i=1 j=1

• Note thatP 2i=1P 2j=1 xi yj =P 2j=1P 2i=1 xi yj . In general case of {x1 , x2 , ..., xn } and {y1 , y2 , ..., ym },
P P P P
we have ni=1 m j=1 xi yj =
m
j=1
n
i=1 xi yj .

• Note that 2j=1 xi yj = xi 2j=1 yj using (2) because xi is treated as a constant in the sum-
P P
mation operator over j’s. Hence, we can write
2 X
X 2 2
X 2
X 2
X 2
X
xi yj = xi yj = yj xi .
i=1 j=1 i=1 j=1 j=1 i=1

1
In general, we have
n X
X m n
X m
X m
X n
X
xi yj = xi yj = yj xi . (3)
i=1 j=1 i=1 j=1 j=1 i=1

That is, when we have double summations, we can take xi ’s out of the summation over j’s.
Similarly, we can take yj ’s out of the summation over i’s.

Expectation, Variance, and Covariance


Let X and Y be two discrete random variables. The set of possible values for X is {x1 , . . . , xn };
and the set of possible values for Y is {y1 , . . . , ym }. The joint probability function is given by

pX,Y
ij = P (X = xi , Y = yj ) , i = 1, . . . n; j = 1, . . . , m.

The marginal probability function of X is


m
pX,Y
X
pX
i = P (X = xi ) = ij , i = 1, . . . n,
j=1

and the marginal probability function of Y is


n
pX,Y
X
pYj = P (Y = yj ) = ij , j = 1, . . . m.
i=1

1. If c is a constant, then

E[cX] = cE[X]. (4)

Proof: By definition of the expected value of cX,


n
X
E[cX] = (cxi )pX
i (by def. of the expected value)
i=1
= cx1 pX X X X X
1 + cx2 p2 + cx3 p3 + ... + cxn−1 pn−1 + cxn pn (by def. of the summation operator)
=c× (x1 pX
1 + x2 pX
2 + x3 pX
3 + ... + xn−1 pX
n−1 + xn p X
n) (because c is a common factor)
n
!
X
=c× xi p X
i (by def. of the summation operator)
i=1
= c × E[X] (by def. of the expected value of X)
= cE[X].

2.
E[X + Y ] = E[X] + E[Y ]. (5)

2
Proof:
n X
m
(xi + yj )pX,Y
X
E(X + Y ) = ij
i=1 j=1
n X
m
(xi pX,Y + yj pX,Y
X
= ij ij )
i=1 j=1
n X m n X
m
xi pX,Y yj pX,Y
X X
= ij + ij (6)
i=1 j=1 i=1 j=1
 
n m m n
!
X,Y 
pX,Y
X X X X
= xi ·  pij + yj · ij (7)
i=1 j=1 j=1 i=1
Pm
because we can take xi out of j=1 because xi does not depend on j’s
n
X m
X
= xi · pX
i + yj · pYj
i=1 j=1
Pm X,Y Pn X,Y
because pX
i = j=1 pij and pYj = i=1 pij
= E(X) + E(Y )

X,Y X,Y X,Y Pn Pm X,Y


Equation (6): To understand ni=1 m
P P Pn Pm
j=1 (xi pij +yj pij ) = i=1 j=1 xi pij + i=1 j=1 yj pij ,
consider the case of n = m = 2. Then,
2 X
2
(xi pX,Y + yj pX,Y
X
ij ij )
i=1 j=1

= (x1 pX,Y X,Y X,Y X,Y X,Y X,Y X,Y X,Y


11 + y1 p11 ) + (x1 p12 + y2 p12 ) + (x2 p21 + y1 p21 ) + (x2 p22 + y2 p22 )
= (x1 pX,Y X,Y X,Y X,Y X,Y X,Y X,Y X,Y
11 + x1 p12 + x2 p21 + x2 p22 ) + (y1 p11 + y2 p12 + y1 p21 + y2 p22 )
2 X
2 2 X
2
xi pX,Y yj pX,Y
X X
= ij + ij .
i=1 j=1 i=1 j=1
Pn Pm X,Y Pn
Equation (7): This is a generalization of (3). To understand i=1 j=1 xi pij = i=1 xi ·
X,Y
( m
P
j=1 pij ), consider the case of n = m = 2. Then,

2 X
2
xi pX,Y = x1 pX,Y X,Y X,Y X,Y
X
ij 11 + x1 p12 + x2 p21 + x2 p22
i=1 j=1

= x1 (pX,Y X,Y X,Y X,Y


11 + p12 ) + x2 (p21 + p22 )
2
xi (pX,Y + pX,Y
X
= i1 i2 )
i=1
2 2
pX,Y
X X
= xi ( ij ).
i=1 j=1

X,Y
· ( 2i=1 pX,Y
P2 P2 P2 P
Similarly, we may show that i=1 j=1 yj pij = j=1 yj ij ).

3. If a and b are constants, then E[a + bX] = a + bE[X].

3
Proof:
n
X
E(a + bX) = (a + bxi )pX
i
i=1
n
X
= (apX X
i + bxi pi )
i=1
Xn n
X
= apX
i + bxi pX
i (8)
i=1 i=1
Xn Xn
=a pX
i +b xi p X
i , (by using (2))
i=1 i=1
Pn X
Pn Pn X
= a · 1 + bE(X), where i=1 pi = i=1 P (X = xi ) = 1 and i=1 xi pi = E(X)
= a + bE(X).
Pn X + bx pX ) =
Pn X
Equation
Pn (8): This is similar to (6). To understand i=1 (ap i i i i=1 api +
X
P 2 X + bx pX ) = (apx + bx pX ) + (apx +
P2 i=1 (ap
i=1 bxi pi , consider the case of n = 2. Then, i P i i 1 1 1 2
bx2 p2 ) = (ap1 + ap2 ) + (bx1 p1 + bx2 p2 ) = i=1 api + 2i=1 bxi pX
X x x X X X
i .

4. If c is a constant, then Cov (X, c) = 0.

Proof: According to the definition of covariance,

Cov(X, c) = E[(X − E(X))(c − E(c))].

Since the expectation of a constant is itself, i.e., E(c) = c,

Cov(X, c) = E[(X − E(X))(c − c)]


= E[(X − E(X)) · 0]
= E[0]
Xn
= 0 × pX
i
i=1
Xn
= 0
i=1
= 0 + 0 + ... + 0
=0

5. Cov (X, X) = V ar (X) .

4
Proof: According to the definition of covariance, we can expand Cov(X, X) as follows:

Cov(X, X) = E[(X − E(X))(X − E(X))]


Xn n
X
= [xi − E(X)][xi − E(X)] · P (X = xi ), where E(X) = xi p X
i
i=1 i=1
Xn
= [xi − E(X)][xi − E(X)] · pX
i
i=1
Xn
= [xi − E(X)]2 · pX
i
i=1
= E[(X − E(X))2 ] (by def. of the expected value)
= V ar(X).

6. Cov (X, Y ) = Cov (Y, X) .

Proof: According to the definition of covariance, we can expand Cov(X, Y ) as follows:

Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))]


n Xm n m
[xi − E(X)][yj − E(Y )] · pX,Y
X X X
= ij , where E(X) = xi p X
i and E(Y ) = yj pYj
i=1 j=1 i=1 j=1
m X n
[yj − E(Y )][xi − E(X)] · pX,Y
X
= ij
j=1 i=1

= E[(Y − E(Y ))(X − E(X))] (by def. of the expected value)


= Cov(Y, X). (by def. of the covariance)

7. Cov (a1 + b1 X, a2 + b2 Y ) = b1 b2 Cov (X, Y ) , where a1 , a2 , b1 , and b2 are some constants.

Proof: Using E(a1 + b1 X) = a1 + b1 E(X) and E(a2 + b2 Y ) = a2 + b2 E(Y ), we can expand


Cov (a1 + b1 X, a2 + b2 Y ) as follows:

Cov(X, Y ) = E[(a1 + b1 X − E(a1 + b1 X))(a2 + b2 Y − E(a2 + b2 Y ))]


= E[(a1 + b1 X − (a1 + b1 E(X)))(a2 + b2 Y − (a2 + b2 E(Y ))]
= E[(a1 − a1 + b1 X − b1 E(X))(a2 − a2 + b2 Y − b2 E(Y )]
= E[(b1 X − b1 E(X))(b2 Y − b2 E(Y )]
= E[b1 (X − E(X)) · b2 (Y − E(Y ))]
= E[b1 b2 (X − E(X))(Y − E(Y ))]
n X m
b1 b2 (xi − E(X))(yj − E(Y )) · pX,Y
X
= ij
i=1 j=1
n X
m
[xi − E(X)][yj − E(Y )] · pX,Y
X
= b1 b2 ij (by using (2))
i=1 j=1

= b1 b2 Cov(X, Y ).

5
8. If X and Y are independent, then Cov (X, Y ) = 0.

Proof: If X and Y are independent, by definition of stochastic independence, P (X = xi , Y =


yj ) = P (X = xi )P (Y = yj ) = pX Y
i pj for any i = 1, ..., n and j = 1, ..., m. Then, we may
expand Cov (X, Y ) as follows.

Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))]


Xn Xm
= [xi − E(X)][yj − E(Y )] · P (X = xi , Y = yj )
i=1 j=1
n X
X m
= [xi − E(X)][yj − E(Y )]pX Y
i pj
i=1 j=1

because X and Y are independent


n X
X m
= {[xi − E(X)]pX Y
i }{[yj − E(Y )]pj }
i=1 j=1
 
n
X Xm 
= [xi − E(X)]pX
i [yj − E(Y )]pYj (9)
 
i=1 j=1
Pm
because we can move [xi − E(X)]pX
i outside of j=1
because [xi − E(X)]pX does not depend on the index j’s
  (i
m n
)
X  X
= [yj − E(Y )]pYj [xi − E(X)]pX i (10)
 
j=1 i=1
nP o
m Y outside of ni=1
P
because we can move j=1 j[y − E(Y )]pj
nP o
m Y
because [y
j=1 j − E(Y )]pj does not depend on the index i’s
( n  
n m m
)
X X X X 
= xi p X
i − E(X)pX i · yj pYj − E(Y )pYj
 
i=1 i=1 j=1 j=1
( n
)  m

X  X 
X Y
= E(X) − E(X)pi · E(Y ) − E(Y )pj
 
i=1 j=1

by definition of E(X) and E(Y )


( n
)  m

X  X 
= E(X) − E(X) pX
i · E(Y ) − E(Y ) pYj
 
i=1 j=1
Pn Pm
because we can move E(X) and E(Y ) outside of i=1 and j=1 , respectively
= {E(X) − E(X) · 1} · {E(Y ) − E(Y ) · 1}
= 0 · 0 = 0.

Equations (9) and (10): This is similar to equations (3) and (7). Please consider the case of
n = m = 2 and convince yourself that (9) and (10) hold.

9. V ar (X + Y ) = V ar (X) + V ar (Y ) + 2Cov (X, Y ).

6
Proof: By the definition of variance,

V ar(X + Y ) = E[(X + Y − E(X + Y ))2 ].

Then,

V ar(X + Y ) = E[(X + Y − E(X + Y ))2 ]


= E[((X − E(X)) + (Y − E(Y )))2 ]
= E[(X − E(X))2 + (Y − E(Y ))2 + 2(X − E(X))(Y − E(Y ))]
because for any a and b, (a + b)2 = a2 + b2 + 2ab
= E[(X − E(X))2 ] + E[(Y − E(Y ))2 ] + 2E[(X − E(X))(Y − E(Y ))] (by using (5))
= V ar(X) + V ar(Y ) + 2Cov(X, Y )
by definition of variance and covariance

10. V ar (X − Y ) = V ar (X) + V ar (Y ) − 2Cov (X, Y ).

Proof: The proof of V ar (X − Y ) = V ar (X) + V ar (Y ) − 2Cov (X, Y ) is similar to the proof


of V ar (X + Y ) = V ar (X) + V ar (Y ) + 2Cov (X, Y ). First, we may show that E(X − Y ) =
E(X) − E(Y ). Then,

V ar(X − Y ) = E[(X − Y − E(X − Y ))2 ]


= E[((X − E(X)) − (Y − E(Y )))2 ]
= E[(X − E(X))2 + (Y − E(Y ))2 − 2(X − E(X))(Y − E(Y ))]
= E[(X − E(X))2 ] + E[(Y − E(Y ))2 ] − 2E[(X − E(X))(Y − E(Y ))] (by using (5))
= V ar(X) + V ar(Y ) − 2Cov(X, Y )
p p
11. Define W = (X −E(X))/ V ar(X) and Z = (Y −E(Y ))/ V ar(Y ). Show that Cov(W, Z) =
Corr(X, Z).

7
Proof: Expanding Cov(W, Z), we have

Cov(W, Z) = E[(W − E(W ))(Z − E(Z))]


= E[W Z] (because E[W ] = E[Z] = 0)
( )
X − E(X) Y − E(Y )
=E p ·p
V ar(X) V ar(Y )
by definition of W and Z
( )
1 1
=E p ·p · [X − E(X)]E[Y − E(Y )]
V ar(X) V ar(Y )
1 1
=p ·p · E {[X − E(X)]E[Y − E(Y )]} (by using (2) and (4))
V ar(X) V ar(Y )
because both √ 1 and √ 1 are constant
V ar(X) V ar(Y )
E {[X − E(X)]E[Y − E(Y )]}
= p p
V ar(X) V ar(Y )
Cov(X, Y )
=p p (by definition of covariance)
V ar(X) V ar(Y )
= Corr(X, Y ) (by definition of correlation coefficient)

12. Let {xi : i = 1, . . . , n} and {yi : i = 1, . . . , n} be two sequences. Define the averages
n
1X
x̄ = xi ,
n
i=1
n
1X
ȳ = yi .
n
i=1
Pn
(a) i=1 (xi − x̄) = 0.

Proof:
n
X n
X n
X
(xi − x̄) = xi − x̄
i=1 i=1 i=1
n
X
= xi − nx̄
i=1
because ni=1 x̄ = x̄ + x̄ + ... + x̄ = nx̄
P
Pn
xi
= n i=1 − nx̄
n Pn
xi
because ni=1 xi = nn ni=1 xi = n i=1
P P
n
= nx̄ − nx̄
Pn
xi
because x̄ = i=1
n
= 0.
Pn
− x̄)2 =
Pn
(b) i=1 (xi i=1 xi (xi − x̄).

8
Proof: We use the result of 2.(a) above.
n
X n
X
2
(xi − x̄) = (xi − x̄) (xi − x̄)
i=1 i=1
Xn n
X
= xi (xi − x̄) − x̄ (xi − x̄)
i=1 i=1
Xn Xn
= xi (xi − x̄) − x̄ (xi − x̄)
i=1 i=1
n
X
because x̄ is constant and does not depend on i’s = xi (xi − x̄) − x̄ · 0
i=1
because ni=1 (xi − x̄) = 0. as shown above
P
n
X
= xi (xi − x̄) .
i=1
Pn Pn Pn
(c) i=1 (xi − x̄) (yi − ȳ) = i=1 yi (xi − x̄) = i=1 xi (yi − ȳ).

Proof: The proof is similar to the proof of 2.(b) above.


n
X n
X n
X
(xi − x̄) (yi − ȳ) = (xi − x̄) yi − (xi − x̄) ȳ
i=1 i=1 i=1
Xn Xn
= (xi − x̄) yi − ȳ (xi − x̄)
i=1 i=1
Xn
= (xi − x̄) yi − ȳ · 0
i=1
Xn
= yi (xi − x̄) .
i=1

Also,
n
X n
X n
X
(xi − x̄) (yi − ȳ) = xi (yi − ȳ) − x̄ (yi − ȳ)
i=1 i=1 i=1
Xn Xn
= xi (yi − ȳ) − x̄ (yi − ȳ)
i=1 i=1
Xn
= xi (yi − ȳ) − x̄ · 0
i=1
Xn
= xi (yi − ȳ) .
i=1