You are on page 1of 99

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes

ACTL2002/ACTL5101 Probability and Statistics


c Katja Ignatieva

School of Risk and Actuarial Studies
Australian School of Business
University of New South Wales
k.ignatieva@unsw.edu.au

Week 3 Video Lecture Notes


Week 2
Week 3
Week 4
Probability: Week 1
Week 6
Review
Estimation: Week 5
Week
7
Week
8
Week 9
Hypothesis testing:
Week
10
Week
11
Week
12
Linear regression:
Week 2 VL
Week 4 VL
Week 5 VL
Video lectures: Week 1 VL

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Introduction

Special sampling distributions & sample mean and variance

Numerical methods to summarize data


Introduction
Measures of location & spread
Numerical example

Graphical procedures to summarize data


Summarizing data

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Introduction

Population vs sample
Population: the large body of data;
Sample: a subset of the population.

Question: For the following four cases would we refer to a


population or sample:
1.
2.
3.
4.

All the actuaries in Australia;


The temperature on 5, randomly chosen, days;
All NSW cars;
The basket of goods of each fifth customer on a given day.

Solution: 1. Population; 2. Sample; 3. Population 4. Sample.


402/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Introduction

Summarising data: Numerical approaches


Given a set of observations x1 , x2 , x3 , . . . , xn selected from a
population (usually assumed i.i.d. (independent and identically
distributed)).
Sorted data in ascending order: x(1) , x(2) , . . . , x(n) , such that
x(1) is the smallest and x(n) is the largest.
Objectives:
- Understand the main features of data and to summarise data
(essential first step in analysing data);
- Make inferences about the population
(more on this later in the course).

403/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Measures of location & spread

Special sampling distributions & sample mean and variance

Numerical methods to summarize data


Introduction
Measures of location & spread
Numerical example

Graphical procedures to summarize data


Summarizing data

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Measures of location & spread

Measures of location
Used to estimate the central point of the sample, also called
measures of central tendency:
The sample mean is given by:
x=

n
1 X

xk
n
k=1

The population mean is given by:


X
x =
pX (x) x
all x

100% trimmed mean, the average of the observations after


discarding the lowest 100% and highest 100%:
x(bnc+1) + . . . + x(nbnc)
e
x =
,
n 2bnc
404/420

where bnc is the greatest integer less than or equal to n.

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Measures of location & spread

Measures of spread
The sample variance:
s2 =

n
X
1
1

(xk x)2 =

n1
n1
k=1
!
n
X
1

xk2 n x 2 .
n1

n
X

xk2 +

k=1

n
X
k=1

x2 2

n
X

!
xk x

k=1

k=1

The population variance:


X
X
2 = Var (X ) =
pX (x) (x X )2 =
pX (x) x 2 2X
all x

all x

s 2.

Population standard deviation: = 2 .


Sample standard deviation: s =

405/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Measures of location & spread

Quantiles
P , th quantile or ( 100)th percentile:
1
1
[number of xk <P ] [number of xk P ]
n
n
approximated by linear interpolation as the ((n 1) + 1)th
observation.
Quartiles: Q1 (25th percentile) and Q3 (75th percentile).
Quantile function: FX1 (u), u [0, 1], where FX (x) = u.

Question: What are the 0.025, 0.16, 0.5, 0.84 and 0.975
quantiles of the N(0,1) distribution?
406/420

Solution: They are -1.96, -1, 0, 1 and 1.96, respectively.

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Measures of location & spread

Mode: The mode m is the value that maximises the p.m.f.


pX (x) in the discrete case or the p.d.f. fX (x) in the
continuous case.
Median, M:

x n+1 ,
if n is odd;
( 
2 )

M=
12 x n + x n +1 , if n is even.
( )
(
)
2

Median absolute deviation:


MAD = median of the numbers:{|xi M|}.
Range:
R = x(n) x(1) .
Interquartile range:
IQR = Q3 Q1 .
407/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Special sampling distributions & sample mean and variance

Numerical methods to summarize data


Introduction
Measures of location & spread
Numerical example

Graphical procedures to summarize data


Summarizing data

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example
An insurance company has occurred the 26 claims with the
following amounts:
1 120
990
450
478

1 000
975
1 000
584

760
346
2 430
1 406

348
1 100
1 245
760

1 548
752
850
1 000

with
26
P

xi

= 25 855;

i=1
26
P

xi2 = 36 904 873.

i=1
408/420

3 400
335
605

588
1 245
540

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example

First step: arrange in ascending order:


335
346
348
450
478
540
584

409/420

588
605
752
760
760
850
975

990
1 000
1 000
1 000
1 100
1 120
1 245

1 245
1 406
1 548
2 430
3 400

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example
Some statistics:
Mean:
x=

n
1 X
25 855

xi =
= 994.42.
n
26
i=1

Variance:
s2 =

n1

n
X

!
xi2 n x 2

i=1


1
36 904 873 26 (994.42)2
=
25
= 447, 762.6.
Standard deviation:
s=
410/420

447 762.6 = 669.2.

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example

Determine P0.35 , i.e., the 35th percentile (0.35 quantile)


This is the (25 (0.35)) + 1 = 9.75th observation.
Then, linear interpolation gives:
P0.35 = x(9) + 0.75 x(10) x(9)

= 605 + 0.75 (147) = 715.25


= 0.25 x(9) + 0.75 x(10)

411/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example
Recall: x(1) = 335

and

x(26) = 3, 400.

Range:
R = 3, 400 335 = 3, 065.
Quartiles:
Q1

= x(250.25+1)
= x(6.25)
= 0.75x(6) + 0.25x(7)
= 585

Q2

= M = x(250.5+1)
= x(13.5)

= x(13) + x(14) /2
= 912.5

Q3

= x(250.75+1)
= x(19.75)
= 0.25x(19) + 0.75x(20)
= 1, 115

Interquartile range:
IQR = Q3 Q1 = 1, 115 585 = 530.
412/420

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Numerical methods to summarize data
Numerical example

Numerical example

Determine the 10% trimmed mean:


Step 1: Compute bnc = b26 (0.1)c = 2.
Hence, we should discard the 2 smallest and the 2 largest of
the observations.
Step 2: Compute the trimmed mean:
e
x0.10 =

413/420

x(3) + . . . + x(24)
= 879.27.
22

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Special sampling distributions & sample mean and variance

Numerical methods to summarize data


Introduction
Measures of location & spread
Numerical example

Graphical procedures to summarize data


Summarizing data

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Empirical cumulative distribution function (ecdf)


Given a set of observations x1 , . . . , xn the empirical cumulative
distribution function is given by:
1
Fn (x) = (number of observations IXk x )
n
E[Fn (x)] =FX (x)
1
Var (Fn (x)) = FX (x) (1 FX (x)) .
n
Note: pn (x) = IXk =x /n, the proportion of observations equal
to x.

414/420

Proves for E[Fn (x)] and Var (Fn (x)) are not part of the
course.

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

E.c.d.f.
1
0.9

Data:

0.8

335
450
584
752
850
1000
1100
1245
2430

0.7
F26(x)

0.6
0.5
0.4
0.3
0.2
0.1
0
0

415/420

500

1000

1500
2000
Claim amount

2500

3000

3500

346
478
588
760
975
1000
1120
1406
3400

348
540
605
760
990
1000
1245
1548

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Histogram
14

Data:
12

335
450
584
752
850
1000
1100
1245
2430

Frequency

10
8
6
4
2
0
0

416/420

500

1000

1500
2000
Claim amount

2500

3000

346
478
588
760
975
1000
1120
1406
3400

3500

Quant the number of observations in each bin (0, 500], (500, 1000],
(1000, 1500], (1500, 2000], (2000, 2500], (2500, 3000], (3000, 3500].
Bin sizes chosen such that it provides good summary of the data, i.e., not
too short and not too long.

348
540
605
760
990
1000
1245
1548

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Stem-and-leaf display

Data:

Stem-and-leaf:
0
0
1
1
2
2
3

417/420

|
|
|
|
|
|
|

333
5556668889
0000011224
5
4
4

335
450
584
752
850
1000
1100
1245
2430

346
478
588
760
975
1000
1120
1406
3400

Each row corresponds to a bin.


The number before | displays the number of thousands (or hundreds/tens
etc.).
Each number after | displays the 3rd (or 2nd /1st ) digit of an observation.
Note: rounding!

348
540
605
760
990
1000
1245
1548

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Boxplot (Box-and-Whiskers plot)


Boxplot
3500

Data:

3000

Claim size

2500
2000
1500
1000
500

335
450
584
752
850
1000
1100
1245
2430

418/420

Red line: median; Blue box: Q1 and Q3 (height of box: IQR)


Black lines: 10th and 90th percentile
Red circles: outliers.

346
478
588
760
975
1000
1120
1406
3400

348
540
605
760
990
1000
1245
1548

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Q-Q plot calculations


This is done by plotting the quantile function of your chosen
distribution against the order statistics, x(i) .
A small continuity adjustment is made, too.
For the example above, a standard normal Q-Q plot, we have:
i
1
2

25
26
i0.5
26

x(i)
419/420

i0.5
26

0.019 2

0.057 7

0.942 3

0.980 8

-2.069 9

-1.574 4

1.574 4

2.069 9

335

346

2 430

3 400

ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes


Graphical procedures to summarize data
Summarizing data

Q-Q plot (quantile-quantile plot)


QQ plot
2.5

Data:

Standard normal quantiles

1.5

335
450
584
752
850
1000
1100
1245
2430

1
0.5
0
0.5
1
1.5
2
2.5
0

420/420

500

1000

1500
2000
Claim size

2500

3000

346
478
588
760
975
1000
1120
1406
3400

3500

Q-Q plot displays if a distribution is a correct approximation and/or when


not (tails).
Calculations: see previous slide.

348
540
605
760
990
1000
1245
1548

ACTL2002/ACTL5101 Probability and Statistics: Week 3

ACTL2002/ACTL5101 Probability and Statistics


c Katja Ignatieva

School of Risk and Actuarial Studies
Australian School of Business
University of New South Wales
k.ignatieva@unsw.edu.au

Week 3
Week 2
Week 4
Probability: Week 1
Week 6
Review
Estimation: Week 5
Week
7
Week
8
Week 9
Hypothesis testing:
Week
10
Week
11
Week
12
Linear regression:
Week 2 VL
Week 3 VL
Week 4 VL
Video lectures: Week 1 VL

Week 5 VL

ACTL2002/ACTL5101 Probability and Statistics: Week 3

Last two weeks


Introduction to probability;
Definition of probability measure, events;
Calculating with probabilities; Multiplication rule,
permutation, combination & multinomial;
Distribution function;
Moments: (non)-central moments, mean, variance (standard
deviation), skewness & kurtosis;
Generating functions;
Special (parametric) univariate distributions.
501/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3

This week
Joint probabilities:
- Discrete and continuous random variables;
- Bivariate and multivariate random variables;

Covariance;
Correlation;
Law of iterative expectations;
Conditional variance identity.

502/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

The Bivariate Case


We are often interested in the joint behavior of two (or more)
random variables.
Denote a bivariate random vector by a pair as follows:
X = [X1 , X2 ]> . The joint distribution function of X is:
FX1 ,X2 (x1 , x2 ) = Pr (X1 x1 , X2 x2 ) .
We can write:
Pr (a1 X1 b1 , a2 X2 b2 ) =FX1 ,X2 (b1 , b2 )
FX1 ,X2 (b1 , a2 )
FX1 ,X2 (a1 , b2 )
+ FX1 ,X2 (a1 , a2 ) .
503/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Discrete Random Variables


In the case where X1 and X2 are both discrete random
variables which can take values
x11 , x12 , . . .

and

x21 , x22 , . . .

respectively, we define:
pX1 ,X2 (x1i , x2j ) = Pr (X1 = x1i , X2 = x2j ) ,

for i, j = 1, 2, . . .

as the joint probability mass function of X , then:


X

X
i=1 j=1

504/562

pX1 ,X2 (x1i , x2j ) = 1.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Discrete Random Variables


The marginal p.m.f. of X1 and X2 are respectively
pX1 (x1i ) =

pX1 ,X2 (x1i , x2j )

j=1

and
pX2 (x2j ) =

pX1 ,X2 (x1i , x2j ) .

i=1

(sum over the other random variable(s)).


Prove: use Law of Total Probability.

505/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Example discrete random variables


An insurer offers both disability insurance (DI) and
unemployment insurance (UI) to small companies.
Most companies buy DI and UI, because of a large discount.
The claims are categorized in no claims, mild claims, and
severe claims.
Last year the 100 insured felt in the following categories:
DI
UI
#

506/562

no
no
74

no
mild
6

no
severe
2

mild
no
3

mild
mild
2

mild
severe
4

severe
no
1

severe
mild
3

Question: Find the marginal p.m.f. of DI and UI.


no
mild
Solution: x
74+6+2
3+2+4
pDI (x)
= 0.82
100
100 = 0.09
6+2+3
pUI (x) 74+3+1
=
0.78
100
100 = 0.11

severe
severe
5
severe
= 0.09
= 0.11

1+3+5
100
2+4+5
100

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Continuous Random Variables


In the case where X1 and X2 are both continuous random
variables, we set the joint density function of X as
fX1 ,X2 (X1 , X2 ) =


FX ,X (x1 , x2 )
x1 x2 1 2

and therefore the joint cumulative density function is given by:


Z x2 Z x1
FX1 ,X2 (x1 , x2 ) =
fX1 ,X2 (z1 , z2 ) dz1 dz2 .

Note:
Z
FX1 ,X2 (, ) =
FX1 ,X2 (, ) =
507/562

fX1 ,X2

Z Z

(z1 , z2 ) dz1 dz2 = 1

fX1 ,X2 (z1 , z2 ) dz1 dz2 = 0.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Continuous Random Variables


The marginal density function of X1 and X2 are respectively:
Z
Z
fX1 ,X2 (z1 , x2 ) dz1 .
fX1 (x1 ) =
fX1 ,X2 (x1 , z2 ) dz2 and fX2 (x2 ) =

The marginal cumulative distribution function of X1 and X2


are then respectively:
Z x1
Z x2
FX1 (x1 ) =
fX1 (u) du
and
FX2 (x2 ) =
fX2 (u) du,

or, alternatively:
Z
FX1 (x1 ) =
and FX2 (x2 ) =
508/562

x1

fX (u1 , u2 ) du1 du2

Z
x2 Z

fX (u1 , u2 ) du1 du2 .

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Introduction

Continuous Random Variables: example


The joint p.d.f. of X and Y is given by:
fX ,Y = 4 x (1 y ),

for 0 x, y 1, and 0 otherwise.


R
a. The marginal p.d.f. of X is: fX (x) = fX ,Y (x, y )dy =


R1
2 1
0 4 x (1 y )dy = 4 x (y 1/2 y ) 0 = 2x.

509/562

b. The marginal
 x
R x c.d.f. of X Ris:x
FX (x) = fX (z)dz = 0 2zdz = z 2 0 = x 2 , if 0 x 1
and zero if x < 0 and one if x > 1.
R
c. The marginal p.d.f. of Y is: fY (y ) = fX ,Y (x, y )dx =

1
R1
2
0 4 x (1 y )dx = 1/2 4 x (1 y ) 0 = 2(1 y ).
Ry
d. The marginal c.d.f. of Y is: FY (y ) = fY (z)dz =


Ry
2 y
2
0 2(1 z)dz = 2z z 0 = 2y y , if 0 y 1 and zero
if y < 0 and one if y > 1.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Discrete case


Let X be the random variable taking one if there is a positive
return on the asset portfolio and zero otherwise.
Let Y be the random variable for the claims for home
insurance, which can take value 0, 1, 2, and 3 for few, normal,
many claims and a large number of claims due to floods,
respectively.
The marginal probability mass functions of X and Y are:
X =x
0
1

510/562

Pr (X = x)
1/2
1/2

and

Y =y
0
1
2
3

Pr (Y = y )
1/8
3/8
3/8
1/8

Question: What would be the joint probability density


function if X and Y are independent?

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Discrete case


Solution: If the two are independent, we would have:
Pr (X = x, Y = y ) = Pr (X = x) Pr (Y = y )
For all X = x and Y = y the joint distribution, if they are
independent, is described in the table below:
Pr(X = x, Y = y )
X =x
0
1

511/562

0
1/16
1/16

Y =y
1
2
3/16 3/16
3/16 3/16

3
1/16
1/16

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Discrete case


Suppose instead they are not independent and their joint
distribution could be described as:
Pr(X = x, Y = y )
X =x
0
1

0
0
1/8

Y =y
1
2
3/16 3/16
3/16 3/16

3
1/8
0

Question: Proof that X and Y are dependent.


Solution: We have Pr(Y = 3) = 1/8 and Pr(X = 1) = 1/2,
however Y takes the value 3 the probability that X takes the
value 1 is zero (joint probability of Y = 3 and X = 1 is zero).
512/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Example: Multinomial distribution


Suppose we have n independent trials with r outcomes with
probabilities p1 , p2 , . . . , pr .
The joint frequency distribution is given by:
pN1 ,N2 ,...,Nr (n1 , n2 , . . . , nr ) =

n!
p n1 p2n2 . . . prnr .
n1 ! n2 ! . . . nr ! 1

The marginal distribution is (Binomial distribution!) given by:


X
X
X X
pNi (ni ) =
,...,
,
,...,
pN1 ,N2 ,...,Nr (n1 , n2 , . . . , nr )
N1

Ni1 Ni+1

Nr

 
n
=
pini (1 pi )nni .
ni
Can do this by summing the marginals.
* Using Binomial expansion (prove not required).
513/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


Now consider an example of a bivariate random vector
[X , Y ]> whose joint density function is:

fX ,Y (x, y ) = c x 2 + xy ,
for 0 x 1 and 0 y 1,
and zero otherwise. To find the constant c, it must be a valid
density so that:
Z Z
Z 1Z 1

1=
fX ,Y (x, y ) dxdy =
c x 2 + xy dxdy

Z
=c
0

1

1 3 1 2
x + x y
3
2

1

1
1
dy =c y + y 2
3
4
0

1
=c
0

Hence, c = 12/7, then also fX ,Y (x, y ) 0 for all x, y .


a. Question: Find the marginal densities.
514/562

b. Question: Find the joint distribution function.

7
.
12

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


a. Solution: Knowing the constant, we can then determine the
marginal densities. First the marginal density for X :
Z
Z 1

12 2
x + xy dy
fX (x) =
fX ,Y (x, y )dy =

0 7


12
1
=
x 2 + x , for 0 x 1,
7
2
and zero otherwise, and for Y :
Z
Z 1

12 2
fY (y ) =
fX ,Y (x, y )dx =
x + xy dx

0 7


12 1 1
=
+ y , for 0 y 1,
7 3 2
515/562

and zero otherwise.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


b. Solution: You can also determine the joint distribution
function if 0 x 1, 0 y 1 by:
Z y Z x
Z yZ x

12 2
u + uv dudv
FX ,Y (x, y ) =
fX ,Y (u, v )dudv =

0
0 7
x


Z y
Z y 
12 1 3 1 2
12 1 3 1 2
dv =
=
u + u v
x + x v
7 3
2
3
2
0 7
0
0
 
y


12 1 3
1
12 1 3
1
=
x v + x 2v 2
=
x y + x 2y 2 .
7 3
4
7 3
4
0
Hence:
FX ,Y
516/562

0,

12

 if x < 0 or y < 0;
1 2 2
+
x
y
, if 0 x 1, 0 y 1;
7
4
(x, y ) =
F (x) ,
if y > 1;

X
FY (y ) ,
if x > 1.
1 3
3x y

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


joint p.d.f.

marginal p.d.f.

1.5
1
0.5
0
0.5
1.5

FX(x)

FX,Y(x,y)

1.5

0.5
0

1
1
0.5
0
0 0.5
y0.5 0.5
x

1.5
0.5
0.5

1.5

0.5

0.5

F (y)

marginal p.d.f.
1.5

0
0.5
0.5
517/562

0.5
1
x
slide 519

1.5

0
0

0.5
y

1.5

0.5
0.5

0.5
x

1.5

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


You can then determine the marginal distributions:

0,
 if x < 0;
12 1 3
1 2
FX (x) = FX ,Y (x, 1) =
x + 4 x , if 0 x 1;
7 3
1,
if x > 1,
and
FY (y ) = FX ,Y (1, y ) =

0,

12
7

1,

1
3y

1 2
4y

if y < 0;
, if 0 y 1;
if y > 1.

Can you confirm the marginal densities are correct?


518/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Exercises

Exercise: Continuous case


It becomes straightforward to compute probability statements such
as (using lower right panel on slide 517):
Z

1Z y


12 2
x + xy dxdy
0
0 7
y
Z 1  3
x
x 2y
12
+
dy
=
7
3
2
0
0

Z 1 3
y
12
y3
=
+
dy
7
3
2
0


 1
Z 1
12 5 y 4
5
12 5 3
y dy =
= ,
=
7
6
7

6
4
14
0
0
R R
so that Pr (X > Y ) = y fX ,Y (x, y )dxdy = 9/14.
Pr (X < Y ) =

519/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Means
Consider the bivariate random vector X = [X1 X2 ]> .
The mean of X is the vector whose elements are the means of
X1 and X2 , that is,

 

E [X1 ]
1
=
.
E[X ] =
E [X2 ]
2
If X1 , X2 , . . . , Xn are jointly distributed random variables with
expectations E [Xi ] for i = 1, . . . , n and Y is a affine function
of the Xi , i.e.,
n
X
Y =a+
bi Xi ,
i=1

then, we have the additively rule:


"
#
n
n
n
X
X
X
E [Y ] =E a +
bi Xi = a +
E [bi Xi ] =a +
bi E [Xi ] .
520/562

i=1

i=1

i=1

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Variances, Covariances
Recall: variance of X is a measure for the spread of X .
Covariance is a measure of the spread between X1 and X2 .
The variance of the random vector X is also called the
variance-covariance matrix:

  2

Var (X1 )
Cov (X1 , X2 )
1 12
Var (X ) =
=
,
Cov (X1 , X2 )
Var (X2 )
12 22
where the covariance is defined as:
Cov (X1 , X2 ) 12 =E [(X1 1 ) (X2 2 )]
=E [X1 X2 X1 2 1 X2 + 1 2 ]
=E [X1 X2 ] E [X1 ] E [X2 ] .

521/562

Note: Cov (Xi , Xi ) = ii = i2 , and covariance only defined


for two r.v..

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Example: Consider the example from slide 506.


Let no=0, mild=1, and
severe=2.

0.8

Question: Calculate the mean of


X1 = DI and X2 = UI .

0.6
0.4

Solution:
E [X1 ] = 3+2+4
100 1 +
6+2+3
E [X2 ] = 100 1 +

0.2
0

2 = 0.27.
2 = 0.33.

Question: Calculate the covariance


between X1 and X2 .

No
Mild

Severe
Mild

Severe
UI

522/562

1+3+5
100
2+4+5
100

No
DI

Question: Is covariance
positive or negative?

Solution:
E [X1 X2 ] = 0.02 1 1 + 0.04 1 2 +
0.03 2 1 + 0.05 2 2 = 0.36.
Cov (X1 , X2 ) = E [X1 X2 ] E [X1 ]
E [X2 ] = 0.36 0.27 0.33 = 0.2709.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Example: Consider the example from slide 509.


Question: Calculate the means.
R
Solution: E [X1 ] = x fX (x)dx =
R1 2
3 1
0 2x dx R= [2/3 x ]0 = 2/3.
R1

E [X2 ] = y fY (y )dx = 0 y 2
(1 y )dy = [y 2 2/3y 3 ]10 = 1/3.

fX,Y(x,y)

3
2

Question: Calculate the covariance


between X1 and X2 .

1
0
1
1
0.5
y

523/562

0.5
0 0

Question: Is covariance
positive or negative?

Solution:
R R
E [X1 X2 ] = fX ,Y (x, y ) x
R1R1
ydxdy = 0 0 4 x 2 (y y 2 )dxdy =
R1
2
0 4/3(y y )dy = 4/6 4/9 = 4/18.
Cov (X1 , X2 ) = E [X1 X2 ] E [X1 ]
E [X2 ] = 4/18 2/3 1/3 = 0.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

524/562

Let X Beta(0.2, 1) (prob of a claim) and Y |X NB(3, X )


(Y Beta-Negative-Binomial). Home insurance, insured qualified
as bad risk if 3 claims within 50 quarters.
Question: Does it have a negative or positive covariance?

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Properties of Covariance
If X and Y are jointly distributed random variables with
expectations X and Y the covariance of X and Y is
Cov (X , Y ) =E [(X X ) (Y Y )]
=E [X Y X Y Y X + X Y ]
=E [X Y ] X Y .
If X and Y are independent:

Cov (X , Y ) = E [X Y ] X Y = E [X ] E [Y ] X Y = 0.
* using independence X , Y .
525/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Properties of Covariance
Let X , Y , Z be random variables, and a, b < we have:
Cov (a + X , Y ) =E [(a + X (a + X )) (Y Y )]
=E [(X X ) (Y Y )]
=Cov (X , Y )
Cov (a X , b Y ) =E [(a X a X ) (b Y b Y )]
=E [a (X X ) b (Y Y )]
=a b E [(X X ) (Y Y )] = a b Cov (X , Y )
Cov (X , Y + Z ) =E [(X X ) (Y + Z Y Z )]
=E [(X X ) ((Y Y ) + (Z Z ))]
=E [(X X ) (Y Y ) + (X X ) (Z Z )]
=Cov (X , Y ) + Cov (X , Z ) .
526/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Properties of Covariance
Suppose X1 , X2 ,Y1 and Y2 are r.v., and a, b, c, d <, then:

Cov (aX1 + bX2 , cY1 + dY2 ) =Cov (aX1 + bX2 , cY1 )


+ Cov (aX1 + bX2 , dY2 )

=Cov (aX1 , cY1 ) + Cov (aX1 , dY2 )


+ Cov (bX2 , cY1 ) + Cov (bX2 , dY2 )

=acCov (X1 , Y1 ) + adCov (X1 , Y2 )


+ bcCov (X2 , Y1 ) + bdCov (X2 , Y2 ) .
* using: Cov (X , Y + Z ) = Cov (X , Y ) + Cov (X , Z ).
** using: Cov (aX , bY ) = abCov (X , Y ).
527/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Properties of Covariance
Let Xi , Yi be r.v., a, bi , c, dj < for i = 1, . . . , n and
j = 1, . . . , m.
We can generalize this as follows:
Suppose:
U =a+

n
X

bi Xi

and

i=1

V =c+

m
X

dj Yj .

j=1

Then:
Cov (U, V ) =

n X
m
X
i=1 j=1

528/562

bi dj Cov (Xi , Yj ) .

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Properties of Covariance
Note that Cov (X , X ) = Var (X ), so we have the variance of
the sum of r.v. is:
Var (X1 + X2 ) =Cov (X1 + X2 , X1 + X2 )
=Cov (X1 , X1 ) + Cov (X2 , X2 ) + 2Cov (X1 , X2 )
=Var (X1 ) + Var (X2 ) +2Cov (X1 , X2 ).
Also,
Var (aX1 ) = Cov (aX1 , aX1 ) = a2 Cov (X1 , X1 ) = a2 Var (X1 ) ,
using the result that we can take a constants out of a
covariance.
529/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Example Covariance
Consider the example from slides 506 and 522.
The costs for disability insurance are $1 million if mild and $2
million if severe.
The costs for unemployment insurance are $0.5 million if mild
and $1 million if severe.
The price of the contract is the expected value plus half the
standard deviation.

530/562

Question: What is the price for DI, UI, and DI and UI


combined?
 
1+3+5
2
2
Solution:
E X12 = 3+2+4
100 1 + 100 2 = 0.45 and
 2  6+2+3
2+4+5
2
2
E X2 = 100 1 + 100 2 = 0.55.
 
Var (X1 ) = E X12 (E [X1 ])2 = 0.45 0.272 = 0.3771 and
 
Var (X2 ) = E X22 (E [X2 ])2 = 0.55 0.332 = 0.4411.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Solution (cont.)
Price DI (=1 million X1 ):
p
Var (X1 million)/2
q
=E [X1 ] million + Var (X1 ) million2 /2
p
=0.27million + 0.3771 million2 = 0.5770million.

Price DI =E [X1 million] +

Price UI (=0.5 million X2 ):


p
Var (X2 0.5 million)/2
q
=E [X2 ] 0.5 million + Var (X2 ) 0.25 million2 /2
p
=0.165million + 0.4411 0.25 million2 /2

Price UI =E [X2 0.5 million] +

=0.3310million.
531/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Means, Variances, Covariances

Solution (cont.)
Price DI and UI combined (=1 million X1 + 0.5 million
X2 ):
Price UI and DI =E [X1 million + X2 0.5 million]
p
+ Var (X1 million + X2 0.5 million)/2
=E [X1 ] million + E [X2 ] 0.5 million
q
+ Var (X1 ) million2 + Var (X2 ) 0.25 million2
+2Cov (X1 , X2 )0.5 million2 /2
=0.27million + 0.165million
q
+ (0.3771 + 0.441/4 + 0.2709) million2 /2
=0.8704million.
532/562

This gives a 4.15% discount!

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Correlation coefficient
Large covariance: high dependency or large variance?
We define the correlation coefficient between X1 and X2 :
(X1 , X2 ) p

Cov (X1 , X2 )
Var (X1 ) Var (X2 )

provided Cov (X1 , X2 ) exists and the variances Var (X1 ) and
Var (X2 ) are each non-zero.
The value of the correlation coefficient is always between 1
and 1, i.e.
1 (X1 , X2 ) 1.
Note: correlation coefficient is only defined for 2 r.v..
533/562

Prove: see next slides.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Prove: Let Y =

X1
1

X2
2 ,

Var (Y ) 0 we have:


X1 X2
0 Var

1
2
 


 
X2
X1 X2
X1
+ Var
2Cov
,
=Var
1
2
1 2
1
1
1 1
= 2 Var (X1 ) + 2 Var (X2 ) 2
Cov (X1 , X2 )
1 2
1
2
|
{z
}
|
{z
} |
{z
}

=1

=1

Var (X1 )Var (X2 )


=
1 2

=2 (1 ) .
Consequently, we see that 1 because the variance of a
random variable is non-negative.
Proof continues next slide.
534/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Similarly by considering Y =

0 Var

X1
1

X1 X2
+
1
2

X2
2 ,

Var (Y ) 0 we have:


= 2 (1 + ) ,

we see that 1, which proves the result.

The correlation coefficient gives a measure of the linear


relationship between the two variables. In fact, = 1 gives:
Pr (X2 = aX1 + b) = 1
for some constants a 6= 0 and b so that you can write an
affine relationship between the two.

535/562

Question: Does a correlation of zero implies independence?


Solution:

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Note that we have that if X , Y are independent, then


Cov (X , Y ) = 0, hence:
Cov (X , Y )
0
(X , Y ) = p
=p
= 0.
Var (X ) Var (Y )
Var (X ) Var (Y )
However, the reverse does not need to hold.
Let X , Y be r.v. with j.p.m.f. (we have set X = Y 2 ):
Pr(X = x, Y = y )
X =x
0
1

1
0
1/3

Y =y
0
1
1/3
0
0
1/3

We have E [Y ] = 0, E [X ] = 2/3, and E [XY ] = 0. We have:


Cov (X , Y )
E [XY ] E [X ] E [Y ]
(X , Y ) = p
= p
= 0.
Var (X ) Var (Y )
Var (X ) Var (Y )
536/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Correlation coefficient

Correlation coefficient
6
5
4
3
quadratic dependence

linear dependence

1
0
1
2
1.5
537/562

0.5
=0

0.5
1
X
=0.9
=0.9

1.5
=0

2.5

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Conditional Distributions: Discrete case


Let X , Y be random variables with j.p.m.f.
Pr (X = xi , Y = yj ).
The conditional probability of X given Y is:
Pr (X = xi |Y = yj ) =

Pr (X = xi , Y = yj )
.
Pr (Y = yj )

If Pr (Y = yj ) = 0, then we define Pr (X = xi |Y = yj ) = 0.
Example: Let X POI(3), Y POI(2), and X and Y are
independent.
We have:
Pr(X = 2|Y = 3)=
538/562

Pr(X = 2, Y = 3) Pr(X = 2) Pr(Y = 3)


=
=Pr(X = 2).
Pr(Y = 3)
Pr(Y = 3)

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Conditional Distributions: Continuous case


Let X , Y be random variables with j.p.d.f. fX ,Y (x, y ).
The conditional density of Y given X is:
fY |X (y |x) =

fX ,Y (x, y )
fX (x)

If fX (x) = 0, then we define fY |X (y |x) = 0.


Example: consider the example from slide 509.
Question: Find fX |Y (x|y = 0.5)
Solution: fX |Y (x|y = 0.5) =
539/562

fX ,Y (x,0.5)
fY (0.5)

2x
1

= 2x.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Application: an imperfect particle counter


Define the random variable N as the number of incoming
claims and X as claims paid. Probability of a fraudulent claim
is q = 1 p and number of claims paid is Binomial:
(X |N = n) Binomial (n, p) .
If the number of incoming claims follows a Poisson
distribution (with parameter ) then the number of claims
paid turns out to also be Poisson with parameter p. This is
an example of thinning of a Poisson probability.
We will see more on thinning of a Poisson probability in
ACLT2003/5103 using Markov chains.
Proof: See next slides.
540/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Application: an imperfect particle counter


Proof: the law of total probability (why can we apply it
here?) gives:
Pr (X = k) =
=
=

Pr (X = k |N = n ) Pr (N = n)

n=0

X
n=k

X
n=k


n
n e
p k (1 p)nk
,
k
n!

n!
n e
p k (1 p)nk
(n k)! k!
n!

continues on next slide.


541/562

since n k.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
Conditional Distributions

Application: an imperfect particle counter


Now (making change of variables j = n k in the third line):
=

X
n=k

n!
n e
p k (1 p)nk
(n k)! k!
n!

( p)k X nk (1 p)nk
e

k!
(n k)!
k

j=0

( (1 p))j
j!

( p)
( p)k p
e e (1p) =
e
,
k!
k!
which is the p.m.f. of a Poisson( p) P
random variable.
i
* using exponential function exp(x) =
i=0 x /i!, with
x = (1 p).

542/562

( p)
e
k!

n=k

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
The Bivariate Normal Distribution

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
The Bivariate Normal Distribution

The Bivariate Normal Distribution


Suppose [X , Y ]> has a bivariate normal distribution, then its
density is given by:


1
1
p
fX ,Y (x, y ) =
exp
A ,
2 (1 2 )
2X Y 1 2
where


 

x X
y Y
x X 2
2
A=

X
X
Y

2
y Y
+
.
Y


543/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
The Bivariate Normal Distribution

The following results are important although quite tedious to show


(see section 5.10 of W+(7ed) for some of the derivation):


1. The marginals are: X N X , X2 and Y N Y , Y2 .
2. The conditional distributions are:



Y 2
, Y 1 2
(Y |X = x ) N Y + (x X )
X
and



X 2
2
, 1
.
(X |Y = y ) N X + (y Y )
Y X
3. The correlation coefficient between X and Y is: (X , Y ) = .

544/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Bivariate Case
The Bivariate Normal Distribution

Simulating multivariate normal distribution


Bivariate case: use properties 1 & 2 to simulate from i.i.d.
standard normal distributions:
X =X + X Z1
q
Y =Y + Y Z1 + Y (1 2 )Z2 ,
where Z1 and Z2 are i.i.d N(0, 1).
OPTIONAL: In case of multivariate normal, let
Z = [Z1 . . . Zn ]> i.i.d. N(0, 1), we have:
- The Cholesky decomposition: AA> = ( is the
variance-covariance matrix).
- We have: X = + AZ .
545/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Law of Iterated Expectations

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Law of Iterated Expectations

Law of Iterated Expectations


Note: E[X |Y = y ] is a constant, but E[X |Y ] is a random
variable.
For any two random variables X and Y , we have the law of
iterated expectations:
E [E [Y |X ]] = E [Y ] .
To prove this in the continuous case, first consider:
Z
E [E [Y |X ]] =
E [Y |X = x ] fX (x) dx


Z Z
=
y fY |X (y |x ) dy fX (x) dx.

546/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Law of Iterated Expectations

Interchanging order of integration, we have


Z Z
E [E [Y |X ]] =
y
fY |X (y |x ) fX (x) dx dy

{z
}
|
=fY (y )

y fY (y ) dy

=E [Y ]
* using the law of total probability (why can we use it here?).

547/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Conditional variance identity

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Conditional variance identity

Conditional variance identity


Another important result is the conditional variance identity:
Var (Y ) = Var (E [Y |X ]) + E [Var (Y |X )] .
Proof (* using the law of iterative expectations):
 
Var (Y ) =E Y 2 (E [Y ])2

 
=E E Y 2 |X (E [E [Y |X ]])2
h
i
 

=E E Y 2 |X E (E [Y |X ])2
h
i
+ E (E [Y |X ])2 (E [E [Y |X ]])2
=E [Var (Y |X )] + Var (E [Y |X ]).
Proof can also be found in section 5.11 of W+(7ed).
548/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Application: Random Sums


An insurance company usually has uncertainty in both the
number of claims and the claim amount of each claim filled.
Denote the total claim size is S, individual claim size Xi and
N is the total number of claims.
We are interested in the (distribution) mean and variance of a
random sum defined as:
S = X1 + X2 + . . . + XN ,
where both the Xi s and N are random variables.
We assume all the Xi are independent and also independent of
N.
549/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Application: Random Sums


Mean of S: The mean of the aggregate claims is:
E [S] = E [Xi ] E [N] .
This is straightforward:
E [S] =E [E [S |N ]]
" " N
##
X
=E E
Xi |N
i=1

"
=E

N
X

#
E [Xi |N]

i=1

=E [N E [Xi |N]]

=E [E [Xi ]] E [N] = E [Xi ] E [N] .


550/562

* using independence Xi and N.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Application: Random Sums


Variance of S: The variance of the aggregate claims is:
Var (S) = (E [Xi ])2 Var (N) + E [N] Var (Xi ) .
This is also straightforward to show:

Var (S) =E [Var (S |N )] + Var (E [S |N ])


"
!#
N
X
Xi
+ Var (E [Xi ] N)
=E Var
i=1

=E [N] E Var (Xi ) + E [Xi ] Var (N)


| {z }
| {z }
constant

constant
2

=E [N] Var (Xi ) + (E [Xi ]) Var (N)

551/562

* using conditional variance identity, ** using independence


between Xi and N.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Application: Random Sums


Moment Generating Function of S: The m.g.f. of the
aggregate claims is given by:
MS (t) = MN (log (MX (t))) .
Finding the m.g.f. is also straightforward:
h i
h h ii

MS (t) =E e tS = E E e tS N
i
h
i
h
=E (MX (t))N = E e Nlog(MX (t))
=MN (log (MX (t))) .
Note that when the number of claims has a Poisson
distribution, the resulting total claims S is said to have a
Compound Poisson distribution.
552/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Laws
Application & Exercise

Exercise
Let X Gamma(, ) and Y |X EXP(1/X ).
a. Question: Find E [Y ].
(Note: E [X ] = /, EXP()=Gamma(1,))
b. Question: Find Var (Y ). (Note: Var (X ) = / 2 )
a. Solution:
E [Y ] =E [E [Y |X ]]
=E [X ] = /.
b. Solution:
Var (Y ) =Var (E [Y |X ]) + E [Var (Y |X )]
 
=Var (X ) + E X 2
=/ 2 + Var (X ) + (E [X ])2
553/562


=/ 2 + / 2 + (/)2 = 2 + 2 / 2 .

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

The Multivariate Case


Let X = [X1 , X2 , . . . , Xn ]> be a random vector with n
elements. The joint distribution function (DF) of X is
denoted by:
FX1 ,X2 ,...,Xn (x1 , . . . , xn ) = Pr (X1 x1 , . . . , Xn xn ) .
In the discrete case, we define the joint probability mass
function as:
pX1 ,X2 ,...,Xn (x1 , . . . , xn ) = Pr (X1 = x1 , . . . , Xn = xn ) .
In the continuous case, we define the joint density function of
X as:
fX1 ,X2 ,...,Xn (x1 , . . . , xn ) =
554/562

...
FX ,X ,...,Xn (x1 , . . . , xn ) .
x1
xn 1 2

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

The joint DF is given by:


Z xn
Z
FX1 ,X2 ,...,Xn (x1 , . . . , xn ) =
...

x1

fX1 ,X2 ,...,Xn (z1 , . . . , zn ) dz1 . . . dzn .

To derive marginal p.m.f.s or densities, simply evaluate (sum


or integrate) overall the region except for the variable of
interest. For example in the continuous case, the marginal
density of Xk , for k = 1, 2, . . . , n is given by:
Z
Z
Y
fXk (xk ) =
...
fX1 ,X2 ,...,Xn (z1 , . . . , xk,. . . . , zn )
dzj .

555/562

j6=k

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

Independent Random Variables


The random variables X1 , X2 , . . . , Xn are said to be
independent if their joint distribution function can be written
as the product of their marginal distribution functions:
FX1 ,X2 ,...,Xn (x1 , . . . , xn ) = FX1 (x1 ) . . . FXn (xn ) .
As a consequence, their joint density can also be written as:
fX1 ,X2 ,...,Xn (x1 , . . . , xn ) = fX1 (x1 ) . . . fXn (xn ) ,
in the continuous case and for the discrete case as:
pX1 ,X2 ,...,Xn (x1 , . . . , xn ) = pX1 (x1 ) . . . pXn (xn ) .
556/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

Also, we have (if independent):


E [X1 X2 . . . Xn ] = E [X1 ] E [X2 ] . . . E [Xn ] ,
and in general, (if independent) we have:
E [gX1 (X1 ) gX2 (X2 ) . . . gXn (Xn )] =E [gX1 (X1 )] E [gX2 (X2 )]
. . . E [gn (Xn )] .

557/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


The Multivariate Case
Introduction

If X1 and X2 are independent, then:


1. Cov [X1 , X2 ] = 0 and so (X1 , X2 ) = 0.
2. E [X1 |X2 ] = E [X1 ] and of course E [X2 |X1 ] = E [X2 ].
3. A very useful result about independence is that X1 , X2 , . . . , Xn
are independent if and only if we can write the joint
distribution as a product of functions that involve only each
random variable:
FX1 ,X2 ,...,Xn (x1 , . . . , xn ) = HX1 (x1 ) . . . HXn (xn )
for some functions HX1 , . . . , HXn .

558/562

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summarizing data
Exercises

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summarizing data
Exercises

Exercise: summarizing data


An insurer assumes that the time between claims is
exponential distributed. A reinsurer pays out when the insurer
has two or more claims within two years. The distribution of
interest is Gamma(2,3).

Questions: Find the


a.
b.
c.
d.

Median;
Range;
10% trimmed mean;
Inter quantile range.

559/562

Sorted observations:
1.56 1.88 2.53 3.39
3.62 3.68 5.24 5.25
5.31 5.56 5.66 6.17
Solutions:
a.
b.
c.
d.

M = (3.68 + 5.24)/2 = 4.46;


R = 6.17 1.56 = 4.61;
x +...+x
e
x0.10 = (2) 10 (11) = 4.212;
Q1 = 2.53 + 0.75 (3.39 2.53) = 3.18
Q3 = 0.75 5.31 + 0.25 5.56 = 5.37
IQR = 5.37 3.18 = 2.19.

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summarizing data
Exercises

E.c.d.f.
1
0.9
0.8
0.7

F (x)

0.6
0.5
0.4
0.3

Colored lines: E.c.d.f.


Black solid line: Gamma(2,3) c.d.f.
Black dashed lines: Gamma(2,3) c.d.f. 2

0.2
0.1
0
0
560/562

10

15

Question: Is the Gamma(2,3) the correct distribution?


Solution: Yes.

20

25

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summary
Summary

Joint & Multivariate Distributions


The Bivariate Case
Introduction
Exercises
Means, Variances, Covariances
Correlation coefficient
Conditional Distributions
The Bivariate Normal Distribution
Laws
Law of Iterated Expectations
Conditional variance identity
Application & Exercise
The Multivariate Case
Introduction
Summarizing data
Exercises
Summary
Summary

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summary
Summary

Summary joint probabilities


Joint distribution function:
FX1 ,X2 (x1 , x2 ) = Pr (X1 x1 , X2 x2 ) .
Marginal p.m.f.:

pX1 (x1i ) =

pX1 ,X2 (x1i , x2j ) .

j=1

Marginal density function:


Z
fX1 (x1 ) =

fX1 ,X2 (x1 , z2 ) dz2 .

Conditional probability:
Pr (X = xi |Y = yj ) =
561/562

Pr (X = xi , Y = yj )
.
Pr (Y = yj )

ACTL2002/ACTL5101 Probability and Statistics: Week 3


Summary
Summary

Summary joint probabilities


Covariance:
Cov (X1 , X2 ) 12 = E [X1 X2 ] E [X1 ] E [X2 ] .
Correlation:
(X1 , X2 ) = p

Cov (X1 , X2 )
Var (X1 ) Var (X2 )

Law of iterative expectations:


E [E [Y |X ]] = E [Y ] .
Conditional variance identity:
Var (Y ) = Var (E [Y |X ]) + E [Var (Y |X )] .
562/562