Sie sind auf Seite 1von 45

CL202: Introduction to Data Analysis

MB+SCP
Mani Bhushan, Sachin Patwardhan
Department of Chemical Engineering,
Indian Institute of Technology Bombay
Mumbai, India- 400076
mbhushan,sachinp@iitb.ac.in

Spring 2016

MB+SCP (IIT Bombay)

CL202

Spring 2016

1 / 45

This handout

Multiple Random Variables


Joint, marginal, conditional distribution and density functions
Independence

MB+SCP (IIT Bombay)

CL202

Spring 2016

2 / 45

Extension of Ideas:

Multiple (Multivariate) Random Variables: Jointly distributed random


variables
Event occurs in sample space S. Associate many, X1 , X2 , ..., Xn , random
variables with .
= 1 2 . . . n
Each random variable is a valid mapping from S to R.

MB+SCP (IIT Bombay)

CL202

Spring 2016

3 / 45

Bivariate Random Variables

For simplicity of notation consider two random variables: X , Y .


Special case of multiple random variables.
Examples:
I

I
I
I

Average number of cigarettes smoked daily and the age at which an individual
gets cancer,
Height and weight of an individual,
Height and IQ of an individual.
Flow-rate and pressure drop of a liquid flowing through a pipe.

MB+SCP (IIT Bombay)

CL202

Spring 2016

4 / 45

Jointly distributed random variables

Often interested in answering questions on X , Y taking values in a specified


region D in R2 (xy plane).
The distribution functions FX (x) and FY (y ) of X and Y determine their
individual probabilities but not their joint probabilities. In particular, the
probability of the event
{X x} {Y y } = {X x, Y y }
cannot be expressed in terms of FX (x) and FY (y ).
Joint probabilities of X , Y are completely determined if the probability of the
above event is known for every x and y .

MB+SCP (IIT Bombay)

CL202

Spring 2016

5 / 45

Joint Probability Distribution Function or Joint Cumulative


Distribution Function

For random variables (discrete or continuous) X , Y , the joint (bivariate)


probability distribution function is:
FX ,Y (x, y ) = P{X x, Y y }
where x, y are two arbitrary real numbers.
Often, the subscript X , Y omitted.

MB+SCP (IIT Bombay)

CL202

Spring 2016

6 / 45

Properties of Joint Probability Distribution Function


(Papoulis and Pillai, 2002)

F (, y ) = F (x, ) = 0, F (, ) = 1.

P(x1 < X x2 , Y y ) = F (x2 , y ) F (x1 , y )


P(X x, y1 < Y y2 ) = F (x, y2 ) F (x, y1 )

P(x1 < X x2 , y1 < Y y2 ) = F (x2 , y2 ) F (x1 , y2 ) F (x2 , y1 ) + F (x1 , y1 )

MB+SCP (IIT Bombay)

CL202

Spring 2016

7 / 45

Joint Density Function


The joint density of X and Y is by definition the function
2 F (x, y )
xy

f (x, y ) =

It follows that,
x

F (x, y ) =

f (, )dd

Z Z
P((X , Y ) D) =

f (x, y )dxdy
D

In particular, as x 0 and y 0,
P(x < X x + x, y < Y y + y ) f (x, y )xy
R R

f (x, y )dxdy = 1; f (x, y ) 0 x, y R.

MB+SCP (IIT Bombay)

CL202

Spring 2016

8 / 45

Joint Density Example: Bivariate Gaussian Random


Variable

f (x, y ) = exp(0.5( )T P 1 ( ))
with

=

x
y

MB+SCP (IIT Bombay)


, =

1
1


, P=

CL202

0.9 0.4
0.4 0.3


, =

1
p

|P|

Spring 2016

9 / 45

Joint Density Visualization

MB+SCP (IIT Bombay)

CL202

Spring 2016

10 / 45

Joint Distribution Visualization

MB+SCP (IIT Bombay)

CL202

Spring 2016

11 / 45

Marginal Distribution or Density Functions of Individual


Random Variables
Marginal Probability Distribution Functions: FX (x), FY (y ):
I

Extract FX (x) from F (x, y ) as:


FX (x)

P(X x) =

P(X x, Y < ) = F (x, )

Similarly, extract FY (y ) as:


FY (y ) = P(Y y ) = P(X < , Y y ) = F (, y )

Marginal Probability Density Functions: fX (x), fY (y ):


I

Extract these from f (x, y ) as:


Z

fX (x) =

f (x, y )dy

fY (y ) =

f (x, y )dx

MB+SCP (IIT Bombay)

CL202

Spring 2016

12 / 45

Marginal Probability Density

Z
fX (x) =

f (x, y )dy

Makes sense, since


P(X A)

= P(X A, Y (, ))
Z Z
f (x, y )dydx
=
ZA
=
fX (x)dx
A

where fX (x) is as defined above.


Similarly,
Z

fY (y ) =

f (x, y )dx

MB+SCP (IIT Bombay)

CL202

Spring 2016

13 / 45

Example 4.3c from Ross



f (x, y ) =

2e x e 2y
0

0 < x < , 0 < y <


otherwise

Compute: (a) P(X > 1, Y < 1), (b) P(X < Y ), (c) P(X < a)
Z
P(X > 1, Y < 1)

=
0
Z 1

2e x e 2y dxdy

2e 2y (e x |
1 )dy

Z 1
2e 2y dy = e 1 (1 e 2 )
= e 1
0
Z Z y
2e x e 2y dxdy = 1/3
P(X < Y ) =
0
0
Z aZ
P(X < a) =
2e x e 2y dydx = 1 e a
0

MB+SCP (IIT Bombay)

CL202

Spring 2016

14 / 45

Joint Density Visualization: Exponential

MB+SCP (IIT Bombay)

CL202

Spring 2016

15 / 45

Joint Distribution Visualization: Exponential

MB+SCP (IIT Bombay)

CL202

Spring 2016

16 / 45

Joint Probability Mass Function (PMF)

Given two discrete random variables X and Y in the same experiment, the
joint PMF of X and Y is
p(xi , yj ) = P(X = xi , Y = yj )
for all pairs of (xi , yj ) values that X and Y can take.
p(xi , yj ) also denoted as pX ,Y (xi , yj ).
The marginal probability mass functions for X and Y are
X
pX (x) = P(X = x) =
pX ,Y (x, y )
y

pY (y ) = P(Y = y ) =

pX ,Y (x, y )

MB+SCP (IIT Bombay)

CL202

Spring 2016

17 / 45

Computation of Marginal PMF from Joint PMF


Formally:
{X = xi } =

{X = xi , Y = yj }

All events on RHS are mutually exclusive. Thus,


X
X
pX (xi ) = P(X = xi ) =
P(X = xi , Y = yj ) =
p(xi , yj )
j

Similarly,
P
pY (yj ) = P(Y = yj ) = p(xi , yj ).
i

Note: P(X = xi , Y = yj ) cannot be constructed from knowledge of P(X = xi )


and P(Y = yj ).

MB+SCP (IIT Bombay)

CL202

Spring 2016

18 / 45

Example: 4.3a, Ross


3 batteries are randomly chosen from a group of 3 new, 4 used but still working,
and 5 defective batteries. Let X , Y denote the number of new, and used but
working batteries that are chosen, respectively. Find
p(xi , yj ) = P(X = xi , Y = yj ).
Solution: Let T =12 C3
p(0, 0) = (5 C3 )/T
p(0, 1) = (4 C1 )(5 C2 )/T
p(0, 2) = (4 C2 )(5 C1 )/T
p(0, 3) = (4 C3 )/T
p(1, 0) = (3 C1 )(5 C2 )/T
p(1, 1) = (3 C1 )(4 C1 )(5 C1 )/T
p(1, 2) = ...
p(2, 0) = ...
p(2, 1) = ...
p(3, 0) = ...

MB+SCP (IIT Bombay)

CL202

Spring 2016

19 / 45

Tabular Form

0
1
2
3
Col sum
(P(Y = j))

10/220
30/220
15/220
1/220
56/220

40/220
60/220
12/220
0
112/220

30/220
18/220
0
0
48/220

4/220
0
0
0
4/220

Row Sum
(P(X = i))
84/220
108/220
27/220
1/220

i represents row and j represents column:


Both row and column sums add upto 1.
Marginal probabilities in the margins of the table.

MB+SCP (IIT Bombay)

CL202

Spring 2016

20 / 45

n Random Variables
Joint cumulative probability distribution function F (x1 , x2 , ..., xn ) of n random
variables X1 , X2 , ..., Xn is defined as:
F (x1 , x2 , ..., xn ) = P(X1 x1 , X2 x2 , ..., Xn xn )
If random vars. discrete: joint probability mass function
p(x1 , x2 , ..., xn ) = P(X1 = x1 , X2 = x2 , ..., Xn = xn )
If random vars. continuous: joint probability density function f (x1 , x2 , ..., xn )
such that for any set C in n-dimensional space
Z Z
Z
P((X1 , X2 , ..., Xn ) C ) =
. . . f (x1 , x2 , ..., xn )dx1 dx2 ...dxn
(x1 ,...,xn )C

where,
f (x1 , x2 , ..., xn ) =
MB+SCP (IIT Bombay)

CL202

n F (x1 , x2 , . . . , xn )
x1 x2 . . . xn
Spring 2016

21 / 45

Obtaining Marginals

= F (x1 , , , . . . , )
Z Z
Z
fX1 (x1 ) =
...
f (x1 , x2 , . . . , xn )dx2 dx3 . . . dxn

XX X
pX1 (x1 ) =
...
p(x1 , x2 , . . . , xn )

FX1 (x1 )

x2

MB+SCP (IIT Bombay)

x3

xn

CL202

Spring 2016

22 / 45

Independence of Random Variables

Random variables X and Y are independent if for any two sets of real
numbers A and B:
P(X A, Y B) = P(X A)P(Y B)
i.e. events EA = {X A} and EB = {Y B} are independent.
Height and IQ
In particular: P(X a, Y b) = P(X a)P(Y b), or
In terms of joint cumulative distribution function F of X and Y :
F (a, b) = FX (a)FY (b); a, b R

Random variables that are not independent are dependent

MB+SCP (IIT Bombay)

CL202

Spring 2016

23 / 45

Independence: Probability Mass and Density Functions

Random variables X , Y independent if:


Discrete random variables: Probability mass function
p(xi , yj ) = pX (xi )pY (yj ) for all xi , yj
Continuous random variables: Probability density function
f (x, y ) = fX (x)fY (y ) for all x, y

MB+SCP (IIT Bombay)

CL202

Spring 2016

24 / 45

Independence: Equivalent Statements

1)

P(X A, Y B) = P(X A)P(Y B); A, B sets in R

2)

F (x, y ) = FX (x)FY (y ); x, y

3)

f (x, y ) = fX (x)fY (y ); x, y ; continuous RVs

3)

p(xi , yj ) = pX (xi )pY (yj ); xi , yj ; discrete RVs

MB+SCP (IIT Bombay)

CL202

Spring 2016

25 / 45

Example 5.2 (Ogunnaike, 2009)


The reliability of the temperature control system for a commercial, highly
exothermic polymer reactor is known to depend on the lifetimes (in years) of the
control hardware electronics, X1 , and of the control valve on the cooling water
line, X2 . If one component fails, the entire control system fails. The random
phenomenon in question is characterized by the two-dimensional random variable
(X1 , X2 ) whose joint probability distribution is given as:
 1 (0.2x +0.1x )
1
2
, 0 < x1 < , 0 < x2 <
50 e
f (x1 , x2 ) =
0,
otherwise
1

Establish that
is a legitimate joint probability density function,
R Rabove

To show: 0 0 f (x1 , x2 )dx1 dx2 = 1.


Z
0

Z
0

1 (0.2x1 +0.1x2 )
1
0.1x2
e
dx1 dx2 =
(5e 0.2x1 |
|0 ) = 1
0 )(10e
50
50

MB+SCP (IIT Bombay)

CL202

Spring 2016

26 / 45

Example (Continued)

Whats the probability of the system


lasting more than 2 years.
RR
1 (0.2x1 +0.1x2 )
e
dx1 dx2 = 0.549.
To find: P(X1 > 2, X2 > 2) = 2 2 50

Find marginal density function of X1 .


Z
1 (0.2x1 +0.1x2 )
1
fX1 (x1 ) =
e
dx2 = e (0.2x1 )
50
5
0

Find marginal density function of X2 ?


Z
1 (0.1x2 )
1 (0.2x1 +0.1x2 )
fX2 (x2 ) =
e
dx1 =
e
50
10
0

Are X1 , X2 independent? Yes, since f (x1 , x2 ) = fX1 (x1 )fX2 (x2 ).

MB+SCP (IIT Bombay)

CL202

Spring 2016

27 / 45

Independence of n Random Variables


Random variables X1 , X2 , ..., Xn are said to be independent if
For all sets of real numbers A1 , A2 , ..., An :
P(X1 A1 , X2 A2 , ..., Xn An ) =

n
Y

P(Xi Ai )

i=1

In particular: a1 , a2 , ..., an R
P(X1 a1 , X2 a2 , ..., Xn an )
F (a1 , a2 , ..., an )

=
=

n
Y
i=1
n
Y

P(Xi ai ), or
FXi (ai )

i=1

For discrete random variables: probability mass function factorizes:


p(x1 , x2 , ..., xn ) = pX 1 (x1 )pX 2 (x2 )...pXn (xn )
For continuous random variables: probability density function factorizes:
f (x1 , x2 , ..., xn ) = fX1 (x1 )fX2 (x2 )...fXn (xn )
MB+SCP (IIT Bombay)

CL202

Spring 2016

28 / 45

Independent, Repeated Trials

In statistics, one usually does not consider just a single experiment, but that
the same experiment is performed several times.
Associate a separate random variable with each of those experimental
outcomes.
If the experiments are independent of each other, then we get a set of
independent random variables.
Example: Tossing a coin n times. Random variable Xi is the outcome (0 or
1) in the i th toss.

MB+SCP (IIT Bombay)

CL202

Spring 2016

29 / 45

Independent and Identically Distributed (IID) Variables

A collection of random variables is said to be IID if


The variables are independent
The variables have the same probability distribution
Example 1: Tossing a coin n times. The probability of obtaining a head in a
single toss does not vary and all the tosses are independent.
I

Each toss leads to a random variable with the same probability distribution
function. The random variables are also independent. Thus, IID.

Example 2: Measuring temperature of a beaker at n time instances in the


day. The true water temperature changes throughout the day. The sensor is
noisy.
I
I

Each sensor reading leads to a random variable.


Variables are independent but not identically distributed.

MB+SCP (IIT Bombay)

CL202

Spring 2016

30 / 45

Conditional Distributions

Remember for two events A and B: conditional probability of A given B is:


P(A | B) =

P(A, B)
P(B)

for P(B) > 0.

MB+SCP (IIT Bombay)

CL202

Spring 2016

31 / 45

Conditional Probability Mass Function

For X , Y discrete random variables, define the conditional probability mass


function of X given Y = y by
pX |Y (x|y ) = P(X = x | Y = y ) =

P(X = x, Y = y )
p(x, y )
=
P(Y = y )
pY (y )

for pY (y ) > 0.

MB+SCP (IIT Bombay)

CL202

Spring 2016

32 / 45

Examples 4.3b,f from Ross


Question: In a community, 15% families have no children, 20% have 1, 35% have
2 and 30% have 3 children. Each child is equally likely to be a boy or girl. We
choose a family at random. Given that the chosen family has one girl, compute
the probability mass function of the number of boys in the family?
G: number of girls, B: number of boys, C: number of children
To find: P(B = i|G = 1), i = 0, 1, 2, 3.
P(B = i|G = 1) =

P(B = i, G = 1)
, i = 0, 1, 2, 3
P(G = i)

First find P(G = 1)


{G = 1} = {G = 1} ({C = 0} {C = 1} {C = 2} {C = 3})
P(G = 1) = P(G = 1, C = 0) + P(G = 1, C = 1) + P(G = 1, C = 2)
+ P(G = 1, C = 3)
since C = 0, C = 1, C = 2, C = 3 are mutually exclusive events with union a S.
Then,
P(G = 1)

= P(G = 1 | C = 0)P(C = 0) + P(G = 1 | C = 1)P(C = 1) + ...


=

0 + (1/2) 0.2 + ... = 0.3875

MB+SCP (IIT Bombay)

CL202

Spring 2016

33 / 45

Example continued

Then,
P(B = 0 | G = 1) =

P(B = 0, G = 1)
P(G = 1)

Numerator
= P(G = 1andC = 1) = P(G = 1 | C = 1)P(C = 1) = (1/2)0.2 = 0.1. Then,
P(B = 0 | G = 1) = 0.1/0.3875 = 8/31
Similarly:
P(B = 1 | G = 1) = 14/31, P(B = 2 | G = 1) = 9/31, P(B = 3 | G = 1) = 0.
Check: Sum of conditional probabilities is 1.

MB+SCP (IIT Bombay)

CL202

Spring 2016

34 / 45

Conditional Probability Density Function

For Random Variables X , Y , conditional probability density of X given that Y = y


is defined as:
f (x, y )
fX |Y (x | y ) =
fY (y )
for fY (y ) > 0.
Hence, can make statements on probabilities of X taking values in some set A
given the value obtained by Y as:
Z
P(X A | Y = y ) =
fX |Y (x | y )dx
A

MB+SCP (IIT Bombay)

CL202

Spring 2016

35 / 45

Independence and Conditional Probabilities

If X , Y are independent, then


pX |Y (x|y ) = pX (x)
fX |Y (x|y ) = fX (x)

MB+SCP (IIT Bombay)

CL202

Spring 2016

36 / 45

Temperature Control Example (Continued), Example 5.2


(Ogunnaike, 2009) Earlier

Find Conditional density function: fX1 |X2 (x1 |x2 ).


f (x1 , x2 )/fX2 (x2 ) =

1 0.2x1
e
5

which is same as fX1 (x1 ) in this example.


2

Similarly, fX2 |X1 (x2 |x1 ) = fX2 (x2 ) in this example.


Generic Question: If fX1 |X2 (x1 |x2 ) = fX1 (x1 ), then is fX2 |X1 (x2 |x1 ) = fX2 (x2 )?
Answer: Yes

MB+SCP (IIT Bombay)

CL202

Spring 2016

37 / 45

Example 5.5 (Ogunnaike, 2009)



fX1 ,X2 =

x1 x2 , 1 < x1 < 2, 0 < x2 < 1


0,
otherwise

Find: Conditional probability densities.


Answer: Compute marginals

(x1 0.5),
fX1 (x1 ) =
0,

(1.5 x2 ),
fX2 (x2 ) =
0,

1 < x1 < 2
otherwise
0 < x2 < 1
otherwise

Then compute conditionals


(x1 x2 )
, 1 < x1 < 2
(1.5 x2 )
(x1 x2 )
fX2 |X1 (x2 |x1 ) =
, 0 < x2 < 1
(x1 0.5)

fX1 |X2 (x1 |x2 ) =

The random variables X1 , X2 are not independent.


MB+SCP (IIT Bombay)

CL202

Spring 2016

38 / 45

Plots

MB+SCP (IIT Bombay)

CL202

Spring 2016

39 / 45

Independence of Transformations

If random variables X , Y are independent, then the random variables


Z = g (X ), U = h(Y )
are also independent.
Proof: Let Az denote the set of points on the x-axis such that g (x) z and Bu
denote the set of points on the y-axis such that h(y ) u. Then,
{Z z} = {X Az }; {U u} = {Y Bu }
Thus, the events {Z z} and {U u} are independent because events
{X Az } and {Y Bu } are independent.

MB+SCP (IIT Bombay)

CL202

Spring 2016

40 / 45

Expected Value

By analogy with transformation of a single RV, expected value of a transformation


of multiple RVs can be defined as:
Z Z
E [g (X , Y )] =
g (x, y )f (x, y )dxdy

For discrete RVs, the above becomes


E [g (X , Y )] =

XX
y

MB+SCP (IIT Bombay)

g (x, y )p(x, y )

CL202

Spring 2016

41 / 45

Special Cases

g (X , Y ) = X + Y . Then,
E (g (X , Y )) = E [X ] + E [Y ]

g (X , Y ) = (X E [X ])(Y E [Y ]): covariance of X,Y; labeled Cov(X,Y).


Correlation coefficient:
=

Cov(X , Y )
X Y

Property: dimensionless, 1 1.

MB+SCP (IIT Bombay)

CL202

Spring 2016

42 / 45

Independence versus Covariance

If X , Y are independent, then


Cov(X , Y ) = 0
Independence = covariance=0 (variables uncorrelated)
Covariance=0 =
6
independence
Example: X,Y take values (0, 1), (1, 0), (0, 1), (1, 0) with equal probability
(1/4).
Cov(X,Y)=0, but X,Y not independent.

MB+SCP (IIT Bombay)

CL202

Spring 2016

43 / 45

Independence Implications

g (X , Y ) = XY
I

If X , Y independent,
E [XY ] = E [X ]E [Y ]

g (X , Y ) = h(X )l(Y )
I

If X , Y independent,
E [h(X )l(Y )] = E [h(X )]E [l(Y )]

MB+SCP (IIT Bombay)

CL202

Spring 2016

44 / 45

THANK YOU

MB+SCP (IIT Bombay)

CL202

Spring 2016

45 / 45