Sie sind auf Seite 1von 37

Chapter 9

Bivariate distributions
Introduction and motivation:
Up to now, our study of RVs has been limited to considering only a single
RV, or function thereof, at a time. In many applications of probability in sci-
ence and engineering, we must deal with several RVs that are simultaneously
dened over a common probablity space.
For example, we might want to compute the probability that two RVs, say
X and Y , respectively belong to real number subsets A and B at the same
time, that is: P(X A, Y B).
In this and subsequent Chapters, the previously developed theory of a single
RV, i.e. Chapters 5 to 8, is extended to handle such situations. This leads to
the notion of joint distributions.
In this and the next Chapter, we rst study in detail the case of two RVs, also
known as bivariate distribution. In a subsequent Chapter, we shall consider
the general case of n 2 RVs.
230
9.1 Bivariate distributions 231
9.1 Bivariate distributions
Denition: Let X and Y be two RVs dened on the probability space (S, F, P).
We say that the mapping
s S (X(s), Y (s)) R
2
(9.1)
denes a two-dimensional random variable (or vector).
Joint events:
It can be shown that for any practical subset D R
2
of the real plane,
the set of outcomes
{(X, Y ) D} {s S : (X(s), Y (s)) D} (9.2)
is a valid event.
1
We refer to (9.2) as a joint (or bi-variate) event.
The situation of interest here is illustrated below:
S mapping (X,Y)
s
X(s )
Y(s)
D
{s:(X(s),Y(s)) in D}
IR
2
Figure 9.1: Illustration of a mapping (X, Y ) from S into R
2
.
Note that in the special case when D = AB, where A R and B R,
(9.2) reduces to
{(X, Y ) D} = {X A, Y B}
1
Specically, if D B
R
2, the Borel eld of R
2
(see Chapter 3), then {(X, Y ) D} F.
c 2003 Benot Champagne Compiled March 9, 2012
9.1 Bivariate distributions 232
Joint probability:
Since {(X, Y ) D} is a valid event, the probability that (X, Y ) D is
a well-dened quantity. This probability, denoted
P((X, Y ) D) P({s S : (X(s), Y (s)) D}),
is called a joint probability.
In this and the following sections, we develop tools to eciently model
and compute joint probabilities. Our rst step is to introduce a bivariate
CDF that generalizes the one dened in Chapter 5.
Denition: The joint cumulative distribution function (JCDF) of RVs X and
Y is dened as
F(x, y) = P(X x, Y y), for all (x, y) R
2
(9.3)
Remarks:
Note that F(x, y) = P((X, Y ) C(x, y)) where we dene C(x, y) =
(, x] (, y]. Region C(x, y) is sometimes referred to as a corner:
More generally, any practical subset D R
2
can be expressed as unions,
intersections and/or complements of corners. From the axioms of prob-
ability, it follows that for any D R
2
, the joint probability P((X, Y )
D) can be expressed in terms of F(x, y).
c 2003 Benot Champagne Compiled March 9, 2012
9.1 Bivariate distributions 233
Example 9.1:
Let D = (0, 1] (0, 1] R
2
. Express P((X, Y ) D) in terms of the joint CDF
of X and Y , i.e. F(x, y).
Solution: We have
P(0 < X 1, 0 < Y 1) = P(0 < X 1, Y 1) P(0 < X 1, Y 0)
= P(X 1, Y 1) P(X 0, Y 1)
[P(X 1, Y 0) P(X 0, Y 0)]
= F(1, 1) F(0, 1) F(1, 0) + F(0, 0)
A graphical interpretation of this result is provided in the gure below.
Theorem 9.1:
(a) F(x, y) is a non-decreasing function of its arguments x and y.
(b) F(, y) = F(x, ) = F(, ) = 0
(c) F(x, ) = P(X x) = F
X
(x) (CDF of X)
F(, y) = P(Y y) = F
Y
(y) (CDF of Y )
F(, ) = 1
(d) F(x
+
, y) = F(x, y
+
) = F(x, y)
(e) If x
1
< x
2
and y
1
< y
2
, then
F(x
2
, x
2
) F(x
2
, y
1
) + F(x
1
, y
1
) F(x
1
, y
2
) 0 (9.4)
Remarks:
Proof similar to that of Theorem 5.1.
According to (a), if y is xed and x
1
< x
2
, then F(x
2
, y) F(x
1
, y).
Similarly, if x is xed and y
1
< y
2
, then F(x, y
2
) F(x, y
1
).
c 2003 Benot Champagne Compiled March 9, 2012
9.1 Bivariate distributions 234
In (b) and (c), interpret as a limiting value, e.g.:
F(, y) = lim
x
F(x, y).
In (c), F
X
(x) and F
Y
(y) are the CDF of X and Y , respectively, as dened
in Chapter 6. Here, they are often called marginal CDF.
In (d), x
+
means the limit from the right, i.e.
F(x
+
, y) = lim
tx+
F(t, y).
with a similar interpretation for y
+
.
Any function F(x, y) satisfying the above properties is called a JCDF.
Whats next?
In theory, if the JCDF F(x, y) is known for all (x, y) R
2
, we can
compute any joint probability for X and Y . In practice, we nd that
the JCDF F(x, y) is a bit dicult to handle and visualize.
For these reasons, we usually work with equivalent but simpler repre-
sentations of the joint CDF:
- X and Y discrete joint PMF (Section 9.2)
- X and Y continuous joint PDF (Section 9.3)
c 2003 Benot Champagne Compiled March 9, 2012
9.2 Joint probability mass function 235
9.2 Joint probability mass function
Denition: Let X and Y be discrete random variables with sets of possible
values R
X
= {x
1
, x
2
, ...} and R
Y
= {y
1
, y
2
, ...}, respectively. We say that X
and Y are jointly discrete and we dene their joint probability mass function
(JPMF) as
p(x, y) = P(X = x, Y = y), for all (x, y) R
2
(9.5)
Theorem 9.2: The JPMF p(x, y) satises the following basic properties:
(a) 0 p(x, y) 1
(b) x / R
X
or y / R
Y
p(x, y) = 0
(c) Normalization property:

xR
X

yR
Y
p(x, y) = 1 (9.6)
(d) Marginalization:

yR
Y
p(x, y) = P(X = x) p
X
(x) (9.7)

xR
X
p(x, y) = P(Y = y) p
Y
(y) (9.8)
c 2003 Benot Champagne Compiled March 9, 2012
9.2 Joint probability mass function 236
Proof:
Results (a) and (b) follow trivially from the denition of the JDPF.
For (c), observe that the events {X = x, Y = y}, where x R
X
and
y R
Y
, form a partition of the sample space S. That is, they are
mutually exclusive and

xR
X

yR
y
{X = x, Y = y} = S (9.9)
Using probability Axiom 3, we have

xR
X

yR
Y
p(x, y) =

xR
X

yR
Y
P(X = x, Y = y)
= P(

xR
X

yR
Y
{X = x, Y = y}) = P(S) = 1
For (d), note that for a given x, the events {X = x, Y = y} where
y R
Y
, form a partition of {X = x}. That is, they are mutually
exclusive and

yR
Y
{X = x, Y = y} = {X = x} (9.10)
Again, using probability Axiom 3, we have

yR
Y
p(x, y) =

yR
Y
P(X = x, Y = y)
= P(

yR
Y
{X = x, Y = y}) = P(X = x) = p
X
(x)
A similar argument holds for p
Y
(y).
c 2003 Benot Champagne Compiled March 9, 2012
9.2 Joint probability mass function 237
Remarks:
p
X
(x) and p
Y
(y) are the probability mass functions (PMF) of X and Y ,
respectively, as dened in Chapter 6.
In the present context, they are also called marginal DPFs.
Example 9.2:
Let X and Y denote the numbers showing up when rolling a magnetically coupled
pair of dice. For (i, j) {1, . . . , 6}
2
, let the JPMF be given by
p(i, j) =
_
(1 + )/36 i = j
/36 i = j.
where 0 < 1.
(a) Find the constant .
(b) Find the marginal PMF p
X
(i)
(c) Find the marginal PMF p
Y
(j)
(d) Find P(X = Y ).

c 2003 Benot Champagne Compiled March 9, 2012


9.3 Joint probability density function (JPDF) 238
9.3 Joint probability density function (JPDF)
Denition: We say that RVs X and Y are jointly continuous if there exists
an integrable function f : R
2
[0, ), such that for any subset D of R
2
, we
have:
P((X, Y ) D) =
__
D
f(x, y) dx dy (9.11)
The function f(x, y) is called the joint probability density function (JPDF)
of X and Y .
Interpretations of f(x, y):
Let x and y be suciently small positive numbers, then:
P(|X x| <
x
2
, |Y y| <
y
2
) f(x, y) x y (9.12)
P((X, Y ) D) is equal to the volume under the graph of f(x, y) over
the region D (see gure):
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 239
Particular cases of interest:
For any subsets A and B of R:
P(X A, Y B) =
_
A
dx
_
B
dy f(x, y) (9.13)
Let A = [a, b] and B = [c, d]:
P(a X b, c Y d) =
_
b
a
dx
_
d
c
dy f(x, y) (9.14)
Note that the endpoints of the intervals A and B may be removed with-
out aecting the value of the integral. Accordingly,
P(a X b, c Y d) = P(a X < b, c Y d)
= P(a X < b, c Y < d)
= etc.
Let C be any curve in the plane R
2
:
P((X, Y ) C) =
__
C
f(x, y) dx dy = 0 (9.15)
For any (a, b) R
2
:
P(X = a, Y = b) = 0 (9.16)
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 240
Theorem 9.3: The JPDF f(x, y) satises the following properties:
(a) f(x, y) 0
(b) Normalization:
__
R
2
f(x, y) dx dy = 1 (9.17)
(c) Marginalization:
_

f(x, y) dy = f
X
(x) PDF of X (9.18)
_

f(x, y) dx = f
Y
(y) PDF of Y (9.19)
(d) Connection with JCDF:
F(x, y) =
_
x

dt
_
y

du f(t, u) (9.20)

2
F(x, y)
x y
= f(x, y) (9.21)
Note:
In the present context, f
X
(x) and f
Y
(y) are also called marginal PDF
of X and Y , respectively.
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 241
Proof:
(a) Follows from the denition of f(x, y).
(b) Using (9.11) with D = R
2
, we have
__
R
2
f(x, y) dx dy = P((X, Y ) R
2
) = 1
(c) From the denition of f
X
(x) as the PDF of X, we have that
P(X A) =
_
A
f
X
(x) dx (9.22)
From the denition (9.11) of f(x, y), we also have
P(X A) = P(X A, Y R)
=
_
A
__

f(x, y) dy
_
dx (9.23)
Both (9.22) and (9.23) being true for any subset A R, it follows that
f
X
(x) =
_

f(x, y) dy
(d) From (9.11), we immediately obtain
F(x, y) = P(X x, Y y) =
_
x

dt
_
y

du f(t, u)
At any continuity point of f(x, y), we have:

2
F(x, y)
x y
=

y
_

x
_
x

dt
_
y

du f(t, u)
_
=

y
_
y

du f(x, u)
= f(x, y)
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 242
Example 9.3:
Example: Let X and Y be jointly continuous with JPDF
f(x, y) =
_
cxy if 0 x 1 and 0 y 1,
0 otherwise.
(9.24)
(a) Find the constant c.
(b) Find the probability that Y X.
(c) Find the marginal PDFs of X and Y , i.e. f
X
(x) and f
Y
(y).
Solution: (a) Constant c is obtained from the normalization condition (9.17):
_

f(x, y)dxdy = c
_
1
0
_
1
0
xy dxdy = c
__
1
0
x dx
_
2
= c
_
x
2
2

1
0
_
2
=
c
4
= 1 c = 4
(b)We seek
P(Y X) = P((X, Y ) D) =
__
D
f(x, y) dx dy
=
_
1
0
__
1
x
4xy dy
_
dx =
_
1
0
_
2xy
2

1
x
_
dx
= 2
_
1
0
(x x
3
) dx = 2
_
x
2
2

x
4
4
_

1
0
= 2
_
1
2

1
4
_
=
1
2
(c) ...
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 243
9.3.1 Uniform distribution
Denition: X and Y are jointly uniform over region D R
2
, or equivalently
(X, Y ) U(D), if their joint PDF takes the form
f(x, y) =
_
_
_
c for all (x, y) D
0 otherwise
(9.25)
where c is a constant.
Remarks:
The value of the constant c is obtained from the requirement that f(x, y)
be properly normalized, that is:
__
D
f(x, y) dx dy = 1 c =
1
__
D
dx dy
=
1
Area(D)
(9.26)
If (X, Y ) U(D), then for any subset E R
2
:
P((X, Y ) E) =
__
E
f(x, y) dx dy
= c
__
ED
dx dy =
Area(E D)
Area(D)
(9.27)
The above concept is equivalent to the random selection of points from
a region D R
2
, as previously discussed in Chapter 3.
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 244
Example 9.4:
Bill and Monica decide to meet for dinner between 20:00 and 20:30 in a restau-
rant lounge. Assuming that they arrive at random during this time, nd the
probability that the waiting time of any one of them be more than 15 minutes?
Solution: Let X and Y respectively denote the arrival time of Bill and Monica
in minutes after 20:00. Assume (X, Y ) U(D) where
D = {(x, y) : 0 x 30 and 0 y 30}
The event that the waiting time of Bill or Monica is more than 15 minutes can
be expressed as
E = {(x, y) S : |x y| 15}
This event is illustrated below:
The desired probability is
P((X, Y ) E) =
Area(E)
Area(D)
=
2 (15 15)/2
30 30
=
1
4

c 2003 Benot Champagne Compiled March 9, 2012


9.3 Joint probability density function (JPDF) 245
9.3.2 Normal Distribution
Denition: RVs X and Y are jointly normal if their joint PDF can be ex-
pressed in the form
f(x, y) =
1
2
X

Y
_
1
2
exp[
1
2
Q(
x
X

X
,
y
Y

Y
)], (x, y) R
2
(9.28)
where
Q(u, v) =
1
1
2
(u
2
2uv + v
2
) (9.29)
and the parameters
X
and
Y
R,
X
and
Y
> 0 and 1 < < 1.
Remarks:
We also refer to (9.28) as the bivariate Gaussian distribution.
Compact notation: (X, Y ) N(
X
,
Y
,
2
X
,
2
Y
, )
It can be shown that f(x, y) is properly normalized, that is:
__
R
2
f(x, y) dx dy = 1 (9.30)
The precise meaning of the parameters
X
,
Y
,
X
,
Y
and will become
clear as we progress in our study.
The normal distribution (and its multi-dimensional extension) is one of
the most important joint PDF in applications of probability.
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 246
Shape of f(x, y):
The function Q(u, v) in (9.29) is a positive denite quadratic form:
Q(u, v) 0, for all (u, v) R
2
with equality i (u, v) = (0, 0). Therefore, f(x, y) (9.28) attains its
absolute maximum at the point (
X
,
Y
).
In the limit u and/or v , the function Q(u, v) +.
Accordingly, f(x, y) 0 in the limit x and/or y .
A study of the quadratic form (9.29) shows that its level contour curves,
i.e. the locus dened by Q(u, v) = c for positive constants c, are ellipses
centered at (0, 0), with the orientation of the principal axis depending
on .
Accordingly, the graph of the function f(x, y) has the form of a bell-
shaped surface with elliptic cross-sections:
- The bell is centered at the point (
X
,
Y
) where f(x, y) attains its
maximum value.
- The level contours of the function f(x, y) have the form of ellipses
whose exact shape depend on the parameter
X
,
Y
and .
c 2003 Benot Champagne Compiled March 9, 2012
9.3 Joint probability density function (JPDF) 247
Theorem 9.4: Let (X, Y ) N(
X
,
Y
,
2
X
,
2
Y
, ). The marginal PDF of X
and Y are given by
f
X
(x) =
1

2
X
e
(x
X
)
2
/2
2
X
, f
Y
(y) =
1

2
Y
e
(y
Y
)
2
/2
2
Y
(9.31)
That is, X N(
X
,
2
X
) and Y N(
Y
,
2
Y
)
Remarks:
The proof is left as an exercise for the student.
According to Theorem 9.4, joint normality of RVs X and Y implies that
each one of them is normal when considered individually.
The converse to this statement is not generally true: X and Y being
normal when taken in isolation does not imply that they are jointly
normal in general.
Theorem 9.4 provides a complete explanation for the meaning of the
parameters
X
,
X
,
Y
and
Y
:

X
= E(X) = expected value of X

2
X
= V ar(X) = variance of X
with similar interpretations for
Y
and
Y
The exact signication of the parameter , called correlation coecient,
will be explained in the next chapter.
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 248
9.4 Conditional distributions
It is often of interest to compute the conditional probability that RV X
belong to a real number subset B R, given that a certain event, say A, has
occurred, that is:
P(X B|A)
In this Section, we develop the necessary theory to handle this kind of prob-
ability computations. Special emphasis is given to the case where the event
A is itself dened in terms of a second RV, say Y .
9.4.1 Arbitrary event A
Denition: Let X be a RV (discrete or continuous) and let A be some event
with P(A) > 0. The conditional CDF of X given A is dened as
F(x |A) P(X x |A) =
P(X x, A)
P(A)
, all x R (9.32)
Remarks:
The function F(x |A) is a valid CDF, in the sense that it satises all the
basic properties of a CDF (see Theorem 5.1).
In theory, the function F(x |A) can be used to compute any probability
of the type P(X B|A). In practice, it is preferable to work with
closely related functions as dened below.
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 249
Denition: Let X be a discrete RV with set of possible values R
X
= {x
1
, x
2
, ...}.
The conditional PMF of X given A is dened as
p(x |A) P(X = x |A) =
P(X = x, A)
P(A)
, all x R (9.33)
Remarks:
The function p(x |A) is a valid PMF. In particular (see Theorem 6.1),
we have: p(x |A) 0, p(x |A) = 0 for all x / R
X
and

xR
X
p(x |A) =

all i
p(x
i
|A) = 1 (9.34)
Furthermore, for any subset B R, we have
P(X B|A) =

x
i
B
p(x
i
|A) (9.35)
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 250
Denition: Let X be a continuous RV. The conditional PDF of X given A is
dened as
f(x |A)
dF(x |A)
dx
, x R (9.36)
Remarks:
The function f(x|A) is a valid PDF (see Theorem 7.2). In particular,
we have f(x |A) 0 for all x R and
_

f(x |A) dx = 1 (9.37)


For any subset B R:
P(X B|A) =
_
B
f(x |A) dx. (9.38)
As a special case of the above, we note:
F(x |A) =
_
x

f(t |A) dt (9.39)


c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 251
9.4.2 Event A related to discrete RV Y
Introduction: Let Y be a discrete RV with possible values R
Y
= {y
1
, y
2
, ...}
and marginal PMF p
Y
(y). The event A under consideration here is
A = {s S : Y (s) = y} = {Y = y} (9.40)
for some y such that p
Y
(y) > 0.
Denition: Let y R be such that p
Y
(y) > 0. The conditional CDF of X
given Y = y is dened as
F
X| Y
(x |y) P(X x |Y = y) =
P(X x, Y = y)
P(Y = y)
, x R (9.41)
Remarks:
In theory, knowledge of F
X| Y
(x |y) is sucient to compute any condi-
tional probability of the type P(X B|Y = y), for any subset B R.
In practice, depending whether X is discrete or cont., we nd it more
convenient to work with closely related functions, as explained next.
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 252
Denition: Suppose X is a discrete RV with set of possible values R
X
=
{x
1
, x
2
, ...}. The conditional PMF of X given Y = y is dened as
p
X|Y
(x |y) P(X = x |Y = y), x R (9.42)
Remarks:
Invoking the denition of conditional probability, we have
p
X|Y
(x |y) =
P(X = x, Y = y)
P(y = Y )
=
p(x, y)
p
Y
(y)
(9.43)
where p(x, y) is the joint PMF of X and Y , as dened in (9.5).
Since p
X|Y
(x |y) in (9.42) is a special case of p(x |A) (9.33), it is also
a valid PMF. In particular, it satises properties similar to those in
(9.34)-(9.35) with obvious modications in the notation.
Example 9.5:
Let X and Y be dened as in Example 9.2. Find the conditional PMF of X given
Y = j where j {1, . . . , 6}.
Solution: From Example 9.2, we recall that
p(i, j) =
_
(1 + )/36, i = j
/36, i = j
where = 1

5
and
p
Y
(j) = 1/6, all j
The desired conditional probability is obtained as
p
X|Y
(i|j) =
p(i, j)
p
Y
(j)
=
_
(1 + )/6, i = j
/6, i = j

c 2003 Benot Champagne Compiled March 9, 2012


9.4 Conditional distributions 253
Denition: Suppose X is a continuous RV. The conditional PDF of X given
Y = y is dened as
f
X|Y
(x|y)
F
X|Y
(x|y)
x
(9.44)
Remarks:
f
X|Y
(x |y) is a special case of f(x |A) (9.36) and as such, it is a valid
PDF.
It satises properties similar to those in (9.37)-(9.39) with obvious mod-
ications in notation, e.g.:
P(X B|Y = y) =
_
B
f
X|Y
(x |y) dx. (9.45)
Example 9.6: Binary communications over noisy channel.
Let discrete RV Y {1, +1} denote the amplitude of a transmitted binary
pulse at the input of a digital communication link. Assume that
p
Y
(y) = P(Y = y) = 1/2, y {1, 1}
Let X denote the received voltage at the output of the link. Under the so-called
additive Gaussian noise assumption, we may assume that conditional on Y = y,
RV X is N(y,
2
). Given a positive pulse was transmitted, nd the probability
that the receiver makes an erroneous decision, that is nd P(X 0 |Y = 1).
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 254
9.4.3 Event A related to continuous RV Y
Introduction:
Let X and Y be jointly continuous with PDF f(x, y) and consider the
event A = {Y = y}, for some y such that f
Y
(y) > 0.
How can we characterize the conditional probability distribution of RV
X, given that event A, i.e. Y = y, has been observed?
Our previous denition of conditional CDF is not applicable here. Indeed
F
X|Y
(x |y) = P(X x |Y = y) =
P(X x, Y = y)
P(Y = y)
=
0
0
(?)
To accommodate this situation, the following extended denition of con-
ditional CDF, based on the concept of limit, is commonly used.
Denition: Let X and Y be jointly continuous. The conditional CDF of X,
given Y = y, is dened as
F
X|Y
(x |y) lim
0
+
P(X x |y < Y y + ). (9.46)
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 255
Denition: Let X and Y be jointly continuous. The conditional PDF of X,
given Y = y, is dened as
f
X|Y
(x |y)
F
X|Y
(x |y)
x
. (9.47)
Remarks:
The function f
X|Y
(x |y) is a valid PDF; it satises properties similar to
(9.37)-(9.39) with obvious modications. In particular:
f
X|Y
(x |y) 0, x R (9.48)
_

f
X|Y
(x |y) dx = 1 (9.49)
F
X|Y
(x |y) =
_
x

f
X|Y
(t |y) dt (9.50)
In practice, the conditional PDF f
X|Y
(x |y) is used instead of the condi-
tional CDF in the computation of probabilities:
P(X B|Y = y) =
_
B
f
X|Y
(x |y) dx (9.51)
Theorem 9.5: Provided f
Y
(y) > 0, the conditional PDF of X given Y = y
can be expressed as
f
X|Y
(x |y) =
f(x, y)
f
Y
(y)
(9.52)
c 2003 Benot Champagne Compiled March 9, 2012
9.4 Conditional distributions 256
Proof (optional material):
F
X|Y
(x |y) = lim
0
+
P(X x |y < Y y + )
= lim
0
+
P(X x, y < Y y + )
P(y < Y y + )
= lim
0
+
_
x

dt
_
y+
y
duf(t, u)
_
y+
y
duf
Y
(u)
=
_
x

dt lim
0
+
__
y+
y
duf(t, u)
_
y+
y
duf
Y
(u)
_
=
_
x

dt lim
0
+
_
2f(t, y)
2f
Y
(y)
+ O()
_
=
_
x

dt
f(t, y)
f
Y
(y)
(9.53)
Taking the partial derivative with respect to x, we nally obtain
f
X|Y
(x |y) =
F
X|Y
(x |y)
x
=
f(x, y)
f
Y
(y)

Example 9.7:
A rope of length L is cut into three pieces in the following way:
- The rst piece of length X is obtained by cutting the rope at random.
- The second piece of length Y is obtained by cutting the remaining segment
of length L X at random
- The third piece is obtained as the remaining segment of length L X Y .
(a) Find f
Y |X
(y|x), the conditional PDF of Y given X = x (0 < x < L).
(b) Find f(x, y), the Joint PDF of X and Y , and illustrate the region of the plane
where it takes on non-zero values.
(c) What is the probability that both X and Y be less than L/2?

c 2003 Benot Champagne Compiled March 9, 2012


9.5 Independent RVs 257
9.5 Independent RVs
Denition: We say that RVs X and Y are independent if the events {X A}
and {Y B} are independent for any real number subsets A and B, that is:
P(X A, Y B) = P(X A)P(Y B) (9.54)
for any A R and B R.
Link with joint CDF:
Let F(x, y) denote the joint CDF of RVs X and Y . If X and Y are
independent, we have
F(x, y) = P(X x, Y y)
= P(X x)P(Y y)
= F
X
(x)F
Y
(y) (9.55)
Conversely, it can be shown that if F(x, y) = F
X
(x)F
Y
(y) for all (x, y)
R
2
, then X and Y are independent.
Theorem 9.6: X and Y are independent if and only if
F(x, y) = F
X
(x)F
Y
(y), all (x, y) R
2
(9.56)
c 2003 Benot Champagne Compiled March 9, 2012
9.5 Independent RVs 258
9.5.1 Discrete case
Theorem 9.7: Let X and Y be discrete RVs with joint PMF p(x, y). X and
Y are independent if and only if
p(x, y) = p
X
(x) p
Y
(y), all (x, y) R
2
(9.57)
Proof: Suppose that X and Y are independent. Then,
p(x, y) = P(X = x, Y = y)
= P(X = x)P(Y = y)
= p
X
(x)p
Y
(y)
Conversely, suppose that (9.57) is satised. Let R
X
= {x
1
, x
2
, ...} and R
Y
=
{y
1
, y
2
, ...} denote the sets of possible values of X and Y , respectively. Then,
for any real number subsets A and B, we have
P(X A, Y B) =

x
i
A

y
j
B
p(x
i
, y
j
)
=

x
i
A
p
X
(x
i
)

y
j
B
p
Y
(y
j
)
= P(X A)P(Y B)
Example 9.8:
Consider 20 independent ips of a fair coin. What is the probability of 6 heads
in the rst 10 ips and 4 heads in the next 10 ips?
c 2003 Benot Champagne Compiled March 9, 2012
9.5 Independent RVs 259
9.5.2 Continuous case
Theorem 9.8: Let X and Y be continuous RVs with joint PDF f(x, y). X
and Y are independent if and only if
f(x, y) = f
X
(x)f
Y
(y), all (x, y) R
2
(9.58)
Example 9.9:
Suppose X and Y are independent RVs, each being exponentially distributed
with parameter = 1. Find P(Y > X + 1) ?
c 2003 Benot Champagne Compiled March 9, 2012
9.5 Independent RVs 260
9.5.3 Miscellaneous results
Theorem 9.9: If RVs X and Y are independent, so are U = g(X) and V =
h(Y ), for any functions g : R R and h : R R.
Application: Suppose that X and Y are independent RVs. Then so are RVs
sin(X
2
) and e
Y 1
Theorem 9.10: If X and Y are independent, then F
X|Y
(x |y) = F
X
(x). Fur-
thermore, if X and Y are jointly discrete, then p
X|Y
(x |y) = p
X
(x), while if
they are jointly continuous, then f
X|Y
(x |y) = f
X
(x).
Theorem 9.11: Let X and Y be jointly normal, as dened in (9.28)-(9.29).
Then X and Y are independent if and only if = 0.
Prof: If = 0 in (9.28)-(9.29), we immediately obtain
f(x, y) =
1

2
X
e

1
2
2
X
(x
X
)
2
1

2
Y
e

1
2
2
Y
(y
Y
)
2
= f
X
(x) f
Y
(y) (9.59)
where the result of Theorem 9.4 has been used. Conversely, it can be shown
that if f(x, y) (9.28) is equal to the product of f
X
(x) and f
Y
(y) in (9.31),
then we must have = 0 .
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 261
9.6 Transformation of joint RVs
Introduction:
Let X and Y be jointly continuous RVs with known PDF f(x, y).
In applications, we are often interested in evaluating the distribution of
one or more RVs dened as a function of X and Y , as in h(X, Y ).
Here, we distinguish two relevant cases:
- h : R
2
R
- h : R
2
R
2
In each case, we present a technique that can be used to determine the
PDF of the transformed variables.
9.6.1 Transformation from R
2
R
Problem formulation:
Let Z = h(X, Y ), where h : R
2
R.
We seek the PDF of RV Z, say g(z).
Method of distribution:
For each z R, nd domain D
z
R
2
such that
Z z (X, Y ) D
z
(9.60)
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 262
Express the CDF of Z as:
G(z) = P(Z z)
= P((X, Y ) D
z
) =
__
D
z
f(x, y) dx dy (9.61)
Find the PDF by taking the derivative of G(z):
g(z) =
dG(z)
dz
(9.62)
Example 9.10:
Let X and Y be uniformly distributed over the square (0, 1)
2
R
2
. Find the
PDF of Z = X + Y .
Theorem 9.12 Let X and Y be independent RVs with marginal PDFs f
X
(x)
and f
Y
(y), respectively. The PDF of Z = X + Y is given by
g(z) =
_

f
X
(x)f
Y
(z x)dx (9.63)
Remarks:
That is, the PDF of Z is obtained as the convolution of the marginal
PDFs of X and Y .
The proof is left as an exercise.
Please note that the previous example is a special case of this theorem.
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 263
9.6.2 Transformation from R
2
R
2
Introduction:
We consider the transformation (U, V ) = h(X, Y ), where h : R
2
R
2
.
We seek the joint PDF of RV U and V , say g(u, v).
The proposed approach is based on the following theorem, which pro-
vides a generalization of the method of transformation in Section 7.2.2
Theorem 8.9: For every (u, v) R
2
, let x
i
x
i
(u, v) and y
i
y
i
(u, v)
(i = 1, 2, ...) denote the distinct roots of the equation (u, v) = h(x, y). The
joint PDF of U and V may be expressed as
g(u, v) =

i
f(x
i
, y
i
)|J
i
| (9.64)
J
i
= det
_
x
i
u
x
i
v
y
i
u
y
i
v
_
=
x
i
u
y
i
v

y
i
u
x
i
v
(9.65)
Remarks:
If the equation (u, v) = h(x, y) has no root, then set g(u, v) = 0.
In (8.52), x
i
and y
i
really stand for x
i
(u, v) and y
i
(u, v), respectively.
The determinant J
i
in (405) is the so-called Jacobian of the inverse
transformation (u, v) (x
i
, y
i
).
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 264
Example 9.11:
Assume X and Y are continuous with joint PDF f(x, y). Let U = X + Y and
V = X Y .
(a) Find the joint PDF g(u, v) of U and V .
(b) In the special case when X and Y are independent, nd the marginal PDF
of U, say g
U
(u).
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 265
c 2003 Benot Champagne Compiled March 9, 2012
9.6 Transformation of joint RVs 266
5
0
5
5
0
5
0
0.1
0.2
x
y
f
(
x
,
y
)
4 2 0 2 4
4
2
0
2
4
x
y

x
=1

y
=1
=0
5
0
5
5
0
5
0
0.05
0.1
4 2 0 2 4
4
2
0
2
4

x
=1

y
=2
=0
5
0
5
5
0
5
0
0.1
0.2
4 2 0 2 4
4
2
0
2
4

x
=1

y
=1
=.5
Figure 9.2: The bivariate normal PDF.
c 2003 Benot Champagne Compiled March 9, 2012

Das könnte Ihnen auch gefallen