1-Multiple12 4

STAT/MTHE 353: Probability II
Administrative details
STAT/MTHE 353: Multiple Random Variables Instructor: Tamas Linder
Email: linder@mast.queensu.ca
T. Linder Office: Je↵ery 401
Phone: 613-533-2417
Queen’s University
Office hours: Wednesday 2:30–3:30 pm
Winter 2012 Class web site: http://www.mast.queensu.ca/⇠ stat353
All homework and solutions will be posted here.
Check frequently for new announcements
STAT/MTHE 353: Multiple Random Variables 1 / 34 STAT/MTHE 353: Multiple Random Variables 2 / 34
Review
Text: Fundamentals of Probability with Stochastic Processes, 3rd
ed., by S. Ghahramani, Prentice Hall.
Lecture slides will be posted on the class web site. The slides are not
self-contained; they only cover parts of the material. S is the sample space;
Homework: 9 HW assignments. P is a probability measure on S: P is a function from a collection of
Homework due Friday before noon in my mailbox (Je↵ery 401). subsets of S (called the events) to [0, 1]. P satisfies the axioms of
No late homework will be accepted! probability;
Evaluation: the better of A random variable is a function X : S ! R. The distribution of X
Homework 20%, midterm 20%, final exam 60% is the probability measure associated with X:
Homework 20%, final exam 80%
P (X 2 A) = P ({s : X(s) 2 A}), for any “reasonable” A ⇢ R.
Midterm Exam: Thursday, February 16 in class (9:30 - 10:30 am)
Here are the usual ways to describe the distribution of X:
Joint Distributions
Distribution function: F : R ! [0, 1] defined by
FX (x) = P (X  x). If X1 , . . . , Xn are random variables (defined on the same probability

space), we can think of
It is always well defined.
Probability mass function, or pmf: If X is a discrete random X = (X1 , . . . , Xn )T
variable, then its pmf pX : R ! [0, 1] is
as a random vector. (In this course (x1 , . . . , xn ) is a row vector and
pX (x) = P (X = x), for all x 2 R. its transpose, (x1 , . . . , xn )T , is a column vector.)
Thus X is a function X : S ! Rn .
Note: : since X is discrete, there is a countable set X ⇢ R such
that pX (x) = 0 if x 2
/ X. Distribution of X: For “reasonable” A ⇢ Rn , we define
Probability density function, or pdf: If X is a continuous random P (X 2 A) = P ({s : X(s) 2 A}).
variable, then its pdf fX : R ! [0, 1) is a function such that
Z X is called a random vector or vector random variable.
P (X 2 A) = f (x) dx for all reasonable A ⇢ R.
A
We usually describe the distribution of X by a function on Rn :

Properties of joint pmf:
Joint cumulative distribution function (jcdf) is a the function defined
for x = (x1 , . . . , xn ) 2 Rn by (1) 0  pX (x)  1 for all x 2 Rn .
X
FX (x) = FX1 ,...,Xn (x1 , . . . , xn ) (2) pX (x) = 1, where X is the support of X.
x2X
= P (X1  x1 , X2  x2 , . . . , Xn  xn )
If X1 , . . . , Xn are continuous random variables and there exists
= P ({X1  x1 } \ {X2  x2 } \ · · · \ {Xn  xn })
⇣ Q
n ⌘ fX : Rn ! [0, 1) such that for any “reasonable” A ⇢ Rn ,
= P X2 ( 1, xi ] Z Z
i=1
P (X 2 A) = · · · fX (x1 , . . . , xn ) dx1 · · · dxn
If X1 , . . . , Xn are all discrete random variables, then their joint A
probability mass function (jpmf) is
then
pX (x) = P (X = x)
The X1 , . . . , Xn are called jointly continuous;
= P (X1 = x1 , X2 = x2 , . . . , Xn = xn ), x 2 Rn
fX (x) = fX1 ,...,Xn (x1 , . . . , xn ) is called the joint probability density
The finite or countable set of x values such that pX (x) > 0 is called function (jpdf) of X.
the support of the distribution of X.
Properties of joint pdf:
Comments:
(1) fX (x) 0 for all x 2 Rn
Z Z Z
(a) The joint pdf can be redefined on any set in Rn that has zero
(2) fX (x) dx = · · · fX (x1 , . . . , xn ) dx1 · · · dxn = 1
volume. This will not change the distribution of X. Rn
Rn
(b) The joint pdf may not exists even when each X1 , . . . , Xn are all
(individually) continuous random variables. The distributions for the various subsets of {X1 , . . . , Xn } can be
Example: . . . recovered from the joint distribution.
These distributions are called the joint marginal distributions (here
“marginal” is relative to the full set {X1 , . . . , Xn }.
Marginal joint probability mass functions
Assume X1 , . . . , Xn are discrete. Let 0 < k < n and
{i1 , . . . , ik } ⇢ {1, . . . , n}
Example: In an urn there are ni objects of type i for i = 1, . . . , r. The
Then the marginal joint pmf of (Xi1 , . . . , Xik ) can be obtained from total number of objects is n1 + · · · + nr = N . We randomly draw n
pX (x) = pX1 ,...,Xn (x1 , . . . , xn ) as objects (n  N ) without replacement. Let Xi = # of objects of type i
drawn. Find the joint pmf of (X1 , . . . , Xr ). Also find the marginal
pXi1 ,...,Xik (xi1 , . . . , xik )
distribution of each Xi , i = 1, . . . , r.
= P (Xi1 = xi1 , . . . , Xik = xik )
Solution: . . .
= P (Xi1 = xi1 , . . . , Xik = xik , Xj1 2 R, . . . , Xjn k
2 R)
where {j1 , . . . , jn k } = {1, . . . , n} \ {i1 , . . . , ik }
X X
= ··· pX1 ,...,Xn (x1 , . . . , xn )
x j1 x jn k
Thus the joint pmf of Xi1 , . . . , Xik is obtained by summing pX1 ,...,Xn
over all possible values of the complementary variables xj1 , . . . , xjn k .
Marginal joint probability density functions
Let X1 , . . . , Xn be jointly continuous with pdf fX = f . As before, let In conclusion, for {i1 , . . . , ik } ⇢ {1, . . . , n},
{i1 , . . . , ik } ⇢ {1, . . . , n}, {j1 , . . . , jn Z Z

k} = {1, . . . , n} \ {i1 , . . . , ik }
fXi1 ,...,Xik (xi1 , . . . , xik ) = ··· f (x1 , . . . , xn ) dxj1 · · · dxjn k
Let B ⇢ R . Then
k
Rn k
P (Xi1 , . . . , Xik ) 2 B where {j1 , . . . , jn k} = {1, . . . , n} \ {i1 , . . . , ik },

✓ ◆
= P (Xi1 , . . . , Xik ) 2 B, Xj1 2 R, . . . , Xjn k 2 R Note: In both the discrete and continuous cases it is important to always
0 1 know where the joint pmf p and joint pdf f are zero and where they are
Z Z Z Z
positive. The latter set is called the support of p or f .
= · · · @ · · · f (x1 , . . . , xn ) dxj1 · · · dxjn k A dxi1 · · · dxik
B Rn k
That is, we “integrate out” the variables complementary to xi1 , . . . , xik .
Example: Suppose X1 , X2 , X3 are jointly continuous with jpdf

8 Marginal joint cumulative distribution functions
<1 if 0  x  1, i = 1, 2, 3
i
f (x1 , x2 , x3 ) =
:0 otherwise In all cases (discrete, continuous, or mixed),
Find the marginal pdfs of Xi , i = 1, 2, 3, and the marginal jpdfs of FXi1 ,...,Xik (xi1 , . . . , xik )
(Xi , Xj ), i 6= j. = P (Xi1  xi1 , . . . , Xik  xik )
Solution: . . . = P (Xi1  xi1 , . . . , Xik  xik , Xj1 < 1, . . . , Xjn k
< 1)
Example: With X1 , X2 , X3 as in the previous problem, consider the = lim · · · lim FX1 ,...,Xn (x1 , . . . , xn )
xj1 !1 x jn k
!1
quadratic equation
X1 y 2 + X2 y + X3 = 0 That is, we let the variables complementary to xi1 , . . . , xik converge
in the variable y. Find the probability that both roots are real. to 1
Solution: . . .
Independence (iii) Suppose gi : R ! R, i = 1, . . . , n are “reasonable” functions. Then
if X1 , . . . , Xn are independent, then so are g1 (X1 ), . . . , gn (Xn ).
Definition The random variables X1 , . . . , Xn are independent if for all
“reasonable” A1 , . . . , An ⇢ R, Proof For A1 , . . . , An ⇢ R,
P (X1 2 A1 , . . . , Xn 2 An ) = P (X1 2 A1 ) ⇥ · · · ⇥ P (Xn 2 An ) P g1 (X1 ) 2 A1 , . . . , gn (Xn ) 2 An

= P X1 2 g1 1 (A1 ), . . . , Xn 2 g 1
(An )
Remarks: 1
where gi (Ai ) = {xi : gi (xi ) 2 Ai }
(i) Independence among X1 , . . . , Xn usually arises in a probability = P X1 2 g1 1 (A1 ) ⇥ · · · ⇥ P Xn 2 g 1

(An )
model by assumption. Such an assumption is reasonable if the = P g1 (X1 ) 2 A1 ⇥ · · · ⇥ P gn (Xn ) 2 An )
outcome of Xi “has no e↵ect” on the outcomes of the other Xj ’s.
Since the sets Ai were arbitrary, we obtain that g1 (X1 ), . . . , gn (Xn )
(ii) The also definition applies to any n random quantities X1 , . . . , Xn .
are independent.
E.g., each Xi can itself be a vector r.v. In this case the Ai ’s have to
be appropriately modified. Note: If we only know that Xi and Xj are independent for all i 6= j, it
does not follow that X1 , . . . , Xn are independent.
Independence and cdf, pmf, pdf
Theorem 1 Theorem 2
Let F be the joint cdf of the random variables X1 , . . . , Xn . Then Let X1 , . . . , Xn be discrete r.v.’s with joint pmf p. Then X1 , . . . , Xn are
X1 , . . . , Xn are independent if and only if F is the product of the independent if and only if p is the product of the marginal pmfs of the
marginal cdfs of the Xi , i.e., for all (x1 , . . . , xn ) 2 Rn , Xi , i.e., for all (x1 , . . . , xn ) 2 Rn ,
F (x1 , . . . , xn ) = FX1 (x1 )FX2 (x2 ) · · · FXn (xn ) p(x1 , . . . , xn ) = pX1 (x1 )pX2 (x2 ) · · · pXn (xn )
Proof: If X1 , . . . , Xn are independent, then Proof: If X1 , . . . , Xn are independent, then
F (x1 , . . . , xn ) = P (X1  x1 , X2  x2 , . . . , Xn  xn ) p(x1 , . . . , xn ) = P (X1 = x1 , X2 = x2 , . . . , Xn = xn )

= P (X1  x1 )P (X2  x2 ) · · · P (Xn  xn ) = P (X1 = x1 )P (X2 = x2 ) · · · P (Xn = xn )
= FX1 (x1 )FX2 (x2 ) · · · FXn (xn ) = pX1 (x1 )pX2 (x2 ) · · · pXn (xn )
Q
n
The converse that F (x1 , . . . , xn ) = FXi (xi ) for all (x1 , . . . , xn ) 2 Rn
i=1
implies independence is out the the scope of this class. ⇤
Theorem 3
Let X1 , . . . , Xn be jointly continuous r.v.’s with joint pdf f . Then
Q
n
X1 , . . . , Xn are independent if and only if f is the product of the
Proof cont’d: Conversely, suppose that p(x1 , . . . , xn ) = pXi (xi ) for
i=1 marginal pdfs of the Xi , i.e., for all (x1 , . . . , xn ) 2 Rn ,
any x1 , . . . , xn . Then, for any A1 , A2 , . . . , An ⇢ R,
X X f (x1 , . . . , xn ) = fX1 (x1 )fX2 (x2 ) · · · fXn (xn ).
P (X1 2 A1 , . . . , Xn 2 An ) = ··· p(x1 , . . . , xn )
x1 2A1 xn 2An Q
n
X X Proof: Assume f (x1 , . . . , xn ) = fXi (xi ) for any x1 , . . . , xn .Then for
= ··· pX1 (x1 ) · · · pXn (xn ) i=1
x1 2A1 xn 2An any A1 , A2 , . . . , An ⇢ R,
! ! ! Z Z
X X X
= pX1 (x1 ) pX2 (x2 ) · · · pXn (xn ) P (X1 2 A1 , . . . , Xn 2 An ) = ··· f (x1 , . . . , xn ) dx1 · · · dxn
x1 2A1 x2 2A2 xn 2An Z A1 Z An
= ··· fX1 (x1 ) · · · fXn (xn ) dx1 · · · dxn
= P (X1 2 A1 )P (X2 2 A2 ) · · · P (Xn 2 An ) A1 An
Z ! Z !
Thus X1 , . . . , Xn are independent. ⇤ = fX1 (x1 ) dx1 ··· fXn (xn ) dxn
A1 An
= P (X1 2 A1 )P (X2 2 A2 ) · · · P (Xn 2 An )
so X1 , . . . , Xn are independent.
Proof cont’d: For the converse, note that
F (x1 , . . . , xn ) = P (X1  x1 , X2  x2 , . . . , Xn  xn )
Z x1 Z xn
= ··· f (t1 , . . . , tn ) dt1 · · · dtn
1 1
By the fundamental theorem of calculus
@n Example: . . .
F (x1 , . . . , xn ) = f (x1 , . . . , xn )
@x1 · · · @xn
(assuming f is “nice enough”).

Q
n
If X1 , . . . , Xn are independent, then F (x1 , . . . , xn ) = FXi (xi ). Thus
i=1
@n
f (x1 , . . . , xn ) = F (x1 , . . . , xn )
@x1 · · · @xn
@n
= FX (x1 ) · · · FXn (xn )
@x1 · · · @xn 1
= fX1 (x1 ) · · · fXn (xn ) ⇤
Expectations Involving Multiple Random Variables If X = (X1 , . . . , Xn )T is a random vector, we sometimes use the
notation
T
E(X) = E(X1 ), . . . , E(Xn )
P
For X1 , . . . , Xn discrete, we still have E(X) = x xp(x) with the
Recall that the expectation of a random variable X is understanding that
8X X X
> xp(x) if X is discrete xp(x) = (x1 , . . . , xn )T p(x1 , . . . , xn )
>
<
x x (x1 ,...,xn )
E(X) = Z 1 ⇣X X X ⌘T
>
>
: xf (x) dx if X is continuous = x1 pX1 (x1 ), x2 pX2 (x2 ), . . . , xn pXn (xn )
1 x1 x2 xn
P T
if the sum or the integral exist in the sense that x |x|p(x) < 1 or = E(X1 ), . . . , E(Xn )
R1
1
|x|f (x) dx < 1.
Similarly, for jointly continuous X1 , . . . , Xn ,
Example: . . . Z
E(X) = xf (x) dx
ZR
n
= (x1 , . . . , xn )T f (x1 , . . . , xn ) dx1 · · · dxn

Rn
Proof cont’d: Thus

Theorem 4 (“Law of the unconscious statistician”) X X
E(Y ) = yP (Y = y) = yP (g(X) = y)
Suppose Y = g(X) for some function g : Rn ! Rk . Then y y
X X
8X = y P (X = x)
>
> g(x)p(x) if X1 , . . . , Xn are discrete
< y x:g(x)=y
E(Y ) = Zx X X
>
> = g(x)P (X = x)
: g(x)f (x) dx if X1 , . . . , Xn are jointly continuous y x:g(x)=y
Rn X
= g(x)P (X = x)
Proof: We only prove the discrete case. Since X = (X1 , . . . , Xn ) can X
x
only take a countable number of values with positive probability, the = g(x)p(x) ⇤
x
same is true for
(Y1 , . . . , Yk )T = Y = g(X)
Example: Linearity of expectation. . .
so Y1 , . . . , Yk are discrete random variables.
E(a0 + a1 X1 + · · · an Xn ) = a0 + a1 E(X1 ) + · · · + an E(Xn )
Transformation of Multiple Random Variables
Suppose X1 , . . . , Xn are jointly continuous with joint pdf

Example: Expected value of a binomial random variable. . .
f (x1 , . . . , xn ).
Example: Suppose we have n bar magnets, each having negative polarity Let h : Rn ! Rn be a continuously di↵erentiable and one-to-one
at one end and positive polarity at the other end. Line up the magnets (invertible) function whose inverse g is also continuously
end-to-end in such a way that the orientation of each magnet is random di↵erentiable. Thus h is given by (x1 , . . . , xn ) 7! (y1 , . . . , yn ), where
(the two choices are equally likely independently of the others). On the
average, how many segments of magnets that stick together do we y1 = h1 (x1 , . . . , xn ), y2 = h2 (x1 , . . . , xn ), . . . , yn = hn (x1 , . . . , xn )
obtain?
We want to find the joint pdf of the vector Y = (Y1 , . . . , Yn )T ,
Solution: . . .
where
Y1 = h1 (X1 , . . . , Xn )
Y2 = h2 (X1 , . . . , Xn )
..
.
Yn = hn (X1 , . . . , Xn )
Let g : Rn ! Rn denote the inverse of h. Let B ⇢ Rn be a “nice” set.

We have We have shown that for “nice” B ⇢ Rn
Z
P (Y1 , . . . , Yn ) 2 B = P h(X) 2 B = P (X1 , . . . , Xn ) 2 A P (Y1 , . . . , Yn ) 2 B = f (g(y))|Jg (y)| dy
B
where A = {x 2 R : h(x) 2 B} = h
n 1
(B) = g(B).
The multivariate change of variables formula for x = g(y) implies that This implies the following:
Z Z
Theorem 5 (Transformation of Multiple Random Variables)
P (X1 , . . . , Xn ) 2 A = · · · f (x1 , . . . , xn ) dx1 · · · dxn
Suppose X1 , . . . , Xn are jointly continuous with joint pdf f (x1 , . . . , xn ).
g(B)
Z Z Let h : Rn ! Rn be a continuously di↵erentiable and one-to-one
= ··· f g1 (y1 , . . . , yn ), . . . , gn (y1 , . . . , yn ) |Jg (y1 , . . . , yn )| dy1 · · · dyn function with continuously di↵erentiable inverse g. Then the joint pdf of
ZB Z Y = (Y1 , . . . , Yn )T = h(X) is
= · · · f (g(y))|Jg (y)| dy
fY (y1 , . . . , yn ) = f g1 (y1 , . . . , yn ), . . . , gn (y1 , . . . , yn ) |Jg (y1 , . . . , yn )|.
B
where Jg is the Jacobian of the transformation g.
Example: Suppose X = (X1 , . . . , Xn )T has joint pdf f and let
Y = AX, where A is an invertible n ⇥ n (real) matrix. Find fY .
(2) Often it is easier to directly compute the cdf of Y1 :
Solution: . . .
FY1 (y) = P (Y1  y) = P h1 (X1 , . . . , Xn )  y
Often we are interested in the pdf of just a single function of = P (X1 , . . . , Xn ) 2 Ay
X1 , . . . , Xn , say Y1 = h1 (X1 , . . . , Xn ). where Ay = {(x1 , . . . , xn ) : h(x1 , . . . , xn )  y}
Z Z
(1) Define Yi = hi (X1 , . . . , Xn ), i = 2, . . . , n in such a way that the = · · · f (x1 , . . . , xn ) dx1 · · · dxn
Ay
mapping h = (h1 , . . . , hn ) satisfies the conditions of the theorem (h
has an inverse g which is continuously di↵erentiable).
Then the theorem gives the joint pdf fY (y1 , . . . , yn ) and we obtain Di↵erentiating FY1 we obtain the pdf of Y1 .
fY1 (y1 ) by “integrating out” y2 , . . . , yn :
Example: Let X1 , . . . , Xn be independent with common distribution
Z Z
Uniform(0, 1). Determine the pdf of Y = min(X1 , . . . , Xn ).
fY1 (y1 ) = · · · fY (y1 , . . . , yn ) dy2 . . . dyn
Rn 1 Solution: . . .
A common choice is Yi = Xi , i = 2, . . . , n.

1-Multiple12 4

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

1-Multiple12 4

Hochgeladen von

Copyright:

Verfügbare Formate

STAT/MTHE 353: Probability II

FX (x) = P (X  x). If X1 , . . . , Xn are random variables (defined on the same probability

We usually describe the distribution of X by a function on Rn :

Marginal joint probability mass functions

Assume X1 , . . . , Xn are discrete. Let 0 < k < n and

{i1 , . . . , ik } ⇢ {1, . . . , n}, {j1 , . . . , jn Z Z

P (Xi1 , . . . , Xik ) 2 B where {j1 , . . . , jn k} = {1, . . . , n} \ {i1 , . . . , ik },

That is, we “integrate out” the variables complementary to xi1 , . . . , xik .

Example: Suppose X1 , X2 , X3 are jointly continuous with jpdf

P (X1 2 A1 , . . . , Xn 2 An ) = P (X1 2 A1 ) ⇥ · · · ⇥ P (Xn 2 An ) P g1 (X1 ) 2 A1 , . . . , gn (Xn ) 2 An

(i) Independence among X1 , . . . , Xn usually arises in a probability = P X1 2 g1 1 (A1 ) ⇥ · · · ⇥ P Xn 2 g 1

Independence and cdf, pmf, pdf

Proof: If X1 , . . . , Xn are independent, then Proof: If X1 , . . . , Xn are independent, then

F (x1 , . . . , xn ) = P (X1  x1 , X2  x2 , . . . , Xn  xn ) p(x1 , . . . , xn ) = P (X1 = x1 , X2 = x2 , . . . , Xn = xn )

= P (X1 2 A1 )P (X2 2 A2 ) · · · P (Xn 2 An )

Proof cont’d: For the converse, note that

By the fundamental theorem of calculus

(assuming f is “nice enough”).

= (x1 , . . . , xn )T f (x1 , . . . , xn ) dx1 · · · dxn

Proof cont’d: Thus

Suppose X1 , . . . , Xn are jointly continuous with joint pdf

Let g : Rn ! Rn denote the inverse of h. Let B ⇢ Rn be a “nice” set.

where Jg is the Jacobian of the transformation g.

Das könnte Ihnen auch gefallen