Sie sind auf Seite 1von 12

STAT511 Spring 2014 Lecture Notes 1

Chapter 5
February 19, 2014

Joint Probability Distributions and Random Samples


5.1 Jointly Distributed Random Variables
Chapter Overview
Jointly distributed rv
Joint mass function, marginal mass function for discrete rv
Joint density function, marginal density function for continuous rv
Independent random variables
Expectation, covariance and correlation between two rvs
Expectation
Covariance
Correlation
Interpretations
Statistics and their distributions
Distribution of the sample mean
Distribution of a linear combination

Joint Mass Function of Two Discrete RVs


Definition 1. Let X and Y be two discrete rvs defined on the sample space S of a
random experiment. The joint probability mass function p(x, y) is defined for each
pair of numbers (x, y) by:
p(x, y) = P (X = x and Y = y)
Let A be the set consisting of pairs of (x, y) values, then the probability P [(X, Y ) A]
is obtained by summing the joint pmf pairs in A:
XX
P [(X, Y ) A] = p(x, y)
(x,y) A

Example of Joint PMF


Example 5.1.1 Exercise 5.3: A market has two check out lines. Let X be the
number of customers at the express checkout line at a particular time of day. Let Y
denote the number of customers in the super-express line at the same time. The joint
pmf of (X, Y ) is given below:
x =, y = 0 1 2 3
0 0.08 0.07 0.04 0.00
1 0.06 0.15 0.05 0.04
2 0.05 0.04 0.10 0.06
3 0.00 0.03 0.04 0.07
4 0.00 0.01 0.05 0.06
What is P (X = 1, Y = 0)? What is P (X = 1, Y > 2)? What is P (X = Y )?

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 2
Marginal Probability Mass Function

Definition 2. The marginal probability mass functions of X and Y , denoted pX (x)


and pY (y), respectively, are given by
X
pX (x) = P (X = x) = p(x, y)
y
X
pY (y) = P (Y = y) = p(x, y)
x

Example of Marginal Probability Mass Function


Example 5.1.1 Now lets find the marginal mass function.

x =, y = 0 1 2 3 p(x)
0 0.08 0.07 0.04 0.00 0.19
1 0.06 0.15 0.05 0.04 0.30
2 0.05 0.04 0.10 0.06 0.25
3 0.00 0.03 0.04 0.07 0.14
4 0.00 0.01 0.05 0.06 0.12
p(y) 0.19 0.30 0.28 0.23 1.00

What is P (X = 3)? What is P (Y = 2)?

Joint Probability Density Function of Two Continuous RVs

Definition 3. Let X and Y be continuous rvs. Then f (x, y) is the joint probability
density function for X and Y if for any two-dimensional set A:
Z Z
P [(X, Y ) A] = f (x, y)dxdy
A

In particular, if A is the two-dimensional rectangle {(x, y) : a x b, c x d},


Z bZ d
P [(X, Y ) A] = f (x, y)dydx
a c

Joint Probability Density Function of Two Continuous RVs

P [(X, Y ) A] = Volume under density surface above A

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 3
Marginal Probability Density Function
Definition 4. The marginal probability density function of X and Y , denoted
fX (x) and fY (y), respectively, are given by:
Z
fX (x) = f (x, y)dy, < X < .

Z
fY (y) = f (x, y)dx, < Y < .

Independent Random Variables


Definition 5 (Independence between X and Y ). Two random variables X and Y are
said to be independent if for every pair of x and y values,
p(x, y) = pX (x) pY (y)
when X and Y are discrete or
f (x, y) = fX (x) fY (y)
when X and Y are continuous
If X, Y are independent, we have:
P (a < X < b, c < Y < d) = P (a < X < b) P (c < Y < d)
If the conditions are not satisfied for all (x, y) then X and Y are dependent.

Example of Independence
Example 5.1.2 X follows an exponential distribution with = 2, Y follows an
exponential distribution with = 3, X and Y are independent, find f (x, y).
f (x, y) = fX (x) fY (y) = 2e2x 3e3y = 6e(2x+3y) , x 0, y 0
Example 5.1.3 Toss a fair coin, and a die. Let X = 1 if coin is head, let X = 0 if coin
is tail. Let Y be the outcome of the die. if X and Y are independent, find p(x, y) and
find the probability that the outcome of the die is greater than 3 and the coin is a head?
x=,y= 1 2 3 4 5 6
1
0 2 16 1
2 16 1
2 16 1
2 16 1
2 16 1
2 16
1
1 2 16 1
2 16 1
2 16 1
2 16 1
2 16 1
2 16

P (X = 1, Y > 3) = P (X = 1) P (Y > 3)

Examples Continued
Example 5.1.4 Given the following p(x, y), is X and Y independent?
x =, y = 0 1 2 3 p(x)
0 0.04 0.03 0.01 0.02 0.1
1 0.08 0.06 0.02 0.04 0.2
2 0.16 0.12 0.04 0.08 0.4
3 0.04 0.03 0.01 0.02 0.1
4 0.08 0.06 0.02 0.04 0.2
p(y) 0.4 0.3 0.1 0.2 1.00

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 4
Example 5.1.5 Give fX (x) = 0.5x, 0 < x < 2, fY (y) = 3y 2 , 0 < y < 1, f (x, y) =
1.5xy 2 , 0 < x < 2 and 0 < y < 1, is X, Y independent?

fX (x)fY (y) = 0.5x 3y 2 = 1.5x y 2 = f (x, y)

More Than Two Random Variables


If X1 , X2 , , Xn are all discrete random variables, the joint pmf of the variables
is the function
p(x1 , , xn ) = P (X1 = x1 , , Xn = xn )
If the variables are continuous, the joint pdf is the function f such that for any n intervals
[a1 , b1 ], , [an , bn ],

P (a1 X1 b1 , , an Xn bn )
Z b1 Z bn
= f (x1 , , xn )dxn dx1
a1 an

Independence More Than Two Random Variables


The random variables X1 , X2 , , Xn are independent if for every subset

Xi1 , Xi2 , , Xin

of the variables, the joint pmf or pdf of the subset is equal to the product of the marginal
pmfs or pdfs.

Conditional Distributions

Definition 6. Let X, Y be two continuous rvs with joint pdf f (x, y) and marginal
pdfs fX (x) and fY (y). Then for any X value x for which fX (x) > 0, the conditional
probability density function of Y given that X = x is:

f (x, y)
fY |X (y|x) = , < y < .
fX (x)

If X and Y are discrete, replace pdfs by pmfs in this definition. That then gives
conditional probability mass function of Y when X = x.

Example of Conditional Mass


Example 5.1.1 Joint mass is given below. What is the conditional mass function
of Y , given X = 1?

x =, y = 0 1 2 3 p(x)
0 0.08 0.07 0.04 0.00 0.19
1 0.06 0.15 0.05 0.04 0.30
2 0.05 0.04 0.10 0.06 0.25
3 0.00 0.03 0.04 0.07 0.14
4 0.00 0.01 0.05 0.06 0.12
p(y) 0.19 0.30 0.28 0.23 1.00

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 5
Example 5.1.6 Given f (x, y) = 65 (x + y 2 ), 0 x 1, 0 < y < 1. fX (x) = 65 x + 25 .
What is the conditional density of Y given X = 0?
6 2
f (0, y) y
fY |X (y|0) = = 52
fX (0) 5

5.2 Expected Values, Covariance, and Correlation


Expected Values

Definition 7. Let X and Y be jointly distributed rvs with pmf p(x, y) or pdf f (x, y)
according to whether the variables are discrete or continuous. Then the expected value
of a function h(X, Y ), denoted E[h(X, Y )] or h(x,y) is:
P P
x y h(x, y) p(x, y), discrete;
h(X, Y) = E [h(X, Y )] =
R R
h(x, y) f (x, y)dxdy, continuous.

Examples of Expected Values

Example 5.2.1 The joint pmf is given below. What is E(XY )? What is
E[max(X, Y )]?

p(x, y) y=0 1 10 20
x=0 0.02 0.06 0.02 0.10
1 0.04 0.15 0.20 0.10
10 0.01 0.15 0.14 0.01

Example 5.2.2 Joint pdf of X and Y is: f (x, y) = 4xy, 0 < x < 1, 0 < y < 1.
What is E(XY )?

Covariance

Definition 8. Let E(X) and E(Y ) denote the expectations of rv X and Y . The
covariance between X and Y , denoted Cov(X, Y ) is defined as:

Cov(X, Y ) = E[(X E(X))(Y E(Y ))]

i.e., P P
x y [x E(X)] [y E(Y )] p(x, y), discrete;
=
R R
(x E(X))(y E(Y ))f (x, y)dxdy, continuous.

Properties of Covariance and Shortcut Formula

Cov(X, X) = V ar(X)

Cov(X, Y ) = Cov(Y, X)

Cov(aX, bY ) = abCov(X, Y ), Cov(X + a, Y + b) = Cov(X, Y ), i.e., Cov(aX +


b, cY + d) = acCov(X, Y )

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 6
Shortcut formula:
Cov(X, Y ) = E(XY ) E(X)E(Y )

If X and Y are independent, then Cov(X, Y ) = 0. However, Cov(X, Y ) = 0 does


not imply independence.

Interpretation of Covariance
Similar to Variance, Covariance is a measure of variation.
Covariance measures how much two random variables vary together.
As opposed to variance: a measure of variation of a single rv.
If two rvs tend to vary together, then the covariance between the two variables
will be positive.
For example, when one of them is above its expected value, then the other
variable tends to be above its expected value as well.
If two rvs vary differently, then the covariance between the two variables will be
negative.
For example, when one of them is above its expected value, the other variable
tends to be below its expected value.

Examples of Covariance
Example 5.2.1 Exercise The joint pmf is given below. Whats Cov(X, Y )?

p(x, y) y=0 1 10 20
x=0 0.02 0.06 0.02 0.10
1 0.04 0.15 0.20 0.10
10 0.01 0.15 0.14 0.01

Example 5.2.2 Joint pdf of X and Y is: f (x, y) = 4xy, 0 < x < 1, 0 < y < 1.
Whats Cov(X, Y )?
Example 5.2.3 Given the pmf below, whats Cov(X, Y )?

p(x, y) y=1 2 pX (x)


x=0 0.04 0.36 0.4
1 0.06 0.54 0.6
pY (y) 0.1 0.9 1

Correlation
Definition 9. The correlation coefficient of two rvs X and Y , denoted Corr(X, Y ),
X,Y or just is defined by:
Cov(X, Y )
Corr(X, Y ) = X,Y = p p
V ar(X) V ar(Y )
i.e.,
Cov(X, Y )
Corr(X, Y ) = X,Y =
X Y
where X and Y are the std devs of X and Y , respectively.

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 7
Properties and Shortcut Formula of Correlation

For any two rvs X and Y , 1 Corr(X, Y ) 1

For a and c both positive or both negative, Corr(aX + b, cY + d) = Corr(X, Y )

If X and Y are linearly related, i.e., Y = aX + b, then Corr(X, Y ) = 1

Shortcut formula:
E(XY ) E(X)E(Y )
Corr(X, Y ) = p p
E(X ) (E(X))2 E(Y 2 ) (E(Y ))2
2

If X and Y are independent, Corr(X, Y ) = 0. However, Corr(X, Y ) = 0 does not


imply independence.

Interpretation of Correlation
Correlation is a standardized measure.

Correlation coefficient indicates the strength and direction of a linear relationship


between two rvs.

X and Y approximately positively linearly related, Corr(X, Y ) will be close to 1.

X and Y approximately negatively linearly related, Corr(X, Y ) will be close to


1.

X and Y not linearly related, Corr(X, Y ) = 0. Especially, when X, Y not related,


i.e., independent, Corr(X, Y ) = 0.

Examples of Correlation

Example 5.2.1 Exercise The joint pmf is given below. Find Cov(X, Y ).

p(x, y) y=0 1 10 20
x=0 0.02 0.06 0.02 0.10
1 0.04 0.15 0.20 0.10
10 0.01 0.15 0.14 0.01

Example 5.2.2 Joint pdf of X and Y is: f (x, y) = 4xy, 0 < x < 1, 0 < y < 1.
Find Corr(X, Y ).

Example 5.2.3 Given the pmf below, whats Corr(X, Y )?

p(x, y) y = 1 1 3
1 1 1
x = 1 9 9 9
1 1 1
0 9 9 9
1 1 1
1 9 9 9

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 8
5.3 Statistics and their Distributions
Statistic
Definition 10 (Statistic). A statistic is any quantity whose value can be calculated
from sample data. Or, a statistic is a function of random variables. We denote a statistic
by an uppercase letter; a lowercase letter is used to represent the calculated or observed
value of the statistic.
Examples:
X1 +X2
Two rv X1 and X2 , denote X = 2 , X is a statistic.

5 rvs X1 , X2 , , X5 , denote Xmax = max(X1 , X2 , , X5 ), Xmax is a


statistic.

2+ Independent RVs
Definition 11 (Joint pmf and pdf for more than two rvs). For n rvs X1 , X2 , , Xn ,
the joint pmf is:

p(x1 , x2 , , xn ) = P (X1 = x1 , X2 , = x2 , , Xn = xn )

and the joint pdf is for any intervals [a1 , b1 ], , [an , bn ]:


Z b1 Z bn
P (a1 X1 b1 , ..., an Xn bn ) = f (x1 , , xn )dx1 dxn
a1 an

Definition 12 (Independence of more than two rvs). The random variables X1 , X2 , , Xn


are said to be independent if for any subset of Xi s, the joint pmf or pdf is the product
of the marginal pmf or pdfs.

Random Samples
Definition 13 (Random Sample). The rvs X1 , X2 , , Xn are said to form a (simple)
random sample of size n if:
1. The Xi s are independent rvs.

2. Every Xi has the same probability distribution.


Such Xi s are said to be independent and identically distributed.

Example: Let X1 , X2 , , Xn be a random sample from standard normal, i.e., X1


follows N (0, 1), X2 follows N (0, 1), ... ... Xn follows N (0, 1) And, X1 , X2 , , Xn
are independent.

Examples of Statistic of a Random Sample


Example 5.3.1 Sample mean Take a random sample of size n from a specific distribution (say
standard normal). X = X1 +X2n++Xn is the random variable sample mean.
X1 + X2 + + Xn
X = , is a statistic
n
If X1 = x1 , X2 = x2 , , Xn = xn ,
x1 + x2 + + xn
x = is a value of the rv X
n

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 9
Example 5.3.2 Sample variance Take a random sample of size n from a specific distribution(say
standard normal). X = X1 +X2n++Xn is the random variable sample mean.

(Xi X)2
P
S2 = , is a statistic
n1
If X1 = x1 , X2 = x2 , , Xn = xn ,

(xi x)2
P
2
s = , is a value of the rv S 2
n1

Deriving the Sampling Distribution of a Statistic

Definition 14 (Sampling Distribution). A statistic is a random variable, the distribu-


tion of a statistic is called the sampling distribution of the statistic.

We may use probability rules to obtain the sampling distribution of a statistic.

Example 5.3.3
Example 5.20 in textbook. A large automobile service center charges $40, $45, and
$50 for a tune-up of four-, six-, and eight-cylinder cars, respectively. 20% of the tune-ups
are done for four-cylinder cars, 30% for six-cylinder cars and 50% for eight-cylinder cars.
Let X be the service charge for a single tune-up. Then the distribution of X is:

x 40 45 50
p(x) 0.2 0.3 0.5

Now let X1 and X2 be the service charges of two randomly selected tune-ups. Find the
distribution of:
X1 +X2
1. X = 2

(X1 X)2 +(X2 X)2


2. S 2 = 21

Example 5.3.3

x1 x2 p(x1 , x2 ) x s2
40 40 0.04 40 0
40 45 0.06 42.5 12.5
40 50 0.10 45 50
45 40 0.06 42.5 12.5
45 45 0.09 45 0
45 50 0.15 47.5 12.5
50 40 0.10 45 50
50 45 0.15 47.5 12.5
50 50 0.25 50 0

5.4 The Distribution of the Sample Mean


Sampling Distribution of Sample Sum and Sample Mean

Proposition (Mean and Std Dev of Sample Sum). Let X1 , X2 , ..., Xn be a random
sample from a distribution with mean and standard deviation . Sample sum is To =
X1 + X2 + ... + Xn . Then:

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 10
1. E(To ) = n

2. V ar(To ) = n 2 and To = n

Proposition (Mean and Std Dev of Sample Mean). Let X1 , X2 , ..., Xn be a random
sample from a distribution with mean and standard deviation . Sample mean is
X = X1 +X2n+...+Xn . Then:

1. E X =
2
2. V ar X = n and X = n


Sample variance S 2 related sampling distribution will be introduced in Chapter 7.

The Case of Normal Population Distribution

Proposition. Let X1 , X2 , ..., Xn be a random sample from a normal distribution with


mean and standard deviation (N (, )). Then To and X both follow normal distri-
butions:

1. To N (n, n)

2. X N (, n )

Example 5.4.1
Example 5.25. Let X be the time it takes a rat to find its way through a maze.
X N ( = 1.5, 2 = 0.352 ) (in minutes). Suppose five rats are randomly selected.
Let X1 , X2 , , X5 denote their times in the maze. Assume X1 , X2 , , X5 be a
random sample from N ( = 1.5, 2 = 0.352 ). Let total time To = X1 + X2 + + X5 ,
average time X = X1 +X25++X5 . What is the probability that the total time of the 5
rats is between 6 and 8 minutes? What is the probability that the average time is at
most 2.0 minutes?

Example 5.4.1 Continued...

T0 N (5 1.5 = 7.5, 5 0.352 = 0.6125)


So,
6 7.5 8 7.5
P (6 < To < 8) = P ( <Z<
0.6125 0.6125
P (6 < To < 8) = (0.64) (1.92) = 0.7115

0.352
X N (1.5, )
5
So,
2.0 1.5
P (X 2.0) = P (Z < ) = P (Z 3.19)
0.35/ 5
P (X 2.0) = (3.19) = 0.9993

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 11
Central Limit Theorem

Theorem 15. Let X1 , X2 , , Xn be a random sample from a distribution with


mean and variance 2 . Then if n is sufficiently large, X has approximately a normal
2 = 2 . T then has approximately a normal distribution
distribution with X = and X n o
with To = n and T2o = n 2 . The larger the value of n, the better the approximation.

Rule of Thumb: If n > 30, the Central Limit Theorem can be used.

Central Limit Theorem

Central Limit Theorem

Examples

Example 5.5.1 Example 5.26 in textbook. The amount of a particular impu-


rity in a batch of some chemical product is a random variable with mean 4.0g
and standard deviation 1.5g. If 50 batches are independently prepared, what is
the (approximate) probability that the sample average amount of impurity X is
between 3.5g and 3.8g?

Example 5.5.2 Example 5.27 in textbook. The number of major defects for a
certain model of automobile is a random variable with mean 3.2 and standard
deviation 2.4. Among 100 randomly selected cars of this model, how likely is it
that the average number of major defects exceeds 4? How likely is it that the
number of major defects of all 100 cars exceeds 20?

5.5 The Distribution of a Linear Combination


Linear Combinations of RVs

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)
STAT511 Spring 2014 Lecture Notes 12
Definition 16. Given a collection of n random variables X1 , X2 , , Xn and n
numerical constants a1 , a2 , , an , the rv:
n
X
Y = a1 X1 + a2 X2 + + an Xn = ai Xi
i=1

is called a linear combination of the Xi0 s.

1. Sample sum To = X1 + X2 + + Xn is a linear combination with a1 = a2 = =


an = 1.
X1 +X2 ++Xn
2. Sample mean X = n is a linear combination with a1 = a2 = =
an = n1 .

Mean and Variance of Linear Combinations

Proposition. Let X1 , X2 , , Xn have expectations 1 , 2 , , n respectively


and variances 12 , 22 , , n2 respectively. Let Y be the linear combination of Xi0 s,
Y = a1 X1 + a2 X2 + + an Xn . Then:

1. E(Y ) = a1 1 + a2 2 + + an n .

2. V ar(Y ) = ni=1 nj=1 ai aj Cov(Xi , Xj )


P P

3. If X1 , X2 , , Xn are independent, V ar(Y ) = a21 12 + a22 22 + + a2n n2 .

Corollary 17. If X1 , X2 are independent, then E(X1 X2 ) = E(X1 ) E(X2 ) and


V ar(X1 X2 ) = V ar(X1 ) + V ar(X2 ).

The Case of Normal Random Variables

Proposition. If X1 , X2 , , Xn are independent,normally distributed rvs, with


means 1 , 2 , , n and variances 12 , 22 , , n2 , then any linear combination
of the Xi0 s Y = a1 X1 + a2 X2 + + an Xn also has a normal distribution.

Y N (a1 1 + a2 2 + + an n , a21 12 + a22 22 + + a2n n2 )

Example
Example 5.5.3 Example 5.30 in textbook. Three grades of gasoline are priced at
$1.20, $1.35 and $1.50 per gallon, respectively. Let X1 , X2 and X3 denote the amounts of
these grades purchased (gallons) on a particular day. Suppose Xi s are independent and
normally distributed with 1 = 1000, 2 = 500, 3 = 100, 1 = 100, 2 = 80, 3 = 50.
The total revenue of the sale of the three grades of gasoline on a particular day is
Y = 1.2X1 + 1.35X2 + 1.5X3 . Find the probability that total revenue exceeds $2500.

Purdue University Chapter5print.tex; Last Modified: February 19, 2014 (W. Sharabati)

Das könnte Ihnen auch gefallen