Sie sind auf Seite 1von 7

1 Random Variables

In applications we are interested in quantitative properties of experimental results.


Example: Toss a coin three times and count the number of heads. The sample space is

S = {(t, t, t), (t, t, h), (t, h, t), (h, t, t), (t, h, h), (h, t, h), (h, h, t), (h, h, h)}.

The random variable X counts the number of heads. Thus if w = (t, t, h) occurs then X(t, t, h) = 1.
Assigning probabilities to value of random variables is done by connecting probability associates with
outcomes of an experiment with the values of the random variables.
Example: (continued) Assume all elements in S are equally likely. Then X = 1 corresponds to the set
{(t, t, h), (t, h, t), (h, t, t)} thus

P (X = 1) = P ({(t, t, h), (t, h, t), (h, t, t)}) = 3/8.

Let B = {2, 3} by X ∈ B we mean the subset A = {(t, h, h), (h, t, h), (h, h, t), (h, h, h)} of S such that
X(w) ∈ B for all w ∈ A so
4
P (X ∈ B) = P (A) = = 0.5.
8
By X = j we mean the set A = {w ∈ S : X(w) = j}. So X = 0 is the set A = {(t, t, t)} and P (X = 0) =
P (A) = 1/8, etc.
Definition: A function X : S → < mapping elements of the sample space into the real numbers will be
called a random variable.
Remark: A random variable is a deterministic function. What is random is the selection of w ∈ S.
Example: Suppose you select a point at p random in the unit circle {(x, y) : x2 + y 2 ≤ 1}. Let D be the
distance of the selected point to the origin x2 + y 2 . Then D is a random variable.

1.1 Discrete Random Variables


Definition: Discrete Random Variables: If the possible values of X are finite or countable we say that X
is a discrete random variable.
Definition: Let x1 , x2 , . . . denote the possible values of X then

p(xi ) = P (X = xi ) = P (w ∈ S : X(w) = xi )

is called the probability mass function. Notice that p(xi ) ≥ 0 and that

X X
p(xi ) = P (w ∈ S : X(w) = xi ) = P (S) = 1.
i=1 i

Convenient notation (not used in book): pX (a) = P (X = a). This notation is useful when you want to
emphasize that you are refereing to random variable X. When there is no danger of confusion we drop the
subscript.
Example: (continued) p(0) = 1/8, p(1) = 3/8, p(2) = 3/8, p(3) = 1/8.
The probability mass function summarizes all probability information associated with the random vari-
able.

1
Example: (continued) P (X ≤ 1) = p(0) + p(1) = 0.5. P (1 ≤ X ≤ 2) = p(1) + p(2) = 3/4, P (X > 1) =
p(2) + p(3) = 0.5.
Although the probability mass function contains all information w.r.t. a discrete random variable, the
cumulative distribution function is frequently used.
Definition: The cumulative distribution function (cdf) F of a random variable X is given by

FX (a) = P (X ≤ a).

When it is clear that we are taking about the random variable X we may simply write F (a). The notation
X ∼ F signifies that F is the distribution of the random variable X.
If X is a discrete random variable we can write
X X
F (a) = p(X = t) = p(t).
t≤a t≤a

Example: (continued) F (0) = 1/8, F (1) = 4/8, F (2) = 7/8, F (3) = 1.


Properties of cdf:

1. F (−∞) = P (X ≤ −∞) = 0,
2. F (∞) = P (X ≤ ∞) = 1,
3. If a < b then F (a) ≤ F (b).

Any function satisfying the above three properties is the cdf of a random variable.
Facts:
1. P (a < X ≤ b) = F (b) − F (a).
2. P (a ≤ X ≤ b) = F (b) − F (a) + p(a).
3. P (a < X < b) = F (b) − F (a) − p(b).
4. P (a ≤ X < b) = F (b) − F (a) − p(b) + p(a).
Example: (continued) P (1 ≤ X ≤ 3) = F (3) − F (1) + p(1) = 1 − 4/8 + 3/8 = 7/8.

1.2 Continuous Random Variables


Although almost all actual measurements made in engineering and scientific work are really discrete, it is often
conceptually convenient to thing in terms of a continuum of possible values. Examples are measurements of
height and weight. It is also a mathematical convenient to deal with such continuous random variables.
We cannot define continuous random variables in terms of their probability mass function because con-
tinuous random variables have an uncountable number of possible values. We can however work with the
concept of the cumulative distribution function and then derive from it a new concept, the probability density
function that has similar properties to the probability mass function.
Recall
FX (x) = P (X ≤ x) = P ({w ∈ S : X(w) ≤ x}).

2
Example: Before exploring the properties of the cdf of continuous random variables, let us work out the
cdf of the distance to the origin of a point selected at random within the unit circle. Clearly FD (d) = 1 for
all d > 1 and FD (d) = 0 for all d < 0. What about values of d ∈ [0, 1]?
p
FD (d) = P r( x2 + y 2 ≤ d) (1)
= P r(x2 + y 2 ≤ d2 ) (2)
= πd2 /π (3)
= d2 (4)

Notice that FD (d) is an increasing and continuous function.


Example: What is the probability that D ≤ 1/2? FD (1/2) = 1/4.
Example: What is the probability that that 1/2 < D ≤ 2/3? Clearly FD (2/3) − FD (1/2) = 4/9 − 1/4.
What is the probability that the distance is exactly equal to 1/2? 0.
Now let us go back to a generic random variable X

P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a).


What is P (X = b)?

P (X = b) ≡ lim P (a < X ≤ b) (5)


a↑b
= lim[F (b) − F (a)] (6)
a↑b

= P (X = b) = [F (b) − F (b− )] (7)

Here F (b− ) is the limit of F (x) as x approaches x from the left.


So P (X = b) is the jump in F at b. For example if X is number of heads in three tosses of a fair coin,
we have F (x) = 0.5 for 1 ≤ x < 2 and F (x) = 0.125 for 0 < x ≤ 1. Thus, P (X = 1) = F (1) − F (1− ) =
0.5 − 0.125 = 0.375.
What if F is continuous (has no jump) at b? Then P (X = b) = 0. If F is continuous everywhere then
P (X = x) = 0 for all x.
What if F is continuously differentiable? Then there exists a function f (x) = F 0 (x) called the probability
density function such that
Z b
F (b) = f (x)dx.
−∞

Consequently
Z b
P (a < X ≤ b) = F (b) − F (a) = f (x)dx.
a

The function f (·) is called the the probability density function.


Example: Find the probability density function of D. Clearly f = 0 on d < 0 and d > 1. For d ∈ [0, 1] we
have f (d) = F 0 (d) = 2d.
Notice that

3
• f (x) ≥ 0, and
R∞
• −∞ f (x)dx = 1.
Since
P (x − δ/2 < X ≤ x + δ/2) = f (x)δ,
the value of f (x) is related to the probability that X takes values close to x. For example, if f (x) = 2f (y)
it is almost twice as likely for X to fall in a small neighborhood of x than of y.
Example: Notice that for random variable D, f (1/4) = 1/2 while at f (3/4) = 3/2, so it is 3 times more
likely that you end at a distance of 3/4 than that you are of ending up at a distance of 1/4.
To summarize, for continuous random variables there is a probability density function f (x) ≥ 0 such that
Z
P (X ∈ A) = f (u)du.
A

In particular if A = {X ≤ x}, then


Z x
F (x) = P (X ≤ x) = f (u)du.
−∞

Given the cdf F (x) we can obtain the density function f (x) by differentiation. Conversely, given the
density function f (x) we can obtain the cdf F (x) by integration.
Properties of cdf: As with the case of discrete random variables:
1. F (−∞) = P (X ≤ −∞) = 0,
2. F (∞) = P (X ≤ ∞) = 1,
3. If a < b then F (a) ≤ F (b).
However, the calculation of probabilities over intervals is simpler:
1. P (a < X ≤ b) = F (b) − F (a).
2. P (a ≤ X ≤ b) = F (b) − F (a).
3. P (a < X < b) = F (b) − F (a).
4. P (a ≤ X < b) = F (b) − F (a).
Example: f (t) = 0 on t < 0, f (t) = t/2 on 0 < t ≤ 1, f (t) = 0.75 on 1 < t ≤ 2 and f (t) = 0 on t > 2.
Then F (a) = a2 /4 on 0 < a ≤ 1, F (a) = .25 + .75(a − 1) on 1 < a ≤ 2, and F (a) = 1 on a > 2.
The cdf is convenient for probability calculations:
P (.1 < X ≤ 1.2) = F (1.2) − F (.1) = .25 + .15 − .01/4 = .40 − .0025 = .3975.

2 Joint Random Variables:


Often we need to deal with two or more random variables at the same time. Here we discuss the case of
discrete and continuous joint random variables.

4
2.1 Joint Distributions of Discrete Random Variables
Example: Toss three fair coins and let X number of heads in first two tosses and Y number of heads
in first three tosses. Give an explicit mapping to obtain p(0, 0) = p(0, 1) = p(2, 2) = p(2, 3) = 1/8 and
p(1, 1) = p(1, 2) = 1/4. Let A = {(0, 0), (1, 1), (2, 2)} then p(A) = 0.5.
What is P (X = 1)? Well X = 1 is equivalent to the set {(x, y) : (1, 1), (1, 2)} so P (X = 1) = 0.5.

2.1.1 Marginal Probability Mass Functions


P P
In general, pX (x) = y p(x, y) and pY (y) = y p(x, y) are known as the marginal probability mass functions.
If X and Y are discrete random variables, there is a joint probability mass function p(·, ·) with the
following three properties.

p(x, y) ≥ 0
XX
p(x, y) = 1.
x y
X
P ((X, Y ) ∈ A) = p(x, y).
(x,y)∈A

2.1.2 Conditional Distribution: Discrete Case


We define the conditional pmf of X given Y = y as

p(x, y)
pX|Y (x|y) = P (X = x|Y = y) =
pY (y)

provided pY (y) > 0 and is defined to be zero otherwise.


Example: Referring to our earlier example we see that
.25
P (X = 1|Y = 2) = = 0.66
.25 + .125
Notice also that p(x, y) = PX|Y (x|y)pY (y) and therefore by adding over y we can obtain pX (x).
¡ ¢ n
Example: Suppose P (X = k|N = n) = nx px (1 − p)n−x for x = 0, 1, . . . , n, and P (N = n) = exp(−λ) λn! .
Then,
P (X = k, N = n) = P (X = k|N = n)P (N = n)
and X
P (X = k) = P (X = k, N = n)
n≥k

Work it out and it turns out that


(λp)k
P (X = k) = exp(−λp) .
k!

5
2.2 Joint Distribution of Continuous Random Variables
If X and Y are continuous random variables, there exists a joint density function f (x, y) with the following
three properties:
f (x, y) ≥ 0,
Z Z
f (x, y)dxdy = 1

For well defined subsets A of <2 Z


P ((X, Y ) ∈ A) = f (x, y)dxdy.
A

Example, f (x, y) = 1 on the unit square. What is the probability that (X, Y ) is in the set [0, .5] × [.5, .8]?

2.2.1 Marginal Density Functions


Let
F (x, y) = P (X ≤ x, Y ≤ y).
This is called the joint cdf of X and Y. When X and Y are continuous, we have
Z x Z y
F (x, y) = f (u, v)dudv.
−∞ −∞

Now, Z ·Z ¸
x ∞
P (X ≤ x) = F (x, ∞) = f (u, v)dv du.
−∞ −∞

On the other hand,


Z x
P (X ≤ x) = fX (u)du.
−∞

So it follows that Z ∞
fX (x) = f (x, v)dv.
−∞

Consequently, the density of X is obtained by integrating out over the second variable. Similarly, the
density of Y is obtained by integrating the first variable.
Z ∞
fY (y) = f (u, y)du.
−∞

Example: f (x, y) = 12/7 × (x2 + xy) on the unit square. Compute fX and fY
Integrating we obtain: fX (x) = 12/7 × (x2 + x/2) and fY (y) = (4 + 6y)/7.
Q. How can we compute P (x1 < X < x2 , y1 < Y < y2 ) from F (·, ·)?
A.
F (x2 , y2 ) − F (x1 , y2 ) − F (x2 , y1 ) + F (x1 , y1 ).

6
2.2.2 Conditional Distribution: Continuous Case
In the continuous case the conditional density of X given Y = y is defined as

f (x, y)
fX|Y (x|y) =
fY (y)

provided fY (y) > 0 and is defined to be zero otherwise.


Example: f (x, y) = λ2 e−λy on 0 ≤ x ≤ y.
Then fX (x) = λe−λx and fY (y) = λ2 ye−λy and

f (x, y) 1
fX|Y (x|y) = =
fY (y) y
on 0 ≤ x ≤ y.
Notice that f (x, y) = fX|Y (x|y)fY (y) and that integrating over y we can obtain fX (x).

3 Independent Random Variables


Let X and Y be random variables with joint cdf F (x, y). Recall that FX (x) = F (x, ∞) and FY (y) = F (∞, y).
X and Y are said to be independent if

F (x, y) = FX (x)FY (y).

Notice that this definition is valid for both discrete and continuous random variables.
If X and Y are continuous and independent then

f (x, y) = fX (x)fY (y).

If X and Y are discrete and independent then

p(x, y) = pX (x)pY (y).

Das könnte Ihnen auch gefallen