Beruflich Dokumente
Kultur Dokumente
In application of probabilities, we deal with numerical data which are random in nature. For
example, we may consider the number of customers arriving at service station at a particular
interval of time or the transmission time of a message in a communication system. These random
quantities may be considered as real-valued function on the sample space. Such a real-valued
function is called a real random variable or simply a random variable (RV) and plays an
important role in describing random data. We shall introduce the concept of random variables in
the following sections.
Random variable
Consider the probability space ( S , F , P ) and a function X : S mapping the sample space
S into the real line. We are interested to define probability on the events involving real numbers
with the help of this mapping.
Let us define the probability PX ( B) of a subset B
by
PX ( B ) = P ( X 1 ( B )) = P ({s | X ( s ) B}).
(1)
Figure 1 illustrates a random variable as a mapping from the sample space to the real line.
X 1
A = X 1 ( B)
X ( s1 )
s1
X ( s2 )
s2
B = X(A)
X ( s3 )
s3
X ( s4 )
s4
S
S
Example 1
Consider the experiment of tossing a fair coin twice.
The sample space is S= {HH, HT, TH, TT}. Assume F = PS and define a random variable X
as follows.
Sample Point
s
HH
HT
TH
TT
X (s)
0
1
2
3
{HH }
{HH , HT }
Here X 1 ( ( , x ]) =
{HH , HT , TH }
S
x<0
0 x <1
1 x < 2
2 x<3
x3
Example 2
Consider the sample space associated with the single toss of a fair die. The sample space is
given by S = {'1', '2 ','3', '4 ','5','6 '} .
Define X to be the mapping that associates a real number equal to the number in the face of the
die and consider F = PS . It is easy to verify that for any x , X 1 ((, x]) F Therefore, X
is a random variable with RX = {1, 2,3, 4,5,6} .
Remark
the sigma
(2)
Axiom 1
PX ( B ) = P ( X 1 ( B )) 1 , because X 1 ( B) F.
Axiom 2
PX ( ) = P ( X 1 ( )) = P ( S ) = 1
Axiom 3
Suppose B1 , B2 ,.... are disjoint Borel sets. Then X 1 ( B1 ), X 1 ( B2 ),.... are also disjoint
sets in F. Therefore,
PX ( U Bi ) = P( X 1 U ( Bi )
i =1
i =1
= P( U X 1 ( Bi )
i =1
= PX ( Bi )
i =1
and
{s | X ( s ) B} F are equivalent and PX ({B}) = P({s | X ( s ) B}). The underlying sample space is
omitted in notation and we simply write { X B} and P ({ X B}) instead of {s | X ( s ) B} and
P ({s | X ( s ) B}) respectively whenever there is no confusion.
Once we have a representation for PX , we can model random data on the real line without being
concerned about the original probability space. Indeed, we model random data in terms of
random variables without being aware of the underlying sample space.
The probability measure PX as defined on any Borel set is not easy to handle. We have better
representations in terms of probability functions that are defined at each point on the real line.
We first introduce one such function, namely, the probability distribution function.
Probability distribution function
Consider the Borel set ( , x] where x represents any real number. The equivalent event in F
is given by
X 1 ((, x]) = {s | X ( s ) x, s S}
and
denoted as { X x}.
{ X > x} = { X x}c ,
{x1 < X x2 } = { X x2 } \{ X x1},
1
{ X = x} = I { X x} \ X x
n =1
n
and so on.
is a function defined by
FX ( x) = PX ((, x])
= P({s | X ( s) x, s S})
(3)
= P({ X x})
for all x . It is also called the cumulative distribution function abbreviated as CDF.
The notation for FX ( x) is used to denote the CDF of the RV X at a point x .
Example 3
Consider the random variable X in Example 1
Assigning equal probabilities to each elementary event in S , we have
P ({ X = 0}) = P ({ X = 1}) = P ({ X = 2}) = P ({ X = 3}) =
FX ( x) is computed as follows.
1
.
4
1
4
Figure 2 The CDF of the random variable in Example 1 and Example 3 ( X and Y axes are to be
scaled)
The distribution function carries all the information about the random variable. Its properties are
summarized in the following theorem.
Theorem 1
If FX ( x) is the probability distribution function of a random variable X , then
(a) FX (x) is a non-decreasing function of x.
(b) FX (x) is right continuous.
(c) FX () = lim FX ( x) = 0 .
x
(d) FX () = lim FX ( x) = 1 .
x
Proof:
x1 < x2 .Then .
FX (x) is a
non-decreasing function of
x , suppose
(, x1 ] (, x2 ]
PX ((, x1 ]) PX ((, x2 ])
FX ( x1 ) FX ( x2 )
(b) Suppose {xn } is a decreasing sequence such that lim xn = x . Thus xn is approaching x
n
lim FX ( xn ) = lim FX ( xn )
xn x +
= lim PX ((, xn ])
n
= PX I (, xn ]
n=1
= PX ((, x])
= FX ( x).
Thus FX ( x) is continuous from the right and we write FX ( x + ) = FX ( x) .
(c) Suppose {xn } is a decreasing sequence such that xn .
theorem
lim FX ( xn ) = lim FX ( xn )
xn
= lim PX ((, xn ])
n
= PX I (, xn ]
n=1
= PX ()
= 0.
FX () = lim FX ( x) = 0.
x
(d)
Applying
the
lim FX ( xn ) = lim FX ( xn )
xn
= lim PX ((, xn ])
n
= PX U (, xn ]
n=1
= PX ( )
= 1.
FX () = lim FX ( x) = 1.
x
We can compute the probability of any Borel set in terms of the CDF. Particularly, the following
results are very important.
(i) P({x1 < X x2 }) = FX ( x2 ) FX ( x1 ) .
(4)
Proof:
and { X x1} are two mutually exclusive events such that
{ X x1} {x1 < X x2 } = { X x2 } .
Therefore,
{x1 < X x2 }
(5)
Proof: We have
{x1 X x2 } = {x1 < X x2 } { X = x1}
P ({x1 X x2 }) = P ({x1 < X x2 }) + P ({ X = x1})
= FX ( x2 ) FX ( x1 ) + P ({ X = x1})
(6)
(7)
Proof:
{ X x} and { X > x} are two mutually exclusive events such that
{ X x} {x > X } = .
Therefore,
P({ X x}) + P({x > X }) = P( ) = 1.
FX ( x) + P({x > X }) = 1.
P({x > X }) = 1 FX ( x).
(8)
lim+ P ({ X = x}) = FX ( x) FX ( x ).
0
where FX ( x ) = lim FX ( x )
0+
We have seen that given FX ( x), - < x < , we can determine the probability of any event
involving values of the random variable X . Thus FX ( x) x is a complete description of the
random variable X . Two random variables X and Y are called identically distributed if
FX ( x) = FY ( x ) x .
Example 4
FX ( x ) = x +
4
8
1
x < 2
2 x < 0
x0
FX ( x )
(c) P ({ X > 2} )
(d) P ({1 < X 1} )
1/4
Solution:
-2
=1
We have observed that random variables are completely characterized by the distribution
function FX ( x). They can be classified into discrete, continuous and mixed-type random
variables according to the nature of the distribution function. Such classification helps in
studying the properties of random variables. For a discrete random variable X, FX (x) is piecewise constant with jump discontinuities at countable number of points. If X is continuous, then
FX (x) is a continuous function of x. In the case of a mixed-type random variable X, FX (x) has
jump discontinuities at countable number of points and it increases continuously at least in one
interval of X . Typical plots of FX ( x) for discrete, continuous and mixed-type random
variables are shown in Figure 4. We shall give more formal definitions of these classes and
their charcterisations in subsequent sections.
FX ( x)
x
Figure 4 (a) FX ( x) for a discrete random variable
FX ( x )
x
FX ( x )
x
Figure 4 (c) FX ( x) for a mixed-type random variable
p X ( xi ) = P({s | X ( s ) = xi })
= P({ X = xi })
for each xi RX .
(9)
The PMF of a discrete random variable X has the following important properties which can be easily
proved.
1.
2.
p X ( xi ) 0 xi RX
p X ( xi ) = 1
(10)
(11)
xi RX
3. Suppose B
(12)
xi B
FX ( x) = p X ( xi )
(13)
xi x
and
p X ( xi ) = FX ( xi ) FX ( xi )
(14)
The PMF of a random variable can be graphically represented by a column diagram as illustrated in
Figure 5.
p X ( xi )
xi
Figure 5 The PMF of a discrete RV
Example 5
Consider the random variable X with defined by
X ( s ) = c s S .
Then X is a discrete RV with the the PMF
p X (c ) = P ( X = c ) = 1 .
Example 6
Consider the random variable X with the distribution function
0 x < 0
1
0 x <1
4
FX ( x) =
1 1 x < 2
2
1 x 2
FX ( x)
1
1
2
1
4
p X (2) = FX (2 ) FX (2 ) = 1 =
2 2
p X (0) = FX (0+ ) FX (0 ) =
We shall describe about some useful discrete probability mass functions in a later chapter.
Continuous random variables and probability density functions
Definition: A random variable X defined on the probability space ( S , F , P ) is said to be continuous if
FX ( x) =
f X (u )du
(15)
d
FX ( x )
dx
(16)
Interpretation of f X (x)
If f X ( x) is a continuous function of x, then it has the following interpretation.
Consider a point x0 . Then
f X ( x0 ) =
FX ( x)
dx
x = x0
FX ( x0 + x) FX ( x0 )
x 0
x
P({x0 < X x0 + x})
= lim
x 0
x
= lim
so that
P({x0 < X x0 + x})
f X ( x0 )x.
See also the illustration in Figure 7. Thus the probability of X lying in the some small interval
( x0 , x0 + x] is determined by f X ( x0 ). In that sense, f X ( x) represents the concentration of
probability just as the density of an object represents the concentration of mass.
f X ( x)
x 0 x0 + x0
x
f X ( x0 )x0
(17)
(b) P ({ X = x}) = 0
Proof:
(18)
FX ( x) = FX ( x ) x .
(c)
( x)dx = 1
(19)
Proof:
f X ( x )dx = FX ()
=1
x2
(d) P( x1 < X x 2 ) =
(20)
( x)dx
Proof:
P( x1 < X x2 ) = FX ( x2 ) FX ( x1 )
x2
x1
f X ( x)dx
f X ( x)dx
x2
= f X ( x)dx
x1
x2
P( x1 < X x2 ) =
( x)dx
x1
Example 7
Consider the random variable X with the distribution function
0
FX ( x) =
ax
1 e
x<0
a > 0, x 0
0
f X ( x) = ax
e
x<0
a > 0, x 0
f X ( x) = 1
a x
x0
a > 0, x 0
Determine a and FX ( x) .
Solution: We have
f X ( x) dx = 1
x
1
a= .
2
dx = 1
By integrating,
FX ( x) = x
1
x0
0<x < 1
x 1
Remark
Using the Dirac delta function we can define the density function for a discrete random variables.
Consider the random variable X defined by the PMF p X ( xi )
The CDF FX ( x) can be written as
for i = 1, 2,..., N .
FX ( x) = p X ( xi )u ( x xi )
i =1
1 for x xi
u ( x xi ) =
0 otherwise
Then the density function f X ( x) can be written in terms of the Dirac delta function as
f X ( x) = p X ( xi ) ( x xi )
i =1
Example 9
Consider the random variable defined in Example 6. The distribution function FX ( x) can be
written as
1
1
1
FX ( x) = u ( x) + u ( x 1) + u ( x 2)
4
4
2
and the PDF is given by
1
1
1
f X ( x) = ( x) + ( x 1) + ( x 2)
4
4
2
Definition: Consider the event { X x} and any event B. The conditional distribution function of
X given B is defined as
FX / B ( x ) = P ({ X x} / B )
=
Thus
P ({ X x} B )
P ( B)
P ( B) 0
(21)
(22)
We can verify that FX / B ( x ) satisfies all the properties of the distribution function. Particularly,
the following properties are important.
(1) FX / B ( ) = 0 and FX / B ( ) = 1.
(2) 0 FX / B ( x ) 1
(3) FX / B ( x ) is a non-decreasing and right continuous function of x.
(4) P({ x1 < X x2 } / B) = FX / B ( x2 ) FX / B ( x1 )
In a similar manner, we can define the conditional density function f X ( x / B ) .
FX / B ( x ) =
f X / B ( u )du
(23)
fX / B ( x) = fX ( x)
(23)
(24)
(2)
f X / B ( x )dx = FX / B ( ) = 1
x2
f ( x )dx
X /B
x1
Example 10
FX ( x ) as
P ({ X x} B )
=
=
P ( B)
P ({ X x} { X b} )
P { X b}
P ({ X x} { X b} )
FX ( b )
Case 1: If x<b, then
{ X x} { X b} = { X x}
P ({ X x} { X b} )
FX / B ( x ) =
FX ( b )
P ({ X x} )
=
FX ( b )
F ( x)
= X
FX ( b )
and the corresponding conditional PDF is given by
f ( x)
fX /B ( x) = X
FX ( b )
Case 2: If x b , then
{ X x} { X b} = { X b}
FX / B ( x ) =
=
=
P ({ X x} { X b} )
FX ( b )
P ({ X b})
FX ( b )
FX ( b )
=1
FX ( b )
and f X / B ( x ) = 0
For given FX ( x) and b, typical plots of FX / B ( x ) and f X / B ( x ) are shown in Figure 8.
FX / B ( x )
FX ( x)
f X / B ( x)
f X ( x)
b
Figure 8 (b) Plot of
Example 11
Suppose X is a random variable with the distribution function FX ( x ) and B = { X > b} .
Then
FX / B ( x ) =
P ({ X x} B )
=
=
P ( B)
P ({ X x} { X > b} )
P { X > b}
P ({ X x} { X > b} )
1 FX ( b )
Case 1: If x b ,
ee have { X x} { X > b} = .
FX / B ( x ) = 0
Case 2: x > b
We have { X x} { X > b} = {b < X x} .
FX / B ( x ) =
=
P ({b < X x} )
1 FX ( b )
FX ( x ) FX ( b )
1 FX ( b )
Thus,
xb
0
FX / B ( x ) = FX ( x ) FX ( b )
otherwise
1 F (b)
X
fX ( x / B) = fX ( x)
1 F ( b )
X
xb
otherwise
Example 12
Suppose
f X ( x) =
is
1
e
2
random
x2
FX / B ( x ) =
=
=
variable
with
the
probability
P ({ X x} B )
P ( B)
P ({ X x} {1 X 1} )
P ({1 X 1} )
P ({ X x} {1 X 1} )
1
f
1
( x)dx
density
function
0
F ( x) F (1)
X
FX / B ( x ) = 1 X
x2
1 2
e dx
1 2
1
f X ( x)
2
1
1 x2
e dx
fX /B ( x) =
1 2
0
x 1
1 < x < 1
x 1
1 < x < 1
otherwise
where
2
1 x2
f X ( x) =
e
2
Remark
2
1 x2
The density function f X ( x) =
is the standard Gaussian density function which we
e
2
shall discuss in a later chapter . The conditional density function f X ( x / B ) in this case is called
the truncated Gaussian. These density functions are plotted in Figure 9.
Figure 9 Plot of
= Bi ,
i =1
Bi B j = for i j
and
n
{ X x} = { X x} Bi
i =1
= P ( Bi )P ({ X x}/ Bi )
i =1
n
= P ( Bi )FX ( x / Bi )
i =1
n
FX ( x ) = P ( Bi )FX ( x / Bi )
(25)
i =1
n
Equivalently, f X ( x ) = P ( Bi ) f X ( x / Bi )
i =1
P ( B j / { X x} ) =
=
P ( B j { X x} )
P ({ X x})
P ( B j ) P ({ X x} / B j )
FX ( x)
P B j ) FX / B j ( x
n
P ( B ) F ( x)
i
(26)
X / Bi
i =1
Conditional probability P ( B / { X = x} )
We may also be interested in finding the conditional probability of the event B given that the
event { X = x} has occurred. As P ({ X = x} ) = 0 for a continuous random variable, we cannot
find P ( B / { X = x} ) by the relation
P ( B / { X = x} ) =
P ( B { X = x} )
P ({ X = x} )
P ( B { x < X x + x} )
P ( B / { X = x} ) = lim
P ({ x < X x + x} )
x 0
= lim
P ( B ) P ({ x < X x + x} / B )
P ({ x < X x + x} )
x 0
P ( B ) f X / B ( x ) x
x 0
f X ( x ) x
= lim
=
P ( B / { X = x} ) =
f X / B ( x) P( B)
f X ( x)
f X / B ( x) P( B)
f X ( x)
P ( B / { X = x}) f
( x)dx
f X / B ( x) P( B)dx
= P( B) f X / B ( x)dx
Q f X ( x / B )dx = 1
= P( B)
P( B ) =
P ( B / { X = x}) f
( x )dx
f X ( x) P ( B / { X = x} )
P ( B / { X = x}) f
( x ) dx
Example 13
A random variable X has the CDF
x<0
0
FX ( x) =
2 x
x0
1 e
Suppose A = {0 X 2} and B = {2 < X < } . Find
(a) FX / A ( x) and FX / B ( x)
(b) P ( A / { X 5} ) and P ( B / { X 5} )
Solution:
(a) Here P ( A) = 1 e4 and
P ( B ) = 1 (1 e 4 ) = e4 .
FX / A ( x) =
P ({ X x} A )
P( A)
FX ( x)
= P( A)
1
1 e2 x
FX / A ( x) = 1 e 4
1
0 x2
x>2
0 x2
x>2
Similarly
x2
0
4 2 x
FX / B ( x) = e e
x>2
e 4
(b) We have
P ( A / { X 5} ) =
P ( A) FX / A (5)
P ( A) FX / A (5) + P ( B ) FX / B (5)
(1 e 4 ) 1
e 4 e 10
4
4
(1 e ) 1 + e
e 4
(1 e 4 )
=
1 e 10
e 4 e 10
and P ( B / { X 5} ) =
1 e10
=
so that
{ X x} = ( RD RC ) { X x} = { X x}
Suppose 0 < p < 1 such that
p = P ( RD )
=
pX ( xi ) p
xi RD
Then
P ( RC ) = 1 p .
Using the result in (25), FX ( x) can now be expressed as
FX ( x ) = P ( RD ) FX / RD ( x ) + P ( RC ) FX / RC ( x )
= pFD ( x ) + (1 p ) FC ( x )
where FD ( x ) = FX / RD ( x ) and FC ( x ) = FX / RC ( x ) .
The corresponding PDF is given by
f X ( x) = pf D ( x) + (1 p) fC ( x)
where
n
f D ( x) = p X ( xi ) ( x xi )
i =1
and fC ( x ) = f X / RC ( xC ) .
Example 14
Consider the random variable X with the distribution function
0
0.1
FX ( x ) =
0.1 + 0.8 x
1
x<0
x=0
FX ( x )
0 < x <1
x >1
0.9
0.1
1
x
0
Figure 10 The CDF of of the RV in Example 14
The plot of FX ( x ) is shown in Figure 10.
Here RD = {0,1} and
p = p X (0) + p X (1)
= 0.1 + 0.1
= 0.2
Therefore, FX ( x )
can be expressed as
FX ( x) = 0.2 FD ( x) + 0.8FC ( x)
where
FD ( x) = FX / RD ( x )
=
P ({ X x} RD )
= 0.5
1
P ( RD )
x<0
0 x 1
x >1
FC ( x) = FX / RC ( x )
=
P ({ X x} RC )
P ( RC )
= x
1
x<0
0 x 1
x >1
Example 15
X is the RV representing the life time of a device with the CDF FX ( x ) for x 0 . Define the
following random variable
if 0 X a
X
Y =
if X > a
a
Find FY ( y )
Solution: We have
RD = {a}
RC = [0, a)
p = P { y RD }
= P { X > a}
= 1 FX ( a )
FY ( y ) = pFD ( y ) + (1 p ) FC ( y )
= (1 FX ( a )) FD ( y ) + FX ( a ) FC ( y )
where FD ( y ) = {0 =