Lecture 1 Probability

City University of Hong Kong Dept of Electrical Engineering
EE3313 Applied Queuing Systems
“Probability”
The lecture slides are largely based on the teaching materials provided by
Dr. Tim Ko.
last revised: 9/2019
Probability —1—
Managing Uncertainty
• Most life decisions are made under uncertainty.
• In HK, the incidence of lung cancer is 88 per 100,000 (≈ 0.001). One

of the tests for lung cancer has an accuracy of 99% (i.e. one wrong
result for every 100 tests or 0.01). A patient has been tested positive
for lung cancer, what is the probability that the patient actually has
lung cancer (i.e. true positive)?
• Approximate Solution:
The inaccuracy of the test is 0.01, assume all inaccuracy show up as
false positive (tested positive but no cancer). We ignore the false
negatives (tested negative but have cancer) because their occurrences
are negligible.
• Proportion of patients are tested positive (all positive) = false positive

+ true positive = 0.01 + 0.001 = 0.011
Probability —2—
• If a patient has been tested positive, what is the probability that the
patient has cancer (true positive)?
= proportion of people who have cancer / all positive
= 0.001/0.011 = 0.091, < 10%.
• Now consider the difference in the answer when a patient is referred by

the doctor to have the test; the doctor taking the patient symptoms
into account (e.g. coughing for a long time), 30% of the referrals are
likely to have lung cancer.
In this case, if a patient has been tested positive, what is the
probability that the patient has cancer?
Probability —3—
Contents
• Events, Sample Space, and Random Variables (RVs)
• Probability, Conditional Probability and Independence
• Probability and Distribution Functions
• Moments
• Discrete RVs: uniform, Bernoulli, geometric, binomial, Poisson
• Continuous RVs: exponential, normal, uniform
• Relationships
– between exponential and geometric
– between binomial and Poisson
– between Poisson and exponential
– between binomial, Poisson and normal
Probability —4—
Events, Sample Space, and Random Variables

• Consider a yet to be performed experiment with an uncertain
outcome. The term “experiment” refers to any uncertain event such as
tomorrow’s weather, tomorrow’s share price, or the result of flipping a
coin.
• The sample space is a set of all possible outcomes of an experiment.
• An event is a subset of the sample space.
• Example: an experiment of tossing a die. The sample space is
{1, 2, 3, 4, 5, 6}, and an event could be {2, 4, 6}, which can be
described in words as ”the number is even”.
• Events are mutually exclusive; their intersection is the empty set.
• A set of events is said to be exhaustive, if its union is equal to the
sample space.
• A random variable is a real (discrete or continuous) valued function
Probability —5—
defined on the sample space.

• A random variable has a real value for each possible outcome of an
experiment. It is the outcome of the experiment that is random! and
therefore the name: random variable (RV).
• You learned about variables in algebra: e.g., 2X = 4. X is a variable
that you know its value: X = 2.
• But if X is random variable, you do not know its value. However, you
know something about X. You will know something about the
likelihood of X to obtain certain values. This likelihood is given by
the probability distribution.
• If we consider an experiment of flipping a coin, the possible outcomes
are Head (H) and Tail (T). Hence the sample space is {H, T}, and a
random variable X could assign X = 1 for the outcome H, and X = 0
for the outcome T.
• If X is a random variable, then Y = g(X) for some function g(·) is
Probability —6—
also a random variable. In particular, some functions of interest are

Y = cX for some constant c and Y = X n for some integer n.
• If X1 , X2 , X3 , . . . , Xn is a sequence of random variables, than

∑n
Y = i=1 Xi is also a random variable.
Probability —7—
Probability, Conditional Probability and Independence

• S = sample space; A is a subset in S, namely, A is an event!
• The probability of A is the function on S, denoted P (A), that satisfies
the following three axioms:
1. 0 ≤ P (A) ≤ 1.
2. P (S) = 1, sum of all probabilities in S is one.
3. The probability of the union of mutually exclusive events is equal
to the sum of the probabilities of these events.
• Higher probability means higher likelihood of occurrence.
• If we conduct a very large number of experiments, and we generate the
histogram by measuring how many times for each of the possible
occurrences. If we normalize the histogram by dividing the number of
times by the total number of experiments, we have the relative
frequencies. These relative frequencies are the probability of each
occurrence.
Probability —8—
• P (A | B) = conditional probability of A given B, which is the

probability of the event A given that we know event B has occurred.
• If event B has occurred, it is our new sample space. And for A to
occur, the relevant outcomes for the experiments must be in A ∩ B.
• P (A | B) (the probability of A knowing that B has occurred), is the
ratio between the probability of A ∩ B and the probability of B.
P (A ∩ B)
P (A | B) = . (1)
P (B)
• Since P (A | B)P (B) = P (A ∩ B) = P (B | A)P (A),
P (B | A)P (A)
P (A | B) = . (2)
P (B)
P (A | B)P (B)
P (B | A) = . (3)
P (A)
• Now if events A and B are independent (special case), which means
Probability —9—
that if one of them occurs, the probability of the other to occur is not
affected,
P (A ∩ B) = P (A)P (B).
P (A | B) = P (A); P (B | A) = P (B)
Probability —10—
Probability and Distribution Functions

• Random variables (RVs) are related to outcomes of experiments.
{X = x}, or the RV X takes the value x, where x is a value assigned
to an outcome of an experiment.
• Therefore, we may assign probabilities to all possible values of the
random variable. This function denoted PX (x) = P (X = x), or
probability function.
• The distribution function (also called Cumulative Distribution
Function (CDF)) of random variable X is defined for all x ∈ R (R
being the set of all real numbers), is defined as
FX (x) = P (X ≤ x).
• The complementary distribution function F̄X (x) is defined by
F̄X (x) = P (X > x).
Consequently, for any random variable, for every x ∈ R,

F (x) + F̄ (x) = 1.
• CDF always “starts” at FX (−∞) = 0.
• CDF always “finishes” at FX (+∞) = 1.
• CDF never decreases P {a < X ≤ b} = FX (b) − FX (a) ≥ 0 for b > a.
Probability Function for a Discrete RV

• P {X = xi } = probability that random variable X takes value xi .
• Sum of probabilities over all possible outcomes of experiments

= 1:
∑
n
P {X = xi } = P (x0 ) + P (x1 ) + · · · + P (xn ) = 1

i=0
• CDF of b gives the probability of a random variable (X) with value

less than and equal to b:
∑
b
FX (b) = P {X ≤ b} = P {X = xi } = P (x0 ) + P (x1 ) + · · · + P (xb )

i=0
.
Probability Density Function (pdf ) for a Continuous RV

• The pdf fX (u) is the derivative of the CDF
∫ u=∞
dFX (u)
fX (u) = , fX (u)du = 1
du u=−∞
• Probability of a continuous random variable to take values between a

and b:
∫ u=b
P (a < X ≤ b) = FX (b) − FX (a) = fX (u)du
u=a
• Probability of a continuous random variable to take value of exactly a:

∫ u=a
P (X = a) = fX (u)du = 0
u=a
Moments
• For discrete random variable (RV), 1st moment, average or mean:
E(X) = X = x1 P {X = x1 } + .. + xn P {X = xn }
∑
n
= xi P {X = xi }
i=1
• For continuous RV, 1st moment, average or mean:

∫ ∞
E(X) = X = ufX (u)du
−∞
• The second moment:

∑
n
E(X 2 ) = X 2 = x2i P {X = xi } – discrete

i=1
or ∫ ∞
2
E(X ) = X 2 = u2 fX (u)du – continuous
−∞
• The second central moment is called the variance:

2
V ar(X) = σX = E((X − m)2 ) = E(X 2 ) − m2
where m = E(X)
√
• Standard Deviation: σX = V ar(X)
• If a is a constant:
E(aX + b) = aE(X) + b V ar(aX + b) = a2 V ar(X)
• If X1 and X2 are two RVs
E(X1 + X2 ) = E(X1 ) + E(X2 )
• If X1 and X2 are two independent RVs
V ar(X1 + X2 ) = V ar(X1 ) + V ar(X2 )
Example of calculating Mean and Variance

• A set of RVs has the value x = 0, 1, 2 with equal probabilities (i.e.
P {X = x} = 1/3)
• To find the mean,

∑
n
E(X) = xi P {X = xi } = 0 + 1/3 + 2/3 = 1

i=1
• For discrete RV, the variance is given by

∑
n
V ar(X) = {xi − E(X)}2 P {X = xi }

i=1
• The variance is the sum of 1/3, 0, 1/3, or V ar(X) = 2/3
Uniform Discrete Random Variable

• All integers in a range of n consecutive integers are equally probable.
• Consider the experiment of tossing a die - the outcomes are 1, 2, 3, 4,

5, 6. Each occur with probability 1/6.
• Find the mean and the variance of this random variables.
• Sketch pdf and CDF of this random variable.
Bernoulli Random Variable (Discrete)

• Sample Space = (“Success”, “Failure”)
• Success = 1, P (Success) = p
• Failure = 0, P (F ailure) = 1 − p
• For a Bernoulli random variable X with parameter p, derive the

results for the mean and the variance:
E[X] = p
V ar[X] = p(1 − p)
Geometric Distribution
• The geometric random variable X represents the number of
independent Bernoulli trials (each of which with p being the
probability of success) that are required until the first success.
• For X to be equal to i we must have i − 1 consecutive failures and
then one success in i independent Bernoulli trials.
• The probability function is
P (X = i) = (1 − p)i−1 p for i = 1, 2, 3, . . . .
• The geometric random variable possesses an important property called

memorylessness.
• In particular, discrete random variable X is memoryless if
P (X > m + n | X > m) = P (X > n), m, n = 0, 1, 2, . . . ,
• The Geometric random variable is memoryless because it is based on
independent Bernoulli trials, and the fact that we had m failures does
not affect the probability that the next n trials will be failures or not.
• The Geometric random variable is the only discrete random variable

that is memoryless.
1
E[X] =
p
1−p
V ar[X] =
p2
Binomial Distribution
• Binomial – the number of successes in S independent Bernoulli trials.
• It is also the sum of S independent, identically distributed (iid)

Bernoulli RV’s with parameter p.
• Example: probability of finding r busy subscribers out of a total of S

subscribers:
r )p (1 − p)
P (X = r) = (S r S−r
r = 0, 1, 2, .., S
where p is the probability that a given subscriber is making calls in a

duration, and
S!
(S
r ) = = CrS
(S − r)!r!
• For a Binomial random variable X with parameters p and S, derive

the results for the mean and the variance:
E[X] = Sp
V ar[X] = Sp(1 − p).
Hint: observe that the binomial random variable is a sum of

independent Bernoulli random variables.
Figure 1: Example of Binomial Probability Density
(Negative) Exponential Dist (Cont RV)

• The density of the Exponential random variable is given by
f (x) = µe−µx where x ≥ 0
• The cumulative distribution function is given by

∫ x
F (x) = µe−µt dt = 1 − e−µx
0
• The complementary distribution function is given by
C(x) = 1 − F (x) = e−µx
• Verify that it is a probability distribution:

∫ ∞
1
F (∞) = µe−µr dr = µ( )[e−µr ]∞
0 = −[0 − 1] = 1
0
−µ
• If X is an exponential random variable then:
E[X] = 1/µ,
V ar[X] = 1/µ2 (= (E[X])2 )
• The Exponential Distribution has the memoryless property:
P (X > t + s | X > t) = P (X > s).
Proof: from Eq (1), the definition of conditional independence:

∩
P ({X > t + s} {X > t})
P (X > t + s | X > t) =
P (X > t)
P ({X > t + s}) e−µ(t+s) −µs

= = −µt
= e = P (X > s).
P (X > t) e
Scaling
∫ b
1
P (a ≤ r ≤ b) = µ e−µr dr = µ( )[e−µr ]ba = −[e−µr ]ba
a
−µ
• Example: when µ is in different unit. Let µ1 be 3 min or 180 sec, then

µ is 13 customer /min or 180
1
customer /sec.
Find the probability that r is between 60 sec and 61 sec:
1
P (1 ≤ r ≤ 1.0167) = −[e− 3 r ]1.0167
1 = 0.003978
1
P (60 ≤ r ≤ 61) = −[e− 180 r ]61
60 = 0.003970
Relationship between Exponential and Geometric RVs

• Both memoryless: geometric the only memoryless discrete RV, and
exponential the only memoryless continuous RV.
• Let Xexp be an exponential random variable with parameter λ and let

Xgeo be a geometric random variable with parameter p.
• Let δ be an “interval” size to discretize the continuous values that

Xexp takes, and we are interested to find δ such that
FXexp (nδ) = FXgeo (n), n = 1, 2, 3, . . . .
• To find such a δ, it is more convenient to consider the complementary

distributions.
• We aim to find δ that satisfies
P (Xexp > nδ) = P (Xgeo > n), n = 1, 2, 3, . . . ,
or
e−λnδ = (1 − p)n , n = 1, 2, 3, . . . ,
e−λδ = 1 − p.
Thus,
− ln(1 − p)
δ= and p = 1 − e−λδ .
λ
• We can see that as the interval size δ approaches zero the probability
of success p also approaches zero; and under these conditions the two
distributions equal to each other.
Poisson Process
• Example: An experiment generate a random variable X representing
the number of telephone calls per minute received by a switch (or
number of outcomes occurring in a given time interval). This is a
Poisson process if it has the following properties:
1. The number of outcomes occurring in one time interval is independent
of the num that occurs in any other disjoint time interval –
memoryless.
2. The prob that a single outcome will occur during a very short time
interval is proportional to the length of the time interval; and does not
depend on the num of outcomes occurring outside this time interval.
3. The prob that more than one outcome will occur in such a short time
interval is negligible.
• In this case, the number of X telephone calls received is a Poisson RV,
and its prob distribution is the Poisson distribution.
Poisson Distribution
• Poisson Distribution:
λr −λ
P (X = r) = e r = 0, 1, 2, ..
r!
• Derive the mean and variance of Poisson random variable to obtain
E[X] = λ, V ar[X] = λ.
• Prob of r telephone calls arrived within an interval t can be modelled

by the Poisson distribution:
(λt)r −λt
P (r, t) = e
r!
where λ= call arrival rate [num of calls per sec], E[X] = V ar[X] = λt
• Consider a sequence of binomial random variables Xn , n = 1, 2, . . .
with parameters (n, p) where λ = np, or p = λ/n. (Notice: using n
instead of S because n is used both as a parameter of the binomial
distribution and also an index.)
Then the binomial random variable Y has the following probability

function
P (Y = k) = lim P (Xn = k)
n→∞
has a Poisson distribution with parameter λ.
Proof:
We write:
( )
n k
lim P (Xn = k) = lim p (1 − p)n−k .
n→∞ n→∞ k
Substituting p = λ/n, we obtain

( )k ( )n−k
n! λ λ
lim P (Xn = k) = lim 1− .
n→∞ n→∞ (n − k)!k! n n
or
( k
)( )n ( )−k
n! λ λ λ
lim P (Xn = k) = lim 1− 1− .
n→∞ n→∞ (n − k)!nk k! n n
Now notice that ( )n

λ
lim 1− = e−λ ,
n→∞ n
( )−k
λ
lim 1− = 1,
n→∞ n
n!
lim = 1.
n→∞ (n − k)!nk
Therefore,
λk e−λ
P (Y = k) = lim P (Xn = k) = .
n→∞ k!
• We have proved that the Poisson random variable accurately
approximates the binomial random variable in case when n is very
large and p is very small so that np is not too large and not too small.
Figure 2: Example of Poisson Probability Density (the parameter here

is A instead of λ)
Relation between Poisson and Exponential Distributions

• when there is no arrivals in a Poisson process or r = 0: P (0, τ ) = e−λτ
= prob that no calls arrive during τ , where τ is the value of the
interarrival time
• if T is the name of the RV for call interarrival time:
P (0, τ ) = P (T > τ ) = e−λτ
• F (τ ) = P (T ≤ τ ) = 1 − e−λτ
• pdf of the call interarrival time (τ ) is then given by
dF (τ ) 1
f (τ ) = = λe−λτ ; E(τ ) =
dτ λ
• Poisson arrival process ⇔ exponential interarrival time
• Number of negative exponential distributed arrivals in time t
= r arrivals in Poisson distribution during time t
Normal Distribution
• RV commonly used in many applications, is the Normal (also called
Gaussian).
• We say that the random variable X has Gaussian distribution with
parameters m and σ 2 if its density is given by
1 −(x−m)2 /2σ 2
fX (x) = √ e − ∞ < x < ∞.
2πσ
• This density is symmetric and bell shaped.
• The wide use of the Gaussian random variable is rooted in the
so-called The central limit theorem.
• This theorem is the most important result in probability theory.
• Loosely speaking, it says that the sum of a large number of
independent random variables (not necessarily of the same
distribution, but each has a finite variance) has Gaussian (normal)
distribution.
• This is also true if the distribution of these random variables are very
different from Gaussian.
• This theorem explains why so many applications in nature have bell

shaped Gaussian histograms, and justifies the use of the Gaussian
distribution as their model.
Binomial, Poisson and Normal

• Binomial, Poisson and Normal approach the same shape under certain
conditions:
– Binomial: S large, p small, Sp(= A) reasonably large (> 10)
– Poisson: large mean A (A > 10)
– Normal: mean = A, variance = A
Figure 3: Binomial, Poisson and Normal
Uniform Continuous Random Variable

• The probability density function of the uniform random variable takes
nonnegative values over the interval (a, b) and is given by
{
1
b−a
if a < x < b
f (x) =
0 otherwise.
• Special case - the uniform (0,1) random variable. Its probability

density function is given by
{
1 if 0 < x < 1
f (x) =
0 otherwise.
• The uniform (0,1) random variable is very important in simulations.
• Almost all computers languages have a function by which they can

generate uniform (0,1) random values.
Applications in Simulations
• You want to find an exponentially distributed (f (r) = µe−µr ) random
value from a number generated by a uniformly distributed random
number generator.
• Let b be a randomly selected number from a uniformly distributed

random number generator between [0,1].
• b can be transformed to an exponentially distributed random number:

1
τ = − ln(1 − b).
µ

Lecture 1 Probability

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 1 Probability

Hochgeladen von

Copyright:

Verfügbare Formate

City University of Hong Kong Dept of Electrical Engineering

EE3313 Applied Queuing Systems

last revised: 9/2019

• In HK, the incidence of lung cancer is 88 per 100,000 (≈ 0.001). One

• Proportion of patients are tested positive (all positive) = false positive

• Now consider the diﬀerence in the answer when a patient is referred by

• Probability, Conditional Probability and Independence

• Probability and Distribution Functions

• Discrete RVs: uniform, Bernoulli, geometric, binomial, Poisson

• Continuous RVs: exponential, normal, uniform

Events, Sample Space, and Random Variables

deﬁned on the sample space.

also a random variable. In particular, some functions of interest are

• If X1 , X2 , X3 , . . . , Xn is a sequence of random variables, than

Probability, Conditional Probability and Independence

• P (A | B) = conditional probability of A given B, which is the

Probability and Distribution Functions

• The complementary distribution function F̄X (x) is deﬁned by

F̄X (x) = P (X > x).

Consequently, for any random variable, for every x ∈ R,

• CDF always “starts” at FX (−∞) = 0.

• CDF always “ﬁnishes” at FX (+∞) = 1.

• CDF never decreases P {a < X ≤ b} = FX (b) − FX (a) ≥ 0 for b > a.

Probability Function for a Discrete RV

• Sum of probabilities over all possible outcomes of experiments

P {X = xi } = P (x0 ) + P (x1 ) + · · · + P (xn ) = 1

• CDF of b gives the probability of a random variable (X) with value

FX (b) = P {X ≤ b} = P {X = xi } = P (x0 ) + P (x1 ) + · · · + P (xb )

Probability Density Function (pdf ) for a Continuous RV

• Probability of a continuous random variable to take values between a

• Probability of a continuous random variable to take value of exactly a:

• For continuous RV, 1st moment, average or mean:

• The second moment:

E(X 2 ) = X 2 = x2i P {X = xi } – discrete

• The second central moment is called the variance:

E(aX + b) = aE(X) + b V ar(aX + b) = a2 V ar(X)

• If X1 and X2 are two RVs

E(X1 + X2 ) = E(X1 ) + E(X2 )

• If X1 and X2 are two independent RVs

V ar(X1 + X2 ) = V ar(X1 ) + V ar(X2 )

Example of calculating Mean and Variance

• To ﬁnd the mean,

E(X) = xi P {X = xi } = 0 + 1/3 + 2/3 = 1

• For discrete RV, the variance is given by

V ar(X) = {xi − E(X)}2 P {X = xi }

• The variance is the sum of 1/3, 0, 1/3, or V ar(X) = 2/3

Uniform Discrete Random Variable

• Consider the experiment of tossing a die - the outcomes are 1, 2, 3, 4,

• Find the mean and the variance of this random variables.

• Sketch pdf and CDF of this random variable.

Bernoulli Random Variable (Discrete)

• For a Bernoulli random variable X with parameter p, derive the

• The geometric random variable possesses an important property called

P (X > m + n | X > m) = P (X > n), m, n = 0, 1, 2, . . . ,

• The Geometric random variable is memoryless because it is based on

• The Geometric random variable is the only discrete random variable

• It is also the sum of S independent, identically distributed (iid)

• Example: probability of ﬁnding r busy subscribers out of a total of S

where p is the probability that a given subscriber is making calls in a

• For a Binomial random variable X with parameters p and S, derive

V ar[X] = Sp(1 − p).

Hint: observe that the binomial random variable is a sum of

Figure 1: Example of Binomial Probability Density

(Negative) Exponential Dist (Cont RV)