Discrete Random Variables and Probability Distributions

Discrete Random
3 Variables and
Probability Distributions
3.1 Random Variables
Random Variables
Definition
A random variable (rv) is any rule that associates a

number with each outcome in sample space S.
Mathematically, a random variable is a function
(i) whose domain is the sample space S, and

(ii) whose value range is a set of real numbers.
3
Random Variables
Random variables are denoted by X, Y, etc.
Use x, y to represent specific values of rvs X, Y.
X(s) = x means that x is the value of rv X associated

with the outcome s.
4
Example 3.1
Calling help desk for computer support will have two
possible outcomes: (i) Get someone (S, for success); and
(ii) put on hold (F, for failure).
With sample space {S, F}, define rv X by

X(S) = 1 X(F) = 0
The rv X indicates whether the student can immediately

speak to someone (1), or not (0).
5
Random Variables
Definition
A random variable with only two possible values of 0 and 1

is called a Bernoulli random variable.
Example
Let X denote the outcomes of tossing a quarter, with

X(W) = 1, and X(E) = 0.
Then this X is a Bernoulli rv.

6
Example 3.3
Observe the # of pumps in use at 2 6-pump gas stations.
Define rvs X, Y, and U by
X = the total # of pumps in use at the two stations
Y = the difference between the # of pumps in use

at station 1 and the # in use at station 2
U = the maximum of the numbers of pumps in use at

the two stations
X, Y, and U are rvs.
7
Example 3.3 contd
For observation (2, 3), determine the values of X, Y, & U.
X((2, 3)) = 2 + 3 = 5, so we say that the observed value of X

is x = 5.
Y((2, 3)) = 2 3 = 1 The observed value of Y is y = 1.
U((2, 3)) = max(2, 3) = 3 Observed value of U is u = 3.
8
Example 3.3 contd
X, Y, & U all have the same domain: The sample space
S = {(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)
(1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
There are 49 possible outcomes in S.

9
Example 3.3 contd
They have different value ranges.
Value range of rv X: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.
Range of rv Y: {6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6}.
Range of rv U: {0, 1, 2, 3, 4, 5, 6}.
10
Two Types of Random Variables
Definitions
A discrete random variable: One whose range is a finite
set or countable infinite set of numbers.
X, Y, U in Example 3.3 are all discrete random variables.
A continuous random variable: One

(1) whose range includes all numbers in a single interval or
the union of several disjoint intervals (e.g., [0, 10][20, 30]).
(2) P(X = c) = 0 for any c in the range of rv X.
Remark: P(X = c) = 0 for any c NOT in the range of rv X. 11

Example 3.5
Randomly select a spot in continental US and find its height
above sea level, denoted by Y.
The Y is a continuous rv.
The domain of Y is sample space S = continental US.
The highest spot in US is 14,494 ft at Mt. Whitney
The lowest spot is 282 ft at Death Valley
Therefore, the value range of Y is [282 ft, 14,494 ft].

12
Example 3.6
Married couples are selected at random, and a blood test is
done on both until we find a couple with the same Rh factor.
Define rv X = # of blood tests to be performed, the value

range of X is D = {2, 4, 6, 8, }.
Since the value range is a countable infinite set of numbers,

X is a discrete rv.
What is the domain of X?
It depends on what is of interest!

All married couples in Florida State
All married couples of age 30-50 in Hillsborough County
13
More Examples
Randomly select a student at USF. Define
X = 0, if the selected student is male; and 1, otherwise.

Range: {0, 1}
Y = the weight of the selected student.
Range: [70, 400] lbs
Z = the height in meters of the selected student.
Range: (4, 7) ft
U = 0, 1, and 2, if the selected student is from Florida, out-
of-state, and a foreign country, respectively.
Range: {0, 1, 2}
X, Y, Z, U have the same domain: USF student body.
14
Exercise Problem 4
X = # of non-zero digits in a randomly selected zip code.
Domain = Sample space S = {all US zip codes}.
A zip code has 5 digits
In theory, the value range can be {0, 1, 2, 3, 4, 5}.
But 00000 is not a valid zip code, & there are no zip codes
with 4 zeros.
Value range of X is {2, 3, 4, 5}.
3 zip code examples: 33647, 33620, 90210
X(33647) = 5, X(33620) = 4, X(90210) = 3

15
Section Summary
Concepts discussed
Random Variables
Bernoulli Random Variables
Discrete Random Variables
Continuous Random Variables
16
Probability Distributions
3.2 for Discrete Random Variables
17
Probability Distributions for Discrete Random Variables
Probabilities assigned to various outcomes in in turn

determine probabilities associated with the values of any
particular rv X.
The probability distribution of X says how the total

probability of 1 is distributed among (allocated to) the
various possible X values.
Suppose, for example, that a business has just purchased

four laser printers, and let X be the number among these
that require service during the warranty period.
18
P(X = x) = the probability of rv X = x, denoted by p(x).
p(0) = P(X = 0) = the probability of X equal to 0.
p(1) = P(X = 1) = the probability of X value to 1.
19
Example 3.7
The Statistics Dept. at Cal Poly has a lab with 6 computers.
Let X = the # of computers in use at a particular time of day.
Suppose the probability distribution of X is as given below.
p(0) = 0.05 5% of time no one uses computers.
p(3) = 0.25 25% of time 3 computers are in use. 20

Example 3.7 contd
P(X 2) = P(X = 0 or X = 1 or X = 2)
= P(X=0) + P(X=1) + P(X=2)
= p(0) + p(1) + p(2)
= 0.05 + 0.10 + 0.15
= 0.30
21
Example 3.7 contd
A = {X 3}: the event at least 3 computers are in use.
B = {X 2}: the event at most 2 computers are in use.
Then B = A'.
P(X 3) = 1 P(X 2)
= 1 0.30
= 0.70
22
Example 3.7 contd
P(2 X 5) = P(X = 2, X = 3, X = 4, or X = 5)
= 0.15 + 0.25 + 0.20 + 0.15
= 0.75
P(2 < X < 5) = P(X = 3 or X = 4)

= 0.25 + 0.20
= 0.45
23
Definition
The probability mass function (pmf) f of a discrete rv is

defined for every number x by f(x) = P(X = x) such that
(i) f(x) 0, and
(ii) all possible x f(x) = 1.
24
Example 3.8
6 lots of components ready for shipment
lot 1 2 3 4 5 6
# defects 0 2 0 1 2 0
X = # of defects in the lot selected for shipment.

The value range of X is {0, 1, 2}. Its pmf f is
f(0) = P(X=0) = P(lot 1, 3 or 6 is shipped) = 3/6 = 0.500

f(1) = P(X=1) = P(lot 4 is shipped) = 1/6 = 0.167
f(2) = P(X=2) = P(lot 2 or 5 is shipped) = 2/6 = 0.333
25
A Parameter of a Probability Distribution
The pmf p of the Bernoulli rv X:
(3.1)
where is a parameter (0 < < 1) of the above pmf p.
Each different value of between 0 and 1 determines a

different member of the Bernoulli family of distributions.
26
Example 3.12
Observe the gender of the next newborn at a certain hospital,
and the observation stops when a boy (B) is born. Let B be
event a girl is borm.
Let p = P(B). Then P(B) = 1 p.

Assume that successive births are independent.
Define rv X = # of births observed.
Then the range of X is {1, 2, 3, 4, 5, 6, }.
p(1) = P(X = 1) = P(B) = p.

27
Example 3.12 contd
p(2) = P(X = 2)
= P(GB)
= P(G) P(B)
= (1 p)p
and
p(3) = P(X = 3)
= P(GGB)
= P(G) P(G) P(B)
= (1 p)2p
28
Example 3.12 contd
The pmf p is
(3.2)
Where the parameter p is in open set (0, 1).
pmf given by (3.2) defines the geometric distributions.
In the gender example, p = 0.51 might be appropriate.

But if we were looking for the first child with Rh-positive
blood, then we might have p = 0.85. 29
The Cumulative Distribution Function
We often wish to compute the following probability
P(the observed value of X will be at most x) = P(X x)
P(X is at most 1)
= P(X 1) = p(0) + p(1) = 0.500 + 0.167 = 0.667
30
Similarly,
P(X 1.5) = P(X 1) = .667
P(X 0) = P(X = 0) = 0.5
P(X .75) = P(X = 0) = 0.5
In fact for any x satisfying 0 x < 1, P(X x) = 0.5.
31
The largest possible X value is 2,
P(X 2) = 1 P(X 3.7) = 1 P(X 20.5) = 1
and so on.
32
Definition
The cumulative distribution function (cdf) F(x) of a

discrete rv X with pmf p(x) is definedby
F(x) = P(X x) = y: y x p(y) (3.3)
F(x) = P(the observed value of X is at most x).
33
Example 3.13
A store has flash drives of 1, 2, 4, 8, or 16 GB of memory.
Let Y = the memory size of a purchased drive.
The pmf of Y is given below.
34
Example 3.13 contd
Then (Assume that all purchases are independent)
F(1) = P(Y 1)
= P(Y = 1)
= p(1)
= 0.05
F(2) = P(Y 2)
= P(Y = 1 or Y = 2)
= p(1) + p(2)
= 0.15
35
Example 3.13 contd
F(4) = P(Y 4)
= P(Y = 1 or Y = 2 or Y = 4)
= p(1) + p(2) + p(4)
= 0.50
F(8) = P(Y 8)
= p(1) + p(2) + p(4) + p(8)
= 0.90
F(16) = P(Y 16)

= 1.00
36
Example 3.13 contd
For any non-integer y in (1, 16), F(y) = F(int(y)).
For example,
F(2.7) = P(Y 2.7)

= P(Y 2)
= F(2)
= 0.15
F(7.999) = P(Y 7.999)
= P(Y 4)
= F(4)
= 0.50 37
Example 3.13 contd
If y < 1, F(y) = 0 [e.g. F(.58) = 0], and
if y > 16, F(y) = 1[e.g. F(25) = 1].
The cdf is
38
Example 3.13 contd
This cdf is depicted below.
39
cdf F(x) of a discrete rv X is a step function, and the step
size at each possible value x of X is equal to p(x).
Proposition
For any a and b with a b,
P(a X b)
= P(X b) P(X < a)
= F(b) F(a)
where a represents the largest possible X value that is

strictly less than a.
40
If all possible values, and a and b are integers, then
P(a X b) = P(X = a or X = a + 1 or . . . or X = b)
= F(b) F(a 1).
Taking a = b yields
P(X = a) = F(a) F (a 1)
in this case.
41
Example 3.15
X = # of days of sick leave taken by a randomly selected
employee of a large company per year.
If the maximum # of allowable sick days per year is 14,
value range of X is {0, 1, 3, 4, . . . , 14}.
42
Example 3.15 contd
Given F(0) = 0.58, F(1) = 0.72, F(2) = 0.76,

F(3) = 0.81, F(4) = 0.88, F(5) = 0.94,
P(2 X 5) = P(X = 2, 3, 4, or 5)
= F(5) F(1)
= 0.22
and
P(X = 3) = F(3) F(2)
= 0.05
43
Exercise Problem 12
55 tickets sold for a 50 seat flight.
Y = # of ticketed passengers actually showed up.
y 45 46 47 48 49 50 51 52 53 54 55
p(y) .05 .10 .12 .14 .25 .17 .06 .05 .03 .02 .01
a. P(accommodate all ticketed passengers showed up) = ?
b. P(cannot take all ticketed passengers showed up) = ?
c. P(1sf standby person can take the flight) = ?

P(3rd standby person can take the flight) = ?
44
Exercise Problem 12
a. P(accommodate all ticketed passengers showed up) = ?
= P(Y50)
= P(Y=45 or 46 or 47 or 48 or 49 or 50)
= P(Y=45)+P(Y=46)+ + P(Y=50)
= 0.05 + 0.1 + 0.12 + 0.14 + 0.25 + 0.17 = 0.83
b. P(cannot take all ticketed passengers showed up)

= P(Y 51)
= 1 P(Y 50)
= 0.17
45
Exercise Problem 12
c. P(1sf standby person can take the flight)
= P(Y49)
= 0.66
P(3rd standby person can take the flight)

= P(Y 47)
= 0.27
46
Exercise Problem 23
The cdf of rv X is given.
a. p(2) = P(X=2)
= F(2) F(1)
= 0.39 0.19
= 0.20
b. P(X > 3) = 1 P(X 3)

= 1 F(3)
= 1 0.67
= 0.33
47
Exercise Problem 23
c. P(2 X 5) = P(X5) P(X1)
= F(5) F(1)
= 0.92 0.19
= 0.73
d. P(2 < X < 5) = P(3 X 4)

= P(X4) P(X2)
= F(4) F(2)
= 0.92 0.39
= 0.53
48
Section Summary
Concepts discussed
Probability Mass (Distribution) Function (pmf)

Parameters of a Probability Distribution
Cumulative Distribution Function (cdf)
49
3.3 Expected Values
50
The Expected Value of X
Definition
rv X has value range D and pmf p(x). The expected value

or mean of X, denoted by E(X) or X or just , is
51
Example 3.16
A university has 15,000 students.
X = # of courses a student is registered. The pmf of X
follows.
= 1p(1) + 2p(2) + 3p(3) + 4p(4) + 5p(5) + 6p(6) + 7p(7)

= 1(.01) + 2(.03) + 3(.13) + 4(.25) + 5(.39) + 6(.17) + 7(.02)
= 0.01 + 0.06 + 0.39 + 1.00 + 1.95 + 1.02 + 0.14
= 4.57 52
The Expected Value of a Function
Given a function h(X) of rv X, determine E[h(X)].
Remark: h(X) is also a random variable.
Proposition
For rv X with value range D and pmf p(x), the expected

value of any function h(X), denoted by E[h(X)] or h(X), is
computed by
53
Example 3.23
A store has purchased 3 computers at $500 apiece, and
will sell them for $1000 apiece.
The manufacturer has agreed to repurchase any

computers unsold after a specified period at $200 apiece.
Let X denote the number of computers sold.
Suppose that
p(0) = 0.1, p(1) = 0.2, p(2) = 0.3 and p(3) = 0.4.
Then E(X) = 0(0.1) + 1(0.2) + 2(0.3) + 3(0.4) = 2.0

54
Example 3.23 contd
h(X) = profit from selling X units. Then
h(X) = revenue cost

= 1000X + 200(3 X) 1500
= 800X 900
Value range of rv h(X) is {900, 100, 700, 1500}
The expected profit is then

E[h(X)] = h(0) p(0) + h(1) p(1) + h(2) p(2) + h(3) p(3)
= (900)(.1) + ( 100)(.2) + (700)(.3) + (1500)(.4)
= $700
55
Rules of Expected Value
Proposition E(aX + b) = a E(X) + b.
In Example 23,
h(X) = 800X 900, and E(X) = 2.
E[h(x)] = 800E(X) 900
= 800(2) 900
= $700.
56
The Variance of X
Definition
Let X have pmf p(x) and expected value . Then the
variance of X, denoted by V(X) or 2X or just 2, is
The standard deviation (SD) of X is
57
Example 3.24
A library has an upper limit of 6 on # of videos that can be
checked out by an individual.
Let X = # of videos checked out by an individual.

The pmf of X is as follows:
The expected value of X is easily computed as = 2.85.
58
Example 3.24 contd
The variance of X is then
= (1 2.85)2(0.30) + (2 2.85)2(0.25) + ... +

(6 2.85)2(0.15)
= 3.2275
The standard deviation of X is = 1.800.
59
A Shortcut Formula for 2
Proposition
V(X) = 2 = 2 = E(X2) [E(X)]2
60
Rules of Variance
The variance of h(X) is the expected value of the squared
difference between h(X) and its expected value:
V[h(X)] = 2h(X) = (3.13)
For linear function h(X) = aX + b,
h(x) E[h(X)] = ax + b (a + b) = a(x ).
V(aX+b) = a2V(X)
61
Example 3.26
In Example 23, E(X) = 2, and
E(X2) = (0)2(.1) + (1)2(.2) + (2)2(.3) + (3)2(.4) = 5
V(X) = 5 (2)2 = 1.
The profit function h(X) = 800X 900
V[h(X)] = (800)2 V(X) = (640,000)(1) = 640,000
Standard deviation of profit function h(X) is 800.

62
Another Example
a. What value of c will make the following a pmf?
f(x) = c(1/2)x, x = 1, 2, 3.
b. Find E(X).
c. Find V(X).
63
Another Example contd
a. c(1/2)1 + c(1/2)2 + c(1/2)3 = 1 c = 8/7
b. E(X) = 1[(8/7)(1/2)] + 2[(8/7)(1/2)2] + 3[(8/7)(1/2)3]
= 11/7
c. V(X)
= 12[(8/7)(1/2)] + 22[(8/7)(1/2)2] + 32[(8/7)(1/2)3] (11/7)2
= 26/49
. 64
Section Summary
Concepts discussed
Expected Value of a rv
Expected Value of Function of a rv
Expected Value of a Linear Function of a rv
Variance of a rv
Variance of Function of a rv
Variance of a Linear Function of a rv
65
The Binomial Probability
3.4 Distribution
Copyright Cengage Learning. All rights reserved.

66
The Binomial Probability Distribution
Many experiments conform to the following:
1. The experiment consists of a sequence of n smaller
experiments called trials, with n fixed in advance.
2. Each trial results in 1 of the same 2 possible outcomes,

denoted by success (S) and failure (F).
3. The trials are independent.
4. The probability of success P(S), denoted by p, is

constant from trial to trial. And P(F) = 1 p
67
The Binomial Probability Distribution
Definition
An experiment satisfying Conditions 1~4 is called a

binomial experiment.
68
Example 3.27
A quarter is tossed successively & independently n times.
Use S to denote the outcome W (Washington) and F the

outcome E (Eagle), for each toss.
Then this results in a binomial experiment with
p = P(S) = 0.5, and
P(F) = 1 p = 0.5.
69
Another Example
A die is tossed n times.
Use S to denote outcome 1, 2, 3, or 4, and F outcome 5 or

6, for each toss.
Then this also results in a binomial experiment with
p = P(S) = 4/6 = 2/3, and
P(F) = 1 p = 1/3.
70
The Binomial Random Variable and Distribution
Definition
Let X = # of successes (S) resulted from a binomial

experiment (i.e., # of times S is observed in the n trials).
Then X is a rv, called a binomial random variable, and
its probability distribution is called a binomial distribution,

denoted as Bin(n, p).
Any binomial random variable is discrete!

71
Consider n = 3.
There are 8 possible outcomes:
SSS SSF SFS SFF FSS FSF FFS FFF
By definition, X(SSF) = 2, X(SFS) = 2, X(SFF) = 1, & so on.
The value range of X for n = 3 is {0, 1, 2, 3}.
Range of X for an n-trial experiment is {0, 1, 2, . . . , n}.

72
Write X ~ Bin(n, p) to indicate that
X is a binomial rv with n trials and success probability of p.
Use b(x; n, p) to denote the pmf of a binomial rv X.
Use B(x; n, p) to denote the cdf of a binomial rv X.
73
Formulas to compute b(x; n, p) and B(x; n, p)
b(x; n, p) = Cx,n px (1 p)n x, x = 0, 1, 2, 3, , n; and

b(x; n, p) = 0, for all other x values.
B(x; n, p) = P(X x) = b(y; n, p) x = 0, 1, . . . , n
74
Example 3.31
Randomly select 6 cola drinkers. Each is given a glass of
cola S and a glass of cola F. Glasses are all identical except
for a code on the bottom to identify the cola.
A cola drinker has no tendency to prefer 1 cola to the other.
Then p = P(a cola drinker prefers S) = 0.5.

Let X = the # of cola drinkers (of the six) who prefer S. Then
X ~ Bin(6, 0.5).
P(X = 3) = b(3; 6, 0.5) = (0.5)3(0.5)3 = 20(0.5)6 = 0.313
75
Example 3.31 contd
The probability that at least three prefer S is
P(3 X) = b(x; 6, 0.5)
= (0.5)x(10.5)6 x
= 0.656
and the probability that at most one prefers S is
P(X 1) = b(x; 6, 0.5)
= 0.109 76
Using Binomial Tables (Table A.1)
Consider n = 20, and p = 0.25.
P(X 10) = B(10; 20, 0.25)

= 0.996
P(5 X 10) = P(X 10) P(X 4)

= B(10; 20, 0.25) B(4; 20, 0.25)
= 0.996 0.415
= 0.581
77
The Mean and Variance of X
If X ~ Bin(n, p), then
mean E(X) = np,
variance V(X) = np(1 p) = npq, and
sd X = ,
where q = 1 p.
78
Example 3.34
75% purchases at a store are made with a credit card. 10
purchases are observed, and let X = # of purchases made
with a credit card. Then
X ~ Bin(10, .75).
Thus E(X) = np = (10)(0.75) = 7.5,

V(X) = npq = 10(0.75)(0.25)
= 1.875,
and =
= 1.37.
79
Example 3.34 contd
The probability that X is within 1 standard deviation of its

mean value is
P(7.5 1.37 X 7.5 + 1.37) = P(6.13 X 8.87)

= P(X = 7 or X = 8)
= P(X = 7) P(X = 8)
= b(7; 10, 0.75)b(8; 10, 0.75)
= 0.532.
P(7.51.37 X 7.5+1.37) = P(7 X 8)

= B(8; 10, 0.75) B(6; 10, 0.75)
= 0.532
80
Exercise Problem 54 contd
P(A customer chooses an oversize tennis racket) = 0.6.

X = # of customers among the 10 who choose oversize one.
Then X ~ Bin(10, 0.6).
a. P(X 6) = 1 P(X 5)
= 1 B(5; 10, 0.6)
= 1 0.367
= 0.633
b. = np = (10(0.6) = 6 = [10(0.6)(0.4)] = 1.55

= 4.45, and + = 7.55
P(4.45 X 7.55) = P(5 X 7) = P(X7) P(X4)
= B(7; 10, 0.6) B(4; 10, 0.6) = 0.833 0.166 = 0.667
81
c. All 10 customers can get what they want iff 3 X 7.

P(3 X 7)
= P(X7) P(X2)
= B(7; 10, 0.6) B(2; 10, 0.6)
= 0.833 0.012
= 0.821
82
Hypergeometric and Negative
3.5 Binomial Distributions
83
The Hypergeometric Distribution
The assumptions leading to the hypergeometric distribution
are as follows:
1. The population to be sampled consists of N elements

(a finite population).
2. Each element is characterized either as a success (S) or

a failure (F), and there are M successes in the
population.
3. A sample of n elements is randomly selected without

replacement.
84
X = # of successes in the random sample of n elements.
Then X is a rv, called hypergeometric rv.
Its value range is
{t, t+1, t+2, , min(n, M)}, where t = max{0, n N + M}.
pmf: h(x; n, M, N) = P(X = x). See Eq. (3.15)
85
Example 3.35
We received 20 service calls for printer problems (8 laser
printers and 12 inkjet printers).
A random sample of 5 is selected for inclusion in a

customer satisfaction survey.
P(exactly 0, 1, 2, 3, 4, or 5 inkjet printers are selected) = ?
Let X = # of inkjet printers in the sample of 5 printers.
Then X is a hypergeometric rv.
86
Example 3.35 contd
N = 20, n = 5, and M = 12.
Value range: {0, 1, 2, 3, 4, 5}
Consider X = 2.
P(X = 2) =
= 77/323 = 0.238
87
Example 3.35 contd
If M =18 (not 12), P(X = 2) = ?
Sample space is given by

N = 20, n = 5, and M = 18.

t = max(0, 5-20+18) = 3, and min(n, M) = 5
Sample space: {3, 4, 5}
P(X = 2) = 0.
88
Example 3.35 contd
If the sample size is increased to n = 15, P(X = 14) = ?
Value range is given by

N = 20, n = 15, and M = 12.

t = 0 and min(n, M) = min(15, 12) = 12
Value range: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
P(X = 14) = 0.
89
For hypergeometric rv X with pmf h(x; n, M, N), the mean
and variance are
In Example 3.35, N = 20, n = 5, and M = 12.
E(X) = 5(12)/20 = 3.0
V(X) = 0.9474 90
The Negative Binomial Distribution
The negative binomial rv and distribution are based on an
experiment satisfying the following conditions:
1. The experiment consists of a sequence of independent

trials.
2. Each trial can result in a success (S) or a failure (F).
3. The probability of success is constant from all trial.
4. The experiment continues until a total of r successes

observed.
91
Let X = # of failures that precede the rth success. Then
X is called a negative binomial random variable,
Its sample space is {0, 1, 2, 3, 4, 5, 6, }, and
Its pmf is written as nb(x; r, p), computed by
92
Example 3.38
A pediatrician wishes to recruit 5 couples, each of whom is
expecting their first child, to participate in a new natural
childbirth regimen. Let
p = P(a randomly selected couple agrees to participate).
If p = 0.2, P(15 couples must be asked) = ?
With S = {agrees to participate}, r = 5, p = 0.2, and x = 10,
P(15 couples must be asked)
= = 0.034
93
Example 3.38 contd
P(at most 10 Fs are observed)
= P(at most 15 couples are asked)
94
Remark. Some books define the negative binomial rv as
the # of trials, rather than the # of failures.
A special case: r = 1.
The pmf is
(3.17)
The random variable X = # of failures (or trials) until one

success is observed is called a geometric random
variable, and the pmf in Eq. (3.17) is called a geometric
distribution.
95
For a geometric random variable X (defined as # of failures
until 1 success observed in our Textbook),
E(X) = (1 p)/p, and V(X) = (1p)/p2.
If a geometric random variable X is defined as # of trials

until 1 success observed, then
E(X) = 1/p, and V(X) = (1p)/p2.
96
For a negative binomial rv X with pmf nb(x; r, p), its mean
and variance are
E(X) = r(1 p)/p, and
V(X) = r(1p)/p2.
97
Exercise Problem 74
Inspector checks 10 firms for possible violation of regulations.
Let X = # of firms among 10 checked that actually violate.
a. If there are 50 firms, & 15 actually violate,
What if the pmf of X?
b. If there are 500 firms, and 150 actually violate,
What is the pmf of X? An approximate pmf?
c. Find E(X) and V(X) for both pmfs in Part (b).

98
a. X is Hypergeometric with n = 10, M = 15, and N = 50.

Its pmf: p(x) = h(x; 10, 15, 50), x = 0, 1, 2, , 10.
b. X is Hypergeometric with n = 10, M = 150, and N = 500.

Its pmf: p(x) = h(x; 10, 150, 500), x = 0, 1, 2, , 10.
Since N is very large relative to n, X can be approximated
as a binomial rv with n = 10 and p = M/N. That is,
h(x; 10, 150, 500) b(x; 10, 0.3) Note 0.3 = 150/500
c. For h(x; 10, 150, 500), E(X) = 10(150/500) = 3, V(X) = 2.06
For b(x; 10, 0.3), E(X) = 10(0.3) = 3, V(X) = 10x0.7x0.7 = 2.1

99
Exercise Problem 76
A family decides to have children until 3 same sex children.
Let X = # of children in the family.
With P(B) = P(G) = 0.5, What is the pmf of X?
100
This question relates to negative binomial distribution.

But we cannot use it directly.
Clearly, X cannot be 0, 1 or 2, and X < 6 3 X 5.
P(X=3) = P(BBB U GGG) = P(BBB) + P(GGG) = 2(0.5)3 = 0.25
P(X=4)
= P(GBBB U BGBB U BBGB U BGGG U GBGG U GGBG)
= 6(0.5)4 = 0.375
P(X=5) = 1 P(X=3) P(X=4) = 0.375

101
3.6 The Poisson Probability
Distribution
102
The Poisson Probability Distribution
Definition
A discrete random variable X is said to have a Poisson

distribution with parameter ( > 0) if the pmf of X is
P(X=x) =
Its cdf is F(x, ) = P(X x).
X is called a Poisson random variable.

103
Example 3.39
Let X denote the number of creatures of a particular type
captured in a trap during a given time period.
Suppose that X has a Poisson distribution with = 4.5, so

on average traps will contain 4.5 creatures.
P(a trap contains exactly five creatures) is
104
Example 3.39 contd
P(a trap has at most five creatures) is
105
The Poisson Distribution as a Limit
Proposition
Suppose that in the binomial pmf b(x; n, p), we let n
and p 0 in such a way that np approaches a value > 0.
Then b(x; n, p) p(x; ).
By this proposition, if n is large and p is small, then

b(x; n, p) p(x; ), where = np.
As a rule of thumb, this approximation can safely be

applied if n > 50 and np < 5.
106
Example 3.40
If (i) P(any given page containing at least typo) = 0.005 and
(ii) typos are independent from page to page,
P(a 400-page book has exactly one page with typos) = ?
P(it has at most three pages with typos) = ?
Let S be a page containing at least one error and F be an

error-free page. Then X = # of pages having at least one
typo is a binomial rv with n = 400 and p = 0.005, so np = 2.
107
Example 3.40 contd
P(a 400-page book has exactly one page with typos)
= P(X = 1) = b(1; 400, .005) p(1; 2)
The binomial value b(1; 400, 0.005) = 0.270669, so the

approximation is very good.
108
Example 3.40 contd
Similarly,
P(X 3)
= 0.135335 + 0.270671 + 0.270671 + 0.180447
= 0.8571
This again is very close to the binomial value
P(X 3) = 0.8576
109
Table below shows the Poisson distribution for = 3 along
with three binomial distributions with np = 3.
110
Table A.2 gives the cdf F(x; ) for = 0.1, 0.2, . . . ,1, 2, . . .,
10, 15, and 20.
Consider = 2. From Table A.2 on page A-5,
P(X 3) = F(3; 2) = 0.857.
P(X = 3) = F(3; 2) F(2; 2) = 0.180.
111
The Mean and Variance of X
Proposition
For a Poisson distribution X with parameter ,
E(X) = V(X) = .
112
Example 3.41
Example 39 continued
Both the expected number of creatures per trap and the

variance of the number trapped equal 4.5, and
X =
= 2.12.
113
The Poisson Process
114
The Poisson Process
A very important application of the Poisson distribution
arises in connection with the occurrence of events of some
type over time.
Vehicles (driving east) passing USF main entrance

Visits to a particular website
Pulses of some sort recorded by a counter
Email messages sent to a particular address
Accidents in an industrial facility
Cosmic ray showers observed by astronomers
115
The Poisson Process
Given an interval, events occur at random throughout the
interval. Partition the interval into subintervals of small
enough length such that
1. P(more than one event in a subinterval) = 0.

2. The probability of one event in a subinterval is the same
for all subintervals & proportional to the length of the
subinterval.
3. The event in each subinterval is independent of other
subintervals.
The experiment of observing random events in an interval

is called a Poisson process.
116
The Poisson Process
X = # of events in the interval is a Poisson rv with
pmf p(x; ),
where is the expected number of events in the interval.
Remark If events occur at a rate of per unit interval,

then = t for a t-unit interval.
117
Example 3.42
Suppose pulses arrive at a counter at an average rate of 6
per minute, so that = 6.
P(at least one pulse is received in a 0.5-min interval) = ?
The interval length t = 0.5 minutes = t = 3
X = # of pulses received in a 30-sec interval is a Poisson rv.
118
Exercise Problem 85
Small aircrafts arrive according to a Poisson process with a
rate of 8 per hour.
a. Let X = # of aircrafts that arrive in 1 hour.

P(X = 6) = ? P(X 6) = ? P(X 10) = ?
b. Let X = # of aircrafts that arrive in a 90-min period.

P(X = 6) = ? P(X 6) = ? P(X 10) = ?
E(X) = ? V(X) = ?
c. Let X = # of aircrafts that arrive in a 2.5-hour period.

P(X = 6) = ? P(X 6) = ? P(X 10) = ? 119
a. E(X) = 8 aircrafts/hour.
P(X = 6) = e886/6! = 0.122
P(X 6) = 1 F(5; 8) = 1 0.191 = 0.809
P(X 10) = 1 F(9; 8) = 1 0.717 = 0.283
b. E(X) = 8(1.5) = 12 aircrafts arriving in 90 minutes.

P(X = 6) = e12126/6! = 0.0255
P(X 6) = 1 F(5; 8) = 1 0.0203 = 0.9797
P(X 10) = 1 F(9; 8) = 1 0.242 = 0.758
E(X) = 12 V(X) = 12
120
c. E(X) = 8(2.5) = 20 aircrafts arriving in 2.5 hours.

P(X = 6) = e20206/6! = 0.00018
P(X 6) = 1 F(5; 20) = 1 0.000072 = 0.999928
P(X 10) = 1 F(9; 20) = 1 0.005 = 0.995
P(X = 6) P(X 6) P(X 10) t

8 0.112 0.809 0.283 1 hr
12 0.0255 0.9797 0.758 1.5 hrs
20 0.00018 0.999928 0.995 2.5 hrs
121
Vehicles arrive according to a Poisson process at a rate of

= 10 vehicles per hour. Suppose that
p = P(an arriving vehicle has no equipment violations) = 0.5.
a. P(exactly 10 arrive in 1 hr & none has violations) = ?
b. P(y arrive in 1 hr & 10 of the y have no violations) = ? y10.
c. P(10 no-violation vehicles arrive in the next hour) = ?

[Hint: Sum the probabilities in part (b) from y=10 to ]
122
a. Let Y = # of cars arriving in the hour. Y ~ Poisson(10).

P(Y = 10 no violations)
= P(Y=10)P(no violationsY=10)
= [e101010/10!][(0.5)10]
= 0.000122
b. P(Y = y 10 have no violations)

= P(Y=y)P(10 have no violationsY=y)
= [e1010y/y!][C10,y (0.5)10 (0.5)y10]
= e105y/[10!(y-10)!]
123
c. P(exactly 10 without violations)

= y=10~ e105y/[10!(y-10)!]
= [e10510/10!] y=10~ [5y10 /(y-10)!]
= [e10510/10!] u=0~ [5u /(u!] u=y10
= [e10510/10!] e5
= e5510/10!
= p(10; 5)
X = # of vehicles without violations per hour. Then

X ~ Poisson(), with = p = 10(0.5) = 5 vehicles/hour.
124
Chapter Summary
Concepts and Probability Distributions Discussed
Random Variables
Bernoulli Random Variables
Probability Mass Functions (pmf)
Cumulative Distribution Functions (cdf)
Expected Values E(X). E[h(X)] & Variances V(X), V[h(X)]
Binomial Distribution
Hypergeometric Distribution
Negative Binomial Distribuion
Geometric Distribution
Poisson Distribution
125

Discrete Random Variables and Probability Distributions

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Discrete Random Variables and Probability Distributions

Hochgeladen von

Copyright:

Verfügbare Formate

Discrete Random

A random variable (rv) is any rule that associates a

Mathematically, a random variable is a function

(i) whose domain is the sample space S, and

Use x, y to represent specific values of rvs X, Y.

X(s) = x means that x is the value of rv X associated

With sample space {S, F}, define rv X by

The rv X indicates whether the student can immediately

A random variable with only two possible values of 0 and 1

Let X denote the outcomes of tossing a quarter, with

Then this X is a Bernoulli rv.

X = the total # of pumps in use at the two stations

Y = the difference between the # of pumps in use

U = the maximum of the numbers of pumps in use at

X, Y, and U are rvs.

For observation (2, 3), determine the values of X, Y, & U.

X((2, 3)) = 2 + 3 = 5, so we say that the observed value of X

Y((2, 3)) = 2 3 = 1 The observed value of Y is y = 1.

U((2, 3)) = max(2, 3) = 3 Observed value of U is u = 3.

X, Y, & U all have the same domain: The sample space

There are 49 possible outcomes in S.

They have different value ranges.

Value range of rv X: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}.

Range of rv Y: {6, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 6}.

Range of rv U: {0, 1, 2, 3, 4, 5, 6}.

X, Y, U in Example 3.3 are all discrete random variables.

A continuous random variable: One

(2) P(X = c) = 0 for any c in the range of rv X.

Remark: P(X = c) = 0 for any c NOT in the range of rv X. 11

The Y is a continuous rv.

The domain of Y is sample space S = continental US.

The highest spot in US is 14,494 ft at Mt. Whitney

The lowest spot is 282 ft at Death Valley

Therefore, the value range of Y is [282 ft, 14,494 ft].

Define rv X = # of blood tests to be performed, the value

Since the value range is a countable infinite set of numbers,

What is the domain of X?

It depends on what is of interest!

X = 0, if the selected student is male; and 1, otherwise.

3 zip code examples: 33647, 33620, 90210

X(33647) = 5, X(33620) = 4, X(90210) = 3

Probabilities assigned to various outcomes in in turn

The probability distribution of X says how the total

Suppose, for example, that a business has just purchased

P(X = x) = the probability of rv X = x, denoted by p(x).

p(0) = P(X = 0) = the probability of X equal to 0.

p(1) = P(X = 1) = the probability of X value to 1.

Let X = the # of computers in use at a particular time of day.

Suppose the probability distribution of X is as given below.

p(0) = 0.05 5% of time no one uses computers.

p(3) = 0.25 25% of time 3 computers are in use. 20

= P(X=0) + P(X=1) + P(X=2)

= p(0) + p(1) + p(2)

= 0.05 + 0.10 + 0.15

A = {X 3}: the event at least 3 computers are in use.

B = {X 2}: the event at most 2 computers are in use.

P(2 < X < 5) = P(X = 3 or X = 4)

The probability mass function (pmf) f of a discrete rv is

(i) f(x) 0, and

(ii) all possible x f(x) = 1.

6 lots of components ready for shipment

X = # of defects in the lot selected for shipment.

f(0) = P(X=0) = P(lot 1, 3 or 6 is shipped) = 3/6 = 0.500