You are on page 1of 9

The Poisson Distribution

Attributes of a Poisson Experiment


A Poisson experiment is a statistical experiment that has the following properties:

• The experiment results in outcomes that can be classified as successes or failures.


• The average number of successes (μ) that occurs in a specified region is known.
• The probability that a success will occur is proportional to the size of the region.
• The probability that a success will occur in an extremely small region is virtually zero.

Note that the specified region could take many forms. For instance, it could be a length, an area,
a volume, a period of time, etc.

Notation
The following notation is helpful, when we talk about the Poisson distribution.

• e: A constant equal to approximately 2.71828. (Actually, e is the base of the natural


logarithm system.)
• μ: The mean number of successes that occur in a specified region.
• x: The actual number of successes that occur in a specified region.
• P(x; μ): The Poisson probability that exactly x successes occur in a Poisson experiment,
when the mean number of successes is μ.

Poisson Distribution
A Poisson random variable is the number of successes that result from a Poisson experiment.
The probability distribution of a Poisson random variable is called a Poisson distribution.

Given the mean number of successes (μ) that occur in a specified region, we can compute the
Poisson probability based on the following formula:

Poisson Formula. Suppose we conduct a Poisson experiment, in which the average number of
successes within a given region is μ. Then, the Poisson probability is:

P(x; μ) = (e-μ) (μx) / x!

where x is the actual number of successes that result from the experiment, and e is
approximately equal to 2.71828.
Properties Of Poisson Distribution
The Poisson distribution has the following properties:

• The mean of the distribution is equal to μ .


• The variance is also equal to μ .

Example 1
The average number of homes sold by the Acme Realty company is 2 homes per day. What is
the probability that exactly 3 homes will be sold tomorrow?

Solution: This is a Poisson experiment in which we know the following:

• μ = 2; since 2 homes are sold per day, on average.


• x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.
• e = 2.71828; since e is a constant equal to approximately 2.71828.

We plug these values into the Poisson formula as follows:

P(x; μ) = (e-μ) (μx) / x!


P(3; 2) = (2.71828-2) (23) / 3!
P(3; 2) = (0.13534) (8) / 6
P(3; 2) = 0.180

Thus, the probability of selling 3 homes tomorrow is 0.180 .

Cumulative Poisson Probability

A cumulative Poisson probability refers to the probability that the Poisson random variable is
greater than some specified lower limit and less than some specified upper limit.

Example 2
Suppose the average number of lions seen on a 1-day safari is 5. What is the probability that
tourists will see fewer than four lions on the next 1-day safari?

Solution: This is a Poisson experiment in which we know the following:

• μ = 5; since 5 lions are seen per safari, on average.


• x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer than 4
lions; that is, we want the probability that they will see 0, 1, 2, or 3 lions.
• e = 2.71828; since e is a constant equal to approximately 2.71828.
To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3 lions. Thus,
we need to calculate the sum of four probabilities: P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5). To
compute this sum, we use the Poisson formula:

P(x < 3, 5) = P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5)


P(x < 3, 5) = [ (e-5)(50) / 0! ] + [ (e-5)(51) / 1! ] + [ (e-5)(52) / 2! ] + [ (e-5)(53) / 3! ]
P(x < 3, 5) = [ (0.006738)(1) / 1 ] + [ (0.006738)(5) / 1 ] + [ (0.006738)(25) / 2 ] + [ (0.006738)(125)
/6]
P(x < 3, 5) = [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] + [ 0.140375 ]
P(x < 3, 5) = 0.2650

The Poisson Distribution

The Poisson Distribution arises in a number of contexts as the distribution of a random number of
points, for example the number of clicks of a Geiger counter in one second, the number of raisins
in a box of raisin bran, the number of blades of grass in a randomly chosen square inch of lawn,
and so forth.

The formula for the probability of observing k of whatever is being counted when the expected
number is m is

p(k) = mk e− m ⁄ k !
where e is the base of the natural logarithms and k ! indicates the factorial function (use the ex
key on a scientific calculator to calculate e− m and the x! key to calculate k !).

Theoretically any count between zero and infinity (including zero) is possible, but the probability
of large counts falls off very rapidly.

In a lottery, the number of winners cannot have an exact Poisson distribution for two reasons.

• The number of winners cannot be more than the number of tickets sold, whereas the
Poisson distribution gives nonzero probability to arbitrarily large numbers of winners.
• The choice of lottery numbers by the players is not completely random. If you choose a
popular number, you will have to share with many other winners if you win. Conversely, if
you can figure out a number no one else likes and play that, you are guaranteed not to
have to share the jackpot if you win. The Poisson distribution assumes every player
chooses lottery numbers completely at random.

The first issue is not a serious problem. The Poisson distribution would be an extremely good
approximation if it were not for the other issue. The second is more serious. Many players (about
70%) buy quick picks which are completely random, but other players choose some number they
think is lucky and that's not random. If every player choose a quick pick the Poisson distribution
would be an almost perfect approximation. Since they don't, it is not quite right. However, we will
assume the Poisson distribution is correct to keep things simple.
The reason why the unconditional distribution of the number of winners of the jackpot and
the conditional distribution of the number of other winners given you win are the same has
to do with the assumption of completely random choices of numbers by all the players,
which is required for the correctness of the Poisson distribution. Then whether you you
win or not doesn't change the probability of anyone else winning. Everyone has the same
1 in 146.1 million chance of winning, and their ticket choice had nothing to do with
yours.

Our Expected Winnings


If we win and there are k other winners, then the jackpot gets split k + 1 ways, and the amount
we win is J ⁄ (k + 1), where J is the size of the jackpot.

Our expected winnings are calculated just like any other expectation: multiply the amount we win
in each case, which is J ⁄ (k + 1), by the probability of that case, which is mk e− m ⁄ k !, and
sum. The sum runs over k from zero to infinity, so it appears to require calculus to sum this
infinite series.

Fortunately, there is a trick that allows us to see what the expectation is without doing the infinite
sum. The terms in the infinite sum are

ak = J ⁄ (k + 1) × mk e− m ⁄ k ! = J mk e− m / (k + 1) !

Let W denote the sum of the ak as k runs from zero to infinity, which is the expectation we are
trying to calculate.

If we multiply each term by m we get

m ak = J mk + 1 e− m ⁄ (k + 1) ! = J p(k + 1)

where p(k) is the Poisson probability defined above. The probabilities p(k) must sum to one as k
goes from zero to infinity by the properties of probability. Because of the k + 1 above, the first
term is J p(1). If we were to add an additional term J p(0), the series would sum to J (because the
probabilities sum to one). Thus the series sums to

J [1 − p(0)] = m W

(Recall that we multiplied by m so the sum is m W rather than W). Solving for W gives

W = J [1 − p(0)] / m = J (1 − e− m) ⁄ m

Thus, the probability of seeing at no more than 3 lions is 0.2650.


A discrete random variable X with a probability distribution function (p.d.f.) of the form:

is said to be a Poisson random variable with parameter l. We write X ~ Po(l)

Expectation and Variance;

If X ~ Po(l), then:

· E(X) = l.

· Var(X) = l.

Sums of Poisson
Suppose X and Y are independent Poisson random variables with parameters l and m
respectively. Then X + Y has a Poisson distribution with parameter l + m. In other words:

• If X ~ Po(l) and Y ~ Po(m), then X + Y ~ Po(l + m)

Random Events
The Poisson distribution is useful because many random events follow it.
If a random event has a mean number of occurrences l in a given time period, then the number of
occurrences within that time period will follow a Poisson distribution.
For example, the occurrence of earthquakes could be considered to be a random event. If there
are 5 major earthquakes each year, then the number of earthquakes in any given year will have a
Poisson distribution with parameter 5.

Example
There are 50 misprints in a book which has 250 pages. Find the probability that page 100 has no
misprints.
The average number of misprints on a page is 50/250 = 0.2 . Therefore, if we let X be the random
variable denoting the number of misprints on a page, X will follow a Poisson distribution with
parameter 0.2 .
Since the average number of misprints on a page is 0.2, the parameter, l of the distribution is
equal to 0.2 .
P(X = 0) = (e-0.2)(0.20)
0!
= 0.819 (3sf)

Binomial Approximation
The Poisson distribution can be used as an approximation to the binomial distribution.
• A Binomial distribution with parameters n and p can be approximated by a Poisson
distribution with parameter np.

The Probability Density Function

We have shown that the k th arrival time in the Poisson process has the gamma probability
density function with shape
parameter k and rate parameter r :
f *k (t) = r k tk −1
(k − 1)!
e −r t , t ≥ 0
Recall also that at least k arrivals come in the interval (0, t] if and only if the k th arrival occurs by
time t:
( N t ≥ k) ⇔ (T k ≤ t)

1. Use integration by parts to show that


( N t ≥ k) =⌠⌡ t
f *k (s)ds = 1 − Σj =0
k −1 e −r t (r t) j
j!
,k∈

2. Use the result of Exercise 1 to show that the probability density function of the number of
arrivals in the interval
(0, t] is ( N t = k) = e −r t (r t)k k ! , k ∈

The corresponding distribution is called the Poisson distribution with parameter r t; the distribution
is named after

Simeon Poisson.
3. In the Poisson experiment, vary r and t with the scroll bars and note the shape of the density
function. Now with r = 2 and t = 3, run the experiment 1000 times with an update frequency of 10
and watch the apparent convergence of the relative frequency function to the density function.
The Poisson distribution is one of the most important in probability. In general, a discrete random
variable N in an experiment is said to have the Poisson distribution with parameter c > 0 if it has
the probability density function
g(k) = e −c c k k ! , k ∈

4. Show directly that g is a valid probability density function.

5. Show that

a. g(n − 1) < g(n) if and only if n < c.


b. g at first increases and then decreases, and thus the distribution is unimodal
c. If c is not an integer, there is a single mode at ⌊c⌋. If c is an integer there are two modes at c −
1 and c.

6. Suppose that requests to a web server follow the Poisson model with rate r = 5. per minute.
Find the probability that there will be at least 8 requests in a 2 minute period.

7. Defects in a certain type of wire follow the Poisson model with rate 1.5 per meter. Find the
probability that there will be no more than 4 defects in a 2 meter piece of the wire. Moments
Suppose that N has the Poisson distribution with parameter c. The following exercises give the
mean, variance, and probability generating function of N .

8. Show that �( N ) = c.

9. Show that var( N ) = c.

10. Show that �(u N ) = e c (u −1) . for u ∈


Returning to the Poisson process {N t : t ≥ 0} with rate parameter r , it follows that �( N t ) = r t
and var( N t ) = r t for t ≥ 0. Once again, we see that r can be interpreted as the average arrival
rate. In an interval of length t, we expect about r t arrivals.

11. In the Poisson experiment, vary r and t with the scroll bars and note the location and size of
the mean/standard deviation bar. Now with r = 3 and t = 4, run the experiment 1000 times with
an update frequency of 10 and watch the apparent convergence of the sample mean and
standard deviation to the distribution mean and standard deviation, respectively.

12. Suppose that customers arrive at a service station according to the Poisson model, at a rate
of r = 4. Find the mean and standard deviation of the number of customers in an 8 hour period.
Stationary, Independent Increments Let us see what the basic regenerative assumption of the
Poisson process means in terms of the counting variables {N t : t ≥ 0}.

13. Show that if s < t, then N t − N s is the number of arrivals in the interval (s, t].
Recall that our basic assumption is that the process essentially starts over at time s and the
behavior after time s is independent of the behavior before time s.

14. Argue that:


a. Nt − Ns has the same distribution as Nt −s namely Poisson with parameter r (t − s).
b. Nt − Ns and Ns are independent.

15. Suppose that N and M are independent random variables, and that N has the Poisson
distribution with parameter c and M has the Poisson distribution with parameter d. Show that N
+ M has the Poisson distribution with parameter c + d. Give a probabilistic proof, based a. on the
Poisson process.
b. Give an analytic proof using probability density functions.

c. Give an analytic proof using probability generating functions.

16. In the Poisson experiment, select r = 1 and t = 3. Run the experiment 1000 times, updating
after each run. By computing the appropriate relative frequency functions, investigate empirically
the independence of the random variables N 1 and N 3 − N 1.

Normal Approximation

Now note that for k ∈ ℕ+,


N k = N 1 + ( N 2 − N 1) + ··· + ( N k − N k −1)
The random variables in the sum on the right are independent and each has the Poisson
distribution with parameter r .

17. Use the central limit theorem to show that the distribution of the standardized variable below
converges to the standard normal distribution as k → ∞. Z k = N k − k r √k r

A bit more generally, the same result is true with the integer k replaced by the positive real
number c. Thus, if N has the Poisson distribution with parameter c, and c is “large”, then the
distribution of N is approximately normal with mean c and standard deviation √c. When using the
normal approximation, we should remember to use the continuity correction, since the Poisson
is a discrete distribution.

18. In the Poisson experiment, set r = 1 and t = 1. Increase r and t and note how the graph of the
probability density function becomes more bell-shaped.

19. In the Poisson experiment, set r = 5 and t = 4. Run the experiment 1000 times with an update
frequency of 100.

Compute and compare the following:

a. ℙ(15 ≤ N4 ≤ 22)
b. The relative frequency of the event {15 ≤ N4 ≤ 22} .
c. The normal approximation to ℙ(15 ≤ N4 ≤ 22).

20. Suppose that requests to a web server follow the Poisson model with rate r = 5. Compute the
normal approximation to the probability that there will be at least 280 requests in a 1 hour period.

Conditional Distributions

Consider again the Poisson model with arrival time sequence (T 1, T 2, ...) and counting process
{N t : t ≥ 0}.

21. Let t > 0. Show that the conditional distribution of T 1 given N t = 1 is uniform on the interval
(0, t). Interpret the result.

22. More generally, show that given N t = n, the conditional distribution of (T 1, ..., T n) is the
same as the distribution of the order statistics of a random sample of size n from the uniform
distribution on the interval (0, t). Note that the conditional distribution in the last exercise is
independent of the rate r . This result means that, in a sense, the Poisson model gives the most
“random” distribution of points in time.
23. Suppose that requests to a web server follow the Poisson model, and that 1 request comes in
a five minute period. Find the probability that the request came during the first 3 minutes of the
period.

24. In the Poisson experiment, set r = 1 and t = 2. Run the experiment 1000 times, updating after
each run. Compute the appropriate relative frequency functions and investigate empirically the
theoretical result in Exercise 23.

25. Suppose that 0 < s < t and that n is a positive integer. Show that the conditional distribution of
N s given N t = n is binomial with trial parameter n and success parameter p = st. Note that the
conditional distribution is independent of the rate r . Interpret the result.

26. Suppose that requests to a web server follow the Poisson model, and that 10 requests come
during a 5 minute period. Find the probability that at least 4 requests came during the first 3
minutes of the period.

Estimating the Rate

In many practical situations, the rate r of the process in unknown and must be estimated based
on observing the number of arrivals in an interval.

27. In the Poisson experiment, set r = 3 and t = 5. Run the experiment 100 times, updating after
each run.

a. For each run, compute the estimate of r based on Nt .


b. Over the 100 runs, compute the average of the squares of the errors.
c. Compare the result in (b) with the variance in Exercise 28.

29 . Suppose that requests to a web server follow the Poisson model with unknown rate r per
minute. In a one hour period, the server receives 342 requests. Estimate r .