Beruflich Dokumente
Kultur Dokumente
Sample Space
# Suppose a box contains 100 items of a particular sort, say 100 capacitors, and each capacitor has
a unique production number running from 1101 to 1200. if an experiment consists of randomly
selecting a single capacitor from the box , then an appropriate sample space would be
𝑺𝟏 = {𝟏𝟏𝟏𝟏, 𝟏𝟏𝟏𝟏, … . . , 𝟏𝟏𝟏𝟏}
It would also be appropriate to employ the sample space
𝑺𝟐 = {𝟏𝟏𝟏𝟏, 𝟏𝟏𝟏𝟏, 𝟏𝟏𝟏𝟏, … . . , 𝟏𝟏𝟏𝟏}
One might argue that 𝑺𝟐 is less suitable from a modeling perspective, since no physical observation
will correspond to the element 1100; nevertheless , 𝑺𝟐 still satisfies the two necessary modeling
requirements. Insofar as probability is concerned, the probability of choosing 1100 will eventually
be set to zero. A set that cannot serve as a sample space is
𝑺𝟑 = {𝟏𝟏𝟏𝟏, 𝟏𝟏𝟏𝟏, … . . , 𝟏𝟏𝟏𝟏}
since no element in 𝑺𝟑 corresponds to the selection of the capacitor with production number 1200.
The elements of a sample space are called outcomes. An outcome is a logical entity and refers only to
the manner in which the phenomena are viewed by the experimenter. For instance in the foregoing
example, if
𝑺𝟒 = {𝟎. 𝟎𝟎𝟎𝝁𝝁, 𝟎. 𝟎𝟎𝟎𝝁𝝁}
Then only two outcomes are realized. While there might be all sorts of information available regarding
the chosen capacitor, once 𝑺𝟒 has been chosen as the sample space, inly the measured capacitance is
relevant, since only its observation will result in an outcome ( relative to 𝑺𝟒 ).
Events
In most probability problems, the investigator is interested not merely in the collection of outcomes
but in some subset of the sample space. A subset of a sample space is known as an event. Two
events that do not intersects are said to be mutually exclusive (disjoint). More generally, the events
𝑬𝟏 , 𝑬𝟐 … , 𝑬𝒏 are said to be mutually exclusive if
𝑬𝒊 ∩ 𝑬𝒋 = ∅
For any 𝒊 ≠ 𝒋, ∅ denoting empty set.
Probability( Modeling Random Processes for Engineers and Managers, James J. Solberg, John-Wiley 7 Sons Inc., 2009)
When the “probability of an event’ is spoken of in everyday language , almost everyone has a rough
idea of what is meant. It is fortunate that this is so, because it would be quite difficult to introduce the
concept to someone who had never considered it before. There are at least three distinct ways to
approach the subject, none of which is wholly satisfying.
The first to appear, historically , was the frequency concept. If an experiment were to be repeated many
times, then the number of times that event was observed to occur, divided by the number of times that
the experiment was conducted, would approach a number that was defined to be the probability of the
event. The ratio of the number of chances for success out of the total number of possibilities is the
concept with most elementary treatment of probability start. This definition proved to be somewhat
limiting, however, because circumstances frequently prohibit repetition of an experiment under
precisely the same conditions, even conceptually. Imagine trying to determine the probability of global
annihilation from meteor collision.
To extend the notion of probability to a wider class applications, a second approach involving the idea
of “ subjective” probabilities emerged. According to this idea, the probability of an event need not
relate to the frequency with which it would occur in an infinite number of trials; it is just a measure of
the degree of likelihood we believe the event to possess. This definition covers even the hypothetical
events, but seems a bit too loose for engineering applications. Different people could attach different
probabilities to the same event.
Most modern texts use the third concept, which relies upon axiomatic definition. According to this
notion, probabilities are just elements of an abstract mathematical system obeying certain axioms. This
notion is at once the most powerful and the most devoid of real world meaning. Of course, the axioms
are not purely arbitrary; they were selected to be consistent with the earlier concepts of probabilities
and to provide them with all of the properties everyone would agree they should have.
We will go with the formal axiomatic system , so that we can be rigorous in the mathematics. We want
to be able to calculate probabilities to assist in making good decisions. At the same time, we want to
bear in mind the real world interpretation of probabilities as measures of the likelihood of events in the
world. The whole point of learning the mathematics is to be able to use it in everyday life.
3. 𝑰𝑰 𝑬𝟏 , 𝑬𝟐 , … . , 𝑬𝒊 , … .
𝒊𝒊 𝒂𝒂𝒂 𝒇𝒇𝒇𝒇𝒇𝒇 𝒐𝒐 𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒐𝒐 𝒎𝒎𝒎𝒎𝒎𝒎𝒎𝒎 𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆𝒆 𝒆𝒆𝒆𝒆𝒆𝒆, 𝒕𝒕𝒕𝒕
𝑷 ⋃∞
𝒊=𝟏 𝑬𝒊 =𝑷 𝑬𝟏 + 𝑷 𝑬𝟐 + ⋯ … . .
Once 𝑺 has been endowed with a probability measure , 𝑺 is called a probability space.
Some of the additional basic laws of probability (which could be proved from the foregoing ) are:
4. 𝑷 ∅ = 𝟎 𝑤𝑤𝑤𝑤𝑤 ∅ 𝑖𝑖 𝑡𝑡𝑡 𝑒𝑒𝑒𝑒𝑒 𝑠𝑠𝑠 𝑜𝑜 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑒𝑒𝑒𝑒𝑒.
5. 𝑷 𝑨� =𝟏 − 𝑷 𝑨 . In other words, the probability that an event does not occur is 1 minus the
probability that it does occur.
6. 𝑷 𝑨 ∪ 𝑩 = 𝑷 𝑨 + 𝑷 𝑩 − 𝑷(𝑨 ∩ 𝑩), for any two events, 𝑨 and 𝑩. When the events are not
mutually exclusive ( when there is some possibility for both A and B to occur) then one has to subtract
off the probability that they both occur.
7. 𝑷 𝑨 𝑩 = 𝑷(𝑨 ∩ 𝑩)/𝑷(𝑩) provided 𝑃(𝐵) ≠ 0. This “ basic law” is , in reality , a definition of the
conditional probability of an event, 𝑨, given that another event , 𝑩, has occurred.
8. 𝑷 𝑨 𝑩 = 𝑷(𝑨) if and only if 𝐴 and 𝐵 are independent. This rule can be taken as the formal
definition of independence.
9. 𝑷 𝑨 ∩ 𝑩 = 𝑷 𝑨 𝑷(𝑩) if and only if 𝑨 and 𝑩 are independent.
A set of events 𝑩𝟏 , 𝑩𝟐 , … , 𝑩𝒏 constitute a partition of the sample space 𝑺 if they are mutually
exclusive and collectively exhaustive, that is ,
𝑩𝒊 ∩ 𝑩𝒋 =∅ for every pair 𝑖 and 𝑗
and
⋃𝒏𝒊=𝟏 𝑩𝒊 =𝑺
In simple terms, a partition is just any way of grouping and listing all possible outcomes such that no
outcome appears in more than one group. When the experiment is performed , one and only one of
the 𝑩𝒊 will occur.
10. 𝑷 𝑨 = ∑𝒊 𝑷 𝑨 𝑩𝒊 𝑷(𝑩𝒊 ) for any partition 𝑩𝒊 , 𝒊 = 𝟏, 𝟐, 𝟑, … 𝒏. This is one of the most useful
relationship in modeling applications. It one expression of the so called law of total probability.
Counting
Given a finite sample space
𝑺 = 𝒘𝟏 , 𝒘𝟐 , … . , 𝒘𝒏
of cardinality 𝒏, the hypothesis of equal probability is the assumption that
the physical conditions are such that each outcomes in 𝑺 possesses equal
probability:
𝑷 𝒘𝟏 = 𝑷 𝒘𝟐 =…….. 𝑷 𝒘𝒏 = 𝟏⁄𝒏
In such a case , the probability space is said to be equi-probable.
#In deciding the format for a memory word in a new computer, the designer decides on a length of 16
bits. Since each bit can be 0 or 1, the problem of deciding on the number of possible words can be
modeled as making 16 selections from an urn containing 2 balls. Thus there are 𝟐𝟏𝟏 = 𝟔𝟔, 𝟓𝟓𝟓 possible
words.
Now suppose the 4 symbols are chosen uniformly randomly with replacement. What is the probability
that a string will be formed in which no symbol is utilized more than once? Let E denote the event
consisting of all words with no symbol appearing more than once, then the desired probability is
𝑷(𝟗,𝟒)
𝑷 𝑬 = =0.461
𝟗𝟒
Fig.1
Here 4 possible branches can be chosen for the first selection, 2 for the second,
and 2 for the third. As a result , the tree contains 4 × 2 × 2 = 16 𝑓𝑓𝑓𝑓𝑓 𝑛𝑛𝑛𝑛𝑛.
It is crucial to note that at each of three stages (selections) of the tree, the
number of branches emanating from the nodes is the same. ; otherwise as
Illustrated in Fig.2 , the multiplication technique of the fundamental principal
does not apply. The requirement that there be a constant number of emanating
branches at each stage corresponds to the condition in the selection protocol
that , at each component , the number of possible choices for the component is
fixed and does not depend on the particular objects chosen to fill the preceding
components.
Fig.2
Again consider the set 𝐴. It can be readily seen that there are two 2-tuple
permutations for each 2-element combination. Thus , each 2-element subset from
𝐴 yields 2! =2 permutations. This reasoning resulting in
𝑃 𝑛, 𝑘 = 𝑘! 𝐶 𝑛, 𝑘 .
Or
𝑃(𝑛,𝑘) 𝑛!
𝐶 𝑛, 𝑘 = =
𝑘! 𝑘! 𝑛−𝑘 !
DISCRETE RANDOM VARIABLES AND THEIR DISTRIBUTIONS
(Probability and statistics for computer scientists- Michael Baron, Chapman & Hall/CRC, 2007.)
A random variable is a function of an outcome,
𝑿=𝒇 𝝎 .
In other words , it is a quantity that depends on chance. The domain of the random
variable is the sample space. Its range can be the set of all real numbers 𝑹, or only
the positive numbers 𝟎, +∞ , 𝒐𝒐 the integers 𝒁, or the interval (𝟎, 𝟏) , etc.,
depending on what possible values the random variable can potentially take.
# Consider an experiment of tossing 3 fair coins and counting the number of heads. Certainly, the same
model suits the number of girls in a family with 3 children, the number of 1’s in a random binary code
consisting of 3 characters, etc.
Let 𝑿 be the number of heads 9 girl’s, 1’s ) . Prior to an experiment, its value is not known. All we can
say is that 𝑿 has to be an integer between 0 and 3. Since assuming value is an event, we can compute
probabilities,
𝟏 𝟏 𝟏 𝟏
𝑷 𝑿 = 𝟎 = 𝑷 𝒕𝒕𝒕𝒕𝒕 𝒕𝒕𝒕𝒕𝒕 = 𝑷 𝑻𝑻𝑻 = . . =
𝟐 𝟐 𝟐 𝟖
3
𝑃 𝑋 = 1 = 𝑃 𝐻𝐻𝐻 + 𝑃 𝑇𝑇𝑇 + 𝑃 𝑇𝑇𝑇 =
8
3
𝑃 𝑋 = 2 = 𝑃 𝐻𝐻𝑇 + 𝑃 𝐻𝐻𝐻 + 𝑃 𝑇𝐻𝐻 =
8
1
𝑃 𝑋 = 3 = 𝑃 𝐻𝐻𝐻 =
8
Summarizing,
𝒙 𝑷{𝑿 = 𝒙}
0 1/8
1 3/8
2 3/8
3 1/8
Total 1
This table contains everything that is known about random variable 𝑿 prior to the experiment.
Before we know the outcome 𝝎, we cannot tell what 𝑿 equals to. However, we cam list all the
possible values of 𝑿 and determine the corresponding probabilities.
Definition
Recall that one way to compute the probability of an event is to add probabilities of all the
outcomes in it. Hence, for any set 𝑨,
𝑷 𝑿𝝐 𝑨 = ∑𝒙𝝐𝝐 𝑷(𝒙).
When 𝑨 is an interval , its probability can be computed directly from the cdf , 𝑭(𝒙),
𝑷 𝒂 <𝑿 ≤ 𝒃 =𝑭 𝒃 −𝑭 𝒂 .
# (Errors in independent modules) . A program consists of two modules. The number of
errors 𝑋1 in the first module has the pmf 𝑃1(𝑥), and the number of errors 𝑋2 in the second
module has the pmf 𝑃2(𝑥), independently of 𝑋1 , where
𝒙 𝑃1(𝑥) 𝑃2(𝑥)
0 0.5 0.7
1 0.3 0.2
2 0.1 0.1
3 0.1 0
Sol.: We break the problem into steps. First, determine all possible values of 𝑌, then compute
the probability of each value. Clearly, the number of errors 𝑌 is integer that can be as low as
0 + 0 = 0 and as high as 3 + 2 = 5. Since 𝑃2 3 = 0, the second module has at most 2 errors.
Next,
𝑃𝑌 0 = 𝑃 𝑌 = 0 = 𝑃 𝑋1 = 𝑋2 = 0 = 𝑃1 0 𝑃2 0 = 0.5 ∗ 0.7 = 0.35
𝑃𝑌 1 = 𝑃 𝑌 = 1 = 𝑃1 0 𝑃2 1 + 𝑃1 1 𝑃2 0 = 0.5 ∗ 0.2 + 0.3 ∗ 0.7 = 0.31
𝑃𝑌 2 = 𝑃 𝑌 = 2 = 𝑃1 0 𝑃2 2 + 𝑃1 1 𝑃2 1 + 𝑃1 2 𝑃2 0
= 0.5 ∗ 0.1 + 0.3 ∗ 0.2 + 0.1 ∗ 0.7 = 0.18
𝑃𝑌 3 = 𝑃 𝑌 = 3 = 𝑃1 0 𝑃2 3 + 𝑃1 1 𝑃2 2 + 𝑃1 2 𝑃2 1 + 𝑃1 3 𝑃2 0
= 0.5 ∗ 0 + 0.3 ∗ 0.1 + 0.1 ∗ 0.2 + 0.1 ∗ 0.7 = 0.12
𝑃𝑌 4 = 𝑃 𝑌 = 4 = 𝑃1 2 𝑃2 2 + 𝑃1 3 𝑃2 1 =0.1*0.1+0.1*0.2=0.03
𝑃𝑌 5 = 𝑃 𝑌 = 5 = 𝑃1 3 ∗ 𝑃2 (2)=0.1 ∗ 0.1 = 0.01
The cumulative function 𝐹 𝑦 can be similarly computed.
Families of Discrete Distributions
Bernoulli Distribution
The simplest random variable (excluding non-random ones!) takes just two
possible values. Call them 0 and 1.
𝑝 = 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜 𝑠𝑠𝑠𝑠𝑠𝑠𝑠
𝑞 = 1 − 𝑝 𝑖𝑖 𝑥 = 0
Bernoulli distribution 𝑃 𝑥 =�
𝑝 𝑖𝑖 𝑥 = 1
𝐸 𝑋 = 𝑝 ; 𝑉𝑉𝑉 𝑋 = 𝑝𝑝
Sol: Let 𝑿 be the number of people (successes) , among the mentioned 15 users (trials),
who will buy the advanced version of the game. It has Binomial distribution with 𝒏 = 𝟏𝟏
Trials and the probability of success
𝒑 = 𝑷 𝒃𝒃𝒃 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂
= 𝑷 𝒃𝒃𝒃 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂 𝒍𝒍𝒍𝒍𝒍𝒍 𝑷 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 𝒂𝒂𝒂 𝒍𝒍𝒍𝒍𝒍𝒍
= 𝟎. 𝟑𝟑 ∗ 𝟎. 𝟔𝟔 = 𝟎. 𝟏𝟏
𝑬 𝑿 = 𝟏𝟏 ∗ 𝟎. 𝟏𝟏 = 𝟐. 𝟕
And
𝑷 𝑿 ≥ 𝟐 = 𝟏 − 𝑷 𝑿 < 𝟐 = 𝟏 − 𝑷 𝟎 − 𝑷 𝟏 = 𝟏 − (𝟏 − 𝒑)𝒏 −𝒏𝒏(𝟏 − 𝒑)𝒏−𝟏=0.7813
Geometric distribution
Definition
The number of Bernoulli trials needed to get the first success has Geometric distribution.
# A search engine goes through a list of sites looking for a given key phrase. Suppose the
search terminates as soon as the key phrase is found. The number of sites visited is
Geometric.
# A hiring manager interviews candidates , one by one, to fill a vacancy. The number of
candidates interviewed until one candidate receives an offer has Geometric distribution.
Geometric random variables can take any integer value from 1 to infinity , because one
needs at least 1 trial to have the first success, and the number of trials needed is not limited
by any specific number. ( For example, there is no guarantee that among the first 10 coin
tosses there will be at least one head.) The only parameter is 𝒑, the probability of a
“success”.
Geometric probability mass function has the form
Observe that
1
∑𝑥 𝑃(𝑥) = ∑∞
𝑥=1(1 − 𝑝)
𝑥−1 𝑝= 𝑝 =1
1−(1−𝑝)
The mean and variance is given as:
∞ ∞
𝑥−1
𝑑 𝑥
𝑑 1 1
𝑥̅ = 𝐸 𝑋 = � 𝑥(1 − 𝑝) 𝑝 = 𝑝 �𝑞 = 𝑝 =
𝑑𝑑 𝑑𝑑 (1 − 𝑞) 𝑝
𝑥=1 𝑥=0
1−𝑝
𝑉𝑉𝑉 𝑋 =
𝑝2
Here we have defined 𝑞 = 1 − 𝑝.
# (St. Petersburg Paradox). This paradox was noticed by a Swiss mathematician
Daniel Bernoulli (1700-1782), a nephew of Jacob. It describes a gambling strategy
that enables one to win any desired amount of money with probability one.
Consider a game that can be played any number of times. Rounds are independent, and each time your
winning probability is 𝑝. The game does not have to be favorable to you or even fair. This 𝑝 can be any
positive probability. For each round , you bet some amount 𝑥. In case of a success , you win 𝑥. If you lose the
round , you lose 𝑥.
The strategy is simple . Your initial bet is the amount that you desire to win eventually. Then, if you win a
round, stop. If you lose a round , double your bet and continue.
Say the desired profit is $100. The game will progress as follows:
Balance
Round Bet ….if lose …. If win
1 100 -100 +100 and stop
2 200 -300 +100 and stop
3 400 -700 +100 and stop
… … …… …….
Sooner or later, the game ill stop, and at this moment, your balance will be $100. Guaranteed! But this is not
what D.Bernoulli called a paradox.
How many rounds should be played? Since each round is a Bernoulli trial, the number of them , 𝑋 , until the
first win is a Geometric random variable with parameter 𝑝.
1
Is the game endless? No, on the average , it will last 𝐸 𝑋 = 1⁄𝑝 rounds. In a fair game with 𝑝 = , one will
2
need 2 rounds, on the average., to win the desired amount. In an “unfair” game, with 𝑝 < 1⁄2, it will take
longer to win, but still a finite number of rounds. For example with 𝑝 = 0.2 i.e., one win in 5 rounds, then
on the average , one stop after 1⁄𝑝 = 5 rounds. This is not a paradox yet.
Finally , how much money does one need to have in order o be able to follow this strategy? Let 𝑌 be the
amount of the last bet. According to the strategy, 𝑌 = 100. 2 𝑋−1. It is a discrete random variable whose
expectation is
𝐸 𝑌 = ∑𝑥 100. 2𝑥−1 𝑃𝑋 𝑥 = 100 ∑∞ 𝑥=1 2
𝑥−1
(1 − 𝑝)𝑥−1 𝑝
∞ 100𝑝
𝑖𝑖 𝑝 > 1�2
= 100𝑝 �[2 1 − 𝑝 ] 𝑥−1
= �2(1 − 𝑝)
𝑥=1 +∞ 𝑖𝑖 𝑝 ≤ 1�2
This the St.Petersburg Paradox ! A random variable that is always finite has an infinite expectation! Even
when the game is fair a 50-50 chance to win , one has to be (on the average! ) infinitely rich to follow this
strategy.
Negative Binomial distribution (Pascal)
In the foregoing , we played the game until the first win. Now keep playing until
we reach a certain number of wins. The number of played games is then Negative
Binomial.
Definition
Therefore
𝐸 𝑋 = 𝐸 𝑋1 + 𝑋2 + ⋯ . +𝑋𝑘 = 𝑘⁄𝑝;
𝑘(1−𝑝)
𝑉𝑉𝑉 𝑋 = 𝑉𝑉𝑉(𝑋1 + 𝑋2 + ⋯ . +𝑋𝑘 )=
𝑝2
#(Sequential testing). In a recent production 5% of certain electronic components are defective. We
need to find 12 non-defective components for our 12 new computers. Components are tested until 12
non defective ones are found. What is the probability that more than 15 components will have to be
tested?
Sol.: Let 𝑿 be the number of components tested until 12 non-defective ones are found. It is a number of
trials needed to see 12 successes, hence 𝑿 has Negative Binomial distribution with 𝒌 = 𝟏𝟏 and
𝒑 = 𝟎. 𝟎𝟎. We need
𝑷 𝑿 > 𝟏𝟏 =
∑∞𝟏𝟔 𝑷 𝒙 = 𝟏 − 𝑭 𝟏𝟏 . 𝐓𝐓𝐓𝐓𝐓𝐓𝐓𝐓𝐓 𝐨𝐨𝐨 𝐧𝐧𝐧𝐧 𝐭𝐭𝐭 𝐭𝐭𝐭𝐭𝐭 𝐨𝐨 𝐍𝐍𝐍𝐍𝐍𝐍𝐍𝐍 𝐛𝐛𝐛𝐛𝐛𝐛𝐛𝐛 𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝𝐝.
However one may compute the left hand side using the following argument.
𝑷 𝑿 > 𝟏𝟏 = 𝑷{𝒎𝒎𝒎𝒎 𝒕𝒕𝒕𝒕 𝟏𝟏 𝒕𝒕𝒕𝒕𝒕𝒕 𝒏𝒏𝒏𝒏𝒏𝒏 𝒕𝒕 𝒈𝒈𝒈 𝟏𝟏 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔}
= 𝑷{𝟏𝟏 𝒕𝒕𝒕𝒕𝒕𝒕 𝒂𝒂𝒂 𝒏𝒏𝒏 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔}
= 𝑷{𝒕𝒕𝒕𝒕𝒕 𝒂𝒂𝒂 𝒇𝒇𝒇𝒇𝒇 𝒕𝒕𝒕𝒕 𝟏𝟏 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒊𝒊 𝟏𝟏 𝒕𝒕𝒕𝒕𝒕𝒕}
= 𝑷{𝒀 < 𝟏𝟏}
Where 𝒀 is the number of successes (non defective components) in 15 trials, which is a binomial
variable with parameters 𝒏 = 𝟏𝟏 and 𝒑 = 𝟎. 𝟗𝟗. Therefore
𝑷 𝑿 > 𝟏𝟏 = 𝑷 𝒀 < 𝟏𝟏 = 𝑷 𝒀 ≤ 𝟏𝟏 = 𝑭 𝟏𝟏 = 𝟎. 𝟎𝟎𝟎𝟎.
Poisson distribution
This distribution is related to a concept of rare events, or Poissonian events. Essentially
it means that two such events are extremely unlikely to occur within a very short time
or simultaneously. Arrivals of jobs, telephone calls , e-mail messages , traffic accidents,
network blackouts, virus attacks, error in software, floods, earthquakes are example of
rare events.
This distribution bears the name of a famous French mathematician Sim𝑒𝑒𝑒 ́ Denis
Poisson (1781-1840).
𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∶ 𝑏 𝑘; 𝑛, 𝑝 ≈ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃(𝜆)
Poisson approx. to Binomial where 𝑛 ≥ 30, 𝑝 ≤ 0.05, 𝑛𝑛 = 𝜆
𝑛 𝑘 𝑛−𝑘
Here 𝑏 𝑘; 𝑛, 𝑝 = 𝑝 𝑞
𝑘
𝑵 𝑵(𝑵−𝟏)
Sol.: Let 𝒏 = 𝟐
=
𝟐
pairs of students in this class. In each pair , both students are born on the
same day with probability 𝒑 = 𝟏⁄𝟑𝟑𝟑. Each pair is a Bernoulli trial because the two birthdays either
match or don’t match. Besides, matches in two different pairs are “nearly” independent. Therefore ,
𝑿, the number of pairs sharing birthdays, is “almost’ Binomial. For 𝑵 ≥ 𝟏𝟏, 𝒏 ≥ 𝟒𝟒 is large, and 𝒑 is
small, thus we shall use Poisson approximation with 𝝀 = 𝒏𝒏 = 𝑵(𝑵 − 𝟏)/𝟕𝟕𝟕,
𝑷 𝒕𝒕𝒕𝒕𝒕 𝒂𝒂𝒂 𝒕𝒕𝒕 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝒃𝒃𝒃𝒃𝒃𝒃𝒃𝒃 = 𝟏 − 𝑷{𝒏𝒏 𝒎𝒎𝒎𝒎𝒎𝒎𝒎}
𝟐
= 𝟏 − 𝑷 𝑿 = 𝟎 ≈ 𝟏 − 𝒆−𝝀 ≈ 𝟏 − 𝒆−𝑵 /𝟕𝟕𝟕
𝟐
Solving the inequality 𝟏 − 𝒆−𝑵 /𝟕𝟕𝟕 > 0.5, we obtain 𝑵 > √(𝟕𝟕𝟕 𝐥𝐥 𝟐) = 𝟐𝟐. 𝟓. That is , in a class of at
least 𝑵 = 𝟐𝟐 students, there is a more than 50% chance that at least two students were born on the
same day of the year.