Sie sind auf Seite 1von 46

Lecture 6

6
AAOC C111: PROBABILITY &
STATISTICS
BITS-PILANI HYDERABAD CAMPUS
Presented by
Dr. M.S. Radhakrishnan
Email: msr@bits-hyderabad.ac.in
Lecture 6
Hypergeometric
Distribution

Text Book: J. SUSAN MILTON and


JESSE C. ARNOLD, Introduction to
Probability and Statistics, Tata McGraw-
Hill Edition, Fourth Reprint 2008.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 2
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 3
In this lecture we look at
• Hypergeometric distribution
• Binomial approximation to
hypergeometric distribution

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 4


The Hypergeometric Distribution
Suppose there are N items of which r are
defective. Suppose we choose, at random, n
items without replacement. Let X be the
number of defective items found among the
n items chosen. It is clear that X is a discrete
r.v. that can take values
max{0, n-(N-r)}  x  min {r, n}.
We want to find the probability distribution
of the r.v. X.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 5
That is we want to find P (X = x).

Now n items among N items can be chosen


at random in any one of N
 
n
equally likely ways.
Now out of r defectives, x defectives can
be chosen in  r  ways.
 
 x
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 6
Now out of (N-r) non-defectives, n-x non-
defectives can be chosen in  N  r  ways.
 
 nx 
Hence the probability of finding x defectives
among the n items chosen is
 r  N  r 
  

h( x; n, r , N )    ,
x n x
N
 
n
Max{0, n-(N-r)}  x  min (n, r)
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 7
We say the random variable X has a
hypergeometric distribution with
parameters n, r and N.
Example
Among the 12 solar collectors on display at a
trade show, 9 are flat plate collectors and the
others are concentrating collectors. If a person
visiting the show randomly selects 4 of the
solar collectors to check out, what is the
probability that 3 of them will be flat plate
collectors?
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 8
Solution: If X is the number of flat plate
collectors among the 4 solar collectors
selected, then X is a r.v. with hypergeometric
distribution with parameters n=4, r=9, N=12.
Thus P (X = 3)  9  3
  
 h(3;4,9,12)   31
 12 
 
 4
8 7  3 4 28
  = 0.5091
12 1110 55
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 9
Mean of the Hypergeometric
Distribution
Let X be a r.v. having the Hypergeometric
Distribution with parameters n, r and N.
Then the mean or expected value of X is
  E ( X )   x P( X  x)
x

 0  P( X  0)  1 P( X  1)  2  P( X  2)
 ...
(Note the first tem is 0.)
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 10
min( n ,r )
 
k 1
k  P( x  k )

a  N  a
  
min( n ,r )
 k  nk 
  k
k 1 N
 
n
 r  1   ( N  1)  (r  1) 
min( n ,r )   
r  k  1   ( n  1)  ( k  1) 
 n 
N k 1  N  1
 n
r
N
 
 n 1 
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 11
Variance of the Hypergeometric
Distribution
Let X be a r.v. having the Hypergeometric
Distribution with parameters n, r and N.
Then the variance of X is
r  r  N  n 
  Var ( X )
2
 n 1   .
N  N  N  1 
 N  n  is often referred to as the finite
 
 N  1  population correction factor.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 12
Exercise 58 Section 3.7 Page 92

Production line workers assemble 15


automobiles per hour. During a given
hour, four are produced with improperly
fitted doors. Three automobiles are
selected at random and inspected. Let X
denote the number inspected that have
improperly fitted doors.

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 13


(a) Find the density of X.
(b) Find E[X] and Var[X].
(c) Find the probability that at most one will
be found with improperly fitted doors.
Solution
 4  11 
  
 x  3  x 
(a) h( x;3, 4,15) 
15 
, 0x3
 
3
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 14
4 4
(b) E[X] = 3   0.8
15 5
4 11 12 264
Var [X] = 3     0.5029
15 15 14 525

(c) P(X  1) = P(X =0) + P(X = 1)


 4 11  4  11
      165  220 385
 0  3   1  2     0.8462
  455 455
15  15 
   
3 3
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 15
Example
What is the probability that an IRS auditor
will catch only 2 income tax returns with
illegitimate deductions if she randomly
selects 6 returns from among 18 returns, of
which 8 contain illegitimate deductions?
Solution: If X is the number of income tax
returns with illegitimate deductions among
the 6 returns selected, then X is a r.v. with
hypergeometric distribution with parameters
n=6, r=8, N=18.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 16
Thus P (X = 2)
 8   10 
  
 h(2;6,8,18)   2 4 
 18 
 
 6

8  7 10  9  8  7 15

18 17 16 15 14 13
70
 = 0.3167
221
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 17
Sampling with replacement
Suppose from among N items (of which r are
defective), n items are chosen at random, one
by one, with replacement. Let X be the
number of defective items drawn among the
n items chosen.
We ask what is P (X = x)?
Now we can have x defectives among the
n items chosen in nC x m.e. ways.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 18
Now the probability of finding a defective
item in any one draw is
r
  p (say)
N
as we draw the items with replacement.
Thus the probability of finding x successes
in any one way is p x q n x
where q = 1–p, probability of getting a
nondefective.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 19
Hence the probability of getting x defective
items among the n items chosen with
replacement is
n x
P( X  x)  Cx p q
n x

x  0,1,...n

Thus X has a binomial distribution with


parameters n and p = r/N.

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 20


Binomial approximation to the
Hypergeometric Distribution
If in the hypergeometric distribution, n is
small compared with N, then the
hypergeometric probability h (x; n, r, N) can
be approximated by the binomial probability,
b (x; n, r/N).

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 21


Example
A shipment of 120 burglar alarms contains 5
defective alarms. If 3 of these alarms are
randomly selected and shipped to a customer,
find the probability that the customer will get
one bad unit by using
(a) the formula for the hypergeometric
distribution;
(b) Binomial approximation to the
hypergeometric distribution.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 22
Solution: (a) X, the number of defective
alarms among the 3 alarms shipped is a r.v.
having hypergeometric distribution with
parameters, n = 3, r= 5 and N = 120.
And we want
P( X  1)  h(1;3,5,120)
 5   115 
   5  115  114  3
  1   2   =0.1167
 120  120  119 118
 
 3 
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 23
(b) By using Binomial approximation,
the required probability is

h(1;3,5,120)
 b(1;3,5 /120)
2
 1  23 
 3   = 0.1148
 24  24 
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 24
Exercise 60 Section 3.7 Page 92

A random telephone poll is conducted to


ascertain public opinion concerning the
construction of a nuclear power plant in a
particular community. Assume that there are
1,50,000 numbers listed for private
individuals and that 90,000 of these would
elicit a negative response if contacted. Let
X denote the number of negative responses
obtained in 15 calls.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 25
(a) Find the density of X.
(b) Find E[X] and Var[X].
(c) Set up the calculations needed to find
P[X  6].
(d) Use the binomial tables to approximate
P[X  6].

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 26


Solution
 90000  60000 
  

h( x;15,90000,150000)    ,
(a) x 15 x
150000 
 
 15 

0  x  15
(b) E[X] = 15 
90000
150000
9
90000 60000 149985
Var [X] = 15     3.600
150000 150000 149999
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 27
(d) P[X  6] = 1 – P[X  5]
= 1 – H[5;15,90000,150000]

 1 – B[5;15,90000/150000]
=1 – B[5;15,3/5] = 1 – B[5;15,0.6]

=1 – 0.0338
= 0.9662

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 28


Chebyshev’s Theorem
Let X be a r.v. having mean  and s.d. . Then
for any positive number k, the probability of
getting a value of X which deviates from  by
at least k  is at most 1/k2.
In symbols 1
P | X   |  k   2
k

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 29


Proof

-k  +k

R1 R2 R3
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 30
We divide the range of X into three regions

R1, where X  -k

R2, where -k < X < +k

R3, where X  +k

We shall denote P(X = x) by f (x)

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 31


Variance of X =    ( x   ) P( X  x)
2 2

  ( x   ) f ( x) 2

  ( x   ) f ( x)   ( x   ) f ( x)
2 2

xR1 xR2

  ( x   ) f ( x) 2

xR3

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 32


Neglecting the middle term which is  0, we get

   ( x   ) f ( x)   ( x   ) f ( x)
2 2 2

xR1 xR3

In the region R1, x  -k or x-   -k


Hence (x- )2  k22

In the region R3, x  +k or x-   k


Hence (x- )2  k22
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 33
Hence    k  f ( x)   k  f ( x)
2 2 2 2 2

xR1 xR3

 
Or 1  k   f ( x)   f ( x ) 
2

 xR1 xR3 
1

xR1
f ( x)  
xR3
f ( x)  2
k
i.e.
1
or P | X   |  k   2
k Q.E.D.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 34
Remark: The event { |X - | < k } is
complementary to the event { |X - |  k }.
Hence we can also say that
1
P | X   |  k   1  2
k
Putting k = , we can also say that

 2
P | X   |     2

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 35
Example

In 1 out of 6 cases, material for bulletproof


vests fail to meet puncture standards. If 405
specimens are tested, what does
Chebyshev’s theorem tell us about the
probability of getting at most 30 or at least
105 cases that do not meet puncture
standards?

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 36


Solution
Let X be the number of cases that do not meet
puncture standards. Then X is a r.v. having
Binomial Distribution with parameters
n = 405 and p = 1/6.
And we want P ( X  30 or X  105).
Now the mean is  = n p = 405/6 = 135/2 and
1 5 45 15
  npq  405    
6 6 6 2
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 37
135 135
X  105  X   105 
2 2
135 75
i.e. X  
2 2
135 135
X  30  X   30 
2 2
135 75
i.e. X  
2 2
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 38
Hence P( X  30 or X  105)
135 75 135 75
 P( X    or X   )
2 2 2 2
 P( X    5 or X    5 )
 P(| X   |  5 )
1 1 by Chebyshev’s
 2  0.04
5 25 theorem
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 39
Example
Show that for 1 million flips of a balanced
coin, the probability is at least 0.99 that the
proportion of heads will fall between 0.495
and 0.505.

Solution: Let X be the number of heads


obtained in 1 million flips. Then X is a r.v.
having Binomial Distribution with
parameters n = 106 and p = 1/2.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 40
Now the mean is  = n p = 500,000 and

1 1
  npq  1000000    500
2 2
The proportion of heads is X /10,00,000.
And we want
X
P(0.495   0.505) (n = 10,00,000)
n
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 41
 P(495,000  X  505,000)

 P(5,000  X    5,000)
 P(10  X    10 )
 P(| X   |  10 )
1
 1  2  0.99 by Chebyshev’s theorem.
10
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 42
Example
How many times do we have to flip a
balanced coin to be able to assert with a
probability of at most 0.01 that the difference
between the proportion of tails and 0.50 will
be at least 0.04?
Solution: Suppose we have to flip the coin
n times. Let X be the number of tails obtained
in n flips.
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 43
Then X is a r.v. having Binomial Distribution
with parameters n and p = 1/2.
And we have to find n so that
X
P(|  0.5 |  0.04)  0.01
n
i.e. such that
P(| X  0.5n |  0.04n)  0.01
or P(| X   |  0.04n)  0.01
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 44
By the third form of Chebyshev’s theorem,
 2
P(| X   |  0.04n)  2
(0.04n)
where the variance   n 2

4
So we have to find n such that
 2

2
 0.01
(0.04n)
5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 45
That is such that

n
 0.01
4  (0.04n) 2

1
Or n  15625
4  (0.04)  0.01
2

5-Aug-19 Prepared by Dr. M.S. Radhakrishnan, BITS-Pilani 46

Das könnte Ihnen auch gefallen