Sie sind auf Seite 1von 10

1

Instructions for Chapter 5- by Dr. Guru-Gharana


The Binomial distribution
Random Variable: Any variable whose numerical values depend
on chance, or have probabilities associated.

Example: In a toss of two coins the sample space S=


and the outcomes have probabilities equal to ¼ each (equally
likely, assuming fair coins). So we have different values and associated
probabilities. Do we have a random variable here? Not quite, because the values
have to be “numerical”. Here the outcomes are in symbols. But we can convert it
to a random variable easily by defining a variable X as the number of heads (or
tails if you like). Then we have three possible values of X (not four) with the
associated probabilities shown below:
X Probability

0 ¼
1 2/4

2 ¼
---------------
Total 1

The X value 1 has the largest probability because it is associated with two
outcomes (TH and HT) while others have only one favorable outcome each. Also
note that the sum of probabilities is equal to 1, a fundamental rule of Probability
discussed in the previous instruction.

A Probability Distribution is simply a display of the values of a random


variable and corresponding probabilities in any form: Tabular (as above),
Graphical (remember the relative frequency polygon), or even in the form a
formula which enables us to find associated probability for any value (or a
range of values for continuous variable).
2

We can display the above distribution graphically too.

The graphical probability distribution shows that the distribution is


symmetric and has the mean value equal to 1.

There are two types of random variables and corresponding probability


distributions: Discrete and Continuous

Discrete Random Variable and Probability Distribution: Any random


variable whose values can be counted (whether finite or infinite) is a discrete
random variable and the corresponding distribution is a discrete probability
distribution, because there are gaps or holes (discontinuities) between values.
Thus, a discrete random variable is a result of counting. Counting the number
of heads or tails, the number of dots in a roll of dices, the number of
customers, the number of cars sold or produced, the number of patients, the
number of defective products, the number of students, etc.

The above example is that of a discrete random variable. We will study more
complicated but useful and interesting examples of discrete random variables later.
Note that only in the case of discrete random variables we can talk of the
probability of a single value.
3

A continuous random variable is a random variable which can take any


numerical value in a given interval . There are no gaps or holes between values. A
continuous random variable is the result of a measurement (instead of counting),
such as Weight, Height, Volume, Length, Area, Time period, Income, Expenditure,
Prices, Taxes, Imports, Exports, etc. For continuous random variables, the
probability of an individual value is Zero (because there are uncountably infinite
possible values). In the continuous variable case we can talk of probability of a
range of values only. It is analogous to this. The area of a single line is zero, but
the area of a range (under a curve) is non-zero, although the range is theoretically
made up of lines). This is the magic of uncountable infinite which converts the sum
of Zeros into a non Zero value!

Let us first study Discrete Random Variables and Probability distributions. We will
study continuous random variables and distributions in Chapter 6.

In any Probability distribution we are always interested in at least three


characteristics: the Mean (or Expected value), the Variance (or its square root
called Standard Deviation) and the symmetric property. Let me give the formulas
for the first two characteristics (which will be similar to the formulas given in the
instructions for Chapter 3 for grouped data except that we now use probabilities
instead of frequencies).

The mean or Expected value of a discrete random variable X is a probability

weighted average given as: µx ∑p x i i


All i
(Note that the formula will involve Integral instead of a sum in the case of a continuous random
variable)

For the above example, µx = 0.25*0 + 0.5*1+ 0.25*2 = 1

The formula for Variance is probability weighted sum of squared deviations

from the mean: x


2
∑pi(xi- µx)2
All i
4

2
In the above example x
2
0.25*(0-1) + 0.5*(1-1)2 + 0.25*(2-1)2 = 0.5.

Therefore, the standard deviation x = = 0.707


The third characteristic related to symmetric property is measured by the third
moment indicating the degree of skewness. But in this course, we will examine it
only graphically by looking at the graphical display of the probability distribution.

Examples of Useful Discrete Probability distributions


The Binomial Distribution
The binomial distribution is the most popular and useful discrete probability
distribution. It is the result of a series of the so called “Bernoulli” trials. A
Bernoulli trial is a trial of an Experiment which has only two possible outcomes
and one and only one of the two outcomes must occur in any trial. One of the
outcomes is (arbitrarily) called a “success” and the other is called a “failure”.
Here the terms success and failure have no positive or negative meanings attached.
It is just a popular method of categorizing and is totally arbitrary. For example in a
toss of a coin there are only two possible outcomes: Head or Tail (assuming that it
does not stand on its edge). Moreover, only one of the two possible outcomes can
(and must) occur in any toss of a coin. We can call Head a success or Tail a
success. But once we assign the term success to one outcome the other
automatically becomes a failure. Similarly, when we are grouping people by
Gender, we can call Male a success or Female a success without having any
negative or positive meanings attached to these terms.
So, the toss of a coin is a Bernoulli trial. The probability associated with the
outcome called success is denoted by π and the probability associated with failure
is denoted by 1- π.

If a Bernoulli trial is independently repeated n times, such that the probability


of success (and, consequently, the probability of failure) remains constant
throughout, then we have a Binomial Distribution with n trials. As also listed
in the book (page 217) an experiment must fulfill the following conditions to result
in a Binomial Distribution:
5

1. The experiment consists of n identical trials


2. Each trial is a Bernoulli trial, that is, each trial has only two mutually
exclusive possible outcomes, one and only one of which can (and must) occur.
One of the outcomes is arbitrarily called a “success” and the other a “failure”.
If there are more than two possible outcomes in every trial, then it is a case of
Multinomial distribution which is not very popular and not covered in this
course.
3. The probability of success π remains constant throughout the experiment. If
this condition is violated we will have the so called “Hypergeometric”
Distribution, which is not very popular and is not covered in this course.
4. The trials are independent (that is the result of a trial does not depend on or
affect the outcome of another trial).

If the above conditions are fulfilled, then the variable X defined as the number of
successes in n trials has a Binomial probability distribution and the probability of
any X value is given by the following formula:

x (n-x)
P(X= x) = π (1- π)

The numerator is n!, called “n factorial” and is defined as n! = n(n-1)(n-2)……….1

Similarly, the denominator involves two factorials. Although this formula looks a
little clumsy, the Binomial Distribution has some neat properties:

1. It is a two parameter distribution. The number of trials, n, and the probability


of success π completely determine the whole probability distribution.
2. The mean or the expected value of the distribution is simply nπ.
3. The Variance of the distribution is simply nπ(1- π) and, therefore, the
standard deviation is √nπ(1- π)
4. If the probability of success is exactly 0.5, then the resulting distribution is
symmetrical. If it is more than 0.5 then left skewed and if it is less than 0.5
then right skewed. If the probability of success is very near 0.5 then the
distribution is approximately symmetrical. The distribution approximates
symmetrical (bell shaped) more closely as the number of trials increases for a
given probability of success. We will talk about it in the next chapter too, when
6

we study normal approximation to binomial distribution. Let us work on some


examples:

Example 1. Suppose a fair coin is tossed six times. Let us define X as the number
of heads which appear in the six tosses. Is it a binomial random variable? Can we
build the whole probability distribution? What is the mean or the expected value of
X? What is its Variance and standard deviation? What is the probability that there
will be at least one head? Exactly two heads? At most three heads? Less than four
heads? No head? Head in five tosses? All heads? Ok, let me help in this example.

First of all a toss of a coin is obviously a Bernoulli trial with only two outcomes,
one and only one of which must occur. Now six tosses of the same coin (or
similarly fair coins) can be safely assumed to be independent of each other
(whether done simultaneously with six coins or done successively in a row with the
same coin). And in each trial the probability of success (head) is the same. So X is
clearly a Binomial random variable. Here, n=6 and π =0.5.

As mentioned above these are the only two parameters needed to determine the
whole distribution. But before we derive the whole distribution we can easily
summarize it in two key characteristics. The mean µx= 6*0.5= 3 (three heads is

2
the expected value). Variance x = 6*0.5*(1-0.5) = 1.5 Standard Deviation

x = = 1.225. Now the whole distribution:

0 (6-0)
P(X = 0) = 0.5 (1- 0.5) = 0.56 = 0.0156 or less than 1.6%
chance.

This low probability is not surprising because the event is that all tosses have to
result tails. In the above calculation I used the rule that 0!=1!=1 ( a little strange
result, but that’s a basic mathematical rule). Also, any number to the power zero is
equal to one. Now I will quickly derive the other probabilities (leaving the details
for your exercise).
7

P(X=1)=……………= 0.0938
P(X=2)=……………= 0.2344
P(X=3)=……………= 0.3125
P(X=4)=……………= 0.2344
P(X=5)=……………= 0.0938
P(X=6)=……………= 0.0156

Thus the probability distribution is:


X Probability
0 0.0156
1 0.0938
2 0.2344
3 0.3125
4 0.2344
5 0.0938
6 0.0156

Note that the distribution is symmetrical around the mean value 3. The total of all
probabilities should be exactly one except for rounding (in the fourth decimal place
here). If you want you can practice drawing the graphical distribution with bars or
columns or probability polygon).
Now that we have done the dirty work of calculating probabilities using the clumsy
formula, I will tell you the secret of getting the answers quickly. Look at Tables on
pages 855 to 859 in the book (5th ed) and 853-857 (6th ed) for probabilities for up to
20 trials. Try to locate the above probabilities in the tables. You can also use the
computer to derive probabilities for any number of trials.

With MegaStat it is a piece of cake. Open Excel, then Add-Ins, then MegaStat, then
Probability, then Discrete probability distribution (gives binomial as default), then fill the
number of observations and probability of occurrence (π) and …that’s it folks. You get
everything you want to know about this distribution including a nice graphical presentation as
follows:

Binomial distribution

6 n
0.5 p

    cumulative
X P(X) probability
0 0.01563 0.01563
8

1 0.09375 0.10938
2 0.23438 0.34375
3 0.31250 0.65625
4 0.23438 0.89063
5 0.09375 0.98438
6 0.01563 1.00000
1.00000

3.000 expected value


1.500 variance
1.225 standard deviation

That is why I recommend MegaStat all the time.

You can do it through regular excel too. Go to Excel. Click on Formulas (on top
row). Then click on “more functions” from the options which appear. Then select
“Statistical” then select Binomdist. In the dialogue box put the value of x you want
probability for, put number of trial, put the probability of success, and put zero for
cumulative (if you want individual probabilities instead of cumulative). Then you
get the probability for any value of X. You can repeat it for any number of values.
Does it sound a little more work compared to MegaStat? Of Course, MegaStat is
meant to make your life easier with the use of Excel. But one good thing about
9

Excel is that you can also get the cumulative probability (not a big deal once you
have all the probabilities). For example when you put 1 in the cumulative row in
the dialogue box for value four P(X=4) comes as 0.890625 with Excel. (Check it
by adding all the probabilities up to 4, the difference will be because of rounding).

Now the remaining questions asked above. Do I need now to teach you how to
answer those questions once you have all the probabilities? I don’t think so.
Remember, however, that you are dealing with a discrete distribution. Therefore, is
“at most 3” same as or different from “less than four”? Is it the same as or different
from “less than or equal to 3”? Think about it.

For your further practice, I am giving you an interesting question below.

Practice Example for bonus points: A salesperson goes door-to-door in a


residential area to demonstrate the use of a new Household appliance to potential
customers. She has found from her years of experience that after demonstration,
the probability of purchase (long run average) is 0.30. To perform satisfactory on
the job, the salesperson needs at least four orders this week. If she performs 15
demonstrations this week, what is the probability of her being satisfactory? .2186

What is the probability of between 4 and 8 (inclusive) orders?


0.2183 + 0.2061 + 0.1472 + 0.0811 + 0.0348

0.6875

Now the challenging questions: if the salesperson wants to be at least 90 percent


confident of getting satisfactory evaluation in her job this week, how many
demonstrations should she perform? 7 demonstrations How would your answers
to above questions change if the probability of success increases (say by training)
to 0.35? 4 demonstrations
Binomial distribution

7 n
0.35 p

cumulativ
    e
X P(X) probabilit
10

y
0 0.04902 0.04902
1 0.18478 0.23380
2 0.29848 0.53228
3 0.26787 0.80015
4 0.14424 0.94439
5 0.04660 0.99099
6 0.00836 0.99936
7 0.00064 1.00000
1.00000

2.450 expected value


1.593 variance
1.262 standard deviation

Those who email me the correct answer to this question (with details shown)
without asking for help will get up to three bonus points in the related Assignment.

Skip Poisson distribution in this chapter. But you can learn it for your enhanced
knowledge because Poisson distribution is used in some applications like queuing
(or waiting line) theory.

We will come back to Binomial Distribution when we study Chapter 6.

Das könnte Ihnen auch gefallen