Sie sind auf Seite 1von 40

Business Decision Analysis

Dr Gary Simpson, Dr Panagiotis Ganotakis


g.p.m.simpson@aston.ac.uk, ganotap1@aston.ac.uk
NB270 / NB268
Statistical Techniques & Operational Research
Week
1 Decision Analysis
2 The Binomial and Poisson Distributions
3 The normal distribution, estimation
4 Confidence limits including the use of the t-distribution
5 Hypothesis Tests for means and proportions
6 Chi-squared hypothesis tests
7 Introduction to Differentiation
8 Optimisation using Differentiation
9 Transportation Problem
10 Linear Programming (formulation & graphical solution).
11 LP (interpreting LINDO output)
12 Revision
Method of Assessment
100% by examination. The examination will be an open-book examination and lasts for 2 hours.
It is designed to test overall understanding of the module and will include formulation of models,
interpretation of computer output, use of statistical tables and calculations by-hand where appropriate.
Books to buy
Standard tables required by all students:
Lindley DV & Scott WF (1984), New Cambridge Elementary Statistical Tables, 2
nd
Edition,
Cambridge: Cambridge University Press.
Main Recommended texts:
All students are expected to read the course notes and work through the examples on the tutorial sheets.
These two books are recommended for further reading (help for those students that feel they need to
read from an additional source):
Either these two books

A1. Bradley, T (2008), Essential Mathematics for Economics and Business, John Wiley &
Sons.
&
A2. Bradley, T (2007), Essential Statistics for Economics and Business, John Wiley &
Sons.
or this single book:
B. Curwin J & Slater R (2008), Quantitative Methods for Business Decisions, 6
th
Edition,
Thomson learning.
Both where the essential reading for BN1105 Quantitative Techniques
Other texts:
Anderson, Sweeney, Williams, Freeman & Shoesmith (2007), Statistics for Business and Economics,
Thomson learning.
Dewhurst F (2006), Quantitative Methods for Business Management, 2
nd
Edition, London: McGraw-
Hill Education Europe.
Oakshott L (2001), Essential Quantitative Methods for Business, 2
nd
Edition, Basingstoke: Palgrave.
Wisnieswki M (2005), Quantitative Methods For Decision Makers,
4
th
Edition, London: Prentice Hall.
Useful Online Sources:
Lindo, provide a demo version of their linear programming package, which can be downloaded from
their website www.lindo.com

Hunt &Tyrrell (2000), http://www.coventry.ac.uk/discuss
This web-site provides helpful material for some of the statistics parts of this module
Gustman G (2003) StatPrimer (Version 6.4)
http://www.sjsu.edu/faculty/gerstman/StatPrimer/
SPSS Video tutorials
http://www.stat.tamu.edu/spss.php
Statistical Techniques
The purpose of decision analysis is to enable a decision maker to choose between several decision
options in a way that is best, or optimal, in some sense. The decision will often be affected by future
events that are not under the control of the decision maker; such events are called states of nature.
We shall use d1, d2, to denote decision options and s1, s2, to denote states of nature.
The result of choosing a particular decision option di in conjunction with a particular state of nature sj is
called the payoff V(di,sj) for that particular pair di and sj. It is simplest to think of this payoff as a
direct monetary payment, but in some contexts it is preferable to use the concept of utility to take
account of the relative importance of different outcomes. An extra 1000 would not make much
difference to a millionaire but would to an impoverished student. So the utility of the 1000 would
greater for the student than for the millionaire.
A firms decision problem
A firm has to make a decision on which of three types of equipment to buy (A, B or C) and has worked
out the payoff (in 1000) it will receive depending on whether the firm can secure a contract of type I
or a contract of type II with its most important customer. (One of the two contracts will be agreed.)
States of nature
Decision Option Contract type I: s1 Contract type II: s2
Choose A: d1 250 70
Choose B: d2 160 150
Choose C: d3 90 170
A decision tree, similar to a probability tree, can be used to show the same information as the table.
Within a decision tree we need to distinguish between decision nodes, which we will represent by a
square, and chance nodes, represented by a circle.
Start
d
1
d
3
d
2
s
1
250
70
160
150
90
170
s
2
s
1
s
1
s
2
s
2




Maximisation without probabilities
Although it is desirable to have some measure of the probabilities of the various possible states of
nature, it is not always possible to get reliable estimates of these probabilities. There are several
different approaches to decision making without probabilities and it is important to understand why
they often lead to different optimal decisions.
1. Optimistic approach (Maximax payoff)
This method simply chooses the decision giving the largest possible payoff in the whole table. It is
sometimes called the maximax payoff because it can be found by working down the payoff table, row
by row for each decision option di writing down Vmax(di), the maximum payoff for any of the states of
nature. The decision option chosen is the one with the largest Vmax(di), that is the maximum of the
maximum payoffs.

States of nature
Decision Option Contract I: s1 Contract II: s2 Vmax(di)
Choose A: d1 250 70 250
Choose B: d2 160 150 160
Choose C: d3 90 170 170
2. Pessimistic approach (Maximin payoff)
Here we consider the worst outcome for each decision option di which we call Vmin(di). Then we have
to identify the maximum of these Vmin(di). This maximises the minimum payoff and so is called the
maximin payoff.
States of nature
Decision Option Contract I: s1 Contract II: s2 Vmin(di)
Choose A: d1 250 70 70
Choose B: d2 160 150 150
Choose C: d3 90 170 90
3. The Hurwicz criterion
This represents a balance between the maximax and maximin approaches. Define the index of
optimism to be a number between 0 and 1, then choose the decision option di which gives the
maximum value of H(di) = Vmax(di) +(1 )Vmin(di).
When =1, this is the same as the maximax approach, and when = 0, this is the same as the
maximin approach. For our example, we will consider = 0.5.
States of nature
Decision Option s1 s2 Vmax(di) Vmin(di) H(di)
Choose A: d1 250 70 250 70 160
Choose B: d2 160 150 160 150 155
Choose C: d3 90 170 170 90 130
4. Minimax regret
This approach considers how much we might regret choosing a decision option once the state of nature
is revealed. The regret or opportunity loss, R(di,sj), is the difference between the best of all payoffs
for a particular state of nature sj and the payoff when option di is chosen and state sj occurs:
R(di,sj) = Vmax(sj) V(di,sj)
So in our example, Vmax(s1) = 250 and Vmax(s2) = 170 and the following table shows the regret values
States of nature
Decision Option s1 s2 Rmax (di)
Choose A: d1 250 - 250 = 0 170 70 = 100 100
Choose B: d2 250 - 160 = 90 170 150 = 20 90
Choose C: d3 250 - 90 = 160 170 170 = 0 160
The option di which minimises the (maximum) regret Rmax(di) is chosen.
Decision analysis using probabilities
When we have reasonable estimates of the probabilities, P(sj), of the various states of nature, the
optimal decision option can be chosen on the basis of the expected values of the payoffs.
In the example, if the management thinks that there is a 60% chance of getting contract I, and believes
that there is a 40% chance of getting contract II then
States of nature
Decision Option Contract I: s1 Contract II: s2 EV(di)
Choose A: d1 250 70 178
Choose B: d2 160 150 156
Choose C: d3 90 170 122
P(sj) 0.6 0.4
So the best decision is d1 (option A) with an expected return of 178.
Using a process called backward induction, we can also calculate this result using a decision tree.
Start
d
1
d
3
d
2
P(s
1
)=0.6
250
70
160
150
90
170
P(s
2
)=0.4
P(s
1
)=0.6
P(s
1
)=0.6
P(s
2
)=0.4
P(s
2
)=0.4
178
156
122
178
Backward induction runs backwards through the tree in the following way:
For each chance node, the expected value of the node is calculated by multiplying the probabilities at
the edges leaving this node by the payoffs at the end of the respective edge and summing up over all
edges belonging to this chance node.
E.g. for the chance node following decision d1:
P(s1) V(d1,s1)+ P(s2) V(d1,s2)= 0.6 250 + 0.4 70 = 178
For a decision node, the value of the node is calculated as the maximum of the values of the nodes
following it.
E.g. for the first node, we need to find: max{178, 156, 122} = 178.
Hence, the optimal choice is d1 (option A) again, as it leads to the highest expected profit.
Expected value of perfect information
If the decision maker had a better idea of which state of nature will actually occur, this would lead to an
improvement in the expected payoff. However, if the cost of obtaining the required information
exceeded the improvement in payoff, no benefit would result.
We therefore need a method for determining an upper limit for the budget for obtaining more
information on the future state of nature.
The payoff table shows that if the company gets contract I, the optimal decision is d1 whereas d3 is the
optimal decision if the company gets contract II.
If the decision maker was able to gain perfect information about which type of contract the company
will get, the expected value of following this combined decision strategy would be
E(Vmax | PI) = V(s1)P(s1) + V(s2)P(s2) = 250 0.6 + 170 0.4 = 218
If no information was available about the actual state of nature then we have already found the best
policy is d1 with an expected value of 178 (which is 178,000).
So, the Expected Value of Perfect Information is
EVPI = E(Vmax | PI) - EVmax = 218 - 178 = 40
This 40,000 represents the additional expected payoff that could be obtained if perfect information
about the future state of nature was available. It would therefore be uneconomic to pay more than 40k
for extra information no matter how good that information might be. However, each amount up to
40k should be paid for perfect information, as the expected payoff can be increased.
Random Variables
In statistics, the events we are concerned about often represent numerical observations (e.g. the daily
demand for a product). In such cases we imagine the sample space to be made up of points that
correspond to that quantity, so the random variable X may take a particular values x. Note that a
capital letter is generally used for the random variable and the corresponding small letter for value.
The probability of the event that X takes the value x is written P(X=x)
Discrete Probability Distributions
If a discrete random variable X takes some value x with probability P(X=x) = f(x)
then f(x) is called the probability function of X. The sum of individual f(x) terms for x up to and
including some value r is called the cumulative distribution function (c.d.f.) and is denoted by F(r).
Note the probability that X is less than or equal to r is P(X r) = F(r) = f(0) +f(1) ++f(r)
Mean of a Discrete Random Variable
The theoretical mean of a discrete random variable is denoted by ,and is also called the expected
value E(X) is calculated in a similar way to the sample mean for repeated data but the relative
frequencies are replaced by the probability f(x).
( ) ( ) Mean = E X
all x


xf x
Variance of a Discrete Random Variable
The theoretical variance of a discrete random variable is denoted by
2
,or Var(X) is given by,
( ) ( )
2
all x
= Var X

x f x
2 2
Binomial Distribution
If the random variable X represents the number of successes in n independent trials of a random
experiment in which there are only two possible outcomes from each trial; a success with probability
p and a failure with probability q=1-p
Then the probability function is given by:
( ) ( )
( )
x n x x n x
x n
q p
! x n ! x
! n
q p C x f x = X P


Expected number of success, mean of the binomial distribution = np. The variance
2
= npq
Cambridge Table 1 shows P(X r)=F(r) for different n and p values.
Example
There are 20 computers in a room; the probability that a computer is working is 0.95. Assuming that
breakdown of a computer is independent to the breakdown of the other computers, a) what is the
probability that all 20 computers are working ? b) What is the probability that exactly 17 computers are
working ? c) What is the probability that 17 or more computers are working ?
a) n = 20 = number of trials success = working so p = 0.95 and q= 1-p = 1-0.95 =0.05
P(all working) =
( ) ( ) ( ) P X = 20

20 20
20 20 20
0 95 0 05 C . .
0 3585 . (to 4 d. p. )
b) n = 20 success = working so p = 0.95 and q= 1-p = 1-0.95 =0.05
P(17 working) =
( ) ( ) ( ) P X = 17

20 17
17 20 17
0 95 0 05 C . .
0 0596 . (to 4 d. p. )
c) n = 20 success = not working so p = 0.05 and q= 1-p = 0.95
P( 3 or less not working) = P(X 3) = F(3) = 0.9841 (From Table 1 page 22)
Note the tables can also be use for parts a) and b) for example
P(17 working) = P(3 not working) = P(X 3) - P(X 2) = F(3)-F(2) = 0.9841 -0.9245 = 0.0596
Poisson Distribution
If the random variable X represents the number of events observed in a fixed observation frame
(usually a specified period of time) and the expected number of events is . If the events occur at
random, in such a way that they are independent of each other, the mean rate of occurrence is constant
and are relatively rare they are said to follow a Poisson process and the probability function is given
by:
( ) ( ) ( ) P X = x f x
x
x

!
exp
, the mean is = , and the variance is
2
=
Cambridge Table 2 shows P(X r)=F(r) for specified =
Example
A telephone help-line receives an average of 2 calls per minute. Assuming that the calls follow a
Poisson process calculate the probability that the help-line receives a) no calls in a 3 minute period. b)
exactly 8 calls in a 3 minute period. c) 6 calls or less in a 4 minute period. d) at least 6 and no more
than 10 calls in a 4 minute period.
Mean no of calls per minute = 2 so the mean no. of calls per 3 minutes = 2 3 = 6
a)
( ) ( ) P X = 0

x
x!
exp ( )
6
0
6
0
!
exp

1
1
0 002479 .
0 0025 . (to 4 dp)
b)
( ) ( ) P X = 8

x
x!
exp ( )
6
8
6
8
!
exp

1679616
40320
0 002479 .
0 1033 . (to 4 dp)
c) Mean no. of calls per 4 minutes = 2 4 = 8,
( ) P X 6 F( ) . 6 0 3134
(From table 2 page
28)
d)
( ) ( ) ( ) ( ) ( ) P X 10 6 10 6 10 5 0 8159 0 1912 0 6247 < P X P X P X P X . . .
Poisson Approximation to the binomial
If for a binomial distribution with n>20 and p<0.1 then set = np and use the Poisson distribution as
an approximation.
Example
If 1% of the population have a peanut allergy find the probability that in a randomly selected group of
200 people more than 4 have a peanut allergy.
n = 200, success = peanut allergy so p = 0.01 and q = 0.99 , = np = 200 0.01=2
Using Poisson Tables on page 25 ( ) ( ) P X > 4 1 4 1 0 9473 0 0527 P X . .
Continuous probability distributions.
Cumulative distribution function, F(x) = P(X x) , as for discrete X, but is a smooth function of x.
Probability density function, f(x) = rate of change of F(x) = dF(x)/dx ; the pdf is not itself a
probability. Areas under the pdf curve represent probabilities:
F(x0) = area under f(x) to the left of x0.
x
f(x)
x
0
F(x
0
)
0
Mean, = total are under the curve of xf(x) = all x xf(x) dx
Variance,
2
= total are under the curve of (x )
2
f(x) = all x (x )
2
f(x) dx = all x x
2
f(x) dx
2
Standard deviation, Variance .
The normal distribution.
The Normal distribution is also called the Gaussian distribution and is the most important of all
continuous probability distributions. Many naturally occurring quantities have this kind of distribution,
and in the correct conditions it can be used to provide a good approximation of many other
distributions -including discrete distributions.
Note: Just because a random variable is not normally distributed does not make it abnormal !

x
f(x)
+ +2 +3 2 3
It is common practice to refer to the normal distribution which has mean and standard deviation
as N( ,
2
). Note that it is the variance not the standard deviation that is given. So N(75,16) stands
for a normal distribution with mean 75 and variance 16, hence a standard deviation of 4.
To avoid having to integrate the pdf of the normal distribution, tables are available of the standard
normal distribution, N(0,1) and problems for any other mean and/or standard deviation are restated as
equivalent problems with mean 0 and standard deviation 1.
Standardised Normal Deviate
A random variable which has a standard normal distribution is usually denoted by Z and its pdf is
denoted by (z) and its cdf by (z) = P(Z<z)
If we have a problem involving some random variable X which has the distribution N( ,
2
). Then to
find the probabilities we first find the standardised deviate subtracting the mean then dividing by the
standard deviation.
So the standardised deviate corresponding to a typical value x is z
x

Using tables to find normal probabilities.


To use tables of (z) to find the probabilities for X which is distributed as N( ,
2
). Note the
following relationships:
( ) ( ) 1. P X x
x
z <

_
,

( ) ( ) 2. P X x
x
z >

_
,

1 1

( ) ( ) 3. z z 1
Example
The probability density function pdf for the normal curve
depends on two parameters, and which represent its mean
and standard deviation.
The pdf is a symmetrical bell shaped curve centred on x =
About 66% probability area lies between and + .
Nearly all the probability area lies between 3 and
+3 .
The average income in a country is known to be 12,800 with a standard deviation 2400, assuming
that incomes are normally distributed, working in units of 1000, calculate the probability that:
a) A randomly chosen individual from that country has an income of greater than 20000 ?

) 20 X ( P 1 ) 20 X ( P >

,
_


<
4 . 2
8 . 12 20
Z P 1
( ) < 1 3 P Z
( ) 1 3
1 0 99865 . 0 00135 .
b) A randomly chosen individual from that country has an income of less than 10000?

< ) 10 X ( P
,
_


<
4 . 2
8 . 12 10
Z P

,
_


<
4 . 2
8 . 2
Z P
( ) < P Z 1 1667 .
( ) 11667 .
( ) 1 1 1667 .
1 0 8783 . 0 1217 .
c) A randomly chosen individual from that country has an income of between 10000 and 14000?

) 14 X 10 ( P ( ) ) 10 X ( P 14 X P <

,
_


<
,
_


<
4 . 2
8 . 12 10
Z P
4 . 2
8 . 12 14
Z P
( ) ( ) < < P Z P Z 0 5 1 1667 . .
( ) ( ) 0 5 1 1667 . .
( ) ( )
0 6915 1 1 1667 . .
+ 0 6915 1 0 8783 . . 0 5698 .
Normal Approximation to the Binomial
If X has a binomial distribution with n>10 and p near 0.5 then you can use the normal approximation.
Set = np and
2
= npq. As n gets larger the approximation is valid for values of p further from
0.5. Note: Because the normal distribution describes continuous random variables and the binomial
distribution describes discrete random variables a continuity correction is applied.
( ) P X <
+

_
,

x P Z
x 0 5 .


<
+

_
,

P Z
x np
npq
0 5 .
Normal Approximation to the Poisson
If X has a Poisson distribution with >20 then you can use the normal approximation.
Set and
2
= . Note: Because the normal distribution describes continuous random
variables and the Poisson distribution describes discrete random variables a continuity correction is
applied.
( ) P X <
+

_
,

x P Z
x 0 5 .


<
+

_
,
P Z
x 0 5 .

12.8
0
20
3
x
z
12.8
0
10
-1.167
x
z
12.8
0
10
-1.167
x
z
14
0.5
P(Xx)
x x-0.5 x+0.5
Normal
pdf
curve
Estimation and confidence limits.
Appropriate sample statistics may be used to provide Point Estimates of unknown values of parameters
in theoretical models. Thus the sample mean and sample standard deviation provide point estimates of
and in a normal distribution.
However, for many purposes we also need some measure of the accuracy of this kind of point estimate.
The usual way of providing this is to calculate confidence limits for the unknown parameter value.
Broadly speaking these are the values between which the unknown parameter is expected to lie with a
specified probability. The probability is usually multiplied by 100 and expressed as a percentage,
which is called the confidence level.
The difference between the confidence limits is called the confidence interval. Higher confidence
levels result in the confidence interval being wider. A sensible balance between a reasonably high
confidence level and a narrow confidence interval is often obtained by using a 95% confidence level.
With the 100(1- )% confidence level, we are allowing a probability of /2 that the parameter lies
below the lower confidence limit and a probability of /2 that it lies above the upper confidence
limit.
For the 95% confidence level, /2 =0.025 (or 2.5%). The confidence limits are found by considering
the distribution of some statistic that depends upon the sample data and upon the unknown parameter.
Confidence Limits for Normal
The distribution required here is that of the statistic
t
x
s n


, found in Table 10.
This is similar to that of the standard normal distribution but varies according to a parameter called the
degrees of freedom, (nu). When calculating confidence limits for , the degrees of freedom are
one less than the sample size, so = n 1. The value of t with this which has probability /2
lying beyond it in the right tail of the distribution is t
/2.
The 100(1 )% confidence limits for are then given by:
n
s t
x
2 /
t
Example
A car manufacturer wants to estimate the average miles per gallon consumed by it new model in typical
urban traffic conditions. In random sample of 25 urban traffic trials the new model yields a sample
mean of 32.0 miles per gallon and a sample standard deviation of s = 2.6 miles per gallon. Assuming
consumption is normally distributed, calculate the 95% confidence limits for the mean.
n = 25, degrees of freedom = 25 1 =24.
Since 100(1- )=95, this tells us that 100/ 2 = 2.5, we look up the 2.5% point for =24 in Table
10.
x
t s
n
t
/ 2

t

32 0
2 064 2 6
25
.
. .

t 32 0
5 3664
5
.
.
t 32 0 1 073 . .
So the 95% confidence limits for the mean are 30.9 to 33.1 miles per gallon (to 1 d.p.)
Confidence Limits for Binomial p.
If a series of n binomial trials yields x successes then the observed proportion of successes is p
x
n

and the observed proportion of failures is


q p 1
.
Approximate 100(1- )% confidence limits for p are:
n
q

z p

2 /
t
where z
/2 is the value of z which cuts off probability /2 in the right tail of the standard normal
distribution (see Table 5). For 95% confidence, the required z value is the 2.5% point in Table 5,
which is 1.9600.
Example
In a survey of 100 randomly selected students 68 stated that the mess in the kitchen belonged their
flatmates. Find the 95% confidence interval for the proportion of students who blamed the mess on
their flatmates.
. p
68
100
0 680
. . q p 1 1 0 68 0 320


/
p z
pq
n
t
2
t

0 680 1 96
0 680 0 320
100
. .
. .
t 0 680 1 96 0 04665 . . .
t 0 680 0 091 . .
The 95% confidence limits for the percentage of students who stated that the mess was caused by their
flatmates are 59% and 77%.
Choice of Sample Size for Specified Accuracy
In advance of taking a sample, it is not always clear how close the confidence limits for an unknown
parameter will be. However the eventual aim may be to be able to express the estimate for the
parameter within some acceptable margin of error.
Consider the case of finding 95% confidence limits for binomial p,
which would be given by:
n
q

z p

2 /
t
Suppose that we wanted our final estimate to be in the form
p e t
then
n
q

z e
2 /
so, for a 95% confidence limit
n
pq
e

1 96 .

giving n
e
pq

_
,

1 96
2
.

Example
Suppose we wished to know to within 1% the percentage of shoppers who regularly use a Tesbury
Supermarket loyalty card. From a small sample of 200 shoppers, it has been estimated that about 35%
do use these cards. Find how many more shoppers should be sampled.
n
e
pq

_
,

1 96
2
.

_
,


1 96
0 01
0 35 0 65
2
.
.
. .
8750 i.e. a further 8550 shoppers.
If we had not taken got the information from the small sample we would have to assume the worst case
which would be that
5 . 0 q p
. This would give n = 9604. So using the initial sample will save
asking approximately 850 shoppers.

Hypothesis Tests
When we considered estimation problems, we assumed that we had data generated by a known type of
model - normal, binomial etc.- but we did not know the value of some parameter(s) in that model, so
we had to estimate parameters such as or p by using information from sample data.
We now consider two different types of problem:
(1) Assuming that we know the type of model, we wish to test some assumption about parameter
values. This differs from estimation in that some value of the unknown parameter has to be
proposed for testing.
(2) We wish to test some assumption about the type of model itself.
In both cases, the basic idea is to check whether it is reasonable to conclude that the sample that has
actually been observed was really generated by the assumed model. We cannot be absolutely certain
about the conclusion but we try to ensure that it has a high probability of being correct.
To do this, we must first decide upon our basic assumption,
which is called the null hypothesis and is denoted by H0.
Using the sample data, we calculate the value of some test statistic
that depends upon H0 and upon the sample.
If the value obtained for the test statistic would have a high probability if H0 were true
then we accept H0 ; otherwise we reject H0 in favour of some pre-specified alternative hypothesis,
usually denoted by H1 or HA.
The procedure can be broken down into five steps:
Step 1: State H0 and H1.
Step 2: Select the test statistic.
Step 3: Specify the significance level, denoted by (alpha),
which is the lowest probability for accepting H0.
Also find the critical value(s) of the test statistic,
which will mark the borderline between accepting and rejecting H0.
Step 4: Assuming that H0 is true, use the sample data to calculate the value of the test statistic.
Step 5: Compare calculated and critical values of the test statistic,
hence decide whether to accept or reject H0.
To illustrate the steps in more detail, we consider the following example.
Example: Testing for normal when is unknown
Suppose that the scheduled time for a particular rail journey is 120 minutes. It is thought that journey
times are normally distributed but there is uncertainty about whether the mean really is 120 minutes.
Step 1: Working in minutes, set up the null hypothesis that the mean is 120, which we write as:
H0 : = 120 .
The simplest alternative hypothesis is that the mean might differ from this 120 in either
direction (so trains might, on average, be either faster or slower), in which case we write:
H1 : 120 ;
This is understood to cover the two possibilities: < 120 or > 120 .
Step 2: As test statistic we use t, which depends upon the sample mean , sample standard deviation,
sample size and the value of :
t
x
s n


/
.
We can see that t will have a large positive value if the sample mean is much larger than
or a large negative value if it is much smaller than . As for confidence limits on , the
number of degrees of freedom for t is = n 1 .
Step 3: We usually set the significance level at 5%, so that if the observed results have a probability
of less than 0.05 we shall reject H0 .
Note: The critical values will depend upon H1 .
If we have specified that might shift in either direction from 120, then we shall have two critical
values, marking out a critical region in two tails of the t-distribution. The total probability associated
with both tails is , with /2 lying to the right of t
/2 (the upper critical value) and /2 lying to the
left of t
/2 (the lower critical value).
-t
/2
/2
t
0
+t
/2
/2 Accept H
0
Reject H
0 Reject H
0
120
x
X
H
1
with < 120 H
1
with > 120
H
0
with = 120
Using the 5% level of significance, the critical values for a two-tailed test are therefore the 2.5% points
on the t-distribution, shown in Table 10. If we take a sample of 10 actual journey times, we have =
10 1 = 9 and the critical values are:
tcrit = t t0.025 = t 2.262 .
Step 4: Only now do we actually look at the observed journey times.
Suppose that the sample of 10 journey times gave a mean of 124 minutes and a standard
deviation of 8 minutes. Using = 120 as if H0 were true, the calculated value of the test
statistic is:
tcalc = (124 -120)/(8/10) = 1.58 .
Step 5: Since tcalc lies between the critical values (t 2.262), we accept H0 and conclude that the
mean journey time could be 120 minutes. If we had found that tcalc < 2.262 or > 2.262, we
would have rejected H0 in favour of H1 and concluded that the mean had shifted from 120 minutes.
Note: If we had decided to test the alternative hypothesis that the shift in the mean could only have
occurred in one direction, then we would have chosen a critical value that cut off in just one tail
of the t-distribution. If the shift could only be to the right then consider the right tail, if it could only
be to the left then consider the left tail.
Exactly the same principles apply in carrying out any of the following tests.
A Summary of Hypothesis Tests
1. Single sample test for normal when is unknown , use
t
x
s n


0
with = n 1
2. Single sample test for binomial p , use
( )
[ ]
z
p p
p p n

0
0 0
1
3. Comparing normal means from two paired samples
Consider individual differences and carry out the test on the differences using the t statistic as in
test 1 above.
4. Comparing normal means from two independent samples .
Calculate a pooled estimate of the common variance
( ) ( )
s
n s n s
n n
p
2
1 1
2
2 2
2
1 2
1 1
2

+
+
and use
( )
2 1
p
0 2 1
n
1
n
1
s
x x
t
+


with = n1+n2 2
5. Comparing proportions from two independent samples .
Calculate a pooled estimate of the common p which is
p
x x
n n

+
+
1 2
1 2
and use
( )

,
_

2 1
2 1
n
1
n
1
p

1 p

z
Each of these tests is illustrated by a worked example on the following pages.
Example (Single sample test for normal when is unknown one
tail)
Mr A Thug, a mugger believes that Aston students attending a nightclub carry on average 15 in cash
on them and so are not worth bothering with.
In a sample of 25 such students the mean amount of cash carried was found to be 16 with s = 4. You
may assume that the amount of cash carried is normally distributed.
Test Mr A Thugs hypothesis at the 5% level of significance.
Step 1: State the null and alternative hypothesis
The null hypothesis is the statement that says the claim is true. H0 : = 0 = 15
The alternative hypothesis is the statement that says the claim is False. H1 : > 15
Step 2: Select the test statistic
This is a single sample test for normal when is unknown.
So we use the test statistic:
t
x
s n


0
with = n 1 = 25 1 = 24
Step 3: Specify the level of significance and find the critical value(s)
5% level of significance so = 0.05. This is a one tailed test so tcrit = 1.711
t
crit
5%
t
0
Step 4: Calculate the test statistic
t
x
s n


0

16 15
4 25

1
4 5
1 25 .
Step 5: Compare the sample statistic with the critical value and make a decision
tcrit 1.25 Accept Ho There is insufficient evidence to reject Mr A Thugs claim.
Example (Single sample test for binomial p two tails)
A lecturer has claimed that 80% of their students can perform a hypothesis test.
A random sample of 36 students where examined and 21 where able to perform such a test.
Is there evidence at the 5% level of significance that the claim is wrong?
Step 1: State the null and alternative hypothesis
The null hypothesis is the statement that says the claim is true. H0 : p0 = 0.8
The alternative hypothesis is the statement that says the claim is False. H1 : p0 0.0.8
Step 2: Select the test statistic
This is a single sample test for binomial p so we use the test statistic:
( )
[ ]
z
p p
p p n

0
0 0
1
Step 3: Specify the level of significance and find the critical value(s)
5% level of significance so = 0.05. This is a two tailed test zcrit = 1.960
Step 4: Calculate the test statistic
( )
[ ]
z
p p
p p n

0
0 0
1
( )
[ ]

0 58333 0 80
0 8 1 0 8 36
. .
. .
[ ]

0 21667
0 16 36
.
.



0 21667
0 0667
3 25
.
.
.
Step 5: Compare the sample statistic with the critical value and make a decision
|-3.25| |zcrit | so reject Ho as the claim is sufficiently unlikely. So accept H1
-1.96
2.5%
z
0 1.96
2.5% Accept H
0
Reject H
0
Reject H
0
Example (Comparing normal means from two paired samples)
Five workers were timed on a particular task, before and after they had received training.
In the following table x denotes the time taken before training and y denotes the time after training,
both measured in minutes:
Worker A B C D E
x 240 260 270 250 255
y 231 254 272 245 253
Carry out a suitable t-test at the 5% level of significance to determine whether the mean job time after
training is significantly less than that before training.
Solution
Let the mean job time before training be 1 and that after training be 2 . Also let the mean of the
differences, d = x y be D. From the rules for expectation, it can be shown that D = 1 2 .
We can then proceed with the usual five steps for testing.
Step 1: State the hypotheses
These may be stated either in the form
(1) H0 : 1 = 2 (the means before and after are equal)
versus H1 : 1 > 2 (mean after training is smaller)
or (2) H0 : D = 0 (mean of differences is zero)
versus H1 : D > 0 (mean of differences is positive)
Whichever form is used, it is clear that H1 is concerned with a shift in one direction only,
so we shall use a one-tail test.
Step 2: Select the test statistic
We are going to carry out a version of Test 1, using the individual differences, d = x y , as if they
were the x-values. Denoting their sample mean by
d
and their sample standard deviation by sd we
therefore use the test statistic
t
d
s n
D
d


/
Step 3: Specify the level of significance and find the critical value(s)
The question specified the 5% level, so = 0.05 . Since we are using a one-tail test, the critical value
is t0.05 for = n 1 = 5 1 = 4 , which is 2.132 .
Step 4: Calculate the test statistic
From the table of data, find the individual differences and then calculate their mean and standard
deviation.
Worker A B C D E
x 240 260 270 250 255
y 231 254 272 245 253
xy = d 9 6 2 5 2 d = 4.0 , sd = 4.183
tcalc = (4.0 0) / (4.183/5) = 4 5 / 4.183 = 2.138
Step 5: Compare the sample statistic with the critical value and make a decision
Our calculated value of 2.138 is only very slightly greater than the critical value 2.132. In a situation
like this we would normally try to obtain more information before making a decision. In this case it
would be sensible to observe more workers before and after training, so as to obtain a larger sample.
Note that if we had asked the vaguer question Is there evidence of a change in the mean job time?
this would have led to a two-tail test, for which the critical values would have been t 2.776.
Our calculated value would still have been 2.138, which would lie between the critical values, leading
us to accept the null hypothesis of no change in the mean.
Example (Comparing normal means from two independent samples)
A review in a motoring magazine claimed that cars of Type 1 have a mean fuel consumption that is 5
miles per gallon (mpg) more than that for cars of Type 2. A test was carried out using 8 cars of Type 1
and 10 cars of Type 2. The sample mean was found to be 43mpg for Type 1 and 40 mpg for Type 2;
the sample standard deviation was 3.0 for Type 1 and 2.5mpg for Type2. Carry out a suitable t-test at
the 5% level of significance to determine whether the difference in the underlying mean consumption
for the two types of car really is 5mpg as claimed in the magazine.
Solution
Let the mean consumption be 1 for Type 1 and 2 for Type 2. Similarly, use labels 1 and 2 for
sample statistics relating to Type 1 and Type 2 respectively. For this type of problem it is useful to
introduce the notation 0 for the difference 1 2 when H0 is true. We then proceed with the usual
five steps.
Step 1: State the hypotheses
H0 : 1 2 = 0 = 5.0 versus H1 : 1 2 0 = 5.0 (2-tail test).
Step 2: Select the test statistic
In order to do this, we assume that the population variances for the two types are equal, so that
1
2
= 2
2
= some unknown
2
. A formal test can be carried out to check that this assumption is
valid but for small samples it nearly always does prove to be valid, so we omit the test. However, we
need to estimate the unknown common variance
2
.; to do this, we pool information about variance
from both samples to find the pooled estimate of the common variance, sp
2
.
s
n s n s
n n
p
2 1 1
2
2 2
2
1 2
1 1
2

+
+
( ) ( )
.
The test statistic is then
t
x x
s
n n
p


+
( )
1 2 0
1 2
1 1

with = n1 + n2 2 .
Step 3: Level of significance and critical value(s)
Using the 5% level for a two-tail test with = n1 + n2 2 = 8 + 10 2 = 16 ,
the critical values for t are the 2.5% points for 16 degrees of freedom , t 2.120 .
Step 4: Calculate the test statistic
We know that n1 = 8 , x
1
43 , s1 = 3.0 , n2 = 10 , x
2
40 , s2 = 2.5 ,
hence sp
2
= [(8-1)(3.0)
2
+ (10-1)(2.5)
2
] / (8 + 10 -2) = 119.25 / 16 = 7.4531
and so sp = 2.73 .
Then tcalc = [(43 - 40) - 5] /[2.73(1/8 +1/10)] = 2 / 1.295 = 1.54 .
Step 5: Compare the sample statistic with the critical value and make a decision
Since 1.54 lies between the critical values , t 2.120 , we accept the null hypothesis and conclude that
the magazines claim is valid.
Example (Comparing proportions from two independent samples)
Samples of male and female students were selected as part of a survey of holiday destinations.
Out of 120 male students, 50 had been abroad (outside UK), and out of 80 female students, 42 had
been abroad. Is it reasonable to conclude that the same proportion of male and female students in
general had been abroad?
Solution
Let suffix 1 denote male students and suffix 2 denote female students. Let the proportions of
the male and female student populations who had been abroad be p1 and p2 respectively and let
the corresponding observed proportions be
p
1
and
p
2
.
Step 1: H0: p1 = p2 = some common unknown p
versus H1: p1 p2 (two-tail test).
Step 2: Calculate a pooled estimate of the common p ,

p
x x
n n

+
+
1 2
1 2
and then the test statistic is
z
p p
p p n n

(

)[( / ) ( / )]
1 2
1 2
1 1 1
.
Step 3: Two-tailed test at 5% level so critical values are the 2.5% points of the standard normal
distribution, z = t 1.96 (from Table 5).
Step 4: To evaluate the test statistic, we first need to calculate the observed proportions for the two
samples and the pooled estimate of the common proportion.
46 . 0
200
92
80 120
42 50
p

; 5250 . 0
80
42
p

; 4167 . 0
120
50
p

2 1

+
+

Hence
z
calc


+

0 4167 0 5250
0 46 0 54 1 120 1 80
0 1083
0 0719
1 505
. .
( . )( . )[( / ) ( / )]
.
.
.
.
Step 5: Since zcalc lies between the critical values, we accept the null hypothesis and conclude that the
underlying proportions of male students and female students who have been abroad is the same.
Chi-squared Tests for Expected and Observed Frequencies
We now consider a family of tests that are based on the following idea. If p is the probability of a
particular event happening when some random experiment is performed once, then the expected
number of such events when the experiment is performed n times will be np. This expected frequency
will usually be different from the frequency that is actually observed when the experiment is performed
n times. If the expected and observed frequencies are very different, this may indicate that the wrong
probability has been used to calculate the expected frequency.
If we consider an experiment that results in just one type of event (success) or its complement
(failure), hypotheses about the value of p for one or two samples can be tested using an appropriate
z-test (Test 2 or Test 5 on page 19 above). However, when there are more than two possible classes of
outcome from the experiment, more general methods are needed and one of the most widely used of
these is based on the statistic
2
which is called chi-squared.[Note that, in English, the ch in
chi is pronounced like a k, as in chaos, chorus, echo and other words derived from Greek.]
Suppose that there are k classes of possible outcome from the experiment and that class i has
probability pi for i = 1 to k. If we perform the experiment n times, the expected frequency of a result in
class i will be npi , which we shall write as Ei . Writing Oi for the corresponding observed frequency,
we then define the chi-squared statistic as follows:

2
2
1

( ) O E
E
i i
i
i
k
.
The calculated value of this statistic is compared with a critical value found in Table 8. Since any
difference between Oi and Ei will cause
2
to increase, we always use a one-tail test, so the critical
value for 5% significance will be the value of
2
that cuts off 5% of the probability area in the right
tail of the distribution. As in the case of the t-distribution, we have to consider the appropriate number
of degrees of freedom for
2
, which is again denoted by .
The basic rule is:
No. of degrees of freedom for
2
= (No. of separate classes) (No. of constraints on frequencies).
In the basic
2
-test, we have k classes and just one constraint, which is that the frequencies for all
classes add up to the total number of observations, n , and so = k 1 .
Example (Testing the Fairness of a Die)
The following table shows the result of 120 tosses of a single six-sided die:
Face of die 1 2 3 4 5 6
Obs.frequency 19 22 17 18 19 25
Carry out a
2
-test at the 5% level of significance to check whether the die is fair.
Solution
Our classes of outcome here correspond to the various faces of the die. The statement that the die is
fair implies that each face has an equal probability and hence that this probability should be 1/6. We
therefore write our null and alternative hypotheses as follows:
H0 : pi = 1/6 for i = 1 to 6 versus H1 : pi 1/6 for at least one value of i .
The test statistic is
2
and the critical value is the 5% point for = 61 = 5, which is 11.07 .
We can explain =5 by saying that values of five of the frequencies can be chosen independently but
the sixth frequency must be 120 minus the sum of the other five.
To calculate the value of
2
if the null hypothesis is true, we first find the expected frequency for each
class, which is Ei = npi = 120 (1/6) = 20 for each class, and so:

calc
2
2 2 2 2 2 2
19 20
20
22 20
20
17 20
20
18 20
20
19 20
20
25 20
20


+

+

+

+

+
( ) ( ) ( ) ( ) ( ) ( )
4 4 / 2 0 2 . 2 .
Since this calculated value is less than the critical value of 11.07, we accept the null hypothesis and
conclude that the die is fair.
Exactly the same procedure would be followed for any example where there are k classes that are each
assumed to have probability 1/k. Another example might be to check whether some kind of event was
equally likely to occur on any day of the week; the classes would then correspond to the 7 days of the
week and the probabilities would all be 1/7.
WARNING
This type of
2
-test is based on an approximation that is only valid when the expected frequencies are
reasonably large. For =1, each expected frequency should be greater than about 10. For > 1,
each expected frequency should greater than about 5. In order to obtain expected frequencies that are
sufficiently large, it may be necessary to combine classes.
Example (Testing for a Poisson Distribution with Specified Mean)
This illustrates a modification of the basic test in which the class probabilities are no longer assumed to
be equal but are now given by a particular theoretical distribution. It also illustrates the idea of
combining classes to obtain sufficiently large expected frequencies.
The number of phone calls received in one hour by a sales office was monitored over 100 hours at
comparable times of day. The results are shown in the following table:
No. of calls, x 0 1 2 3 4 5 6 7 8 >8
Obs. frequency 11 30 28 17 12 0 1 0 1 0
Carry out a
2
-test at the 5% level to determine whether the hourly number of calls could reasonably
be considered to have a Poisson distribution with mean 2.0 .
Solution
We shall test H0 : distribution is Poisson with mean 2.0
versus H1 : distribution is not Poisson with mean 2.0 .
Note that this is a one-tail test because, as explained above, any difference between observed and
expected frequencies causes the test statistic
2
to increase.
The next step is to define the classes into which the observations will be arranged.
Here the classes correspond to the values of x and the class probabilities are found from the Poisson
probability function f(x) =
x
exp(- )/x! . Note however that we must consider the whole
distribution, so we include a class that covers all the higher values of x; the probability for this class
will be 1 (probability of all the lower values of x). Poisson tables for = 2 show that F(7) = 0.999
and F(8) = 1.000 to 3d.p., so we initially define our last class as X 8 with probability 0.001.
Then, either using tables or the Poisson formula, we find the probabilities for classes corresponding to
x = 0, 1, . . . , 7 and multiply by 100 to find the expected frequencies as shown in the table below:
Class 1 2 3 4 5 6 7 8 9
x 0 1 2 3 4 5 6 7 8+
Prob. .135 .271 .271 .180 .090 .036 .012 .003 .001
Ei 13.5 27.1 27.1 18.0 9.0 3.6 1.2 0.3 0.1
Now we see that the Ei values for classes 6 to 9 are all less than 5 but adding them together we would
get an expected frequency of 3.6+1.2+0.3+0.1 = 5.2 for a new class 6 which would be X 5.
The observed frequency for this new class will be 2 = (total no. of observed values 5).
Our final table for observed and expected frequencies is therefore:
Class 1 2 3 4 5 6 Total
Oi 11 30 28 17 12 2 100
Ei 13.5 27.1 27.1 18.0 9.0 5.2 99.9 (error from using 3 d.p.)
After combining, we have the no. of classes k = 6 , so = k1 = 5
and the critical value is the 5% point for
2
with = 5, which is 11.07 (as in the previous example).
We now calculate the contribution to
2
= (Oi Ei )
2
/Ei for each class and sum these to find
2
calc. .
Class Contribution. to
2

1 (1113.5)
2
/13.5 = 0.463
2 (3027.1)
2
/27.1 = 0.310
3 (2827.1)
2
/27.1 = 0.030
4 (1718.0)
2
/18.0 = 0.056
5 (12 9.0)
2
/ 9.0 = 1.000
6 (2 5.2)
2
/ 5.2 = 0.463
Total =
2
calc. .= 3.828
Since 3.828 is less than the critical value, 11.07, we accept H0 and conclude that the distribution could
be Poisson with mean 2.0 .
Note: The precise value of
2
calc. will depend upon the number of decimal places used for the expected
frequencies. For hand calculation, we recommend calculating these to 1 d.p. and calculating the
contributions to
2
to 3 d.p., as shown above.
In this first example of checking whether a Poisson distribution fits the data, we assumed that the
value of the parameter was known. The only constraint on the frequencies was that they sum to n,
so only one degree of freedom was lost and = k1.
If we had to estimate another degree of freedom would be lost, because this imposes the additional
constraint that x O nx
i i
and so now equals k 2 .
Example (Testing for a Poisson Distribution with Unknown Mean)
With the same data as in the previous example, using the observed frequencies before combining, the
sample mean is :
[0(11) + 1(30) + 2(28) + 3(17) + 4(12) + 5(0) + 6(1) + 7(0) + 8(1)] / 100 = 199 / 100 = 2.0 to 1 d.p.
Using this as our estimate of the unknown population mean , we would get the same probabilities as
before. Also the calculations for combining classes and evaluating
2
calc would be the same.
The only differences would be that our null and alternative hypotheses would refer to some Poisson
distribution without specifying its mean and our critical value would now be the 5% point for
2
with 6 2 = 4 degrees of freedom, which is 9.488. In this case we would again accept the null
hypothesis.
The same basic method can be extended to testing for any kind of theoretical distribution (e.g. the
normal) but care must be taken to combine classes corresponding to adjacent values of x so as to get
large enough expected frequencies and also to allow for the loss of one degree of freedom for each
unknown parameter that is estimated from the data.
Contingency Tables
The basic chi-squared test can also be extended to testing whether two or more ways of classifying data
are independent.
Example (Is choice of newspaper associated with gender?)
Suppose that a survey of newspaper readership has been carried out and the number of readers
tabulated by newspaper title and gender of reader are as follows:
Times Telegraph Guardian Row total
Male 182 215 203 600
Female 154 136 110 400
Column total 336 351 313 Grand total 1000
We want to find whether there is some statistical association between gender and choice of title.
Solution
We start with the null hypothesis that the two criteria for classification are independent.
The alternative is that they are not independent - in other words they are associated.
Our test statistic will be
2
and we test at the usual 5% level, but to find the critical value
we first need to consider how to estimate probabilities and expected values.
To illustrate the method, consider Male readers. Selecting at random from our grand total of 1000
readers, the probability of selecting a Male reader will be P(Male) = (No.of Males)/1000 = 600/1000.
Similarly, the probability of selecting a Times reader will be P(Times) = (No.of Times readers)/1000 =
336/1000
If being Male and being a Times reader are independent, the simple multiplication rule tells us that
P(Male and Times) = P(Male) P(Times) = (600/1000) (336/1000) = 600 336/(1000)
2
.
The expected number of Male and Times is 1000 P(Male and Times) = 600 336/1000 = 201.6 .
In general, the expected frequency for the class corresponding to row i and column j in the table is
Eij = (Total for row i) (Total for column j)/(Grand total) .
Now in each row, all but one of the frequencies is independent and the last one must be such as to
make the row total come to the figure we use for calculating the expected frequencies. Similarly in
each column, all but one of the frequencies is independent but the last one must make the column total
come to the figure used to calculate the expected frequencies.
The total number of independent frequencies gives us the number of degrees of freedom, which is
= (No.of rows 1) (No.of columns 1).
For our newspaper readership example we therefore have = (2 1) (3 1) = 2
and the critical value is the 5% point for
2
with = 2, which is 5.991.
It is convenient to write the expected frequencies in brackets beside or below the corresponding
observed frequencies. As usual , we round the expected frequencies to 1 d.p.
Times Telegraph Guardian Row total
Male 182 (201.6) 215 (210.6) 203 (187.8) 600
Female 154 (134.4) 136 (140.4) 110 (125.2) 400
Column total 336 351 313 Grand total 1000
Contributions to
2
are calculated in the usual way. It is convenient to arrange these by row and
column:
2
calc = 1.906 + 0.092 + 1.230
+ 2.858 + 0.138 + 1.845 = 8.069.
Since this is greater than the critical value (5.991), we reject H0 and conclude that gender and title of
newspaper are associated. It is most important to remember that the null hypothesis for this type of
problem is always that there is no association.
Covariance and Correlation
The categories used in constructing contingency tables may be the values assumed by quantitative
variables, in which case the value of
2
calc is a measure of how closely the variables are associated.
A more informative measure is the covariance, which is based on the actual values of the variables
rather than just the number of values of each of them that corresponds to a particular cell in the
contingency table.
Suppose that we are interested in the relationship between two variables, X and Y. We have n pairs of
observations, (xi , yi) for i = 1 to n . From these we can calculate the following sums of terms:
x , x
2
, y , y
2
and xy , where denotes summation over i = 1 to n in each case.
[Note that some calculators have special facilities for entering the (xi , yi) as pairs and automatically
calculate each of the five sums.]
The next step for hand calculation is to evaluate each of the following:
Sxx = x
2
(x)
2
/n , Syy = y
2
(y)
2
/n , Sxy = xy (x)(y)/n .
Note that Sxx/(n1) is the sample variance for X, denoted by s
2
x or var(x) ,
similarly Syy/(n1) is the sample variance for Y, denoted by s
2
y or var(y) ,
and we now define Sxy/(n1) to be the sample covariance for X and Y, denoted by cov(x,y).
Example (Covariance between Height and Weight)
The following table shows the height and weight of six people who were selected at random from a
particular population.
Height in ft, x 5.1 5.3 5.7 5.8 6.0 6.2
Weight in lb, y 70 112 133 154 165 230
From this table we find
x = 34.1, x
2
= 194.67, y = 864, y
2
= 138 974and xy = 5017.9;
then Sxx = x
2
(x)
2
/n = 194.67 (34.1)
2
/6 = 0.8683,
Syy = y
2
(y)
2
/n = 138 974 (864)
2
/6 = 14 558,
Sxy = xy (x)(y)/n = 5017.9 (34.1)(864)/6 = 107.5 .
The sample covariance is therefore
cov(x,y) = 107.5 / 5 = 21.5 .
The difficulty about interpreting this as a measure of association between height and weight is that it
depends on the units of measurement. If we had calculated the covariance between the heights and
weights of the same individuals measured in metres (1m = 3.28ft) and kilograms (1kg = 2.20lb), we
would have found that cov(x,y) = 2.98 but the closeness of association would have been just the same.
To overcome this dependence on units, we can calculate the covariance between the standardised
values of x and y, which are obtained by subtracting the sample means and dividing by the sample
standard deviations. The resulting statistic is called the sample correlation between x and y and is
denoted by r(x,y) or simply r. The quick way to calculate r is to use the formula;
r
Sxy
Sxx Syy

.
For our height and weight data, r 107 5 0 8683 14558 0 956 . . . .
The correlation is a measure of linear (i.e. straight-line) relationship between two variables. It can be
shown that the value of r must lie between 1 and +1. If r = +1, an increase in one of the variables is
exactly matched by a linear increase in the other; if r = 1, an increase in one of the variables is exactly
matched by a linear decrease in the other. A value of r near zero indicates that there is no linear
relationship between x and y - although there may be some clear nonlinear relationship.
A Test for Zero Correlation between Normal Variables
We have seen that r measures linear relationship between sample values. There is a corresponding
quantity denoted by (rho) which measures linear relationship between the populations of X and
Y. In the same way that the sample mean provides an estimate of the population mean ( ) and the
sample variance provides an estimate of the population variance (
2
), so r provides an estimate of .
The interpretation of this estimate is complicated by the effect of sample size, since small samples quite
often give high values of r even when the underlying value of is small.
When X and Y can reasonably be assumed to be normally distributed, we can carry out a simple test
using the t-distribution to check whether is really zero.
We test H0 : = 0 versus H1 : 0 (two-tail test).
The test statistic is t = r [(n 2) / (1 r
2
)] with degrees of freedom = n 2 .
Example (Testing for Correlation between Height and Weight)
Consider the last example, in which we found that the sample correlation between height and weight
for six people was 0.956. Assuming that height and weight are normally distributed (which has often
been checked and confirmed), we now want to see whether this result indicates a real (non-zero)
correlation between height and weight for the population from which the sample of six was drawn.
Solution
We test H0 : = 0 versus H1 : 0 (two-tail test).
Note that the null hypothesis assumes zero population correlation.
The degrees of freedom = n 2 = 6 - 2 = 4, so for a 2-tail test at the 5% level
the critical values are t 2.776 .
Then tcalc = r [(n 2) / (1 r
2
)] = 0.956 [(6 2) / (1 0.956
2
)] = 6.52 .
Since this lies far beyond the upper critical value, we reject H0 and conclude that there is indeed a non-
zero correlation between height and weight for the population from which the sample was drawn.
The Distinction between Association and Causal Relationship
When looking at measures of association, such as correlation or
2
in contingency tables, it is most
important to understand that association does not necessarily imply cause and effect - thus being male
does not cause someone to read a particular newspaper, nor does being tall cause someone to be heavy.
Statistical methods can only highlight association; other forms of analysis must then be used to
establish causal mechanisms. For example, statistical methods showed that there is an association
between smoking tobacco products and developing various forms of cancer, but biomedical research
was needed to establish how the cancers are caused.
Decision Analysis
Introduction to Differentiation
Differentiation is the process of finding the rate at which change takes place. For example, the rate of
change of distance with respect to time is called speed.
The result of the process of differentiation is called a derivative.
Hence, the speed can be found by calculating the derivative of the distance covered with respect to
time. In the graph above, it is represented by the slope of the lines.
Business Examples for derivatives are:
Marginal Cost the rate of change of cost with respect to the level of production.
Marginal Profit the rate of change of profit with respect to the level of production.
Price Elasticity of Demand - the rate of change in demand with respect to price.
The slope of a straight line
-5
y
x
5
5 2
x
y
In general, the slope, a, of a straight line
which is given by a function y = ax + b is the rate of change of y with respect to a change in x.
slope =
dy
dx

y
x
a
where x is the difference between any two values of x and y is difference between the
corresponding values of y. The slope of the straight line is equivalent to the derivative of the
corresponding linear function.
7.5
2.5
Time (minutes)
Distance (km)
5 10
The distance covered in the
first 5 minutes is 2.5 km.
Constant speed in first 5
minutes:
30 km/h = 0.5 km/min
The distance covered in the
next 5 minutes is 5 km.
Constant speed in next 5
minutes:
60km/h = 1km/min
Consider the straight line y = 3x - 5
which goes through the points (2,1) and (5,10).
So the change in x is x = 5 - 2 =3

and the corresponding change in y is y = 10 - 1 = 9.
Hence, the slope of the line is given by the ratio
slope =

y
x
a
9
3
3
Finding the slope of a curve
The method that was used for straight lines will not work for curves because the slope of a curve
changes. Only a straight line has a constant slope, i.e. the same slope for all values of x.
Instead of looking at the curve itself, we need to look at a linear tangent to the curve:
y
x
P
Q
Curve y=f(x)
Chord
PQ
Tangent at P

Definition of Differentiation
This process of finding the limit of the ratio of the change in y to the change in x, y/ x, as the
change in x, x, tends towards zero is called differentiation. So, the derivative of y with respect to x
is defined as the limit of the gradient of the cord PQ as h, the change in the x value, tends to zero.
( ) ( ) dy
dx x
y
x h
f x h f x
h

1
]
1

1
]
1
1
lim lim

0 0

The result of the process of differentiation is called the derivative of y with respect to x (or simply the
derivative if no other variable is involved).
Rules for derivatives
Rather than calculate every derivative from first principles i.e. calculating the limit value defined
above a number of rules are used. These are summarised in the following table:
f(x) = y a ax
ax
n

ln(x) ln(ax
b
) e
ax
be
ax
dy
dx
0 a
anx
n1
1
x
b
x
ae
ax
abe
ax
If the function to be differentiated contains the sum or difference of a number of terms, the derivative is
the sum or difference of the derivatives of the individual terms.
Examples
1. Differentiate the function y = 2x
3
and hence find the slope of the curve at x = 3.
Recall that if y = ax
n
then
dy
dx
anx
n

1
so,
dy
dx
x x

2 3 6
3 1 2
So the slope of the curve at x = 3 is:
dy
dx
6 3 54
2
2. Differentiate the function y = x
3
- 2x
2
+ 15x

+76

A chord is a straight line joining any two points on a curve.
A tangent to a curve is a straight line which just touches but does not cross
the curve.
The slope, or gradient, of the curve at point P is the same as the gradient of
the tangent drawn at point P.
If we draw shorter chords on the curve then as point Q becomes closer to P
the gradient of the chord will increasingly approach the gradient of the
tangent at point P. By this method, the slope of the curve can be found
analytically.
Recall: if the function to be differentiated contains the sum of a number of terms, the derivative is the
sum of the derivatives of the individual terms.
Hence:
15 4 3 0 15 2 2 3 1
2 1 1 1 2 1 3
+ + +

x x x x x
dx
dy
3. Differentiate the function y = ln(x) + e
3x


:
dy
dx x
e
x
+
1
3
3
Second & higher order derivatives
Recall that the rate of change of distance with respect to time is called speed. It can be found by
calculating the derivative of the distance covered with respect to time. Similarly, the acceleration is the
rate of change of speed with respect to time. The acceleration is called the second derivative of
distance with respect to time.
In general, the second derivative gives the rate of change of the first derivative, i.e. the rate by which
the rate of change - and hence the slope of the original function - itself changes.
The second derivative is denoted by
d y
dx
2
2
, the third derivative is denoted by
d y
dx
3
3
and they are
calculated using the same rules as for the first derivative.
Example
Find the second and third derivatives of the function y = x
3
- 2x
2
+ 15x

+ 76

with respect to x.
First derivative (see above): Second derivative:
15 4 3
2
+ x x
dx
dy

d y
dx
x x
2
2
2 1
3 2 4 6 4

d y
dx
3
3
6
Note that for polynomials the degree, i.e. the highest number in the exponent, decreases by one for each
derivative that is calculated. I.e., a square function like y = 2x
2
has a linear first derivative (4x) and a
constant second derivative (4).
Optimisation using Differentiation
In Business and Management, we are often interested in optimising, i.e. in maximising or minimising
some objective, as for example:
Maximising profit
Maximising revenue
Minimising cost (e.g., when producing a fixed amount of goods)
Minimising risk
Hence, it is important to be able to find the output which maximises profit or the sales price which
maximises revenue. This can be done by using differentiation.
Maximum and minimum values of a function occur when the slope of the respective curve is zero. That
is, the first derivative of the function is zero at the point where the function has a local maximum or
minimum:
0
dx
dy
Moreover, the direction of the change in slope indicates whether the respective point is a maximum or
minimum point. That is, the sign of the second derivative at the turning point indicates whether there is
a maximum or a minimum. If the slope decreases, i.e. the second derivative is negative, then the
function has a maximum at his point, if the slope increases, i.e. the second derivative is positive, then
there is a minimum.
So the location and type of the turning point can be determined by calculating the first derivative,
setting it equal to zero, and then checking for the sign of the second derivative.
Local maxima and minima can be illustrated as follows:
+ slope
- slope
slope = 0
Maximum
d y
dx
2
2
0 <

- slope + slope
slope = 0
Minimum
d y
dx
2
2
0 >
Decreasing slope Increasing slope
Example
Find the turning points of the function y = x
3

- 6x
2

+ 9x

+ 6
dy
dx
x x + 3 12 9
2


For such a point, we must have:
0 3 12 9
2
+ x x
Solve the quadratic equation by using the appropriate formula (see Quantitative Techniques):
x =
12 12
2
t

t 4 3 9
2 3
12 144 108
6
12 36
6
12 6
6
There are two turning points: x = 1 and x = 3.
Differentiating
dy
dx
x x + 3 12 9
2
, i.e. calculation of the second derivative gives
d y
dx
x
2
2
6 12
When x = 1,
d y
dx
2
2
6 12 6 , and there is a local maximum at x = 1.
When x = 3,
d y
dx
2
2
18 12 6 + , and there is a local minimum at x = 3.
Points of Inflexion
It is possible for a turning or stationary point i.e. a point at which the slope and hence the first
derivative is zero to be neither a maximum nor minimum.
In these cases, the (positive) slope decreases to zero and then increases again or (in the case of a
negative slope) it increases to zero and then decreases again.
Such points are called points of inflexion. At such a point, the first and the second derivative both take
value zero. A point of inflexion may, e.g., look as illustrated below:
Stock Control Example
There is a constant demand for an item of D units per month. It costs Hc per unit per month to hold the
item in stock, and it costs Rc each time an order is placed for a new delivery of the item in
administration charges. If there are no time delays between re-ordering and the delivery of new stock,
and shortages are not allowed, find the most economic order quantity.
If the item is ordered in batches of q items, then D/q orders will be placed each month, at a cost of
RcD/q.
The average amount of stock will be q/2 and so the holding costs each month will be Hcq/2.
So the total costs per month are Tc(q) = RcD/q + Hcq/2.
Differentiating gives
( ) dT q
dq
R Dq
H
c
c
c
+
2
2
=! 0
So there are stationary points when R Dq
H
c
c

2
2
, that is q
R D
H
c
c
t
2
.
(This can be found by reformulating the above equation in standard format and applying the formula
for solving quadratic equations.)
The second derivative is
( ) dT q
dq
R Dq
c
c
2
2
3
2

> 0, as all numbers are non-negative.
Clearly negative orders do not have a practical interpretation, and so the minimum of monthly total cost
is reached when q
R D
H
c
c

2
. This is called the Economic Order Quantity (EOQ).
Partial derivatives
When a function varies with more than one variable, the derivative of the function with respect to one
variable can be found holding the other variable(s) constant; this is known as the partial derivative.
Functions in more than one variable occur, e.g., when modelling the profit or the production of multi-
product firms. In these cases, we need to introduce one independent variable for the quantity of each
of the products.
Example
f(x,y) = 2x + 3xy - y
2
Increasing slope
Slope = 0
Decreasing slope
When differentiating with respect to x we hold y constant, so
f(x,y) = 2x +Ax + B, where A = 3y and B = -y
2
.
Using the rules of differentiation gives

f x y
x
y
( , )
+ 2 3
Note that a (curly d) is used instead of a d to indicate that a partial derivative has been found.
Similarly, holding x constant and differentiating with respect to y gives

f x y
y
x y
( , )
3 2
.
Partial derivatives can be used to find the stationary points, i.e. the maxima and minima, of functions of
more than one variable, by setting the partial derivatives with respect to each of the variables equal to
zero.

However, the rules for identifying the stationary points for functions of more than one variable are
more complicated than for a single variable and are beyond the scope of this course.
The Transportation Problem
When a company can supply its customers from a number of warehouses (or factories) it may wish to
find a method of doing so which minimises transportation costs, while not exceeding the capacities of
the individual warehouses. We will make the following simplifying assumptions:
1. Items are shipped individually, that is there are no savings (or extra costs) per item when several
items are shipped from the same supplier to the same destination.
2. The items are identical, that is the customer has no preference as to which warehouse(s) the items are
shipped from.
3. No shipping from warehouse to warehouse, or to a customer via another customer.
4. Multiple deliveries are acceptable, i.e. a customer can receive the items he wants from more than
one warehouse.
5. The scheduling of deliveries will not impose any further constraints.
Some Definitions
Feasible Solution This is a solution in which all the constraints are met. That is all the demands are
exactly met and all the supplies are used.
Basic Solution This is a solution in which exactly one less route is used than the number of suppliers,
n, plus the number of destinations, m. That is (n+m -1) routes. The optimum solution will never
require more routes.
Basic Feasible Solution This is a solution that is both basic and feasible.
Degeneracy This is when we have a solution that involves less routes than the basic n+m -1. It is
normally represented by using n+m -1 routes but carrying zero units on some of them.
Solution Method
1. Balance supply and demand using a dummy destination.
2. Find a basic feasible solution (e.g. Least Cost First).
3. Check if the solution can be improved (e.g. Stepping Stone Algorithm).
4. If an improvement is possible, make it and repeat from step 3.
Example
Duff Machine Tools manufacture machine tools at three factories. The Leeds factory can produce 15
machines each month, the Manchester factory 20 machines and the Nuneaton factory 10 machines.
In a particular month, there are no machines in stock, and customers in Aston must be supplied with 7
machines, 17 machines each must go to customers in Bradford and Cardiff.
The transportation costs (in ) for each machine are given below:
Aston Bradford Cardiff
Leeds 10 1 20
Manchester 12 7 9
Nuneaton 2 14 16
1. Balance supply and demand using a dummy destination.
The total number demanded by all customers is 41 machines and there is a supply of 45 machines so
we introduce a dummy customer with a demand of 4 machines and zero transportation cost to represent
the unused supply.
The problem then can be represented in a special tableau form which is shown below. The factories or
warehouses are put into the rows, the customers are represented by the columns of the tableau. The
amounts available or required are written next to the respective row or column. The costs for
transporting an item (here: one machine) from an origin to a destination can be found in the upper right
corner of each cell.

15
20
10
10 1 20 0
12 7 9 0
2 14 16 0
Aston Bradford Cardiff Dummy
Leeds
Manchester
Nuneaton
Destination
Required
7 17 17 4
Available
Supplier
2. Find a basic feasible solution (using Least Cost First)
1 Assign as much as possible to the real route with smallest unit cost. (With ties pick one at random.)
2 Cross out the row or column which is now satisfied. (If both are satisfied only cross out one of them.)
3 Recalculate the supply and demand for the remaining rows and columns.
4 Repeat from step 1 until only the dummy column remains.
5 Assign the remaining variables with the appropriate amounts.
Note that the dummy column is only used in this process after the other columns are already satisfied!

15
20
10
10 1 20 0
12 7 9 0
2 14 16 0
Aston Bradford Cardiff Dummy
Leeds
Manchester
Nuneaton
Destination
Required
3
1
7 17 17 4
Available
Supplier
7
15
2 17
0
2 0
18
0

3
1
3. Check if the solution can be improved (Stepping Stone Algorithm)
The idea of the Stepping Stone Procedure is to exchange a route which is currently not used for a route
which is currently used. Only if taking up the new route gives a saving in cost, this exchange is actually
made. Therefore, first it needs to be checked for all unused routes, if (in terms of cost) it is desirable to
use one of them. In order to keep the solution feasible, the units which are to be transported through the
previously unused route need to be reallocated such that each customer still gets what he wants and
each warehouse still delivers the amount it has.
Stepping Stone Procedure:
Starting with any basic feasible solution, consider each unused route in turn:
I Assign one item to this route. Indicate this with a +
II Compensate for this item by adjusting the numbers assigned to other used routes to ensure demand
is met and supply is not exceeded. Note the increases with + and the decreases with -
III Calculate the net change in cost, by adding the cost of the routes marked with a + and subtracting
the cost of the routes marked with a -.
If no unused route gives a negative total then the current solution is optimal.
Otherwise: allocate as many units as possible to the route giving the greatest saving, making the
compensations indicated by the +s and s.
Then repeat the stepping stone method (i.e. steps I to III) until the optimal routes are found.
Note that you need to find a circle of routes (tableau cells) which starts at the unused route (empty
cell) you are considering, and otherwise only consists of used routes, i.e. of tableau cells which
currently have an entry! I.e. you go from the empty cell to a cell with an entry in the same row, then to
a cell with an entry in the same column as the cell you are coming from, etc., until you are back at the
empty cell you started with. The circles for the different unused cells should be marked with
different letters, i.e. A for the first unused cell, B for the second, etc.

15
20
10
10 1 20 0
12 7 9 0
2 14 16 0
Aston Bradford Cardiff Dummy
Leeds
Manchester
Nuneaton
Destination
Required
11
7 17 17 4
Available
Supplier
17
A+
A-
A+
A- A+
A-
B+
B+
B-
B-
C+
C+
2
15
C-
C-
D+
D+ D- 7
3
D-
E+
E+
E-
E-
F+
F+
F-
F-
The cheapest solution,
costing 196, is to supply all 7 to Aston from Nuneaton, to supply Bradford with 15 from Leeds and 2
from Manchester, and supply all 17 to Cardiff from Manchester. This will leave 1 machine in
Manchester and 3 in Nuneaton.
Stepping Stones: A worked example including improvement step
Task
A car manufacturer has stocks of new cars (one particular model) available as follows: 20 at Liverpool,
25 at Birmingham and 15 at Oxford. He has orders from dealers for 12 cars in Bristol, 25 in London, 13
in Manchester and 10 in Birmingham. How should he arrange the deliveries (each car is driven
separately) if the cost of a delivery from each factory to each dealer is as follows:
Dealer
Factory Bristol London Manchester Birmingham
Liverpool 27 31 14 20
Birmingham 19 22 19 10
Oxford 19 16 25 16
Solution
First, a feasible starting solution has to be found by using the Least Cost First Method. This solution is
given in the tableau below.
Then, choose the first route that is not currently used and put a plus sign in that square.
In this case it is the route from Liverpool to Bristol, so we put A+ in that square (the A just means that
we do not have to redraw our table when we consider the second route).
A: 10 -1 +7 - 0 + 0 -2 = +14
B: 20 - 9 +7 - 1 = +17
C: 0 - 0 +7 -1 = +6
D: 12 - 2 + 0 -0 = +10
E: 14 - 0 + 0 -7 =+7
F: 16 - 0 + 0 - 9 = +7
Hence these routes are optimal
as no costs can be saved.

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
A+
3
7
12
15
This plus signifies that we are intending to increase the number of items transported on this route. In
order to be able to do this, two things must be true:
i) Less must go to the respective customer from somewhere else, i.e., the customers in Bristol
still should get the 12 they want in total, and not more.
ii) Less must go from this supplier, i.e. Liverpool, to another destination so that we actually have
at least one car to send to Bristol. (In this case 7 cars were going to London and 13 to
Manchester accounting for all 20 cars available at Liverpool.)
Taking the first of these we note that Bristol must receive less from Birmingham as we are currently
supplying all 12 from there. So we put a minus sign here to signify for every car we send from
Liverpool to Bristol we send one less from Birmingham to Bristol.

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
A+
A-
3
7
12
15
Now, because supply and demand exactly match, we know that every car we do not send from
Birmingham to Bristol must be going to another customer. Because we only consider introducing one
new route at a time, it must go to either the customer in London or the customer in Birmingham as
these are the only other customers supplied from Birmingham (at the moment, 3 cars go to London and
10 go to Birmingham respectively). Now if we sent it to Birmingham the customers in Birmingham
would receive more than the 10 they want, and because they get them all from Birmingham we can not
reduce the number being supplied from elsewhere to compensate. So, the cars which were going from
Birmingham to Bristol must now go to London, hence we put a plus sign on the Birmingham to
London route.

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
A+
A+ A-
3
7
12
15
Now note that both in order for London to get the 25 they want and for us to not send more than 20
from Liverpool we must put a minus on the route Liverpool to London.
We now have found that any cars we send from Liverpool to Bristol are ones that previously we would
have sent from Liverpool to London.

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
A+
A-
A+ A-
3
7
12
15
Now we have come back to where we started and found the only possible set of changes which
introduces only the route from Liverpool to Bristol and which ensures that each customer gets exactly
the amount he wants without sending more than we have from any supplier. We now calculate how
much the transportation cost will change for each car sent on this route.
It costs 27 for each car we send from Liverpool to Bristol, but we will save 19 for each not sending
them from Birmingham, it will cost 22 each to send a car from Birmingham to London and we will
save 31 for each that we do not send from Liverpool to London. This gives a net change of 27-19+22-
31=-1 that is we save 1 for each we send on the new route. (Just add the costs of the routes with plus
signs and subtract the costs of the routes with minus signs).
As this route saves us money we should send on it as many cars as we can. (Before we actually do that
we have to check that currently there is no other route on which we could save more money per car.
This can be checked by applying the very same procedure to the other five unused routes. If you do
that, you should end up with positive net changes for all other routes, so the only possible improvement
is to introduce the route Liverpool-Bristol into the transportation plan!) As for every item we send on
this route we send one less on the routes with minus signs, the most that can be sent is determined by
the smaller of the numbers using these routes (in this case the smaller of 7 and 12). This is because you
cannot send a negative number of items on a route!
So our improved set of routes is determined by increasing the numbers going on routes marked with
plus signs by 7 and reducing the numbers going on routes with minus signs by 7. At this point it is
worth drawing a new table (but not before):

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
10
7
5
15
You now apply the same process all over again for each unused route, checking if further
improvements can be made. In this case you will find that no further improvements can be found, as all
net costs are positive, see below.

20
25
15
27 31 14 20
19 22 19 10
19 16 25 16
Bristol London Manchester Birmingham
Liverpool
Birmingham
Oxford
Destination
Required
10
12 25 13 10
Available
Supplier
13
10
7
5
15
A+ A-
A+ A-
B+
B+
B-
B-
C+
C+
C-
C-
D+
D+
D-
D-
E+
E+
E+
E-
E-
E- F+
F+
F-
F-
Route Cost
A Liverpool-London 31-27+19-22 = +1 More, so do not use it
B Liverpool-Birmingham 20-27+19-10 = +2 More, so do not use it
C Birmingham-Manchester 19-14+27-19 = +13 More, so do not use it
D Oxford-Bristol 19-19+22-16 = +6 More, so do not use it
E Oxford- Manchester 25-14+27-19+22-16 = +25More, so do not use it
F Oxford-Birmingham 16-10+22-16 = +12 More, so do not use it
Introduction of any of the unused routes will increase costs, so the last tableau must represent the
cheapest way of transportation: Liverpool to Bristol 7 cars, Liverpool to Manchester 13, Birmingham
to Bristol 5, Birmingham to London 10, Birmingham to Birmingham 10, Oxford to London 15 cars.
The total cost is 1,026.
Linear Programming
Linear Programming (LP) is concerned with looking for the best solutions to problems given certain
constraints. The decision maker has control over a number of factors which are represented by
continuous variables, and the objective is to maximise (or minimise) a function of these decision
variables, without exceeding a set of bounds defined by linear constraints.
Formulate a problem as a LP model
1. Define decision variables. 2. Define the objective function. 3. Specify the constraints.
Find the solution
If there are only 2 variables a graphical method can be used, otherwise a specialised computer package
such as LINDO can be used (or the solver within Excel).
Example 1: The farmers problem
A farmer has 100 spare hectares of land, which can be used to plant either wheat or potatoes (or
neither). Wheat gives a profit of 90 per hectare and potatoes give a profit of 60 per hectare.
EU regulations limit the amount of potatoes planted to 65 hectares at most.
There will be only 480 person-hours available to harvest the crop. A hectare of wheat takes 6 person
hours to harvest, while a hectare of potatoes takes only 3 hours. What should the farmer do?
1. Define decision variables
The farmer has control over the number of hectares of wheat and/or potatoes that are planted.
Hence the decision variables are:
W the number of hectares of wheat planted,
P the number of hectares of potatoes planted.
2. Define the objective function
Objectives in Linear Programming can be either to maximise a linear function of the decision variables
or to minimise a linear function of the decision variables.
Wheat generates a profit of 90 per hectare and potatoes 60 per hectare. So the profit from planting W
hectares of wheat is 90W and the profit from planting P hectares of potatoes is 60P.
Hence, the total profit is given by the following linear function of W and P, Profit = 90 W + 60 P.
So the objective function is to maximise profit (in s):
Maximise Profit = 90 W + 60 P
3. Specify the constraints
Constraints in LP are linear equalities or inequalities. This means, a linear function of the variables is
either equal to, less than or equal to, or greater than or equal to some value.
The farmer has 100 spare hectares of land. The total amount of land planted must be less than or equal
to the available space, so: W + P 100
EU regulations limit the amount of potatoes planted to at most 65 hectares, so: P 65
There will be only 480 person-hours available to harvest the crop. A hectare of wheat takes 6 hours to
harvest, while a hectare of potatoes takes only 3 hours. So the time needed to harvest W hectares of
wheat is 6W and the time needed to harvest P hectares of potatoes is 3P. Therefore: 6W + 3P 480.
Finally, there are the non-negativity constraints. It is not possible to plant a negative amount, so
P 0 and W 0.
The Linear Programming model of the farmers problem (completed)
Let W be the number of hectares of wheat planted and P be the number of hectares of potatoes planted.
Maximise Profit = 90 W + 60 P
Subject to the following constraints:
Land Constraint W + P 100
EU Constraint P 65
Harvest Time Constraint 6W + 3P 480
Non-negativity P 0 and W 0
Graphical Solution

100
100
W
P
160
80
65
Feasible
Region
So the maximum profit will be where the land constraint crosses the harvest time constraint, that is
where W = 60 and P = 40. This gives a profit of 7800.
The land and harvest time constraints are called binding constraints, because there is no slack or
surplus; we have used all the land and all the available labour.
Using LINDO
Syntax for a LP model to be solved by LINDO computer package.
1. Start with the objective function. Write MAX (for maximise) or MIN (for minimise) followed by a
linear function. Max 90W + 60P
2. The end of the objective function and the beginning of the constraints is signified with any of the
following: SUBJECT TO, SUCH THAT, S.T. or ST
3. This is followed by the constraints. You may, optionally, name constraints in a model. Constraint
names make many of LINDO's output reports easier to interpret. Constraint names must follow the
same conventions as variable names. To name a constraint you must start the constraint with its name
terminated with a right parenthesis. After the right parenthesis, you enter the constraint as a linear
function (without any constant terms) followed by either = or >= or <= then a number.
Land) W + P <= 100
EC) P <= 65
Harvest) 6W +3P <= 480
LINDO will accept < or > in constraints but they will be interpreted as <= and >=.
The end of the constraints is signified with the word END.
4. LINDO has a limit of eight characters in a variable name. Names must begin with an alphabetic
character (A to Z), which may then be followed by up to seven additional characters. These additional
Plot Constraints on Graph
Land Constraint W + P 100
EU Constraint P 65
Harvest Time Constraint 6W + 3P 480
Non-negativity P 0 and W 0
Plot trial objective function on the graph
e.g. Profit = 5400 = 90 W + 60 P
Profit = 7200 = 90 W + 60 P
The maximum profit is realised where a line parallel to the trial
objective function just touches the feasible region.
Note the solution to a LP model will always be on a corner (or
edge) of the feasible region. Hence, an alternative solution
method is to find the profit at each corner of the feasible region
and to compare these profits.
characters may include anything with the exception of the following: ! ) + - = < >. So, as an example,
the following names would be considered valid: XYZ, My_Var, A12, SHIP$LA,
whereas the following would not: THISONESTOOLONG , A-HYPHEN , 1INFRONT
5. LINDO will not accept parentheses as indicators of a preferred order of precedence.
All operations in LINDO are ordered from left to right. Only + , , and inequalities are allowed.
6. Comments may be placed anywhere in a model. A comment is denoted by an exclamation mark (!).
Anything following the exclamation mark on the current line will be considered a comment.
MAX Profit) 90W + 60P ! Maximise profit (in )
SUBJECT TO
Land) W + P <= 100 ! There is only a limited area of land (units are Hectares)
EC) P <= 65 ! The Maximum area of Potatoes due to EC regulations (in Hectares)
Harvest) 6W +3P <= 480 ! There is a limited amount of labour at harvest time (in person hours)
End
Output from LINDO

LP OPTIMUM FOUND AT STEP 1
OBJECTIVE FUNCTION VALUE
PROFIT) 7800.000
VARIABLE VALUE REDUCED COST
W 60.000000 0.000000
P 40.000000 0.000000
ROW SLACK OR SURPLUS DUAL PRICES
LAND) 0.000000 30.000000
EC) 25.000000 0.000000
HARVEST) 0.000000 10.000000
NO. ITERATIONS= 1
Example 2: Efficient stock portfolio
A fund manager is planning the investment of up to 600,000 so as to maximise its yield. The
following stocks are considered:
Stock A B C D E
Estimated % yield 15 12 7 6 8
Stocks A and B are high risk and can compose at most 40% of the total investment. Also neither of the
investments in A or B can be greater than half of the total invested in stocks C, D and E. Also, the
investment in each of low yield stocks C, D, and E must not exceed the total invested in the other two
low yield stocks. Formulate the problem as an LP model.
1. Define decision variables
Let the amount invested in stock A be A thousand pounds.
Let the amount invested in stock B be B thousand pounds.
Let the amount invested in stock C be C thousand pounds.
The optimum value of the objective function.
In this case the maximum profit is 7800.
The optimum is achieved with these values of
the decision variables:
W = 60 hectares and P = 40 hectares
All the land is used (Slack is zero).
We have planted only 40 hectares of potatoes so
another 25 could be planted before the EC
directive would be a problem.
All the available hours of labour at harvest time
have been used (Slack is zero).
The dual price gives the marginal cost / benefit of a change in the
right-hand side of the respective constraint.
So if the farmer lost a hectare of land because of a road widening
this would reduce the profit by 30.
If, on the other hand, the farmer had a hectare more on which to
plant potatoes or wheat, he would be able to make an extra profit of
30. Therefore, he should be ready to pay a maximum rent of 30
for an additional hectare of land.
If the farmer was able to buy extra harvest time, he should be ready
to pay a maximum amount of 10 per hour (as he can make an extra
profit of 10 per hour).
The reduced costs of an objective function
coefficient give the amount by which the
coefficient has to change to let the
respective variable take a positive value in
the optimal solution. I.e., the reduced costs
express by how much the profit per hectare
needs to increase to make production of a
certain sort of crop worthwhile.
Variables with positive values have
reduced costs of zero, as production of the
resp. crop is worthwhile already.
Let the amount invested in stock D be D thousand pounds.
Let the amount invested in stock E be E thousand pounds.
2. Define the objective function
Maximise the overall percentage yield, Y = (15A + 12B +7C +6D +8E)/600.
Max Y = 0.025000A + 0.02000B + 0.011667C +0.01000D + 0.013333E
3. Specify the constraints
Total investment is less than or equal to available capital
A + B + C + D + E <= 600
Stocks A and B are high risk and can compose at most 40% of the total investment
A + B <= 0.40(A+ B+ C+ D+ E)
so 0.60A + 0.60B - 0.40C - 0.40D - 0.40E <= 0
Neither of the investments in A or B can be greater than of the total invested in stocks C, D and E.
A <= 0.5(C + D + E) and B <= 0.5(C + D + E)
A - 0.5C - 0.5D - 0.5E <= 0 and B - 0.5C - 0.5D - 0.5E <= 0
Also, the investment in each of low yield stocks C, D, and E must not exceed the total invested in the
other two low yield stocks.
C <= D + E and D <= C + E and E <= C + D
so, C - D - E <=0 and D - C - E <=0 and E - C - D <=0
Non negativity constraints
A >= 0, B >= 0, C >= 0, D >= 0, E >= 0.
Some additional notes about Linear Programming
Convex

Non-convex
Fractional values of decision variables should have meaning (Divisibility)
For example, it is possible to plant a fraction of a hectare so the decision variables in the farmers
problem can take fractional values. With lots of business examples this may not be the case, for
example it is not possible to make and sell a fraction of a car (although it may be possible to interpret
such figures as averages).
No diminishing (or increasing) returns to scale (Proportionality)
In the farmers problem, returns from 10 hectares of wheat must be 10 times the return from one
hectare. This is because the objective function is a linear function.
Additivity
The total contribution of all variables in the objective function and their requirements in the constraints
are the direct sum of the individual contribution or requirement of each variable. This means, e.g., that
the profit contributions of wheat and potatoes are added up in the objective function; the profit
resulting from wheat is not influenced by the profit which results from the planting of potatoes.
The feasible region is always convex.
That is, it is possible to join any two points in the
feasible region with a straight line that is entirely
within the feasible region.
This is a direct consequence of the constraints being
linear inequalities

Das könnte Ihnen auch gefallen