Sie sind auf Seite 1von 42

Modelling and Analysis for Management

Exam Compendium December 2009

Yoav Gordon

1|Page

MAM Compendium

General comments and consideration for the exam

Show workings for mathematical elements some marks will be provided for starting in the
right direction

Use diagrams and charts where possible normal distribution curve, forecasting trend lines etc

Read the question they often pose tricky ones to confuse (bastards)

KISS Keep it Simple Stupid

Questions for consideration before accepting a model of any kind

What are the objectives of the model?

Who developed the model, and for whom?

Has an appropriate methodology been used (eg modelling cycle)

What are the underlying techniques used in the model? (eg simulation, regression, LP, etc)

What software has been used to develop the model?

What verification and validation of the model has taken place?

How were the data obtained for the model?

What experimentation has taken place with the model?

What alternatives have been considered during experimentation?

2|Page

The Modelling Process Lesson 7 Mindmap

Conceptual modelling

Model coding

Experimentation

Implementation

Factors that can make a model a success

A significant problem to be solved: the existence of the business


A clear objective: which routes to fly?
Strong management support for the modelling.
Availability of some suitable data.
Reasonable and clear assumptions where data were not available.
A simple model.
Strengths and limitations of the model are understood
Rapid turnaround of modelling effort.
Wide range of modelling approaches used
Do we have confidence in the model through validation?

3|Page

Conceptual modelling
why conceptual modelling is useful in the modelling process
Below is the model answer to this question (from exam solution to specimen exam in front of module)

Influence diagrams
Uses / advantages

help understand / appreciate key influences on a business


Can be used as a basis for debate amongst senior managers

General

Can be more than one way of drawing the loop: the key thing is you give a good explanation of
how the loop functions

If asked to say whether main loop positive or negative: comment only on main loop

Could be no single, main loop: could be composed of several mini-loops, as shown below (L7,
p.16):

4|Page

Signs

- (minus) means negative relationship, + (plus) means positive relationship

Calculating the overall sign of a loop (Pidd, p.179): simply sum up the +ve signs
odd number = negative feedback loop
even number or 0 = positive feedback loop

Show the sign of a loop as follows (L7, p.22):

If loop has several sub-loops, then may need to show each overall sub-loop sign

Delays / lead times

Delays / lead times are denoted by a D, as shown below (L7, p.16):

5|Page

A delay signals that a considerable delay occurs before a change in the tail variable causes a
corresponding change in the head variable (L7, p24).

Delays usually occur between somethings production rate being increased/decreased and the
resultant increase/decrease in output. See diagram above & L7, p17.

Commenting on influences and loops

6|Page

If negative feedback loop: this is a vicious circle ie as time progresses the level
of the element decreases. Can be described as a growth-limiting loop. (Note
that in diagram below, horizontal line is only shown if a target level is
mentioned).

If negative feedback loop with delay: then the element under consideration will
oscillate over time.

If asked about what is missing in the diagram (using diagram from L7, p.17 as an example):
Mention influences on the outside element; in the L7 example, this is traffic volume ie
what could the influences on traffic volume be?
Also mention major things which are missing from the main loop; in the L7 example, this
is influence of public transport and investment in public transport.

7|Page

The Normal Distribution and Sampling Lesson 3

Theoretical elements

Key elements that must be satisfied to be able to make inferences about population using a sample;

Sample must be reasonably large


Sample must be taken at random
Population must be infinite removing one item will have little impact
Sample must be a properly representative subset of the statistical population

The normal distribution is a theoretical probability distribution and is represented by the Bell Curve. To
determine ND two variables are required;

Mean and Standard Deviation a measure of variation - as we move away from the mean, it is the SD
that describes how far away a particular value lies
The area occupied by the curve is probability with the total area summing 1 or 100%
68% of the data will be within 1 SD of mean
95% of the data will be 2 SD from mean
99.7% of the data will be 3 SD from mean

8|Page

Determining the Normal Distribution

Distribution Tables are used to calculate the proportion of distribution that falls within any part of the
curve what is the probability of an event to occur this is commonly known as the Z SCORE

Z Score determined by: z= x-m/s this gives a percentage which tells us how many SDs the value lies
from the average

NOTE: some exam questions will ask you to find other elements of the Z score formula;

To find x zxs =SUM then x+m to give answer ford example


To find s: - z=x-m/? rearrange to make SUM x-m/z where Z is percentage required washing machine
example

Populations and Samples

What is the reliability of a sample against a statistical population?


Further use of the normal distribution where complete information on all the data is not available. A
critical part of inferential statistics involves determining how far sample statistics are likely to vary
from each other and from the population

Sampling Distribution of Percentages: Frequency distribution of the percentages of all possible samples
of size n that can be selected from a given population.

Mean of Distribution = P for example population size is 200 sample is 60 so P = 60/200*100 or 30%

SD is determined by =
differ from the mean.

9|Page

= % or standard error (SE) of percentages or how sample values

Confidence Intervals: used to evaluate sample results in the context of the population?

We determine this by returning to Z scores and look for a value as close to 2.5 % (as we are looking for
95% shared across both tails of the curve) or 1.96

95% Confidence interval determined by P+/- (SE*1.96)

Sampling Distribution of Means


The distribution we would get if we took all possible random samples from a given population and
calculated their means. As long as samples are of an appropriate size the distribution will be normal.
This also allows an analysis of how we determine appropriate sample size
Will need two key variables:
Mean equal to overall population mean

Standard Deviation standard error of mean

Confidence interval can be used in similar way to above with Z score of 1.96

However! What if we wanted to know mean within X cents? This can be done by increasing the sample
size but by how much?
Sample size N is unknown
Confidence interval must be 1.96 standard errors so n can be determined as;
N= zxs/x where x is the mean that we have been asked to look at
We then square the answer to provide sample size needed

10 | P a g e

Tricky test questions


2008 Paper on cookies
How many cookies should be baked to be 98% sure they dont run out?
As this is focussed on one end of the curve your Z score is for 2% rather than 1% at either end or
Z= M+Z(SD)

Z= 2.05 or 313+2.054(57) = 430 cookies

Jar Sample Size Commonly used data and same answer!


You wish to take a random sample of jars of coffee in order to determine the mean weight to 0.1 grams
at the 95% level of confidence. You know that the standard deviation of the weights is 0.5 grams.
Answer: s/SQRn or 0.5/SQRn
Therefore N= 1.96 (Confidence Interval) * 0.5 (s)/0.1 (x) = 96.04 units
Post Office Queuing
A post office measures the length of time people wait in a queue to be served. A random sample of 1000
people yield a mean waiting time of 2:45 with a standard deviation of 1:45.
(a) Calculate a 95% confidence interval for the true mean waiting time in minutes and seconds
Answer
Firstly need to decimalise the mean and sd so 2:45 becomes 2.75 and 1:45 becomes 1.75
To build 95% confidence interval the formula is:
Mean +/- z*SEM (standard error of mean) or 2.75+/- 1.96*(1.75/SQR1000) = 2.64 2.86 minutes

Then need to turn decimal figures back into minutes and seconds
Note: you multiply decimalised parts by 60 ie .64 and .86 by 60 to give you: 2mins 38 secs to 2
mins 52 secs

Internet usage
Part of a pilot study conducted by a marketing department indicates that about 70% of homes in a
particular area have access to the internet from home.
(a) How many homes should be randomly sampled in a proper survey if it is desired to get an estimate of
the true percentage of homes with access to the internet that is 95% certain to be within 5% of the true
value for the population of the area?

Answer

This is about determining sampling size but will need algebraic analysis as much of the data is
unavailable. This is determined by;

11 | P a g e

Note: easy to mess up calculations here so make sure you square relevant variables first then
build into the equation. Look at the answer and make sure decimal point is moved appropriately

12 | P a g e

Forecasting

Forecasting is used to address the increasing complexity of an organisations environment and


changing demands and expectations of customers

Forecasting plays a critical role in the strategy and decisions making of most organisations
although no single and ideal approach exists to address it.

Time-Series Decomposition
The Trend: the long term underlying movement in the variable can be upward or downward moving
and is an indication of the long term direction of variable
Seasonal Components: variability occurring within series during a time period and is repeated i.e.
electricity use during the day
The Random Element: difference between actual series and expected series based on trend and seasonal
component. Relates to random and unpredictable events any shock event.

Time Series Decomposition: determined by: series D=T+S+R


D = data series
T = trend
S = seasonal components
R = random element
Forecast Errors: need to determine the overall accuracy of forecasting method. Error is determined by
the difference between the actual and forecast variable - 2 common calculations are used;
Mean Absolute Deviation: absolute value of each error and averages them over the entire period MAD
takes the total of averages and divides by number of values = 5.27 units away from actual value. Note: a
longer forecast ie 4 weeks will give a stronger MAD
Mean Squared Error: in some cases the forecast will produce significant errors which will be of cost to
the organisation therefore the MSE takes the error and squares it before averaging
Concerns about time-series

If is twelve months, likely there is seasonality


If only 12 months, cannot really identify the trend
Data may be only a very small fragment of a longer series that exists neither trend or seasonality
Look for shocks and address with moving averages

Exam Issues
Trend draw trend line and range if applicable (see below)
Seasonality
Any other factors noted in introductory text i.e. product type, country etc and any factor which
might influence it i.e. drinks at xmas
Oscillations

13 | P a g e

What does the forecast


look like? Is the trend
downward?

Indicate any problems with data ie length of time and seasonality issue also consider impact of
random factors

(a) Comment on the structure of the data

14 | P a g e

Seasonality of the monthly data and a clear annual cycle


Is there a trend? Has there been a change in the last two years as opposed to upward
movement from 1992 -1996?

Why has there been a change what are the influences government legislation
through price controls or taxation? Curbing demand by wine producers to push up
vintage values etc?

(b) You try to decide whether the information would be useful for forecasting the likely demand for
champagne for 1999 and 2000 what do you decide and explain your reasoning?
Yes we would hope that 7 years worth of monthly data showing clear seasonality cycle
Last two years are area of concern as they only show two cycles which isnt enough to
work with
Consider planning horizon of 2 years and problem of uncertainty as forecasts move from
most recent horizon
What is the impact of the millennium celebration end of 1999 likely to be atypical with
extra demand for champagne

15 | P a g e

Linear Regression

Regression Analysis is used to determine and quantify the relationship between two variables. It
concerns itself with how changes in one variable will impact on another specific variable
simple linear regression

The use of regression is not only used for forecasting or prediction but can also be applied to
the measurement or quantification of the impact that a change in one variable will have on
another

Dependant and Independent Variables - ANOVA Tables

Regression can be understood through Excel table called ANOVA where independent and dependant
variables are shown X independent and Y dependant

Regression formula: Yp = A+Bx


A= Intercept Dependant
B = X Variable Independent
IMPORTANT: depending on the
relationship it will be +/-

Relationship of miles per gallon is negative and can be determined as


Y=18.90-0.138X or when weight is 0 MPG is 18.90
What is the fuel consumption if the weight carried was 30,900kg?

16 | PMPG=
a g e 1.890-0.138*39.0 = 14.64mpg

ANOVA Table in Detail

Note: Remember to be clear on which is the dependant and independent variable and watch for
decimals!!

Correlation Coefficient/Multiple R measure of


correlations between values in this case strong negative
correlation

Coefficient of Determination/R Square shows in this


case that 75% of the variation in fuel consumption is
caused by weight
Equivalent of Multiple R squared
Coefficient of Determination is measured
by Regression Sum of Squares/Total Sum
of Squares

Total Sum of Squares total


variation in dependant variable

17 | P a g e

Residual Sum of Squares sum of


squared deviations between actual Y
and those predicted by regression

Regression Sum of Squares


amount of variation in Y attributed to
regression line

SHOULD BE LOW

SHOULD BE HIGH

Standard Error indicates typical


error in the equation to predict fuel
consumption as 0.356mpg
Error can now be determined
summing 35 data variables to give
mean = 14.63
Standard Error = 0.356*100/14.63 =
2.43%

T Stat value of test upon slope or


constant to determine if true value of
these is 0 or not.
Any value in excess of -2 and 2
indicates that it is unlikely the true
value is 0

P Value probability of the true


population value of a coefficient being
0

Lower and Upper 95% - range of


values within which population values
of coefficients lie

SMALL IS GOOD

IF RANGE INCLUDES 0 MODEL IS


UNLIKLEY TO BE ACCURATE

General Comments for a Linear Model


1.
2.
3.
4.
5.
6.
7.
8.

How do we interpret R2?


How do we account for remaining variation?
What are the T Stat and 95% values?
Create a scatter plot to confirm linearity
Look a the practical application of the model
Consider the number of overall observations for significance
What are the impacts of variation and randomness
Are there any issues of data quality?

18 | P a g e

Simulation

Simulation models are concerned with looking for optimum solutions to given problems and are
primarily descriptive rather than deterministic.

Concerned with describing in modelling terms the business problem and generally provide
additional information about a problem.

Simulations imitates behaviour of a real world problem

Allows managers to experiment with alternative courses of action and predict likely outcomes

Where are Simulations Useful?

Variability: where this is predictable or not

Interaction: where components within a system interact with each other i.e. customers and
bank tellers

Complexity: where there is interaction between multiple actors in the real situation creating
unpredictable behaviour

19 | P a g e

Exam Application
A retail store has observed the following daily demand for a particular product that was introduced a
few months ago:

Demand (units)

Frequency (days) 4 19 25 18 16 12

6
6

Tasks:
1. What level of daily demand would you expect to see for this product on average?
2. Simulate daily demand over a 10 day period. Calculate the average daily demand for the product.
How does this figure compare to your answer to question 1?

Step 1 - Need to calculate average daily demand per day:

Demand (units)

Frequency (days) 4 19 25 18 16 12
Total demand

0 19

6
6

Total = 100

50 54 64 60 36

Total = 283

Average daily demand = Total demand/total days


= 283 / 100 = 2.83 units
Therefore, you would expect to see a daily demand of 2.83 units for this product on average
Step 2 Using Random Numbers:

Demand

Probability = Freq/SUMfreq*100

Frequency

Prob

4%

4%

0-3

19

19%

23%

4 - 22

25

25%

48%

23 - 47

18

18%

66%

48 - 65

16

16%

82%

66 - 81

12

12%

94%

82 - 93

6%

100%

94 - 99

% chance that this


frequency occurs

20 | P a g e

Cum Prob

Cumulative percentages ie
4+19=23 must always be
found

Rand No

Based on cumulative range ie


for 48% random range is 23-47

Step 3 Working out a 10 Day Simulated Demand:

Day
1
2
3
4
5
6
7
8
9
10

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
Monday
Tuesday
Wednesday

Random
Number

Daily Demand

46
78
70
13
92
6
91
40
55
80

2
4
4
1
5
1
5
2
3
4

Daily demand figures for simulation


are taken by considering random
number and where it falls into random
number range above
For example if random number is 91
then it falls into highlighted area
giving us a demand of 5 units

Row or column of random


numbers are taken from a
given list

Create a 10
day series

Average daily demand from the simulation = total demand/number of days or 31/10 = 3.1

Comment: compared with averge demand of 2.8 worked out initially this is very close to actual mean
using greater iterations will bring it closer.

Why Simulate?

Risk reduction

Greater understanding

Reduction in operating cost

Reduction in lead times

Faster plant changes

21 | P a g e

Improved customer service

Simulation vs. Real Life Experimentation

Cost: real life can be hugely expensive simulation removes this

Repeatability: use of random numbers allows for alternative situations to be tested under same
conditions impossible in real life

Control of Time Base: simulation will take less time to achieve

Legality and Safety

Simulation vs. Mathematical Modelling

Non-Standard Distributions: simulations can use both theoretical and historical distribution

Dynamic and Transient Effects: simulations allow for a better understanding of how a system
operates under more extreme conditions

Interaction of Random Events: simulation allows for an understanding of the impact from
random factors such as machine breakdowns and knock on effects

Endorsement by Management

The ability of management to view and understand the model is likely to result in greater acceptance of
the results;

Simulation Promotes Total Solutions: VIS provides big picture solutions relevant to an entire
organisation

Simulation Fosters Creativity: what if questions and engagement from all involved

Simulation Makes People Think: both the building and development of the simulation will
encourage people to consider the issue in greater depth

Effective Communication of Ideas

22 | P a g e

Key Problems with Simulation

Accuracy results of a single simulation should not be taken as conclusive evidence

Starting state what is the typical starting state of the simulation?

Relevance does the simulation accurately reflect the real world situation?

Experimentation/Replication with Simulation

The accuracy of simulation model results can be improved in two ways:

running the model for longer

performing multiple replications

A replication is a run of a simulation model that utilises specific streams of random numbers. By
changing the streams of random numbers another replication is performed, from which a slightly
different result will be obtained. In statistical terms this is similar to taking multiple samples in order to
improve the accuracy of the estimate. An Experimentation is a run of a simulation model where random
numbers dont change but inputs do.

Replication Example

Replication

Average time

Mean

Deviation

Deviation SQR

26

28.7

-2.7

7.29

30.2

28.7

1.5

2.25

29.2

28.7

0.5

0.25

27.1

28.7

-1.6

2.56

31

28.

2.3

5.29

Totals

143.5

17.64

Mean = Sum of results


N

23 | P a g e

or - Mean Service Time = 143.5 /5 = 28.7 mins

Standard deviation = SQR Sum(Result - Mean)2


n-1

Standard Deviation = 2.1 mins

Confidence interval =
Mean + t n-1, /2 standard deviation
n
n = number of replications
t n-1, /2 = value from t-distribution for n-1
and the significance level /2
(t is like Z for normal distributions. Use t in this case since n is small and population std dev
unknown

n=5
=5% or 0.05, so /2 = 0.025
t n-1, /2 = t

4,0.025

= 2.776

Confidence Interval =
28.7 + 2.776 2.1 = 28.7 + 2.61
5

24 | P a g e

Linear Programming
Overview

LP is a method of mathematical analysis that seeks to optimise the allocation of limited


resources to competing activities. Examples include Allocation of aircraft and crews/Plant
scheduling

Three key characteristics must apply to a particular problem


1. Identifiable decision needs to be made
2. A set of constraints or limited resources are identified
3. And objective for optimisation can be identified

Key stages of modelling process

Formulation

Solution

Sensitivty
Analysis

Key Assumptions

Linearity all relationships are assumed by the model to be linear in form assumption of
proportionality

Divisibility all variables are continuous, rather than discrete

Certainty we know for certain the numerical values of the linear relationships specified

Case Study
ClearGlass Windows and Conservatories is planning a local radio and television advertising campaign. They have
employed a marketing services firm to give advice on their advertising strategy. Their recommendations are as
follows:

use at least 60 radio and TV advertisements

of these, do not use more than 50 for radio advertisements

use at least as many radio advertisements as TV advertisements.

ClearGlass has an advertising budget of 34,000. The local radio station has quoted 200 per advertisement,
while TV advertisements cost 800 each. The marketing services firm has rated the two
advertising media in terms of audience coverage and recall power, giving TV advertisements a rating of 600 and
radio advertisements a rating of 200.

25 | P a g e

How many TV and radio advertisements should ClearGlass use in order to maximise the overall rating of the
advertising campaign?

Formulation

STEP 1 - explicitly identifying the set of decisions to be made decision variables


case example has two variables how many Radio and TV advertisements to make

Constraints

STEP 2 what are the restrictions or constraints to the decision variables


Case example constraints:

Use at least 60 radio and TV adverts or R+T60


Do not use more than 50 Radio adverts or R50
Use at least as many Radio adverts as TV adverts or RT or R-T0
Advertising budget of 4k with Radio costing 200 and TV costing 800 or 200R+800T34000
Non Negativity Constraints - R0 T0

26 | P a g e

Objective Function

A Statement of what we are trying to maximise or optimise

Maximise 200R+600T
Final Formulation should then read as follows;
Maximise 200R+600T such that;
R+T60
R50
RT
R-T0
200R+600T34000
R0
T0
Where R= number of radio advertisements and T = number of TV advertisements

To Note for Exam

Ensure the right number of decision variables may be more than 1 or 2

The difference between these


cells and their counterparts in
the Limit column is what
resources were not used, ie
slack/unused resource

27 | P a g e

This cell should specify what you


are trying to maximise or
minimise (the actual final figure
is displayed underneath; in this
case, 600)

Sample Questions

Answer
E Executive Home, F Family Home, S Starter Home
Maximise 170E+100F+50S
Subject to constraints:
Land Constraints: 100E+60F+180S110,000
Cost Constraints: 100E+60F+30S 10,000
Time Constraints: 800E+650F+400S100,000
Planning Constraints: S0.2(E+F+S)
Non Negativity Constraints E,F,S0

28 | P a g e

(B) This indicates that currently the


family homes are not creating
value and have therefore not
profitable for value to increase
price must rise by 2.09 or 2090

(A) The allowable increase of the


constraint is 1944.44 yielding
through the shadow price an
additional 1.7 for every pound
invested or 1944.44*1.70 =
3,305,555

29 | P a g e

Answer
LN Local News, NN National News, W Weather, S- Sport
Minimise 150LN+250NN+75W+100S
Subject to
0.15LN 25
0.10W25
2NNLN= 2NN-LN0
SW = S-W0
SLN+NN = S-LN-NN0
Cost Constraints
LN150
NN250
W75
S100
Non Negativity Constraints LN,NN,W,S 0

30 | P a g e

Important Rule: formula must always end


with a 0 at the right hand side so rearrange!!

31 | P a g e

Allowable increase has now been reached and


there is no slack

Shadow price suggests that each new unit will


create a loss for the company
The allowable increase is 12.857 therefore
penalise additional tables up to this amount

Question A Answer

An additional table would therefore impact on overall profit (3900) as follows: Profit Shadow price or
3900-30=3870 an unadvisable action

9: SENSITIVITY ANALYSIS

Precursor: uses of sensitivity analysis


Wis, p.410: provides answers to what if questions
Wis, p.410: Sensitivity analysis allows us to assess readily a change in one part of the
problem without the need for entire recalculation; means do not need to recalculate
the entire problem > this can be time-consuming and tedious.
Wis, p.410:This sensitivity can be undertaken on the constraints and on the objective
function

32 | P a g e

Pidd gives two reasons for the importance of sensitivity analysis, p.213:

Precursor: In order to answer an excel solver question, sometimes need to go beyond simply
thinking about the information in excel solver

1 the world is dynamic and, therefore, things are constantly changing; thus it is
important to know what effect changes in the coefficients may have on the optimum
solution
2nd reason see Pidd, p.213. Dont fully comprehend this.

see Q10b of L9 SAA


Question asks: If Bob were to consider changing the price of the College bat, briefly
describe one extra piece of information that he could use in conjunction with the
information from the Sensitivity Report.
Answer brings in the outside concept of demand:

Precursor: looking at excel solver situations where are asked to comment on something
without resolving the LP model

33 | P a g e

See Q13b, Lesson 9


For this question, some new information is given

Are then asked the following question:

Answer is as follows:

34 | P a g e

Interpreting the Excel Solver answer report Top Part


final values for the objective function (target cell) and the decision variables (adjustable cells),
The original value simply refers to the values for these prior to the calculation of the solution,
which are zero

Decision variables

35 | P a g e

Initial values of
the decision
variables (always
zero)

Values that Decision


variables take in
optimum solution

Interpreting the Excel Solver answer report Bottom Part

Below is an abridged version of the answer report found on p.4 of Lesson 9 notes: the graphic
does not show the upper parts of the answer report (target cell and adjustable cells)
if constraint is binding, there is no slack (since as much (for constraints) or as little (for constraints) as
possible of the resource is used in the proposed solution. Means that if want to improve solution, the constraint
must be increased or decreased
>Wisniewski definition, p. 409: binding constraints: at the optimal solution they limit (or bind) the objective
function from taking an improved value

**This bottom part


of the Answer report
shows the usage of
each constraint**

If constraint is not-binding , means there is some slack. In other words, not all of the resource is used (for
constraints) or more than the minimum allowable is used (for constraints). Note that if the slack refers to the
additional quantity of the resource that has been used above the minimum allowable, means that you have
leeway/scope to reduce the resource down to the minimum level required
>Wisniewski definition, p. 409: non binding: at the current optimal solution they do not prevent the objective
function from taking an improved value

If asked to state which constraints are binding and which are non-binding, and what
this tells us:
Answer would be something like this (answer to Q3a, SAA9):

36 | P a g e

Is important to put this line in,


and also down here

How to comment on whether if it is worth making extra resource available / or to buy


extra resource
Firstly, look at the Answer sheet: if the constraint is non-binding, then it
means that we already have slack resource available to us. Means that if the
optimal production schedule followed, would have slack, so no need to
acquire/buy extra resource.
Secondly, can check by looking at the shadow price of the constraint. Will be
zero, meaning adding or reducing the constraint by 1 will have no affect on the
objective function.

37 | P a g e

Pricing Analysis at Merrill Lynch

Application of the Modelling Process to ML

4 key stages in the modelling process: Conceptual modelling, model coding, experimentation
and implementation.

Conceptual Modelling
Understanding the problem

Application

Determining the modelling


objective

1.
2.
3.
4.

Determine the modelling


approach

Model Definition

38 | P a g e

Impact of electronic trading and the commoditization of trading


threatened MLs value proposition providing advice and guidance
through a financial advisor.
Management decided to offer investors more choices for doing
business with ML by developing an online investment site. The firm had
to balance carefully the needs of its brokers and its clients, while
planning for considerable financial risk.
Doing so would require operations research expertise in data mining
and strategic planning.
Evaluation of alternative product and service structures and pricing,
and constructed models to assess individual client behaviour. The
models showed that revenue at risk to ML ranged from $200 million to
$1 billion;
Determine the total revenue at risk if the only clients choosing the new
pricing options were those who would pay less to ML.
Determine a more realistic revenue impact based on a clients
likelihood of adopting one of the new services.
Assessing the revenue impact for various pricing schedules, minimum
fee levels, product combinations, and product features.
Assess the impact on each financial consultant and identify those who
would be most effected.

Pg8 case study: The task force was asked to recommend new product structures
and pricing options. It focused on two main options. The first was an assetbased pricing option. The second was a direct online pricing option.
The Management Science Group assessed more than 40 combinations of offer
architectures and pricing. Model would need to be flexible to be able to analyze
new scenarios with a new set of offerings quickly enough to meet self-imposed
deadlines.
Basic model approach was to simulate client choice behaviour. From an
initial set of system data and the resulting system output measures of
interest as a baseline.
Introducing change conditions to the data and applying business rules
to the clients to determine their reactions to the changes and calculate
the revised output measures.
The team had several different ideas about how to model these rules
and as a result, developed three different models:
1. Rational-economic-behaviour model (REB) (Total revenue at risk)
2. A financial-advisor-affinity model based on monte carlo simulation

(assess clients rational behaviour)

3. A financial-advisors-affinity model based on business rules. (clients with


Collect and analyze the data
required to develop the model

a high affinity to their FAs would have a high level of price or economic
indifference)
Evaluation of an extensive volume of client-level data. A database was
constructed using data from 1998, which was the most recent full-year
data available and which provided a profile of client assets and trading
activity.
Information on five million clients, 10 million accounts, 100million
trade records, 250 million ledger records and 16,000 FAs.

Strength of the model group assessed more than 40 combinations of pricing


offerings and architectures. The turnaround time for analyzing new scenarios
with a new set of offerings was only a few hours. (pg 11)
Given complexity of the problem they were able to reduce it to 3 key models

Key Weaknesses/Criticisms

Were feedback loops incorporated in the models not clear. Feedback loops help answer
questions such as: Are the forecasts accurate\Comparing actual revenue with forecast
revenue/Simulation output and confidence intervals

FA Affinity- Business rules model - Limited explanation/clarification on why 30% zone of


indifference calculated for clients with high FA affinity and 10% for clients with low FA affinity based on management judgment. Impact on output (ie forecast revenue) if these assumptions
are changed in the model?

What is the impact on output of extreme conditions? The model seems to be built for a stable /
average environment and doesnt appear to account for the impact of macro-economic shocks
such as economic downturn and increased competition on potential client responses to ML
product offerings, firm revenue etc. These external factors will influence which of the products
are chosen. Is one year of data statistically significant?

Very little verification/validation discussed in the case only mention is using these results,
executive management specified new pricing alternatives and arrangements for testing (p. 15)

Assumption that the Greedy approach is accurate as well as simplicity of other behavioural
approaches

Potential Improvements to the Model

Sensitivity analysis ie impact on revenues if different external factors exist such as economic
downturn, external crisis. SA tests the robustness of the solution- changing some of the models
inputs and seeing effect on output

39 | P a g e

Influence diagram: useful for determining cause and effect, and important variables. Eg.
Economic variables and client choice
Solution validation- Are the outcomes, in term of actual revenue (profitability), as predicted by
the model? [The rate of revenue expected and impact on FA remuneration predicted by the
model should be compared with actual results after implementation of new products]
Data validation: The data are probably reliable, but some effort should go into determining the
source of the data and checking whether the data are correct

Alternative modelling approaches

Linear programming could have been used to find an optimum number of clients using the
integrated choice approach. This model could also take into account the number of FAs and
related costs and revenue from each customer type. Potentially there are limitations and
boundary conditions which would determine certain capacity limits and business cases. For
example:

Capital investment into the software and infrastructure for the integrated choice products
would require a certain number of clients to break-even. If this level is not reach then it would
not be in the best interest of the company to invest.
Increasing or decreasing the number of FAs to cope with the changes in demand could be a
limiting factor
External influences including inflation, base interest rates and unemployment could be
incorporated to the model.

Verification & Validation


Conceptual model validation:
determining that the scope and
level of detail of the proposed
model is sufficient for the
purpose. Are any assumptions
correct?

Data validation: Data is


extracted from the real world
for input to the conceptual
model and the computer
model, and is also used for
model validation and
experimentation. I
White-box: performed
continuously throughout the
model coding. White-box

40 | P a g e

Merrill Lynch
Conceptual model was right but there was the impact on the required
level of detail. For example undertaking the analysis on all 5million
accounts? Could a sample be used to speed up the whole project?
Given the levels of revenue concerned it was most likely that senior
management wanted the best possible, most accurate level of analysis,
hence all accounts were included.

Only taking one year into account could have missed certain trends in
the data. This was only a snap shot and could have significant
implications on the model and its ability to forecast the future.

Rating financial-advisor-affinity with clients could be open to


interpretation and human error.

ML did validate the data (pg 15) before analyzing a single record of
data, the group clarified the purpose of the analysis, the key
assumptions, the important deadlines, and the ultimate objectives.
As the source of the data was from MLs own client records it was
deemed to be accurate. There is no mention of checks being made on
the data in the case study.
There is some evidence that white box validation took place to
determine client. (pg15) Management science also worked closely with
market research and provided support for its conjoint-analysis survey.

validation ensures that the


content of the model is true to
the real world.
Black-box validation:
determining the overall
behaviour of the model is being
considered
Experimentation validation:
model is used to develop a
better understanding of the
real world, whether through
seeking solutions to specific
problems or through general
enquiry. Sensitivity analysis is
used to validate the
experimentation by checking
the robustness of the solution.
Solution Validation: Once
implemented it should be
possible to validate the
implementation against the
model results.

Over-simplistic approach Model did not take into account external


forces which would effect the clients decision to invest or not. For
example: economic situation, employment levels, access to the internet
(computer knowledge) etc.
Sensitivity analysis was carried out (pg14) The zones of price
indifference were based on management judgment, also conducted a
sensitivity analysis of revenue at risk, using various values for the
indifference zone. They varied the indifference zones to identify the
breakeven points.
In terms of finding the pricing sweet spot (pg16) Without the models
and analysis, we would not have found the right price. The one percent
fee was the sweet spot because it balanced several factors: the clients
price elasticity, our revenue at risk and profitability, competitive
offerings, and possible defections among our top FAs
Limited ability to provide feedback on future events as the revenue
analysis was only validated on 1998 data. However, the outcome of
implementing the model for ML was significant. (pg15) Integrated
Choice had a profound impact on MLs competitive position.
Whilst this could be due to the implementation of the model external
factors which were not included in the model could also play a
significant role.

How transferable the models are to other organizations?

how the internet and other new technologies can give a firm the ability to conduct business in
new and exciting ways - explores service pricing what new services should be offered and at
what price?
It is never a good idea to transfer models without assessing what the requirements/objectives of
the new system are and if they fit the new system
NB. If it was decided to transfer this model to another situation, the model would need to be
tested to ensure that the model fits the new environment and relevant changes would need to
be made to the different models within the model.

Product/Service characteristics

The application of the model is not appropriate for all industries. The characteristics can be used
to determine whether the model is appropriate for a particular product or service

Product/ service is perishable- This means that, at some point, the product becomes worthless
because it can no longer be sold, and it cannot be held in inventory for future demand, generally
because the time of its availability has passed eg FA knowledge, share prices, communications
bandwidth, seats on a bus

41 | P a g e

Market segmentation. The nature of the product is such that it can be priced to appeal and
target different market segments (. Price indifference/ price elasticity - customers willingness
to pay for different levels of service

The product or service can be sold in advance. Often this is done through reservation
systems which, when combined with other technologies, enable a system of forecasting
and control to manipulate demand and pricing.

The variable costs of the product are low. An incremental sale does not cost the
vendor much, but enables the product to be sold at a wide range of prices rather than
letting it spoil and go unsold. For example, a hotel operation has high fixed costs and
relatively low variable costs (Informs). High fixed costs - FAs

Application to other industries

Airline industry to determine revenues from different product offerings- full service
(business/first class) and limited service (economy). Like ML offerings, price elasticity and price
indifference are key factors. (Price changes and effect on demand)

Transportation industry- eg a Bus company trying to decide how many buses to provide under a
new service, and the fare levels to charge see LBC bus example p. 91

Hotel industry Ritz hotel in Lisbon using models to determine room rates.

Telecommunications for connection-oriented services- Similar simulation model can be used to


calculate optimal price discounts, which are used to shift demand from congested to
uncongested periods in a telecommunications system and use of bandwidth.

As with ML case, both assess demand for services and customers willingness to pay for used
to develop a user model of behaviour so as to predict the proportion of users who will accept a
price discount and delay use of the service. See A New Pricing Model for Competitive
Telecommunications Services Using Congestion Discounts Informs, vol 17, no. 2 Spring 200

Temporal Application: how would this problem be affected by c21 technology? Access to financial data,
different markets and customer groups, smaller asset allocations?

42 | P a g e

Das könnte Ihnen auch gefallen