Model Fitting

2/13/2014
1
III. Model Fitting
MATHEMATICS DIVISION
IMSP - UPLB
AMAT 110 January 2014 1 / 86 BPVillacrusis AMAT 110 January 2014
Introduction
Three possible tasks when analyzing a collection
of data points:
1. Fitting a model selected type or types to data.

2. Choosing the most appropriate model from
competing types that have been fitted.
3. Making predictions from collected data.
2 / 86
(e.g. best-fitting exponential model versus
best-fitting polynomial model)
BPVillacrusis
The first two tasks falls under the general heading of
model fitting where a model or competing models exist
that seem to explain the observed behavior.

AMAT 110 January 2014
Introduction
In the third case, a model does not exist to explain the
observed behavior.
We wish to construct an empirical model based
on the collected data (e.g. using interpolation).
3 / 86 BPVillacrusis
Relationship Between Model Fitting
and Interpolation
What must be done in each case?
In Task 1 the precise meaning of the best model
must be identified.
In Task 2 a criterion is needed for comparing
models of different types.
In Task 3 a criterion must be established for
determining how to make predictions in between
the observed data points.
4 / 86 BPVillacrusis AMAT 110 January 2014
and Interpolation
Modelers attitude?
Model fitting
- the modeler strongly suspects a relationship of a
particular type
- the modeler is willing to accept some deviation
(errors) between the model and the collected data
points
5 / 86 BPVillacrusis AMAT 110 January 2014
and Interpolation
Modelers attitude?
6 / 86
Interpolation.
- the modeler is strongly guided by the data that
have been carefully collected and analyzed when
seeking for the curve that captures the data points
- the modeler generally attaches little explicative
significance to the interpolating curves and expects
no error between the curve and the data points.
BPVillacrusis
2/13/2014
2
and Interpolation
- the modeler may ultimately want to make
predictions from the model.
Note:
explicative models are theory driven
predictive models are data driven.
7 / 86
Similarity?
BPVillacrusis AMAT 110 January 2014
and Interpolation
8 / 86
Example
Suppose we are attempting to relate two variables and ,
and have gathered the data plotted in the following figure.
Figure 1
BPVillacrusis AMAT 110 January 2014
and Interpolation
9 / 86
One way to make predictions is to use a technique
such as spline interpolation with the following result.
Figure 2
Example
BPVillacrusis
AMAT 110 January 2014 10 / 86
and Interpolation
Another way is to fit a curve (parabola)
=
1
2
+
2
+
3

that is suspected by the modeler.
In this case, the next concern is to determine
the arbitrary constants
1
,
2
and
3
to select
the best parabola.
Example
BPVillacrusis AMAT 110 January 2014 11 / 86
and Interpolation
Figure 3
Example
and Interpolation
Another example of interpolation
BPVillacrusis
2/13/2014
3
AMAT 110 January 2014 13 / 86
and Interpolation
- Sometimes you need to fit a model and also interpolate in
the same problem.

- Subsequent analysis involving operations such as
integration or differentiation should be considered in
selecting a model
Remarks:
and Interpolation
- The model may be replaced with an interpolating curve
(such as a polynomial) that is more readily differentiated or
integrated.
Remarks:
(e.g. a step function used to model a square
wave might be replaced by a trigonometric
function)
and Interpolation
Remarks:
In these instances, the modeler desires the
interpolating curve to approximate closely the
essential characteristics of the function it replaces.
This type of interpolation is usually called
approximation.
BPVillacrusis
AMAT 110 January 2014 16 / 86
Sources of Error in the Modeling Process
Sources of errors should be considered in modeling.
If error considerations are neglected, undue confidence may
be placed in intermediate results, causing faulty decisions in
subsequent steps.
You need to consider the effects of cumulative errors that
exist from previous steps.
Classification of errors:
1. Formulation errors - result from the assumption that
certain variables are negligible or from simplifications in
describing interrelationships among the variables in the
various sub-models.
2. Truncation errors - are attributable to numerical method
used to solve a mathematical problem.

For example, sin x can be represented by the power series
sin = -
3
3!
+
5
5!
+

3. Round-off errors - are caused by using a finite digit
machine for computation such as calculators and
computers.
The accumulated effect of rounding off will significantly
affect the number that is supposed to be the answer.
4. Measurement errors - are caused by imprecision in the
data collection.
Classification of errors:
BPVillacrusis
2/13/2014
4
AMAT 110 January 2014 19 / 86
Fitting Models to Data Graphically
The model generally contains one or more
parameters and sufficient data must be gathered to
determine them. This is the problem of data
collection.
The determination of how many data points to collect
involves trade-off between the cost of obtaining them
and the accuracy required for the model.
Some considerations in data collection:
As a minimum, the modeler needs as many data
points as there are arbitrary constants in the model
curve.
The range over which the model is to be used
determines the endpoints of the interval for the
independent variable(s).
The spacing between the data points within the interval is
also important because any part of the interval over which the
model must fit particularly well can be weighted by using
unequal spacing.
Each data point must also be thought of as an interval of
confidence (should commensurate with the appraisal of the
errors).
BPVillacrusis
AMAT 110 January 2014 22 / 86
Figure 4. Data points as interval of confidence
Visual Model Fitting with the Original Data
Suppose we want to fit to fit the model y = ax + b to
the data shown in figure 4. How might we choose the
constants a and b to determine the line that best fits the data?
Generally, when more than two data points exist, all of them
cannot be expected to lie exactly along a single straight line.

Ordinarily, there will be some vertical discrepancy between a
few of the data points and the line usually called absolute
deviations.
For the best-fitting line, we might try to minimize the
sum of the absolute deviations or minimize the
largest absolute deviation from the fitted line
Figure 5 Figure 6
BPVillacrusis
2/13/2014
5
AMAT 110 January 2014 25 / 86
Although these visual methods for fitting a line to data points
is not analytical and may appear to be imprecise, the methods
are often quite compatible with the accuracy of the modeling
process.
Furthermore, these technique immediately gives an
impression of how good the fit is and where it appears to fit
well.
Remarks:
Transforming the Data
Most of us are limited visually to fitting only lines. So how
can we graphically fit curves as models?
Suppose for example, that a relationship of the form =

is suspected for some sub-model and the data shown below
have been collected.
The model states that y is proportional to
. Thus, if we
plot y versus
, we should obtain approximately a straight

line.
Figure7
BPVillacrusis
AMAT 110 January 2014 28 / 86
From the figure, the slope of the line is approximated as
=
165 60.1
54.6 20.1
3.0

Alternate technique: Using transformation

Take the logarithm of each side of the equation =
to
obtain

ln y = ln C + x.
Note that this is a linear equation with respect to ln y and x
with y-intercept ln C.
Figure 8
From the figure, we can determine that ln C is
approximately 1.1, giving =
1.1
3.0 as before.
BPVillacrusis
2/13/2014
6
AMAT 110 January 2014 31 / 86
Similar transformation can be made on a variety of
other curves to produce linear relationships, like
=
.
Limitation: distortion of distance

Solution: validate the model by comparing it to the
graph of the original data and not the transformed one.
Example: We want to fit the model of the form
=
1
to the plotted data below.

Logarithmic transformation:
ln =
1
+ln.
Remarks:
1. The points of the graph of the transformed data are
squeezed together.

2. Consequently, if a line is made to fit the transformed
data, the absolute deviations appear small.
Example of a relatively poor fitted model of the form =
1

.
BPVillacrusis
AMAT 110 January 2014 34 / 86
Remember:
1. Be careful when using transformations, you might end up
selecting a relatively poor model.
2. Comparisons should be made with the original data.
3. Be aware that many computer codes fit models by first
making a transformation.
4. If you intend to use indicators to decide for the best
model, you must first ascertain how those indicators were
computed.
Analytic Method of Model Fitting:
Chebyshev Approximation Criterion
Given: collection of m data points (
), = 1,2,...,m

Goal: find the best fitted curve y = f(x)

Criterion: minimize the largest absolute deviation
|

| over the entire collection of data points.

This important criterion is often called
Chebyshev Approximation criterion.
In a more compact notation:

( ) { } ( )
for all 1, 2,...,
i i
Minimize Max y f x
i m
=
BPVillacrusis
2/13/2014
7
AMAT 110 January 2014 37 / 86
Disadvantages:
Is often difficult to apply in practice, at least using
elementary calculus.
The optimization problems that result from applying
the criterion may require advanced mathematical
procedures or numerical algorithms requiring the use
of computers.
Example:
Suppose we want to measure the line segments AB, BC
and AC represented below.
Suppose your measurement yield the follwing estimates:
13, 7, 19 AB BC AC = = =
There is a discrepancy in the resu Note: lts.
That is, we will assign values to the three line segments
in such a way that the largest absolute deviation between
any corresponding pair of assigned and observed values
is minimized.
Let's resolve the discrepancy using the Chebyshev criterion.
BPVillacrusis
AMAT 110 January 2014 40 / 86
Assume the same degree of confidence in each
measurement so that each measurement has equal
weight.
In that case, the discrepancy should be distributed equally
across each segment, resulting in the predictions
2 2 1
|AB| =12 |BC| = 6 |AC| = 19
3 3 3
1
Thus, each absolute deviation is .
3
2 2 1
|AB| =12 |BC| = 6 |AC| = 19
3 3 3
Convince yourself that reducing any of these deviations
causes one of the other deviation to increase.
Remember that |AB| + |BC| = |AC|.
1 2 3
Let , and be the true lengths of the line segments
AB, BC and AC, respectiv
x x x
ely.
Now, let's formulate the problem symbolically.
1 2 3
Let , , and be the discrepancies between the true
and measured values as foll
r r r
ows:
( )
( )
( )
1 1
2 2
3 3
x 13
x 7
x 19
r AB
r BC
r AC
=
=
=
1 2 3
The numbers r , r and r are called residuals.
BPVillacrusis
2/13/2014
8
AMAT 110 January 2014 43 / 86
1 2 3
1 2 3
Applying the Chebyshev criterion, values would be
assigned to x , x and x to minimize the largest of the
numbers |r |, |r | and |r |.
Let this largest number to be , then we want to r
Minimize r
subject to the conditions:
1
r r s
2
r r s
3
r r s
1
or r r r s s
2
or r r r s s
3
or r r r s s
1 2 3
x x x + =
Thus, we want to
Minimize r
subject to:
1
13 0 r x + >
1
r r r s s
2
r r r s s
3
r r r s s
1 2 3
x x x + =
1
13 0 r x + >
2
7 0 r x + >
2
7 0 r x + >
3
19 0 r x + >
3
19 0 r x + >
This problem is called a that can be
solved by computer implementation (or manually) of an
algorithm known as the S
lin
imp
ear prog
lex met
ram
hod.
Using Simplex method, this linear program yields a
1
minimum value of and
3
r
-
=
1 2 3
2 2 1
12 , 6 and 19
3 3 3
x x x
- - -
= = =
BPVillacrusis
AMAT 110 January 2014 46 / 86
General Procedure:
If represents the largest absolute value of the residuals

=

-(

), then the problem is to

Minimize r
s.t.

0
+

0

for = 1, 2, ,

NOTE: This mathematical program is not necessarily linear.
Minimizing the Sum of the Absolute Deviations
Goal: find the best fitted curve = () over a collection
of data points

Criterion: minimize the sum of the absolute deviations
|
|.

That is, determine the parameters of the function = ()
that minimizes
|
=1

Although, we can easily applied this geometrically, the
general criterion represents a severe problem.
To solve this optimization problem using calculus, we need
to differentiate the sum of the absolute deviations with
respect to the parameters of () to find critical points.
However, the various derivatives of the sum fail to be
continuous because of the presence of absolute values.
Minimizing the Sum of the Absolute Deviations
Remarks:
BPVillacrusis
2/13/2014
9
AMAT 110 January 2014 49 / 86
Least-Squares Criterion
Most frequently used curved-fitting criterion.

Goal: to determine the parameters of the function of type
= () to minimize the sum
(
)
2
=1

1. Ease of solving the resulting optimization problem using
calculus of several variables.
2. This technique is bias on relatively small errors and
discriminate relatively large deviations.
Advantages:
Geometric interpretation:
Consider the case of three data points and let
)
denote the deviation between the observed and predicted values
for = 1, 2, 3.
Now, think of
as the scalar components of a deviation vector as

depicted below.
R = R
1
i + R
2
j + R
3
k
BPVillacrusis
AMAT 110 January 2014 52 / 86
Geometric interpretation:
Thus, applying the least-squares criterion is like minimizing the
magnitude of a vector whose components are the residuals.
Relating the three Criteria
Minimizing the sum of the absolute deviations tends to treat
each data point with equal weight and to average the
deviations.
The Chebyshev criterion gives more weight to a single
point potentially having the largest deviation.
The least square criterion is somewhere in between as far as
weighting individual points with significant deviations is
concerned.
Applying the Least-Squares Criterion
Fitting a Straight Line:
Goal: to get the best fitted line over data points
, = 1,2,3,...,m.

Applying the least-squares criterion to this situation
requires minimization of
= [
]
2
= (
)
2
=1
=1

= () = +b
BPVillacrusis
2/13/2014
10
AMAT 110 January 2014 55 / 86
Fitting a Straight Line
Anecessary condition for optimality is that the two
partial derivatives
and
equal zero, yielding the

equations
= 2
=1

= 2
=1

These equations can be rewritten to give
The preceding equations can be solved for and once all
the values for
and
are substituted into them.

The solutions for the parameters a and b are easily obtained
by eliminations from the previous equations and are found
to be
=

2
(
)
2
(slope)
and
=

2
(
)
2
(y-intercept)
BPVillacrusis
AMAT 110 January 2014 58 / 86
Model form =
, where n is fixed

Application of the criterion then requires the
minimization of
Fitting a Power Curve
= [
]
2
= [
]
2
=1
1

A necessary condition for optimality is that the
derivative
equals zero giving the equation:
= 2
= 0
=1

Solving the equation for yields
=
2

BPVillacrusis
2/13/2014
11
AMAT 110 January 2014 61 / 86
Transformed Least-Squares Fit
Consider fitting the model =
using the
least-squares criterion.

Application of the criterion requires the minimization of
= [
]
2
= [
]
2
=1
1

Again, a necessary condition for optimality is that

= 0.
Formulating the conditions, you can verify that the
resulting nonlinear system would not be easy to solve.
(Verify!)

Many of these simple models result in derivatives that are
very complex or in systems of equations that are difficult to
solve.

For these reason, we use transformations that allow us to
approximate the least-squares model.
BPVillacrusis
AMAT 110 January 2014 64 / 86
Suppose we wish to fit the power curve =

to a collection of data points.

Taking the logarithms of both sides of the equation
y =
yields

ln = ln + .
Note that plotting the variables ln versus ln , this equation
yields a straight line.

Using the formulas used for the least-squares line to solve for
the slope n and intercept ln with transformed variable and
=5 data points, we have
=
5 ln
(ln
) (ln
)(ln
)
5 (ln
)
2
(ln
)
2

ln =
(ln
)
2
(ln
) ln
(ln
) ln
5 (ln
)
2
(ln
)
2

For the data in the previous example, we get
ln
= 1.3217558
ln
= 8.359597801
(ln
)
2
= 1.9648967
ln
(ln
) = 5.542315175

Yielding
= 2.062809314
and
ln = 1.126613508
or = 3.085190815.
BPVillacrusis
2/13/2014
12
AMAT 110 January 2014 67 / 86
Thus, our transformed least-squares best fit is

= 3.0852
2.0628

This model predicts y = 16.4348 when x = 2.25.

Note: This model fails to be quadratic like the one we fit
previously.
Fitting y=ax
2
to previous example using Chebyshev
Approximation Criterion
Minimize
subject to
The solution to the preceding LP yields = 0.28293 and
2
= 3.17073.

Thus, the minimum largest deviation is 0.28293
the model is = 3.17073
2
.

We have now determined estimates of the parameter a for the
model type =
2
.

Which estimate is the best?
BPVillacrusis
Fitting y=ax
2
to previous example using Chebyshev
Approximation Criterion
AMAT 110 January 2014 70 / 86
Choosing the Best Model
Summary of the results of the three models
Note the increase in the sum of the squares of the deviations
in the transformed least-squares model.
It is tempting to apply a simple rule, such as choose the
model with the smallest absolute deviation; but other
statistical goodness of fit exist as well.

These indicators are useful in eliminating obviously poor
models, but there is no easy answer to the question, Which
is the best model?.

The model with the smallest absolute deviation or smallest
sum of squares may fit very poorly over the range where you
intend to use it.
In each of these graphs the model = has the same sumof squared
deviations.
BPVillacrusis
2/13/2014
13
AMAT 110 January 2014 73 / 86
Furthermore, as you may know already, interpolation can
easily construct models that pass through each data point,
thereby yielding zero sum of squares and zero maximum
deviation.

So, answering the question of which model is the best
must be done on a case-to-case basis,
taking into account such things as
purpose of the model,
precision demanded by the scenario,
accuracy of the data,
range of values for the independent variables
BPVillacrusis
AMAT 110 January 2014 76 / 86
BPVillacrusis
2/13/2014
14
AMAT 110 January 2014 79 / 86
BPVillacrusis

Model Fitting

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Model Fitting

Hochgeladen von

Copyright:

Verfügbare Formate

2/13/2014

, we should obtain approximately a straight

to the plotted data below.

| over the entire collection of data points.

as the scalar components of a deviation vector as

equal zero, yielding the

are substituted into them.

equals zero giving the equation:

Das könnte Ihnen auch gefallen