Sie sind auf Seite 1von 14

2/13/2014

1
III. Model Fitting
MATHEMATICS DIVISION
IMSP - UPLB
AMAT 110 January 2014 1 / 86 BPVillacrusis AMAT 110 January 2014
Introduction
Three possible tasks when analyzing a collection
of data points:
1. Fitting a model selected type or types to data.

2. Choosing the most appropriate model from
competing types that have been fitted.
3. Making predictions from collected data.
2 / 86
(e.g. best-fitting exponential model versus
best-fitting polynomial model)
BPVillacrusis
The first two tasks falls under the general heading of
model fitting where a model or competing models exist
that seem to explain the observed behavior.

AMAT 110 January 2014
Introduction
In the third case, a model does not exist to explain the
observed behavior.
We wish to construct an empirical model based
on the collected data (e.g. using interpolation).
3 / 86 BPVillacrusis
AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
What must be done in each case?
In Task 1 the precise meaning of the best model
must be identified.
In Task 2 a criterion is needed for comparing
models of different types.
In Task 3 a criterion must be established for
determining how to make predictions in between
the observed data points.
4 / 86 BPVillacrusis AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
Modelers attitude?
Model fitting
- the modeler strongly suspects a relationship of a
particular type
- the modeler is willing to accept some deviation
(errors) between the model and the collected data
points
5 / 86 BPVillacrusis AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
Modelers attitude?
6 / 86
Interpolation.
- the modeler is strongly guided by the data that
have been carefully collected and analyzed when
seeking for the curve that captures the data points
- the modeler generally attaches little explicative
significance to the interpolating curves and expects
no error between the curve and the data points.
BPVillacrusis
2/13/2014
2
AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
- the modeler may ultimately want to make
predictions from the model.
Note:
explicative models are theory driven
predictive models are data driven.
7 / 86
Similarity?
BPVillacrusis AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
8 / 86
Example
Suppose we are attempting to relate two variables and ,
and have gathered the data plotted in the following figure.
Figure 1
BPVillacrusis AMAT 110 January 2014
Relationship Between Model Fitting
and Interpolation
9 / 86
One way to make predictions is to use a technique
such as spline interpolation with the following result.
Figure 2
Example
BPVillacrusis
AMAT 110 January 2014 10 / 86
Relationship Between Model Fitting
and Interpolation
Another way is to fit a curve (parabola)
=
1

2
+
2
+
3

that is suspected by the modeler.
In this case, the next concern is to determine
the arbitrary constants
1
,
2
and
3
to select
the best parabola.
Example
BPVillacrusis AMAT 110 January 2014 11 / 86
Relationship Between Model Fitting
and Interpolation
Figure 3
Example
BPVillacrusis AMAT 110 January 2014 12 / 86
Relationship Between Model Fitting
and Interpolation
Another example of interpolation
BPVillacrusis
2/13/2014
3
AMAT 110 January 2014 13 / 86
Relationship Between Model Fitting
and Interpolation
- Sometimes you need to fit a model and also interpolate in
the same problem.

- Subsequent analysis involving operations such as
integration or differentiation should be considered in
selecting a model
Remarks:
BPVillacrusis AMAT 110 January 2014 14 / 86
Relationship Between Model Fitting
and Interpolation
- The model may be replaced with an interpolating curve
(such as a polynomial) that is more readily differentiated or
integrated.
Remarks:
(e.g. a step function used to model a square
wave might be replaced by a trigonometric
function)
BPVillacrusis AMAT 110 January 2014 15 / 86
Relationship Between Model Fitting
and Interpolation
Remarks:
In these instances, the modeler desires the
interpolating curve to approximate closely the
essential characteristics of the function it replaces.
This type of interpolation is usually called
approximation.
BPVillacrusis
AMAT 110 January 2014 16 / 86
Sources of Error in the Modeling Process
Sources of errors should be considered in modeling.
If error considerations are neglected, undue confidence may
be placed in intermediate results, causing faulty decisions in
subsequent steps.
You need to consider the effects of cumulative errors that
exist from previous steps.
BPVillacrusis AMAT 110 January 2014 17 / 86
Sources of Error in the Modeling Process
Classification of errors:
1. Formulation errors - result from the assumption that
certain variables are negligible or from simplifications in
describing interrelationships among the variables in the
various sub-models.
2. Truncation errors - are attributable to numerical method
used to solve a mathematical problem.

For example, sin x can be represented by the power series
sin = -

3
3!
+

5
5!
+

BPVillacrusis AMAT 110 January 2014 18 / 86
Sources of Error in the Modeling Process
3. Round-off errors - are caused by using a finite digit
machine for computation such as calculators and
computers.
The accumulated effect of rounding off will significantly
affect the number that is supposed to be the answer.
4. Measurement errors - are caused by imprecision in the
data collection.
Classification of errors:
BPVillacrusis
2/13/2014
4
AMAT 110 January 2014 19 / 86
Fitting Models to Data Graphically
The model generally contains one or more
parameters and sufficient data must be gathered to
determine them. This is the problem of data
collection.
The determination of how many data points to collect
involves trade-off between the cost of obtaining them
and the accuracy required for the model.
Some considerations in data collection:
BPVillacrusis AMAT 110 January 2014 20 / 86
Fitting Models to Data Graphically
As a minimum, the modeler needs as many data
points as there are arbitrary constants in the model
curve.
The range over which the model is to be used
determines the endpoints of the interval for the
independent variable(s).
Some considerations in data collection:
BPVillacrusis AMAT 110 January 2014 21 / 86
Fitting Models to Data Graphically
The spacing between the data points within the interval is
also important because any part of the interval over which the
model must fit particularly well can be weighted by using
unequal spacing.
Each data point must also be thought of as an interval of
confidence (should commensurate with the appraisal of the
errors).
Some considerations in data collection:
BPVillacrusis
AMAT 110 January 2014 22 / 86
Fitting Models to Data Graphically
Figure 4. Data points as interval of confidence
Some considerations in data collection:
BPVillacrusis AMAT 110 January 2014 23 / 86
Visual Model Fitting with the Original Data
Suppose we want to fit to fit the model y = ax + b to
the data shown in figure 4. How might we choose the
constants a and b to determine the line that best fits the data?
Generally, when more than two data points exist, all of them
cannot be expected to lie exactly along a single straight line.

Ordinarily, there will be some vertical discrepancy between a
few of the data points and the line usually called absolute
deviations.
BPVillacrusis AMAT 110 January 2014 24 / 86
Visual Model Fitting with the Original Data
For the best-fitting line, we might try to minimize the
sum of the absolute deviations or minimize the
largest absolute deviation from the fitted line
Figure 5 Figure 6
BPVillacrusis
2/13/2014
5
AMAT 110 January 2014 25 / 86
Visual Model Fitting with the Original Data
Although these visual methods for fitting a line to data points
is not analytical and may appear to be imprecise, the methods
are often quite compatible with the accuracy of the modeling
process.
Furthermore, these technique immediately gives an
impression of how good the fit is and where it appears to fit
well.
Remarks:
BPVillacrusis AMAT 110 January 2014 26 / 86
Transforming the Data
Most of us are limited visually to fitting only lines. So how
can we graphically fit curves as models?
Suppose for example, that a relationship of the form =


is suspected for some sub-model and the data shown below
have been collected.
BPVillacrusis AMAT 110 January 2014 27 / 86
The model states that y is proportional to

. Thus, if we
plot y versus

, we should obtain approximately a straight


line.
Figure7
Transforming the Data
BPVillacrusis
AMAT 110 January 2014 28 / 86
From the figure, the slope of the line is approximated as
=
165 60.1
54.6 20.1
3.0

Transforming the Data
BPVillacrusis AMAT 110 January 2014 29 / 86
Alternate technique: Using transformation

Take the logarithm of each side of the equation =

to
obtain

ln y = ln C + x.
Note that this is a linear equation with respect to ln y and x
with y-intercept ln C.
Transforming the Data
BPVillacrusis AMAT 110 January 2014 30 / 86
Figure 8
From the figure, we can determine that ln C is
approximately 1.1, giving =
1.1
3.0 as before.
Transforming the Data
BPVillacrusis
2/13/2014
6
AMAT 110 January 2014 31 / 86
Transforming the Data
Similar transformation can be made on a variety of
other curves to produce linear relationships, like
=

.
Limitation: distortion of distance

Solution: validate the model by comparing it to the
graph of the original data and not the transformed one.
BPVillacrusis AMAT 110 January 2014 32 / 86
Transforming the Data
Example: We want to fit the model of the form
=
1

to the plotted data below.


Logarithmic transformation:
ln =
1

+ln.
BPVillacrusis AMAT 110 January 2014 33 / 86
Transforming the Data
Remarks:
1. The points of the graph of the transformed data are
squeezed together.

2. Consequently, if a line is made to fit the transformed
data, the absolute deviations appear small.
Example of a relatively poor fitted model of the form =
1

.
BPVillacrusis
AMAT 110 January 2014 34 / 86
Transforming the Data
Remember:
1. Be careful when using transformations, you might end up
selecting a relatively poor model.
2. Comparisons should be made with the original data.
3. Be aware that many computer codes fit models by first
making a transformation.
4. If you intend to use indicators to decide for the best
model, you must first ascertain how those indicators were
computed.
BPVillacrusis AMAT 110 January 2014 35 / 86
Analytic Method of Model Fitting:
Chebyshev Approximation Criterion
Given: collection of m data points (

), = 1,2,...,m

Goal: find the best fitted curve y = f(x)

Criterion: minimize the largest absolute deviation
|

| over the entire collection of data points.


This important criterion is often called
Chebyshev Approximation criterion.
BPVillacrusis AMAT 110 January 2014 36 / 86
Chebyshev Approximation Criterion
In a more compact notation:


( ) { } ( )
for all 1, 2,...,
i i
Minimize Max y f x
i m

=
BPVillacrusis
2/13/2014
7
AMAT 110 January 2014 37 / 86
Disadvantages:
Is often difficult to apply in practice, at least using
elementary calculus.
The optimization problems that result from applying
the criterion may require advanced mathematical
procedures or numerical algorithms requiring the use
of computers.
Chebyshev Approximation Criterion
BPVillacrusis AMAT 110 January 2014 38 / 86
Example:
Chebyshev Approximation Criterion
Suppose we want to measure the line segments AB, BC
and AC represented below.
Suppose your measurement yield the follwing estimates:
13, 7, 19 AB BC AC = = =
There is a discrepancy in the resu Note: lts.
BPVillacrusis AMAT 110 January 2014 39 / 86
Chebyshev Approximation Criterion
That is, we will assign values to the three line segments
in such a way that the largest absolute deviation between
any corresponding pair of assigned and observed values
is minimized.
Let's resolve the discrepancy using the Chebyshev criterion.
BPVillacrusis
AMAT 110 January 2014 40 / 86
Chebyshev Approximation Criterion
Assume the same degree of confidence in each
measurement so that each measurement has equal
weight.
In that case, the discrepancy should be distributed equally
across each segment, resulting in the predictions
2 2 1
|AB| =12 |BC| = 6 |AC| = 19
3 3 3
BPVillacrusis AMAT 110 January 2014 41 / 86
Chebyshev Approximation Criterion
1
Thus, each absolute deviation is .
3
2 2 1
|AB| =12 |BC| = 6 |AC| = 19
3 3 3
Convince yourself that reducing any of these deviations
causes one of the other deviation to increase.
Remember that |AB| + |BC| = |AC|.
BPVillacrusis AMAT 110 January 2014 42 / 86
Chebyshev Approximation Criterion
1 2 3
Let , and be the true lengths of the line segments
AB, BC and AC, respectiv
x x x
ely.
Now, let's formulate the problem symbolically.
1 2 3
Let , , and be the discrepancies between the true
and measured values as foll
r r r
ows:
( )
( )
( )
1 1
2 2
3 3
x 13
x 7
x 19
r AB
r BC
r AC
=
=
=
1 2 3
The numbers r , r and r are called residuals.
BPVillacrusis
2/13/2014
8
AMAT 110 January 2014 43 / 86
Chebyshev Approximation Criterion
1 2 3
1 2 3
Applying the Chebyshev criterion, values would be
assigned to x , x and x to minimize the largest of the
numbers |r |, |r | and |r |.
Let this largest number to be , then we want to r
Minimize r
subject to the conditions:
1
r r s
2
r r s
3
r r s
1
or r r r s s
2
or r r r s s
3
or r r r s s
1 2 3
x x x + =
BPVillacrusis AMAT 110 January 2014 44 / 86
Chebyshev Approximation Criterion
Thus, we want to
Minimize r
subject to:
1
13 0 r x + >
1
r r r s s
2
r r r s s
3
r r r s s
1 2 3
x x x + =
1
13 0 r x + >
2
7 0 r x + >
2
7 0 r x + >
3
19 0 r x + >
3
19 0 r x + >
BPVillacrusis AMAT 110 January 2014 45 / 86
Chebyshev Approximation Criterion
This problem is called a that can be
solved by computer implementation (or manually) of an
algorithm known as the S
lin
imp
ear prog
lex met
ram
hod.
Using Simplex method, this linear program yields a
1
minimum value of and
3
r
-
=
1 2 3
2 2 1
12 , 6 and 19
3 3 3
x x x
- - -
= = =
BPVillacrusis
AMAT 110 January 2014 46 / 86
General Procedure:
If represents the largest absolute value of the residuals


=

-(

), then the problem is to

Minimize r
s.t.


0
+

0

for = 1, 2, ,

NOTE: This mathematical program is not necessarily linear.
Chebyshev Approximation Criterion
BPVillacrusis AMAT 110 January 2014 47 / 86
Minimizing the Sum of the Absolute Deviations
Goal: find the best fitted curve = () over a collection
of data points



Criterion: minimize the sum of the absolute deviations
|

|.

That is, determine the parameters of the function = ()
that minimizes
|

=1

BPVillacrusis AMAT 110 January 2014 48 / 86
Although, we can easily applied this geometrically, the
general criterion represents a severe problem.
To solve this optimization problem using calculus, we need
to differentiate the sum of the absolute deviations with
respect to the parameters of () to find critical points.
However, the various derivatives of the sum fail to be
continuous because of the presence of absolute values.
Minimizing the Sum of the Absolute Deviations
Remarks:
BPVillacrusis
2/13/2014
9
AMAT 110 January 2014 49 / 86
Least-Squares Criterion
Most frequently used curved-fitting criterion.

Goal: to determine the parameters of the function of type
= () to minimize the sum
(

)
2

=1

BPVillacrusis AMAT 110 January 2014 50 / 86
Least-Squares Criterion
1. Ease of solving the resulting optimization problem using
calculus of several variables.
2. This technique is bias on relatively small errors and
discriminate relatively large deviations.
Advantages:
BPVillacrusis AMAT 110 January 2014 51 / 86
Least-Squares Criterion
Geometric interpretation:
Consider the case of three data points and let

)
denote the deviation between the observed and predicted values
for = 1, 2, 3.
Now, think of

as the scalar components of a deviation vector as


depicted below.
R = R
1
i + R
2
j + R
3
k
BPVillacrusis
AMAT 110 January 2014 52 / 86
Least-Squares Criterion
Geometric interpretation:
Thus, applying the least-squares criterion is like minimizing the
magnitude of a vector whose components are the residuals.
BPVillacrusis AMAT 110 January 2014 53 / 86
Relating the three Criteria
Minimizing the sum of the absolute deviations tends to treat
each data point with equal weight and to average the
deviations.
The Chebyshev criterion gives more weight to a single
point potentially having the largest deviation.
The least square criterion is somewhere in between as far as
weighting individual points with significant deviations is
concerned.
BPVillacrusis AMAT 110 January 2014 54 / 86
Applying the Least-Squares Criterion
Fitting a Straight Line:
Goal: to get the best fitted line over data points

, = 1,2,3,...,m.

Applying the least-squares criterion to this situation
requires minimization of
= [

]
2
= (

)
2

=1

=1

= () = +b
BPVillacrusis
2/13/2014
10
AMAT 110 January 2014 55 / 86
Fitting a Straight Line
Anecessary condition for optimality is that the two
partial derivatives

and

equal zero, yielding the


equations

= 2

=1

= 2

=1

BPVillacrusis AMAT 110 January 2014 56 / 86
Fitting a Straight Line
These equations can be rewritten to give
The preceding equations can be solved for and once all
the values for

and

are substituted into them.


BPVillacrusis AMAT 110 January 2014 57 / 86
Fitting a Straight Line
The solutions for the parameters a and b are easily obtained
by eliminations from the previous equations and are found
to be
=

2
(

)
2
(slope)
and
=

2
(

)
2
(y-intercept)
BPVillacrusis
AMAT 110 January 2014 58 / 86
Model form =

, where n is fixed

Application of the criterion then requires the
minimization of
Fitting a Power Curve
= [

]
2
= [

]
2

=1

1

BPVillacrusis AMAT 110 January 2014 59 / 86
Fitting a Power Curve
A necessary condition for optimality is that the
derivative

equals zero giving the equation:

= 2

= 0

=1

Solving the equation for yields
=

2

BPVillacrusis AMAT 110 January 2014 60 / 86
Fitting a Power Curve
BPVillacrusis
2/13/2014
11
AMAT 110 January 2014 61 / 86
Fitting a Power Curve
BPVillacrusis AMAT 110 January 2014 62 / 86
Transformed Least-Squares Fit
Consider fitting the model =

using the
least-squares criterion.


Application of the criterion requires the minimization of
= [

]
2
= [

]
2

=1

1


Again, a necessary condition for optimality is that

= 0.
BPVillacrusis AMAT 110 January 2014 63 / 86
Transformed Least-Squares Fit
Formulating the conditions, you can verify that the
resulting nonlinear system would not be easy to solve.
(Verify!)

Many of these simple models result in derivatives that are
very complex or in systems of equations that are difficult to
solve.

For these reason, we use transformations that allow us to
approximate the least-squares model.
BPVillacrusis
AMAT 110 January 2014 64 / 86
Transformed Least-Squares Fit
Suppose we wish to fit the power curve =


to a collection of data points.

Taking the logarithms of both sides of the equation
y =

yields

ln = ln + .
BPVillacrusis AMAT 110 January 2014 65 / 86
Transformed Least-Squares Fit
Note that plotting the variables ln versus ln , this equation
yields a straight line.

Using the formulas used for the least-squares line to solve for
the slope n and intercept ln with transformed variable and
=5 data points, we have
=
5 ln

(ln

) (ln

)(ln

)
5 (ln

)
2
(ln

)
2

ln =
(ln

)
2
(ln

) ln

(ln

) ln

5 (ln

)
2
(ln

)
2

BPVillacrusis AMAT 110 January 2014 66 / 86
Transformed Least-Squares Fit
For the data in the previous example, we get
ln

= 1.3217558
ln

= 8.359597801
(ln

)
2
= 1.9648967
ln

(ln

) = 5.542315175

Yielding
= 2.062809314
and
ln = 1.126613508
or = 3.085190815.
BPVillacrusis
2/13/2014
12
AMAT 110 January 2014 67 / 86
Transformed Least-Squares Fit
Thus, our transformed least-squares best fit is

= 3.0852
2.0628


This model predicts y = 16.4348 when x = 2.25.


Note: This model fails to be quadratic like the one we fit
previously.
BPVillacrusis AMAT 110 January 2014 68 / 86
Fitting y=ax
2
to previous example using Chebyshev
Approximation Criterion
Minimize
subject to
BPVillacrusis AMAT 110 January 2014 69 / 86
The solution to the preceding LP yields = 0.28293 and

2
= 3.17073.

Thus, the minimum largest deviation is 0.28293
the model is = 3.17073
2
.

We have now determined estimates of the parameter a for the
model type =
2
.

Which estimate is the best?
BPVillacrusis
Fitting y=ax
2
to previous example using Chebyshev
Approximation Criterion
AMAT 110 January 2014 70 / 86
Choosing the Best Model
Summary of the results of the three models
Note the increase in the sum of the squares of the deviations
in the transformed least-squares model.
BPVillacrusis AMAT 110 January 2014 71 / 86
Choosing the Best Model
It is tempting to apply a simple rule, such as choose the
model with the smallest absolute deviation; but other
statistical goodness of fit exist as well.

These indicators are useful in eliminating obviously poor
models, but there is no easy answer to the question, Which
is the best model?.

The model with the smallest absolute deviation or smallest
sum of squares may fit very poorly over the range where you
intend to use it.
BPVillacrusis AMAT 110 January 2014 72 / 86
Choosing the Best Model
In each of these graphs the model = has the same sumof squared
deviations.
BPVillacrusis
2/13/2014
13
AMAT 110 January 2014 73 / 86
Choosing the Best Model
Furthermore, as you may know already, interpolation can
easily construct models that pass through each data point,
thereby yielding zero sum of squares and zero maximum
deviation.

So, answering the question of which model is the best
must be done on a case-to-case basis,
taking into account such things as
purpose of the model,
precision demanded by the scenario,
accuracy of the data,
range of values for the independent variables
BPVillacrusis AMAT 110 January 2014 74 / 86
Choosing the Best Model
BPVillacrusis AMAT 110 January 2014 75 / 86
Choosing the Best Model
BPVillacrusis
AMAT 110 January 2014 76 / 86
Choosing the Best Model
BPVillacrusis AMAT 110 January 2014 77 / 86
Choosing the Best Model
BPVillacrusis AMAT 110 January 2014 78 / 86
Choosing the Best Model
BPVillacrusis
2/13/2014
14
AMAT 110 January 2014 79 / 86
Choosing the Best Model
BPVillacrusis

Das könnte Ihnen auch gefallen