Sie sind auf Seite 1von 20

GEM 803

Numerical Methods
CHAPTER 3
CURVE FITTING
Curve Fitting
Curves are being fit to data essentially to obtain intermediate
estimates.
Techniques in curve fitting require the simplified version of a
complicated function by which the values of the function at a
number of discrete values along the range of interest are
computed.
Then, a simpler function may be derived to fit these values.
Both applications are known as curve fitting.
Curve Fitting
The distinction between two general approaches for curve fitting
is categorized on the basis of the amount of error associated with
the data presented.
•The data exhibits a significant degree of error or “noise”

With these types of data, it is necessary to derive a single curve


that represents the general trend of the data. The curve is
designed to follow the pattern of the points exclusively taken as a
group. Since any individual data may be defined incorrect, no
effort is made to intersect every point. One approach of this
nature is called least-squares regression method.
Curve Fitting

The data is known to be very precise

The basic approach is to fit a curve or series of curve


that pass directly through each point. Such data are
usually obtained from tables. The estimation of
values between well-known discrete points is called
interpolation.
Least-Square regression
When the error associated with data is substantial enough,
interpolation would be inappropriate and may just yield
unsatisfactory results when predicting intermediate values. To
solve this, the appropriate approach is to visually inspect the
plotted data and then sketch a “best” line through the points. One
way to do this is to derive a curve that minimizes the discrepancy
between the data points and the curve. The technique best suited
for this is called the least-square regression.
Linear Regression
The simplest illustration of the least-square approximation is
fitting a straight-line to a set of paired observations:
{(x1,y1),(x2,y2),..,(xn,yn)}. This mathematical expression for the
straight line is
(1) y = a + bx + e
where a and b are coefficients representing the intercept and
the slope, respectively, and e is the error or residual between
the model and the observations.
Linear Regression
Rearranging the equation;
(2) e = y – a – bx
The error or residual is the discrepancy between the true value y
and the approximate value, a + bx, predicted by the linear
equation.
Criteria for a “best” fit
•One strategy for fitting a “best” line through the data would be
minimize the sum of the residual error
n n
(3)  e =  ( y − a − bx )
i =1
i
i =1
i i

where n is the number of the sets of data. However, this is an


inadequate criterion, which depicts the fit of a straight line to 2
points.
•Another criterion would be to minimize the sum of the absolute
values of the discrepancies.
Criteria for a “best” fit
•Another criterion would be to minimize the sum of the absolute
values of the discrepancies.
n n

(4)  e =  (y
i =1
i
i =1
i − a − bxi )
This criterion does not yield a unique best fit.
•A strategy that overcomes the disadvantages of the
aforementioned approaches is to minimize the sum of the squares
of the residuals,
n n
(5)
S =  ei =  ( yi − a − bxi )
2 2

i =1 i =1
Least-Square Fit of a Straight Line
In order to determine values for a and b, equation (5) is
differentiated with respect to each coefficient,

S
= −2 ( yi − a − bxi )
(6) a
S
= −2 ( yi − a − bxi )xi
b

Setting these derivatives to zero will result in a minimum S,

0 =  yi −  a −  bxi
(7)
0 =  xi yi −  axi −  bxi
2
Least-Square Fit of a Straight Line
But a = na, the equations can be expressed as a set of two
simultaneous linear equations of two unknowns:
(6) na +  bxi =  yi

 axi +  bxi =  xi yi
2

These are called the normal equations. Solving these


simultaneously yield the following results:
n xi yi −  xi  yi
b=
n xi − ( xi )
2
(7) 2

a=
 y i
−b
 x i

n n
Example
A study was done to determine the effect of ambient temperature x on the
electric power y consumed by a chemical plant. Other factors held constant,
and the data were collected from an experimental pilot plant.
y (BTU) x (0F) y (BTU) x (0F)
250 27 265 31
285 45 298 60
320 75 267 34
295 58 321 74

(a) Estimate the regression line y = a + bx


(b) Predict the power consumption for an ambient temperature of 65 0F.
Linearization of Nonlinear Relationship
•In most cases, the behavior exhibited by a given data set either
appears to be nonlinear or curvilinear. Nevertheless,
transformation may be employed to express the data in a form
that is compatible with linear regression.
•Examples of Nonlinear model
(exponential model)
y = ae bx
(power equation)
y = axb
(saturation-growth rate equation)
ax
y=
b+ x
Example 1
The following data are the selling prices y, of certain model of
used cars x years old:
x 1 2 2 3 5 5
y 6350 5695 5790 5395 4985 4895
(a) Fit a curve of the form y = abx
(b) Estimate the selling price of such a car when it is 4 years old.
Example 2
Fit a saturation-growth rate equation to the following data set.
x 0.75 2 2.5 4 6 8 8.5
y 0.8 1.3 1.2 1.6 1.7 1.8 1.7
Nonlinear Regression
In some cases when the approximating function to fit is nonlinear
and the linearization is not possible, it is necessary to establish the
regression coefficients using the deviation, as discussed in the
linear regression method, say for second order polynomial of the
form

y = a0 + a1 x + a2 x 2

The corresponding deviation function is given by


n
D =  ( yi − a0 − a1 xi − a2 xi ) 2
2

i =1
Nonlinear Regression
Differentiating with respect to a0, a1, and a2 and then equating
each to zero, yields the following
n n n

y
i =1
i = a0 n + a1  xi + a2  xi
i =1 i =1
2

n n n n

x y
i =1
i i = a0  xi + a1  xi + a2  xi
i =1 i =1
2

i =1
3

n n n n

x
i =1
i
2
yi = a0  xi + a1  xi + a2  xi
i =1
2

i =1
3

i =1
4
Multiple Regression
The regression may be called multiple if there are more than one
independent variable involved, the approximating the function for 2
variables x and z is of the form
y = ao + a1 x + a2 z

The values of a0, a1, and a2 are obtained using the deviation functions
and are found from the equations
n n n

y
i =1
i = a0 n + a1  xi + a2  zi
i =1 i =1
n n n n

x y
i =1
i i = a0  xi + a1  xi + a2  xi zi
i =1 i =1
2

i =1
n n n n

 yi zi = a0  zi + a1  xi zi + a2  zi
i =1 i =1 i =1 i =1
2
Example 1
The following is a set of coded experimental data on the compressive
strength of a particular alloy at various values of the concentration of
some additive.
Concentration, x Compressive Strength, y
10.0 25.2 27.3 28.7
15.0 29.8 31.1 27.8
20.0 31.2 32.6 29.7
25.0 31.7 30.1 32.3
30.0 29.4 30.8 32.8
Estimate the quadratic regression curve of the form
y = 0 + 1x + 2 x2
Example 2
A study was performed on a type of bearing to find the relationship of amount of wear y
to x1 = oil viscosity and x2 = load. The following data were obtained.
y x1 x2
193 1.6 851
172 22.0 1058
113 33.0 1357
230 15.5 816
91 43.0 1201
125 40.0 1115

(a) Estimate the unknown parameters of the multiple linear regression equation
y = 0 + 1x1 + 2 x2
(a) Predict wear when oil viscosity is 20 and load is 1200.

Das könnte Ihnen auch gefallen