Metamodeling Scilab

powered by
SURFACE FITTING? DO IT WITH SCILAB!

Authors: Massimiliano Margonari
Keywords:
Data fitting, Interpolation, inverse distance weighting, Kriging, Neural Network
Abstract:
In this paper we show how it is possible to implement different techniques to solve the data fitting problem.
Contacts:
m.margonari@openeering.com
Attachments:
surface_fitting.zip
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
EnginSoft SpA - Via della Stazione, 27 - 38123 Mattarello di Trento | P.I. e C.F. IT00599320223
Surface Fitting? Do it with Scilab!
1. Introduction
Very often, in engineering sciences, data have to be fitted to have a more general view of the problem at hand. These data usually come out from a series of experiments, both physical and virtual, and surface fitting is the only way to get relevant and general information from the system under exam. Many different expressions are used to identify the activity of building a mathematical surrogate describing the behavior of a physical system from some known data. Someone simply talks about regression or data fitting or data interpolation, someone else talks about modeling or metamodeling or response surface methodology. Different application fields, and different schools, use different words for describing the same activity. The common activity is carrying out a mathematical model starting from points. The model should be able to describe the phenomenon and should also give reliable answer when estimating values in unknown points. This problem is well-known in literature and a wide variety of solutions has been proposed. In this paper we present a series of techniques which engineers can implement in Scilab and use to solve data fitting problems very easily and fast. For sake of simplicity, in this tutorial we decided to consider 30 couples of values (x and y, belonging to the interval [0, 1]) to which a system response (z) is associated. The data fitting problem reduces, in this case, to find a mathematical function of kind , able to reproduce z at best (see Figure 1). Of course, all the techniques described hereafter can be extended to tackle multidimensional data fitting problems.
Figure 1: 3D plot of the dataset used in this paper: 30 points.
www.openeering.com
page 2/13
2. Polynomial fitting
Probably one of the simplest and most used data fitting techniques is the polynomial fitting. We imagine that the system response can be adequately modeled by a mathematical function expressed as:
where coefficients represent the free parameters of the model. Usually, polynomial coefficients are determined imposing the interpolation condition of in each point of the dataset ( ) obtaining a system of linear equations, such as:
The number of points to be fitted is usually larger than the number of coefficients of the polynomial. This leads to an over-determined system (more equations than unknowns) which has to be solved in a least square sense. The resulting polynomial generally does not interpolate exactly the data, but it fits the data instead. It is very easy to set up a Scilab script which implements this fitting technique: the main steps to be performed are to correctly set up the over determined system, whose coefficients can be computed in our case using a Pascal triangle, and the solution of the system. The resulting model can be also plotted. In Figure 2 the polynomial of degree 4 is drawn together with the 30 points in the dataset (a.k.a. training points). It can be seen that the surface does not pass through the training points. One could be tempted to choose a very high degree of the polynomial in order to have a very flexible mathematical model able, in principle, to correctly capture complex behaviors. Unfortunately, this does not lead to the desired result: the problem is known as overfitting or Runge phenomenon in honor of the mathematician who described it (see Figure 3). The risk is that, when increasing too much the polynomial degree, the obtained model exhibits an oscillatory and unnatural behavior between the dataset points. In other words, the polynomial over fits the data, as commonly said in these cases. This overfitting problem, which is universally known linked to the polynomial fitting, also afflicts other fitting techniques. The Ockham's razor should be kept in mind when dealing with fitting: the suggestion is here to not abuse of the possibility to add complexity to the model when not strictly necessary. The file polynomial_fitting.sce contains a possible implementation of the polynomial fitting technique described above.
www.openeering.com
page 3/13
Figure 2: Polynomial of degree 4 is plotted together with the dataset points (black dots). It can be seen that the resulting surface, even if it approximates the data, does not pass through the points, being not interpolating.
Figure 3: The overfitting phenomenon is plotted in this picture. A 10 degree polynomial is used in this case. The resulting model exhibits large unrealistic oscillations far from the dataset points.
www.openeering.com
page 4/13
3. The inverse distance weighting (IDW)

The inverse distance weighting technique, shortly known as IDW, is a relatively simple but effective data fitting technique. The mathematical frame is not particularly sophisticated and this allows a fast and trivial implementation of this approach. The main hypothesis is that the system response, which can be also strongly non-linear, has to be sufficiently regular: neither jumps nor sudden variations are admitted here. This request is not too much restrictive in many engineering applications. The idea is to estimate the response in an unknown point of the space as a weighted sum of the known responses in the surrounding points . Mathematically speaking we say that:
where the weights are computed as:
As shows in Figure 4, the first closest points to where the system response is known have to be found. Then, the response in all the n points has to be summed, taking into account that a closer point will have more importance in determining the unknown response. In Figure 5 the resulting model (with n=10, p=2) is drawn together with the dataset points. It can be seen that a non-linear non-polynomial response is produced and that the dataset points are perfectly interpolated. The predicted response is however characterized by a non-smooth behavior, due to the relatively low n used in the computations. Actually, if we set n=30, the resulting IDW model is much smoother (see Figure 6) than before. It immediately appears that the IDW model tends to give a flat constant response in portions of the domain where there are not dataset points. Actually, all the points where a known response is available tend to have the same weight, as we go far away.
Figure 4: The IDW metamodel allows predicting the response in a point known responses in the first closest n points (red).
.(green) considering the
www.openeering.com
page 5/13
Figure 5: The IDW metamodel (n=3, p=2) is plotted together with the dataset points (black dots). We can notice that the metamodel exactly interpolates the points and that the response is characterized by sudden variations, due to the low n adopted.
Figure 6: The IDW metamodel (n=30, p=2). In this case the model is definitely smoother than before. In the portions of the domain where there are no points the predicted response tends to be constant, as expected.
The file idw.sce contains a possible implementation of the inverse distance weighting (IDW) technique described above. www.openeering.com
page 6/13
4. The radial basis function (RBF)

The radial basis function technique is another interesting approach that can used to build interpolating models. The idea is very similar to the one at the basis of the IDW presented in the previous chapter. However, in this case we imagine that the unknown response can be modeled as a weighted sum of some basis functions ):
The parameter is a free parameter which can be set to modify the shape of the basis function. In literature a large variety of basis function can be found: they usually are chosen to have a bell shape, centered in , and decreasing smoothly away. Probably one of the most used and simple basis functions is the Gaussian one, which is:
We can write equations which enforce the interpolation condition of our model in the dataset points, obtaining a well posed system of linear equations:
The system can be easily solved obtaining the weights . The basis functions contain a free parameter , which has to be fixed before filling the matrix reported above, which obviously can strongly influence the final result. The greater the , the sharper the basis function and, therefore, the final metamodel in proximity of the dataset points. If a good value of such parameter is not known one can set up an automatic procedure to get the best . One possible strategy consists in minimizing the so-called mean leave one out error. Let us suppose for a while to disregard the first point in the dataset and build the metamodel using a given guess : we can now compute the difference between the known and the predicted response in the point excluded from the dataset. This error measure is known as leave one our error, to remember the way it is computed. If we repeat this step for all points in the dataset we can finally compute the mean error, that is the mean leave one out error (MLOOE), associated to the parameter . An optimization procedure, such as a gradient based algorithm, can be used to find the which minimize the MLOOE. In Figure 7 the MLOOE is reported versus the iteration number as given by the Scilab optimization function optim. It is very important to correctly set this parameter. Actually, if a too small value of is used, the basis functions tends to look as the Dirac delta function leading to a useless metamodel, which gives a non-zero answer only in proximity of the dataset points. On the contrary if a too high value is adopted, the metamodel will tend to a constant like function, filling the valleys and smoothing the peaks too much. In Figure 8 the metamodel built using the RBF approach is reported. It can be seen that the resulting surface is interpolating the dataset as expected. The MLOOE is in this case 0.0149502, having a .
www.openeering.com
page 7/13
Figure 7: The mean leave one out error (MLOOE) plotted versus the iterations number of the optimization procedure, driven by a gradient-based algorithm.
Figure 8: The RBF metamodel interpolates the data exactly ( = 0.0547619, MLOOE = 0.0149502).
www.openeering.com
page 8/13
5. The Kriging approach

The Kriging approach is a well-known interpolation method originally developed by the French mathematician Georges Matheron to solve geostatistics problems. The basic idea, as for other techniques, is to compute the unknown response in a point x through the definition of a weighted linear combination of known responses in points xi:
The peculiarity of Kriging is to consider the f(x) as an extraction from a random process F(x) and to use the following estimator to compute the unknown weights i:
The simple Kriging, which is as the term says the easiest version of this approach, is based on the assumption that the mean m(x) is constant and known: without any loss of generality we assume m(x)=0. The unknown weights can be computed minimizing the prediction error (also called Kriging variance), which reads:
being E the mean operator. The first derivatives with respect to i have to be set to zero obtaining a system of n linear equations:
where C(h)=C(||xi-xj||)=C(0)-(h) is the covariance matrix, which can be filled using a semivariogram (h), that is mathematically defined as:
being h a distance between two points in the space. Substantially, the semivariogram is a function which describes the spatial auto-correlation of a function and it can be estimated starting from a series of experimental data. Scilab makes available the Kriging technique to the users with the DACE toolbox. It can be downloaded and installed through the ATOMS portal. It is possible to fast build a metamodel calling the dacefit function and use it thanks to the predictor function. We used this toolbox to fit our set of data: results are reported in the following figures. Kriging has the valuable advantage to also provide an estimation of the error committed in the prevision: this can be used to get a sort of confidence interval around a prevision which is often interesting information to have when dealing with engineering applications. www.openeering.com
page 9/13
Figure 9: The interpolating Kriging metamodel.
Figure 10: The estimation of error in the model estimation. It can be seen that the largest errors come out in the portions of the domain where a low number of points is available.
The file dace.sce uses the DACE toolbox available through ATOMS to solve the data fitting problem. www.openeering.com
page 10/13
6. The neural network

Also a neural network can be used for data fitting: even if the expression neural network often inspires strange and sophisticated things, they hide a very simple mathematical frame, at least in their basic version. The idea is to mathematically reproduce the behavior of a simple brain, able to learn from data and then to reply correctly when asked for. A neural net is composed by a series of elementary entities, the neurons, which, connected one another, take some inputs and transform them into outputs using simple rules. The behavior of the neuron can be modeled in many different ways, but probably the simplest one leads to the so-called perceptron. A series of inputs are firstly weighted and then summed together. The result is biased and passed, as input, to a sigmoid function which delivers the output. The following equations are the mathematical core of a perceptron:
In Figure 11 a schematic representation of the neuron is drawn; the weights (w) and the bias (b) can be seen as a sort of free parameters of the neuron which allow modifying the output provided by the neuron.
Figure 11: The neuron, the elementary entity of a neural network. Some inputs (xi) are weighted (wi) and summed together. A bias (b) is added and the result is passed to a sigmoid function which delivers the output (y).
The common way to build a network is to organize some layers of neurons (usually one or two are sufficient to tackle data fitting problems) and connect them. As a result, we obtain a mathematical model able to accept some inputs and transform them into outputs. The so-called training phase can now start: an optimization process has to find out the best values for all the weights and biases in the network in such a way that the difference between the output values contained in the dataset and those provided by the net is minimal. The network learns, in some sense, from the data, trying to minimize the fitting error . When the training is concluded the network is able to compute new outputs corresponding to whatever input. One could be tempted to use very large networks to fit data, in order to have the greatest www.openeering.com
page 11/13
Surface Fitting? Do it with Scilab! flexibility of the model. This is extremely dangerous, for two reasons at least: the first one is that the computational time could become prohibitive and, secondly, that the overfitting, the phenomenon presented above for the polynomial fitting, could appear. The risk is to have a huge and useless metamodel. The best is to use the lowest number of neurons which however allows to correctly capture the system behavior. Since this set up is generally not known, it is recommended to try with different network and finally use the one which provide the best fitting. One interesting feature of neural networks is that they also digest dirty datasets: duplicate and contradictory data are allowed. The result is a metamodel which generally does not interpolate exactly the dataset but which can be however used to fit, for example, experimental results or non-previously treated data.
Figure 12: A neural network can be built connecting some neurons together: usually one or two layers are sufficient to deal with data fitting problems. The inputs (xi) are transformed, passing through the neurons (N) into an output (y).
Figure 13: The fitting error plotted versus the iteration number, as computed during the network training.
www.openeering.com
page 12/13
Figure 14: The neural network used for data fitting: the model does not interpolate the dataset.
7. Conclusions
In this document we have discussed about some methodologies for solving data fitting problems with Scilab. Starting from the simplest ones up to some more sophisticated, we have introduced some mathematical details. The reader can download the Scilab code from the Openeering website to implement his/her own fitting applications.
8. References
[1] P. Alfeld, Scattered data interpolation in three or more variables, In T.Lyche and L. Schumaker, editors, Mathematical Methods in Computer Aided Geometric Design, pp 1 34. Academic Press, (1989). [2] Martin D. Buhmann, Radial Basis Functions: Theory and Implementations Cambridge University Press, (2003). [3] Armin Iske, Multiresolution Methods in Scattered Data Modelling, Springer, (2004). [4] Holger Wendland, Scattered Data Approximation, Cambridge University Press (2004).
www.openeering.com
page 13/13

Metamodeling Scilab

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Metamodeling Scilab

Hochgeladen von

Copyright:

Verfügbare Formate

powered by

SURFACE FITTING? DO IT WITH SCILAB!

Data fitting, Interpolation, inverse distance weighting, Kriging, Neural Network

Surface Fitting? Do it with Scilab!

Figure 1: 3D plot of the dataset used in this paper: 30 points.

Surface Fitting? Do it with Scilab!

Surface Fitting? Do it with Scilab!

Surface Fitting? Do it with Scilab!

3. The inverse distance weighting (IDW)

where the weights are computed as:

.(green) considering the

Surface Fitting? Do it with Scilab!

Surface Fitting? Do it with Scilab!

4. The radial basis function (RBF)

Surface Fitting? Do it with Scilab!

Surface Fitting? Do it with Scilab!

5. The Kriging approach

Surface Fitting? Do it with Scilab!

Figure 9: The interpolating Kriging metamodel.

Surface Fitting? Do it with Scilab!

6. The neural network

Surface Fitting? Do it with Scilab!

Das könnte Ihnen auch gefallen