Beruflich Dokumente
Kultur Dokumente
Applications
What is Nonlinear Regression?
Nonlinear regression is a form of regression analysis in which observational data are modeled by
a function which is a nonlinear combination of the model parameters and depends on one or
more independent variables. In the past, advanced modelers would work with nonlinear
functions, including exponential functions, logarithmic functions, trigonometric functions, power
functions, Gaussian function, and Lorenz curves. Some of these functions, such as the
exponential or logarithmic functions, would then be transformed so that they would be linear.
When so transformed, standard linear regression would be performed, but the classical approach
has significant problems, especially if the modeler is working with larger datasets and/or if the
data includes missing values, nonlinear relationships, local patterns and interactions.
This paper, and others, will cover improvements to conventional and logistic regression, and will
include a discussion of classical, regularized, and nonlinear regression, as well as modern
ensemble and data mining approaches. We will begin with Multivariate Adaptive Regression
Splines (MARS).
Nonlinear Regression Techniques:
Logistic Regression
Regularized Regression: GPS Generalized Path Seeker
Nonlinear Regression: MARS Regression Splines
Nonlinear Ensemble Approaches: TreeNet Gradient Boosting; Random Forests; Gradient
Boosting incorporating RF
Ensemble Post-Processing: ISLE; RuleLearner
This whitepaper will focus on MARS nonlinear regression and offer case study examples.
Linear regression models typically fit straight lines to data. MARS approaches model
construction more flexibly, allowing for bends, thresholds, and other departures from straightline methods. MARS builds its model by piecing together a series of straight lines with each
allowed its own slope. This permits MARS to trace out any pattern detected in the data.
Core Capabilities
MARS core capabilities include:
Automatic variable search Large numbers of variables are examined using efficient
algorithms, and all promising variables are identified.
Automatic variable transformation Every variable selected for entry into the model is
repeatedly checked for non-linear response. Highly non-linear functions can be traced
with precision via essentially piecewise regression.
Automatic limited interaction searches MARS repeatedly searches through the
interactions allowed by the analyst. Unlike recursive partitioning schemes, MARS
models may be constrained to forbid interactions of certain types, thus allowing some
variables to enter only as main effects, while allowing other variables to enter as
interactions, but only with a specified subset of other variables.
Variable nesting Certain variables are deemed to be meaningful (possibly non-missing) in
the model only if particular conditions are met (e.g., X has a meaningful non-missing
value only if categorical variable Y has a value in some range).
Built-in testing regimens The analyst can choose to reserve a random subset of data for
test, or use v-fold cross-validation to tune the final model selection parameters.
Applications
Salford Systems 2013
This new, flexible regression modeling tool is applicable to a wide variety of data analyses,
particularly those in which variables possibly may be in need of transformation and interaction
effects are likely to be relevant. The software can assist a data analyst to rapidly search through
many plausible models and quickly identify important interactions-insights that can lead to
significant model improvements. Further, because the software can be exploited via intelligent
default settings, for the first time analysts at all levels can easily access MARS innovations.
Visualization of Results
In addition to summary text reports, MARS results are also displayed in the Results dialog box.
The GUI output includes ANOVA decomposition, variable importance, and final model tables as
well as graphical plots. MARS automates both the selection of variables and the non-parametric
transformation of variables to achieve the best model fit. Variable transformation is
accomplished implicitly through the piecewise regression function used by MARS to trace
arbitrary non-linear functions. MARS communicates this non-parametric transformation
graphically, displaying the predicted response as a function of either one or two variables.
MARS automatically produces 2-D plots for main effects (response variable as a function of
each predictor) and 3-D surface plots for interactions, with options to spin and rotate. For higherorder interactions, the user can choose slices of the function for display of 2-D and 3-D
subspaces. Examples of main effects and interaction plots are shown below.
References