Sie sind auf Seite 1von 14

Use Historical Data First

• Prior to conducting a DOE, learn whatever you can from


existing data.
• Existing databases often contain several variables.
• Through sound Regression techniques, we can analyze
existing data to….

– Identify the variables that have the greatest impact on the output
(Y).
– Identify variables to include in a DOE.
– Determine best operating levels from the resultant predictive
equation.

In other words, regression analysis can often identify and verify


NOTE causes. Sometimes, this analysis alone can establish best
operating levels.
Johnson Controls, Inc. © 02.20.02 1 W4ReviewofImprovePhase.ppt
Polynomial Models
If the scatter diagram shows a “curving” pattern, a polynomial model
might be appropriate.
y

Model : y  0  1x   2 x 2 (quadratic)

Model : y  0  1x  2 x 2  3 x 3 (cubic)


x

For polynomial regression, the general practice is to use the lowest


degree polynomial that works.
Johnson Controls, Inc. © 02.20.02 2 W4ReviewofImprovePhase.ppt
What is Multiple Regression ?
If we have several variables suspected/known to be
related to a response variable, y, of interest we can
build a multiple regression model.

With the use of two or more input variables, x1, x2, etc.,
the models can become much more complex, but they
have the potential to produce more useful information
and to provide more precise predictions than the
single variable models.

A major difficulty in building multiple regression


models is the inability to picture the relationship
between x1, x2… and y.

Johnson Controls, Inc. © 02.20.02 3 W4ReviewofImprovePhase.ppt


Multiple Regression Example
• The data in JC3 is:
y = % impurities in a chemical solution
x1 = temperature (C)
x2= sterilizing time (minutes)
• Our goal is to build a regression model and then to use it
to predict the mean % of impurities when time is set at 15
minutes and temperature is set at 120.
• Strategy
– Propose a model.
– Run the regression program, including all model checking
procedures.
– Use/interpret the model once it is validated.

Johnson Controls, Inc. © 02.20.02 4 W4ReviewofImprovePhase.ppt


Minitab Output – Multiple Regression
Regression Analysis: %imp versus time, temp Each variable is significant, given
that the other variable is included in
The regression equation is the model.
%imp = 2.86 + 0.0433 time - 0.0146 temp

Predictor Coef SE Coef T P


Constant 2.8567 0.1734 16.48 0.000
time 0.043333 0.008061 5.38 0.000
temp -0.0146000 0.0009873 -14.79 0.000

S = 0.06981 R-Sq = 96.5% R-Sq(adj) = 95.7%

The model accounts for 95.7% of the


variability.

Johnson Controls, Inc. © 02.20.02 5 W4ReviewofImprovePhase.ppt


Minitab Output (continued)
Analysis of Variance

Source DF SS MS F P The 95.7% is significant.


Regression 2 1.20663 0.60332 123.78 0.000
Residual Error 9 0.04387 0.00487
Lack of Fit 3 0.01367 0.00456 0.91 0.492 The lack of fit test does
Pure Error 6 0.03020 0.00503 not reject the model.
Total 11 1.25050

Source DF Seq SS Of the SST = 1.25050, 1.0658 of it


time 1 0.14083 is accounted for by temperature.
temp 1 1.06580 An additional .14083 is accounted
for by time.
Predicted Values for New Observations
New Obs Fit SE Fit 95.0% CI 95.0% PI
1 1.7547 0.0347 ( 1.6762, 1.8331) ( 1.5783, 1.9310)
Values of Predictors for New Observations

New Obs time temp


1 15.0 120

Mean % impurities is predicted to be 1.7547 when


temperature is 120 and time is 15 minutes.

Johnson Controls, Inc. © 02.20.02 6 W4ReviewofImprovePhase.ppt


Residual Analysis
Three residual plots were examined
(next three slides):
• Residuals versus Time
• Residuals versus Temperature
• Residuals versus Fitted Values

The residual plots show no model problems

Johnson Controls, Inc. © 02.20.02 7 W4ReviewofImprovePhase.ppt


Residuals versus Temperature
Residuals Versus temp
(response is %imp)

0.1
Residual

0.0

-0.1

75 85 95 105 115 125


temp

Johnson Controls, Inc. © 02.20.02 8 W4ReviewofImprovePhase.ppt


Residuals versus Time
Residuals Versus time
(response is %imp)

0.1
Residual

0.0

-0.1

15 16 17 18 19 20
time

Johnson Controls, Inc. © 02.20.02 9 W4ReviewofImprovePhase.ppt


Residuals versus the Fitted Values
Residuals Versus the Fitted Values
(response is %imp)

0.1
Residual

0.0

-0.1

1.6 2.1 2.6


Fitted Value

Johnson Controls, Inc. © 02.20.02 10 W4ReviewofImprovePhase.ppt


Test for Normality

The normal plot .999


and test show .99
.95
no problem with
Probability
.80
the normality
assumption. Our .50

linear model is .20

validated. .05
  .01
.001

-0.1 0.0 0.1


RESI1
Average: -0.0000000 Anderson-Darling Normality Test
StDev: 0.0631497 A-Squared: 0.220
N: 12 P-Value: 0.786

Johnson Controls, Inc. © 02.20.02 11 W4ReviewofImprovePhase.ppt


Correlation Matrix
Correlations: diam, dtemp, rate, mtemp

diam dtemp rate Notice that the correlation


dtemp 0.824
between dtemp and mtemp
0.012
is very high. This means
rate 0.543 0.000 that the two variables
0.164 1.000 supply almost the same
information relative to y =
mtemp 0.689 0.965 -0.214 diameter.
0.059 0.000 0.610

Cell Contents: Pearson correlation


P-Value

High correlation between x-variables can cause problems.

Johnson Controls, Inc. © 02.20.02 12 W4ReviewofImprovePhase.ppt


Approach to Model Building

Because of the complexity of model building with multiple, possibly


highly correlated, variables we propose a simplified approach.

1.) Center all quantitative variables: x = x– x, or standardize the


variables. This reduces correlations between x, x2, x3, etc.,
terms.
2.) Enter complete quadratic models into the Minitab Stepwise
Regression Procedure. This is a procedure that takes a set of
input variables and, based on statistical tests, produces a
model with maximal (or close to it) R2-adj.

3.) Check the model adequacy for the stepwise selection.

Johnson Controls, Inc. © 02.20.02 13 W4ReviewofImprovePhase.ppt


What is Stepwise Regression?
Stepwise Regression is an approach whereby variables
and other terms (quatratic, interactions, ect.) are…

Added to the model one at a time


and
Removed from the model one at a time

Minitab will run this routine to find the best model.

Stepwise regression takes care of the problem of


Note correlated variables. However, you should still center the
variables to reduce the correlation between a given
variable and its higher order terms.

Johnson Controls, Inc. © 02.20.02 14 W4ReviewofImprovePhase.ppt

Das könnte Ihnen auch gefallen