Sie sind auf Seite 1von 28


Regression Analysis
Is collection of statistical tools that are used to model and
explore relationships between variables that are related in a
nondeterministic manner
y is the purity of oxygen
produced in a chemical
distillation process,

x is the percentage of
hydrocarbons present in
the main condenser of the
distillation unit
Scatter plot
between x
and y
Inspection of this scatter diagram indicates that there is a strong
indication that the points lie scattered randomly around a straight line.
It is probably reasonable to assume that the mean of the random variable Y is
related to x by the following straight-line relationship:

intercept and slope of the line are called regression coefficients

Actual value of Y is determined by the mean value function (the linear model) plus a
random error term, say,

This expression relating y and x is called simple linear regression model because it has only one
independent variable or regressor.
Suppose that we can fix the value of x and observe the
value of the random variable Y.

Now if x is fixed, the random component determines the

properties of Y.

Suppose that the mean and variance of are 0 and ,

respectively. Then,

The variance of Y given x is


94.24 93.75


Straight Line should pass the scatter plot with minimum difference between observation
and prediction. Question is HOW to get minimum difference?
y is the purity of oxygen
produced in a chemical
distillation process,

x is the percentage of
hydrocarbons present in
the main condenser of the
distillation unit
Actual Y, Fitted Y and ERROR
Error Sum of Squares (Residulas) can also be calculated as under
SS E  21.25

SS E 21.25
 
  1.18
n2 18

Estimated   1.08
SS T  SS R  SS E

total variability in error (residual)

the response variable sum of squares

It can be shown that

Is equal to

Das könnte Ihnen auch gefallen