Sie sind auf Seite 1von 2

Theoretical considerations

Analysis Multiple Linear Regression allows us to establish the relationship that occurs between a
dependent variable Y and a set of independent variables (X1, X2, ... XK). The multiple linear
regression, unlike the simple, is closer to situations of real analysis as phenomena, events and social
processes, by definition, are complex and therefore must be explained as far as possible by the
number of variables that directly and indirectly involved in its realization.

A applying the analysis most often multiple regression is that both the dependent variable are
independent variables as continuous measurements scale interval or ratio. However, there are other
possibilities: (1) we can also apply this analysis when a dependent variable relate continuous with a
set of categorical variables; (2) or also apply the multiple linear regression analysis in the case that
relate nominal dependent variable with a set of continuous variables.

The locus of points in space x has potential importance in determining the properties of the regression
model. In particular, points away or remote have a disproportionate impact on the parameter
estimates, standard errors, predicted values and summary statistics of the model.

Matrix hat

It plays an important role in identifying influential observations. H determines the covariance of y and
e; because it can be shown that

Attention is directed to the hii diagonal elements of the hat matrix H: The diagonal of the hat matrix is
a measure standardized distance from the i -th observation center (or centroid) of the space x: Thus,
large elements on the diagonal they indicate observations are potentially influential, being away from
the rest of the sample in the space x.

Note that not all points will be influential in balancing the regression coefficients. Observations with
large diagonal elements and also with large residuals are likely to be influential.

A using the cutoff value 2 (k + 1) / n must also be careful to evaluate the magnitudes of both k + 1
and n will be cases in which two (k + 1) n> 1; and in such cases you do not apply the cut.

Calculating the measure of influence is based on comparing now the change in the coefficient vector
b and b regression (i) has been removed the ith observation.

Points with large values of Di have great influence on the estimated least squares b, commonly
regarded as influential point when Di> 1: If there is no case, we conclude that no observation has an
unusual influence.

Multicollinearity in regression is a condition that occurs when some predictor variables included in the
model are correlated with other predictors. Severe multicollinearity is problematic, because it can
increase the variance of the regression coefficients, making them unstable. The following are some
of the consequences of unstable coefficients:
The coefficients may seem insignificant even when there is a significant relationship between the
predictor and response.

The coefficients of the predictors highly correlated vary widely from sample to sample.

Removing any strongly correlated term of the model significantly affect estimates of other terms highly
correlated coefficients. The coefficients of the terms highly correlated may even have the wrong sign.

To measure the multicollinearity, you can examine the correlation structure of the predictor variables.
You can also examine the inflation factors of variance (FIV). IVF measure how much increases the
variance of an estimated regression coefficient increases if the predictors are correlated. If all IVF is
1, no multicollinearity, but if some IVF are greater than 1, the predictors are correlated. When an FIV
is> 5, the regression coefficient for that term is not considered properly.

A transformation could be necessary when residues exhibit constant variance or not normal.

Transformations also could be useful when the model exhibits a lack of significant adjustment, which
is particularly important in the analysis experiments response surface. Suppose you include all
significant interactions and quadratic terms in the model, but the unadjusted test indicates the need
for higher-order terms. A transformation can eliminate the lack of fit.

If the transformation corrects the problem, you can use regression analysis rather than other tests,
possibly more complicated. An appropriate text on regression analysis or experiments designed
properly can provide guidance regarding which transformations solve different problems.

Das könnte Ihnen auch gefallen