Beruflich Dokumente
Kultur Dokumente
The General Linear Model is a phrase used to indicate a class of statistical models which include simple linear regression analysis. Regression is the predominant statistical tool used in the social sciences due to its simplicity and versatility. Also called Linear Regression Analysis.
Simple Linear Regression: The Basic Mathematical Model Regression is based on the concept of the simple proportional relationship - also known as the straight line. We can express this idea mathematically! Theoretical aside: All theoretical statements of relationship imply a mathematical theoretical structure. Just because it isnt explicitly stated doesnt mean that the math isnt implicit in the language itself!
Alternate Mathematical Notation for the straight line - dont ask why! 10th Grade Geometry y = mx + b Statistics Literature
Yi a bX i ei
Econometrics Literature
Y i =B 0 +B1X i +e i
10
0 1 2 3 4 5 6 7 8 9 10
There is 1 essential goal and there are 4 important concerns with any OLS Model
In order to do this, we must find parameter estimates which accomplish this minimization. In calculus, if you wish to know when a function is at its minimum, you take the first derivative. In this case we must take partial derivatives since we have two parameters (a & b) to worry about. We will look closer at this and its not a pretty sight!
Because (1) the sum of the errors expressed as deviations would be zero as it is with standard deviations, and (2) some feel that big errors should be more influential than small errors.
Therefore, we wish to find the values of a and b that produce the smallest sum of squared errors.
In mathematical jargon we seek to minimize the Unexplained Sum of Squares (USS), where: USS (Yi Yi ) 2
In order to do this, we must find parameter estimates which accomplish this minimization.
In calculus, if you wish to know when a function is at its minimum, you take the first derivative. In this case we must take partial derivatives since we have two parameters to worry about.
Tests of Inference
t-tests for coefficients F-test for entire model
T-Tests
Since we wish to make probability statements about our model, we must do tests of inference. Fortunately,
B tn 2 seB
Goodness of Fit
Since we are interested in how well the model performs at reducing error, we need to develop a means of assessing that error reduction. Since the mean of the dependent variable represents a good benchmark for comparing predictions, we calculate the improvement in the prediction of Yi relative to the mean of Y (the best guess of Y with no other information).
Sums of Squares
This gives us the following 'sum-of-squares' measures: Total Variation = Explained Variation + Unexplained Variation
Note: Occasionally you will run across ESS and RSS which generate confusion since they can be used interchangeably. ESS can be error sums-of-squares or estimated or explained SSQ. Likewise RSS can be residual SSQ or regression SSQ. Hence the use of USS for Unexplained SSQ in this treatment.
R2 (r-square)
The r2 (or R-square) is also called the coefficient of determination. ESS 2 r TSS USS 1 TSS
Tests of Inference
t-tests for coefficients F-test for entire model Since we are interested in how well the model performs at reducing error, we need to develop a means of assessing that error reduction. Since the mean of the dependent variable represents a good benchmark for comparing predictions, we calculate the improvement in the prediction of Yi relative to the mean of Y (the best guess of Y with no
Goodness of fit
The correlation coefficient A measure of how close the residuals are to the regression line It ranges between -1.0 and +1.0
r2 (r-square)
Since R2 always increases with the addition of a new variable, the adjusted R2 compensates for added explanatory variables.
In addition, the F test for the entire model must be adjusted to compensate for the changed degrees of freedom. Note that F increases as n or R2 increases and decreases as k increases Adding a variable will always increase R2, but not necessarily adjusted R2 or F. In addition values of R2 below 0.0 are possible.