Sie sind auf Seite 1von 9

Basic Econometrics Chapter 2:

THE NATURE OF REGRESSION ANALYSIS

. Historical origin of the term Regression


The term REGRESSION was introduced by Francis

Galton Tendency for tall parents to have tall children and for short parents to have short children, but the average height of children born from parents of a given height tended to move (or regress) toward the average height in the population as a whole (F. Galton, Family Likeness in Stature) Galtons Law was confirmed by Karl Pearson: The average height of sons of a group of tall fathers < their fathers height. And the average height of sons of a group of short fathers > their fathers height. Thus regressing tall and short sons alike toward the average height of all men. (K. Pearson and A. Lee, On the law of Inheritance) By the words of Galton, this was Regression to mediocrity

Statistical vs. Deterministic Relationships


In regression analysis we are concerned with

STATISTICAL DEPENDENCE among variables (not Functional or Deterministic), we essentially deal with RANDOM or STOCHASTIC variables (with the probability distributions

Regression vs. Causation


Regression does not necessarily imply causation. A

statistical relationship cannot logically imply causation. A statistical relationship, however strong and however suggestive, can never establish causal connection: our ideas of causation must come from outside statistics, ultimately from some theory or other (M.G. Kendal and A. Stuart, The Advanced Theory of Statistics)

Regression vs Correlation
Correlation Analysis: the primary objective is to

measure the strength or degree of linear association between two variables (both are assumed to be random) Regression Analysis: we try to estimate or predict the average value of one variable (dependent, and assumed to be stochastic) on the basis of the fixed values of other variables (independent, and nonstochastic)

1-6. Terminology and Notation


Dependent Variable Explained Variable Predictand Regressand Response Endogenous Explanatory Variable(s) Independent Variable(s) Predictor(s) Regressor(s) Stimulus or control variable(s) Exogenous(es)

Prof.VuThieu

May 2004

The Nature and Sources of Data for Econometric Analysis

Types of Data : Time series data; Cross-sectional data; Pooled data 2) The Sources of Data 3) The Accuracy of Data

The method of ordinary least square (OLS)


OLS estimators are expressed solely in terms of

observable quantities. They are point estimators The sample regression line passes through sample means of X and Y

The assumptions underlying the method of least squares


Ass 1: Linear regression model (in parameters) Ass 2: X values are fixed in repeated sampling Ass 3: Zero mean value of ui : E(uiXi)=0 Ass 4: Homoscedasticity or equal variance of ui : Var (uiXi) = 2 [VS. Heteroscedasticity] Ass 5: No autocorrelation between the disturbances: Cov(ui,ujXi,Xj ) = 0 with i # j [VS. Correlation, + or - ] Ass 6: Zero covariance between ui and Xi Cov(ui, Xi) = E(ui, Xi) = 0 Ass 7: The number of observations n must be greater than the number of

parameters to be estimated Ass 8: Variability in X values. They must not all be the same Ass 9: The regression model is correctly specified Ass 10: There is no perfect multicollinearity between Xs

Das könnte Ihnen auch gefallen