Beruflich Dokumente
Kultur Dokumente
Regression and extensions are the most common tool used in finance.
Used to help understand the relationships between many variables.
We begin with simple regression to understand the relationship between two
variables, X and Y.
180000
160000
140000
120000
100000
80000
60000
40000
20000
0
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
Example:
Y = house price, X = lot size
Y = 34000 + 7 X
= 34000
=7
e = Y - - X
e = error
u = Y - - X
u = residual
10
11
FIGURE 4.1
Y = + X
B
u2
u3
u1
A
X
12
13
Solution:
(Y Y )( X X )
i
i
=
( X X )2
i
and
= Y X
14
Jargon of Regression
Y = dependent variable.
X = explanatory (or independent) variable.
and are coefficients.
and are OLS estimates of coefficients
Run a regression of Y on X
15
16
Example:
X = lot size, Y = house price
17
dY
=
dX
18
19
= 0.000842
20
Interpretation of
21
Residual Analysis
How well does the regression line fit through the points on the XY-plot?
Are there any outliers which do not fit the general pattern?
Concepts:
1. Fitted value of dependent variable:
= + X
Y
i
i
Yi does not lies on the regression line, Y i does lie on line.
22
What would you do if you did not know Yi, but only Xi and wanted to predict
23
24
25
TSS = (Y Y )2
i
Note similarity to formula for variance.
26
Y )2
RSS = (Y
i
27
R2 = 1
SSR
TSS
or (equivalently):
R2 =
RSS
TSS
28
REMEMBER:
R2 is a measure of fit (i.e. how well does the regression line fit the data points)
29
Properties of R2
0 R21
R2=1 means perfect fit. All data points exactly on regression line (i.e. SSR=0).
R2=0 means X does not have any explanatory power for Y whatsoever (i.e. X
has no influence on Y).
Bigger values of R2 imply X has more explanatory power for Y.
R2 is equal to the correlation between X and Y squared (i.e. R2=rxy2)
30
Properties of R2 (cont.)
31
Nonlinearity
So far regression of Y on X:
Y = + X + e
Could do regression of Y (or ln(Y) or Y2) on X2 (or 1/X or ln(X) or X3, etc.) and
same techniques and results would hold.
E.g.
Y = + X 2 + e
32
33
180
160
140
120
100
80
60
40
20
0
0
34
3.5
2.5
1.5
0.5
0
0
10
12
14
35
0.5
0
-3
-2
-1
-0.5
-1
-1.5
36
Chapter Summary
1. Simple regression quantifies the effect of an explanatory variable, X, on a dependent
variable, Y.
2. The relationship between Y and X is assumed to take the form, Y=+X, where is the
intercept and the slope of a straight line. This is called the regression line.
3. The regression line is the best fitting line through an XY graph.
4. No line will ever fit perfectly through all the points in an XY graph. The distance
between each point and the line is called a residual.
5. The ordinary least squares, OLS, estimator is the one which minimizes the sum of
squared residuals.
37
38