Beruflich Dokumente
Kultur Dokumente
b)
R = SSay/SSaaSSyy = 0.8696
b = SSay/SSaa =1.1
c = y bA = -4
Yi = -4 + 1.1Ai
c) -4 + 1.1(70) = 73
if you use the equation but if you use your intuition and you see that at age 60 the income starts to
drop again therefore income at 70 would probably be less than 50 and intuitively this would be due
to retirement.
2 a)
Ho : B2 = 0
H1 : B2 not = 0
t crit = t 35, 0.05/2 = 2.0301
Rejection region
T < -2.0301
T > 2.0301
iv)
c) R^2 = 0.2763
d) The Models functional form should be changed; if you look at the residual plot the u
shape of the residuals suggests that there is a relationship other than a linear one between the
variables.
3.
Source
A)
SS
df
MS
Model
Residual
79.1208602
466.593426
1
5
79.1208602
93.3186851
Total
545.714286
90.952381
Coef.
b
_cons
.2892977
73.6583
Std. Err.
.3141838
10.20055
t
0.92
7.22
Number of obs
F( 1,
5)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.399
0.001
=
7
=
0.85
= 0.3994
= 0.1450
= -0.0260
= 9.6602
1.096933
99.87966
75
80
85
90
95
B)
10
20
30
b
40
Fitted values
50
C)
. ttest b==0
One-sample t test
Variable
Obs
Mean
30.31571
Std. Err.
4.74434
Std. Dev.
12.55234
18.70673
mean = mean(b)
Ho: mean = 0
Ha: mean < 0
Pr(T < t) = 0.9997
t =
degrees of freedom =
Ha: mean != 0
Pr(|T| > |t|) = 0.0007
41.9247
6.3899
6
res
8.453517
-11.49829
-.1867665
-6.822286
9.648299
-7.644862
8.050388
Therefore residuals the largest residuals are Boston: - 11.4983 and Milwauke +9.6483
4.
A)
.
. summ prpblck income
Variable
Obs
Mean
prpblck
income
409
409
.1134864
47053.78
Std. Dev.
.1824165
13179.29
Min
Max
0
15919
.9816579
136529
Min
Max
0
15919
.9816579
136529
B)
Psoda = 0.9563196 + 0.1149882(prpblck) + 1.60e-06(income) + u
R^2 = 0.0642
Sample size = 401
Variable
Obs
Mean
prpblck
income
409
409
.1134864
47053.78
Std. Dev.
.1824165
13179.29
SS
df
MS
Model
Residual
.202552215
2.95146493
2
398
.101276107
.007415741
Total
3.15401715
400
.007885043
psoda
Coef.
prpblck
income
_cons
.1149882
1.60e-06
.9563196
Std. Err.
.0260006
3.62e-07
.018992
t
4.42
4.43
50.35
Number of obs
F( 2,
398)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
401
13.66
0.0000
0.0642
0.0595
.08611
P>|t|
0.000
0.000
0.000
.0638724
8.91e-07
.9189824
.1661039
2.31e-06
.9936568
C)
The coefficient is smaller in this simple regression because income and prpblck have a negative
relationship and therefor this regression captures the effects of both the income and the prpblck in
the prpblck variable. Holding income constant larger and more accurate corelation between prpblck
and psoda, you can see this is also reflected in the r squared value although both are low the
previous regression produced a higher r squared value.
SS
df
MS
Model
Residual
.057010466
3.09700668
1
399
.057010466
.007761922
Total
3.15401715
400
.007885043
psoda
Coef.
prpblck
_cons
.0649269
1.037399
Std. Err.
.023957
.0051905
t
2.71
199.87
Number of obs
F( 1,
399)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
401
7.34
0.0070
0.0181
0.0156
.0881
P>|t|
0.007
0.000
.0178292
1.027195
.1120245
1.047603
D)
. reg lpsoda prpblck lincome
Source
SS
df
MS
Model
Residual
.196020672
2.68272938
2
398
.098010336
.006740526
Total
2.87875005
400
.007196875
lpsoda
Coef.
prpblck
lincome
_cons
.1215803
.0765114
-.793768
Std. Err.
.0257457
.0165969
.1794337
t
4.72
4.61
-4.42
Number of obs
F( 2,
398)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
.0709657
.0438829
-1.146524
R squared = 0.0681
N= 401
E)
401
14.54
0.0000
0.0681
0.0634
.0821
Log(psoda) = 0.1215803
=
=
=
=
=
=
.1721948
.1091399
-.4410117
SS
df
MS
Model
Residual
.250340622
2.62840943
3
397
.083446874
.006620679
Total
2.87875005
400
.007196875
lpsoda
Coef.
prpblck
lincome
prppov
_cons
.0728072
.1369553
.38036
-1.463333
Std. Err.
.0306756
.0267554
.1327903
.2937111
t
2.37
5.12
2.86
-4.98
Number of obs
F( 3,
397)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.018
0.000
0.004
0.000
=
=
=
=
=
=
401
12.60
0.0000
0.0870
0.0801
.08137
.1331141
.1895553
.6414201
-.8859092
The estimate of B1 changes from 0.121508 to .0728072 suggesting that the effects of prppov were
previously being incorporated in the effects of prpblck; there was a relationship between
B1(prpblck) with U(error) and U(error) with Y(lpsoda). Although if you check the correlation
between the three variables you will see there is a low value for lpsoda and prppov and a fairly high
correlation between prpblck and prppov therefore it may be a multicollinearity issue.
. corr psoda prpblck prppov
(obs=401)
psoda
prpblck
prppov
psoda
prpblck
prppov
1.0000
0.1344
0.0260
1.0000
0.6795
1.0000
F).
. corr lincome prppov
(obs=409)
lincome
prppov
lincome
prppov
1.0000
-0.8385
1.0000
Yes it is what I would expect, a correlation figure close to -1 makes sense; as income goes up
poverty goes down.
G) log(income) and prppov being in the same regression model cause the problem of multicollinearity,
the problem of two or more independent variables in a regression being highly correlated. But as the
prpblck is not too highly correlated with either of these variables it is still viable to include both in
the regression when looking at causality of prpblck on psoda.