Sie sind auf Seite 1von 5

Personal income and Gross Domestic Product in United States between 1929 2008

Description of the proposed model


In the model we will try to find the dependence of personal income by Gross Domestic
Products. From http://www.bea.gov we took data of personal income and Gross Domestic
Product in United States from years 1929 to 2008. We have some assumptions that personal
income is someway connected with Gross Domestic Product. From simple classical economy
model we can obtain the functional dependence:
PI = a * GDP + b
PI is dependant variable of our data set, and GDP in independent variable of the model. a and
b are coefficient we will try to estimate. So called function is called simple linear regression.
Data set
year
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956

personal
income
85,1
76,3
65,3
49,9
46,9
53,7
60,4
68,7
74,1
68,4
72,9
78,5
96,1
123,5
152,2
166,0
171,7
178,6
191,0
209,8
207,1
229,0
258,0
275,4
291,9
294,5
316,1
339,6

gdp
8,813
8,054
7,537
6,557
6,473
7,173
7,812
8,828
9,281
8,961
9,684
10,534
12,337
14,622
17,021
18,402
18,196
16,190
16,039
16,738
16,651
18,104
19,507
20,254
21,183
21,039
22,541
22,979

year
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984

Estimation of the regression model by STATA

personal
income
358,7
369,0
392,8
411,5
429,0
456,7
479,6
514,6
555,7
603,9
648,3
712,0
778,5
838,8
903,5
992,7
1 110,7
1 222,6
1 335,0
1 474,8
1 633,2
1 837,7
2 062,2
2 307,9
2 591,3
2 775,3
2 960,7
3 289,5

gdp
year
23,440
1985
23,217
1986
24,868
1987
25,484
1988
26,077
1989
27,658
1990
28,868
1991
30,545
1992
32,506
1993
34,625
1994
35,496
1995
37,208
1996
38,356
1997
38,422
1998
39,713
1999
41,815
2000
44,224
2001
44,001
2002
43,916
2003
46,256
2004
48,391
2005
51,085
2006
52,699
2007
52,579
2008
53,904
52,860
55,249
59,220 in mln $

personal
income
3 526,7
3 722,4
3 947,4
4 253,7
4 587,8
4 878,6
5 051,0
5 362,0
5 558,5
5 842,5
6 152,3
6 520,6
6 915,1
7 423,0
7 802,4
8 429,7
8 724,1
8 881,9
9 163,6
9 727,2
10 269,8
10 993,9
11 663,2
12 100,6

gdp
61,666
63,804
65,958
68,684
71,116
72,451
72,329
74,734
76,731
79,816
81,814
84,842
88,658
92,359
96,469
100,000
100,751
102,362
104,931
108,748
111,944
115,054
117,388
118,692

. r e gr e ss GD P P I
S ou r ce

SS

df

M o d e l 79955.8935
R e s i d u a l 5594.65044

1 79955.8935
78 71.7262876

T o t a l 85550.544

79 1082.91828

G DP

C oe f .

P I.0094535
_ c o n s20.64661

MS

S t d. Er r .

.0002831
1.203405

33.39
17.16

0.000
0.000

80
N u mb e r o f o bs =
F(
1 , 1114.74
7 8) =
P ro b > F
0.0000
=
R -s q ua r ed
0.9346
=
A d j R - s q0.9338
ua r ed =
R oo t M S E
8.4691
=
t
.0088898
18.25081

P> | t|

[ 9 5% Co n f. In t er v al ]

.0100172
23.04241

R-squared coefficient is equal to 0.93 and adjusted R-squared is equal to 0.93. Both
coefficients are close to 1, so it means that regression model describes the connection between
PI and GDP well. Many changes of variables can be described by the model. Constant
coefficient b has economical interpretation. So called autonomic income. The coefficient a is
very small, equal to 0.01. Its interpretation is that 1% of GDP is personal income. Standard
errors are small, so the estimation of the coefficients is quite thorough. For p-value equal to
5% (and even less) we can accept statistic importance of all coefficients (a and b). There are
no arguments to deny the hypothesis of coefficients importance. With 95% or more
probability a is between 0.008 and 0.011 (rounding to 3 points after dot). The assumption is
thorough enough in our opinion. Coefficient b is between 18 and 23. It is thorough
assumption also. At the end let us notice Root MSE is c.a. 8.5. It is used as a loss function
describing the errors of the model. The number is small, so another argument for the linear
regression model to be good. So we obtained that it should be:
PI = 1% * GDP + 20
Errors
In linear regression model errors of course occur. But we assume that those errors have a
normal distribution. We can check this assumption e.g. by Jarque-Berrys test. To do it we
calculate the errors.
. predict e, residuals
e
-12.6381
-13.31391
-13.72692
-14.56134
-14.61698
-13.98126
-13.4056
-12.46806
-12.06611
-12.33223
-11.65177
-10.85471
-9.218089
-7.192114
-5.064429
-3.813887

-1.093854
-.8780112
-.5975726
-.9179434
.5080638
.9472837
1.374848
2.693987
3.687502
5.03363
6.606092
8.269435
8.7207
9.830514
10.34986
9.845813

8.760592
5.977152
6.613477
7.476173
7.679808
7.967762
7.99473
7.82513
7.098722
5.684651
3.932871
3.39784
3.537231
3.937443
3.006756
2.55304

-13.51642
-16.34737

-4.073772
-6.145001
-6.413224
-5.89195
-5.953425
-4.707457
-3.578607
-2.996098
-2.22308
-2.391659

10.52517
11.78392
13.07741
11.79657
10.649
11.6674
12.30497
13.06573
12.55743
10.11471

2.639642
1.539221
2.062571
-.3365959
-2.3687
-2.249459
-2.343503
-3.854484
-5.787941
-9.523205

. s k te s t e
S ke w ne s s/ K ur t os i s t es t s f or No r ma l it y
jo i nt
V a r i a b l Pe r ( S k e w n e s s )
P r (K u rt o si s )
ad j c h i2 ( 2)
e 0.335

0.000

11.55

Pr o b> c hi 2

0.0031

On the confidence level 5% there are no arguments to deny hypothesis of normally distributed
errors. Due to this fact we assume that errors are distributed normally. It is another argument
that linear regression model was a good one for the analyzed case.
Linearity test
. e s ta t o v te s t
R am s ey

RE S ET te s t u si n g p ow e rs of th e f i tt e d v al u es
Ho :
m od e l h as no om i tt e d v ar i ab l es
F 132.84
(3 , 7 5 ) =
P r o b > F =0.0000

of

GD P

Because the probability is very small, the test states that zero hypothesis of no omitting
variables, should be treated as true. Therefore linear model is good enough.
Heteroskedasticity
. e s ta t h e tt e st
B re u sc h -P a ga n / Co o k- W ei s be r g t es t f o r h et e ro s ke d as t ic i ty
H o: Co n st a nt va r ia n ce
V ar i ab l es : f i tt e d v al u es of GD P
1)c h i 2 ( =0.76
P r o b > 0.3832
ch i 2

Due to probability value 0.38 we can state the homoskedasticity of variable. We do also
another homoskedasticity test.

. i m te s t, wh i te
W hi t e' s t e st fo r H o : h om o sk e da s ti c it y
a ga i ns t H a : u nr e st r ic t ed he t er o sk e da s ti c it y
c2)h i 2 (
= 11.16
P r o b > c h i 2 0.0038
=
C am e ro n & Tr i ve d i' s d e co m po s it i on of IM - te s t
S o ur c e

ch i 2

df

H e t e r o s k e d a s t i c i t11.16
y
S k e w n e s 3.35
s
K u r t o s i 7.67
s

2
1
1

0.0038
0.0674
0.0056

T o t a22.18
l

0.0002

This test also proved homeskedasticity of variable. (As far as now we always took confidence
level equal to 5%.)
Graphs

We see that when GDP is low, PI is smaller part of GDP. When GDP rises, PI becomes
bigger part of GDP. Anyway we stated a linear model to be good enough to make assumption.
But probably, as seen from the graph, exponential function would be probably better. But that
would be another model. What we did was analyzing a linear model.
As the consequence we see that firsly the model underestimates and next it overestimates the
predicted variable.

The end
One way is to say that if Gross Domestic Product is higher then the expansion of economy
forces personal income to rise more dynamically than Gross Domestic Product has risen.
Another explanation is that during 1929 2008 period was different tax policies. So there is
another factor, which we did not use in our model, but which has an important impact on
personal income.

Das könnte Ihnen auch gefallen