Sie sind auf Seite 1von 7

Saturday June 14 21:09:03 2014 Page 1

Statistics/Data Analysis
name: <unnamed>
log: C:\Users\computer\Downloads\anna2.smcl
log type: smcl
opened on: 14 Jun 2014, 21:07:11
1 . clear
2 . import excel "C:\Users\computer\Downloads\f982d65a5ee478446edf97c434941dca_8f9439a8f
> 9bb86b41cc087e881ecde20 (1).xls", sheet("Data all") firstrow case(lower)
3 . do "C:\Users\computer\AppData\Local\Temp\STD03000000.tmp"
4 . // We have imported spreadsheet using file>Import>Excel or ODBC
5 .
6 . destring gdp,replace force
gdp contains nonnumeric characters; replaced as double
(156 missing values generated)
7 . misstable summarize
Obs<.

Unique
Variable Obs=. Obs>. Obs<. values Min Max

gdp 156 132 132 4.23e+08 1.55e+13
cab 156 132 132 -65.113 208.3497
fdi 156 132 132 -2.904237 85.36791
ggc 156 132 132 7.33e+07 2.53e+12
gdc 156 132 132 -45.67934 75.23929
infl 156 132 132 -3.653483 53.2287
pop 156 132 132 52971 1.34e+09
m2 156 132 132 15.57571 499.1002

8 . //summarise missing data
9 . //choice is between a few descriptive statistics/ cleaning, which is case based, in
> our case, since
10. //missing observations are in format: all missing in the same row, we resort to clea
> ning first.
11. drop if m2 ==.
(156 observations deleted)
12. summarize
Variable Obs Mean Std. Dev. Min Max
countryname 0
gdp 132 5.13e+11 1.65e+12 4.23e+08 1.55e+13
cab 132 -2.448658 22.8535 -65.113 208.3497
fdi 132 6.261226 10.28633 -2.904237 85.36791
ggc 132 8.94e+10 2.78e+11 7.33e+07 2.53e+12
gdc 132 18.03794 18.77267 -45.67934 75.23929
infl 132 6.516843 6.036728 -3.653483 53.2287
pop 132 4.70e+07 1.62e+08 52971 1.34e+09
m2 132 78.61254 63.70257 15.57571 499.1002
13.
14. //means, max, min, count
Saturday June 14 21:09:03 2014 Page 2
15. // ggc and pop need to be scaled down. Educated guess for ggc is log transformations
> .
16. // However this requires
17. sfrancia ggc
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
ggc 132 0.32645 77.182 8.747 0.00001
18. sfrancia gdp
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
gdp 132 0.30228 79.952 8.818 0.00001
19. sfrancia cab
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
cab 132 0.52421 54.521 8.047 0.00001
20. gen lggc = ln(ggc)
21. gen lgdp = ln(gdp)
22. gen lpop =ln(pop)
23. gen gcratio =ggc/gdp
24. sfrancia lggc
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
lggc 132 0.99288 0.816 -0.409 0.65883
25. sfrancia lgdp
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
lgdp 132 0.99114 1.015 0.030 0.48801
26. sfrancia cab
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
cab 132 0.52421 54.521 8.047 0.00001
27. sfrancia lpop
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
lpop 132 0.98547 1.665 1.026 0.15242
Saturday June 14 21:09:03 2014 Page 3
28. gen dfdi =ln(1/sqrt(fdi+6))
29. //So far, so good. but ratio ggc/gdp is nono
30. // If indep vars follow far from normal distributions, errors would not be npormally
> distributed either.
31. // Note a lpm also is defined on a 0-1 dichotomous variable, having a binomial or po
> ission dist. even exponential distributions are sometimes considered
32. //near normal..
33. gen lm2= ln(m2)
34. gen dinfl =1/sqrt(infl+20)
35. graph box lgdp lggc m2 lpop dinfl fdi cab
36. summarize m2 fdi cab
Variable Obs Mean Std. Dev. Min Max
m2 132 78.61254 63.70257 15.57571 499.1002
fdi 132 6.261226 10.28633 -2.904237 85.36791
cab 132 -2.448658 22.8535 -65.113 208.3497
37. reg lgdp dfdi gdc lggc lpop dinfl lm2
Source SS df MS Number of obs = 132
F( 6, 125) = 1284.51
Model 703.365604 6 117.227601 Prob > F = 0.0000
Residual 11.4077786 125 .091262229 R-squared = 0.9840
Adj R-squared = 0.9833
Total 714.773383 131 5.45628537 Root MSE = .3021
lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]
dfdi -.1679284 .1239446 -1.35 0.178 -.4132301 .0773733
gdc .0100278 .0016121 6.22 0.000 .0068372 .0132183
lggc .8468715 .0230011 36.82 0.000 .8013495 .8923936
lpop .1592765 .0234463 6.79 0.000 .1128734 .2056797
dinfl -.0271865 2.018549 -0.01 0.989 -4.022145 3.967772
lm2 .0511404 .0587354 0.87 0.386 -.0651042 .1673851
_cons 2.253579 .526545 4.28 0.000 1.211481 3.295677
38. predict ehat, res
39. sfrancia ehat
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
ehat 132 0.98077 2.203 1.590 0.05596
40. cor lgdp ehat
(obs=132)
lgdp ehat
lgdp 1.0000
ehat 0.1263 1.0000
Saturday June 14 21:09:03 2014 Page 4
41. reg lgdp gdc lggc lpop
Source SS df MS Number of obs = 132
F( 3, 128) = 2556.35
Model 703.039342 3 234.346447 Prob > F = 0.0000
Residual 11.7340409 128 .091672195 R-squared = 0.9836
Adj R-squared = 0.9832
Total 714.773383 131 5.45628537 Root MSE = .30277
lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]
gdc .0101792 .0016123 6.31 0.000 .006989 .0133693
lggc .853619 .0179367 47.59 0.000 .8181281 .8891099
lpop .1452605 .0190435 7.63 0.000 .1075797 .1829412
_cons 2.723731 .2771124 9.83 0.000 2.175416 3.272045
42. predict ehat2, res
43. sfrancia ehat2
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
ehat2 132 0.97612 2.737 2.026 0.02137
44. reg lgdp dfdi gdc lggc lpop dinfl lm2
Source SS df MS Number of obs = 132
F( 6, 125) = 1284.51
Model 703.365604 6 117.227601 Prob > F = 0.0000
Residual 11.4077786 125 .091262229 R-squared = 0.9840
Adj R-squared = 0.9833
Total 714.773383 131 5.45628537 Root MSE = .3021
lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]
dfdi -.1679284 .1239446 -1.35 0.178 -.4132301 .0773733
gdc .0100278 .0016121 6.22 0.000 .0068372 .0132183
lggc .8468715 .0230011 36.82 0.000 .8013495 .8923936
lpop .1592765 .0234463 6.79 0.000 .1128734 .2056797
dinfl -.0271865 2.018549 -0.01 0.989 -4.022145 3.967772
lm2 .0511404 .0587354 0.87 0.386 -.0651042 .1673851
_cons 2.253579 .526545 4.28 0.000 1.211481 3.295677
45. est store one
46. predict ehat3, res
47. sfrancia ehat3
Shapiro-Francia W' test for normal data
Variable Obs W' V' z Prob>z
ehat3 132 0.98077 2.203 1.590 0.05596
Saturday June 14 21:09:03 2014 Page 5
48. reg lgdp dfdi gdc lggc lpop dinfl lm2, robust
Linear regression Number of obs = 132
F( 6, 125) = 1870.58
Prob > F = 0.0000
R-squared = 0.9840
Root MSE = .3021
Robust
lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]
dfdi -.1679284 .1108799 -1.51 0.132 -.3873735 .0515168
gdc .0100278 .0023949 4.19 0.000 .005288 .0147675
lggc .8468715 .0355762 23.80 0.000 .7764618 .9172813
lpop .1592765 .0357984 4.45 0.000 .0884272 .2301259
dinfl -.0271865 2.026325 -0.01 0.989 -4.037534 3.983161
lm2 .0511404 .0607948 0.84 0.402 -.06918 .1714609
_cons 2.253579 .5171786 4.36 0.000 1.230018 3.277139
49.
50. //check sign and significance for heteroskedasticity
51. //bootstrap _b _se, reps(10000) nodots bca mse verbose : reg lgdp gdc lggc lpop
52. //bootstrap to check for reduced model, not discussed in class
53. reg lgdp dfdi gdc lggc lpop dinfl ehat
Source SS df MS Number of obs = 132
F( 6, 125) = .
Model 714.704197 6 119.117366 Prob > F = 0.0000
Residual .069186336 125 .000553491 R-squared = 0.9999
Adj R-squared = 0.9999
Total 714.773383 131 5.45628537 Root MSE = .02353
lgdp Coef. Std. Err. t P>|t| [95% Conf. Interval]
dfdi -.1986008 .0092544 -21.46 0.000 -.2169163 -.1802852
gdc .0101033 .0001254 80.59 0.000 .0098552 .0103515
lggc .8565428 .0015685 546.07 0.000 .8534385 .8596472
lpop .1524919 .0017221 88.55 0.000 .1490835 .1559002
dinfl .5624837 .1480869 3.80 0.000 .2694013 .855566
ehat 1 .0069655 143.56 0.000 .9862143 1.013786
_cons 2.199354 .040718 54.01 0.000 2.118768 2.27994
54. // endogentiey is there.
55. // None of variables are uncorrelated to lngdp
56. //No IV
57. // Think of IV
58. // Malaria, british ule etc..
59. // We have completed upto where we were supposed to be in 2 hrs
60. //rest to be discussed in next class
61. //1. Spreate estimations for 2 sub sets of countries. LIVE
62. // How you have been writing things out
63. // Send me a ppt of all this to show what you understood.
64.
65. hist ggc
(bin=11, start=73302640, width=2.296e+11)
Saturday June 14 21:09:03 2014 Page 6
66. hist gdp
(bin=11, start=4.230e+08, width=1.412e+12)
67. hist cab
(bin=11, start=-65.112996, width=24.860248)
68. hist ehat2
(bin=11, start=-1.1364665, width=.17306202)
69. hist lggc
(bin=11, start=18.110107, width=.9497809)
70. hist lgdp
(bin=11, start=19.862972, width=.95555149)
71. hist cab
(bin=11, start=-65.112996, width=24.860248)
72. hist lpop
(bin=11, start=10.8775, width=.92195572)
73. hist ehat
(bin=11, start=-1.091697, width=.17301518)
74. hist ehat2
(bin=11, start=-1.1364665, width=.17306202)
75. hist lm2
(bin=11, start=2.7457125, width=.31519038)
76. hist dfdi
(bin=11, start=-2.2574472, width=.15385727)
77. hist gdc
(bin=11, start=-45.679341, width=10.992603)
78. hist infl
(bin=11, start=-3.6534827, width=5.1711074)
79. //http://apps.eui.eu/Personal/Researchers/decio/OLD%20SITE/papers/appeconapril.pdf
80. //talks on endogeniety and IV in GDP
81. foreach var of varlist dfdi gdc lggc lpop{
2. spearman ehat `var'
3. }
Number of obs = 132
Spearman's rho = 0.0291
Test of Ho: ehat and dfdi are independent
Prob > |t| = 0.7403
Number of obs = 132
Spearman's rho = -0.1494
Test of Ho: ehat and gdc are independent
Prob > |t| = 0.0873
Number of obs = 132
Spearman's rho = -0.0566
Test of Ho: ehat and lggc are independent
Prob > |t| = 0.5192
Number of obs = 132
Spearman's rho = -0.0039
Test of Ho: ehat and lpop are independent
Prob > |t| = 0.9647
Saturday June 14 21:09:03 2014 Page 7
82. //found the culprits: lnggc and gdp
83. // Note dfdi cant related to gdp. Is it correlated to lnggc and gdp?
84. spearman cab lm2 gdc lggc
(obs=132)
cab lm2 gdc lggc
cab 1.0000
lm2 0.2202 1.0000
gdc 0.6722 0.2591 1.0000
lggc 0.5758 0.3747 0.5505 1.0000
85.
86. //2SLS
87. ivregress 2sls lgdp lpop dfdi (lggc gdc = lm2 cab), nocons
Instrumental variables (2SLS) regression Number of obs = 132
Wald chi2(4) = .
Prob > chi2 = .
R-squared = .
Root MSE = 1.7056
lgdp Coef. Std. Err. z P>|z| [95% Conf. Interval]
lggc 1.297955 .6561642 1.98 0.048 .0118972 2.584014
gdc -.0910203 .1769682 -0.51 0.607 -.4378716 .2558309
lpop -.1643236 .5802573 -0.28 0.777 -1.301607 .9729598
dfdi .5526195 2.245363 0.25 0.806 -3.84821 4.953449
Instrumented: lggc gdc
Instruments: lpop dfdi lm2 cab
88. // do not go a lot on the t values, concentrate on direction of change
89. est store two
90. hausman one two
Coefficients
(b) (B) (b-B) sqrt(diag(V_b-V_B))
one two Difference S.E.
dfdi -.1679284 .5526195 -.7205479 .
gdc .0100278 -.0910203 .1010481 .
lggc .8468715 1.297955 -.4510838 .
lpop .1592765 -.1643236 .3236001 .
b = consistent under Ho and Ha; obtained from regress
B = inconsistent under Ha, efficient under Ho; obtained from ivregress
Test: Ho: difference in coefficients not systematic
chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 40.46
Prob>chi2 = 0.0000
(V_b-V_B is not positive definite)
91. //small p value ols is inconsistent. IV is good
92.
end of do-file
93. log close
name: <unnamed>
log: C:\Users\computer\Downloads\anna2.smcl
log type: smcl
closed on: 14 Jun 2014, 21:07:59

Das könnte Ihnen auch gefallen