Sie sind auf Seite 1von 35

Common STATA Applications for Microeconomists: Panel Data, Endogenous RHS Variables and Quantile Estimation

Hui-chen Wang

Prepared for the Stata User Group Meeting December 14, 2012 Taipei, Taiwan
1

Presentation Outline
Longitudinal (panel) data analysis Nature of panel data Basic linear models:
Cluster Robust Standard Errors Fixed effects and Random effects

Extension: IV in panel data models


Dynamic panel models
2

Presentation Outline (cont)


Other common applications More models with endogenous regressors
IV regressions and beyond

Quantile regression

Panel/Longitudinal Data
A data set in which the same entities are observed across time Entities: Persons, firms, states, countries, etc.. Examples Micro data: PSID, NLSY, IRS data (of the same persons or firms) Aggregate data: Time series of multiple 4 countries, states, etc

Example1: States Observed Over Time


Data used in: Wang and Hsieh (2012), Tobacco Politics: The Role of Voters and Special Interests in Cigarette Tax Setting

Example 2: Firm-level Panel Data


(Data drawn from National Center for Charitable Statistics core files (See Mayer et al (2012).)

Econometric Issues in Panel Data


Example: Cigtaxit = B0+B1*Voterviewit+B2*Incomeit++eit
i: States 1, 2, 3, , 50 t: years 1, 2, , 8

Common concerns: 1. Within group serial correlation: cov(eit, ets)0


Standard inferences incorrect; OLSE inefficient

2. Omitted variable bias


7

Cluster-Robust Standard Errors


Classic inferences assume no serial error correlation
Unlikely to hold in Panel Data OLS coefficients still consistent, but common SE and inferences would be wrong

Cluster-robust Standard errors: SE adjusted for clustering Positive (negative) correlation within cluster
Unadjusted SE is understated (overstated) Cluster-adjustment => SE larger (smaller)
8

OLSE without cluster adjustment

OLSE with Cluster-Robust SE

Identical Coefficients

Bigger SE

Smaller significance

10

),

Linear Panel-Data Models: Basics


But cluster-robust SE does not help efficiency Individual-effects modes

yit = ' xit + it = ' xit + i + uit , i = 1...N , t = 1...T


OLSE At least inefficient ( Cov( it , is ) 0 ) May be biased/inconsistent (if E ( xit i ) 0 )
11

Linear Panel-Data Models: Basics


Fixed Effects the dummy variable model Treats i as the individual specific constant term LSDV equivalent to Covariance estimator: ( xit xi )) OLS regress ( yit yi )) on (
Time-demeaned, called within (FE) transformation i drops out of transformed equation Uses only the within variation; great loss of DF

Linear Panel-Data Models: Basics


Random Effects the error components model Treats i as a component of the error term Estimated with GLS (or FGLS)
Quasi-demeaned (RE transformation)

Linear Panel-Data Models: Basics


If E ( xit i ) = 0 Both estimators are unbiased and consistent, but RE estimator is efficient while FE is not. If E ( xit i ) 0 Only FE estimator is unbiased/consistent

Estimating panel-data models with Stata


The xt commands Step 1. Declare your dataset to be panel data
Before any other xt command can be used

15

Linear FE Estimation: xtreg, fe


Squared corr b/w actual dep.variable & its fitted value (excluding u_i)

16

Linear RE Estimation: xtreg, re

17

Can we specify cluster in FE regression?


Yes! FE and cluster addresses different issues FE may not eliminate all forms of serial correlation in the error terms

18

Fixed-Effects with Cluster-robust SE

19

Additional Remarks
Interpretation of the constant term in FE model
Arbitrary value Is the average of y per Stata constraint

Interpretation of R2 Hausman test for fixed effects: hausman Within and Between variation: xtsum
20

Within and Between Variation

21

Panel IV Estimation
Static IV
xtivreg, xtivreg2
2sls after within (FE) transformation Uses exogenous instrunment from current period Cannot estimate time-invariant repressors

xthtaylor: the Hausman-Taylor estimator


Allow time-invariant regressors Draw instruments from periods other than current

Dynamic IV: Arellano-Bond Estimator


22

Dynamic Panel Model: Arellano-Bond Estimator


yi,t depends on individual FE as well as yi,t-1 Take first difference to eliminate FE yi,t-1 is correlated with uit by construction
Because yi,t-1 in yi,t-1 is a function of ui,t-1 in uit Instrumented with further lags of y, say, yi,t-2 & beyond

Stata Commands xtabond, xtabond2, xtdpd, xtdpdsys


23

Mayer, Wang, Egginton and Flint (2012), The Impact of Revenue Diversification on Expected Revenue and Volatility for Nonprofit Organizations, NVSQ forthcoming

24

xtabond
(Defaut: one lag, all available IVs)

25

Other Models with Endogenous RHS variable(s)


Linear IV models: ivreg, ivreg2 IV models with a dichotomous endogenous regressor: treatreg Sample selection model: heckman Models with a dichotomous outcome
Ivprobit: probit model with endogenous regressors biprobit: maximum-likelihood two-equation probit models

26

Some Other Applications: Quantile Regressions


Effects of X evaluated at the conditional quantile (e.g., the median)
Compared to effects measured at the conditional mean More robust against outliers, epecially in small sample

Coefficient differences across conditional distribution Stata Command: qreg, bsqreg, sqreg

27

Stata commands: qreg

28

bsqreg (bootstrapping VCE)

Identical Coefficients as in qreg

Different SE and t

29

Simultaneous quantile reg w/ bootstrapping VCE)

30

Wang and Hsieh (2012), Tobacco Politics: The Role of Voters and Special Interests in Cigarette Tax Setting
t+1 = 96, 97, 99, 00, 02, 03, 07, 08

Taxi ,t +1 = 0 + 1 VoterAttitudeit + 2 (VoterAttitudeit PartyDomit ) + 3 Tobaccoit + 4 AntiSmoking it + X it + uit Per-pack state cigarette tax (cents)
Data source: State Tobacco Activities Tracking & Evaluation (STATE)

31

(Example from Wang and Hsieh, 2012) Differential Politics across Tax Distribution: OLS and Quantile Regression Results

32

(Example from Wang and Hsieh, 2012)

Marginal Effect Estimates in OLS and Quantile Regression (pre 2012)

33

Example from Wang and Hsieh, 2012)

Marginal Effect Estimates in OLS and Quantile Regression (pre 2012)

34

How about FE Quantile?


Some development in literature, no consensus to date. Some Stata users propose to specify individual-specific dummies
Not an appropriate method Incidental Parameters Problem Inconsistent coefficients, even the slopes
35

Das könnte Ihnen auch gefallen