Sie sind auf Seite 1von 29

Methodological Workshop 3:

Fixed Effects Models and


Multi-Level Models
Yu Xie University of Michigan

Whats Common?
Both

the fixed effects model and the multilevel model utilize clustered data. Both the fixed effects model and the multilevel model are designed to handle crosscontext heterogeneity.

Different Objectives
Fixed

effects model and multi-level model are very different research designs:
Fixed effects model controls for (or absorbs) pre-treatment heterogeneity (type I heterogeneity) Multi-level model models both forms of heterogeneity across contexts.

Application of Different Principles


The

fixed effects model is essentially an application of the social grouping principle (with a group being a cluster) The multi-level model is essentially an application of the social context principle.

Using Different Assumptions


The

fixed effects model assumes no type II heterogeneity bias (often constant effects model), or additive effects of heterogeneity across contexts (i.e., clusters). The multi-level model relaxes homogeneity assumption at the individual level but assumes that both forms of heterogeneity are at the context level and can be modeled adequately with contextual covariates.

A General Lesson: Tradeoff between Data and Assumption


When

observed data are thin, it takes strong assumptions to yield sharp results. There is no free information in statistics. Either you collect it, or you assume it.
(Xie 1996, AJS).

Fixed effects model

Sibling model as an example

Family SES, environment are shared


a and X may be correlated. Take difference between the two eq.
Yi2 - Yi1= b1 (Xi2 - Xi1) + (ei2 - ei1) Resulting in a more robust equation Yi1 = b0 + b1Xi1 + ai + ei1 Yi2 = b0 + b1Xi2 + ai + ei2

Properties of the fixed effects approach:


All fixed-characteristics are controlled It consumes a lot of information Unobserved heterogeneity (Type I) is controlled for at the group level (fixed effects)

Example: Critique of Zhou and Hou (1999): Positive Benefits of Send-Down?

More interestingly, our findings also reveal some positive consequences of the send-down experience. For instance, when compared with urban youth, a noticeably higher proportion of the send-down youth attained a college education after 1977. Partly as a result of their educational attainment, these sent-down youth, especially those with shorter rural durations, were equally likely to enter favorable employment (type of occupation and work organizations) in the urban labor force, despite their relatively short urban labor force experience. (Zhou and Hou 1999: 32)

Speculated Reason for the Beneficial Effects


The

unusual hardship faced by sent-down youth forced them to be more adaptive and thus acquire skills to survive.

In Our Recent Study (Xie, Yang, and Greenman 2008)


We

analyze data from the survey of Family Life in Urban China that we conducted in three large cities (Shanghai, Wuhan, and Xian) in 1999. We use some items designed for this study.

Statistical Analyses

(1) We present the differences in six socioeconomic indicators between respondents who experienced send-down with those who did not experience send-down. (2) We present results from a fixed-effects model capitalizing on the sibling structure in our data. (3) We examine educational attainment closely as a time-varying covariate and its endogenous role in affecting early returns of sent-down youth.

Table 1: Descriptive Differences between Respondents with Send-Down Experience and Respondents without Send-Down Experience

Not
Sent down College Education (%) Years of Schooling Annual Salary (yuan) Total Annual Income (yuan) Cadre (%) SEI N 10.9 11 5,318 8,468 5.3 42.5 651 Sent Down 11.9 10.8 4,983 8,680 6.3 42 481

Sent Down
Duration <6 15.2 11.3 4,567 7,976 6.6 42.5 349
* ** ***

Sent Down Duration 6+ 3 9.4 6,083 10,542 5.3 40.6 132


*** *** *** ***

Notes: *p<.1, **p<.05, ***p<.01

After We Control for Covariates (Table 2)


There

are no differences in salary or income. Short-term sent-down youth still have higher levels of education than the other two groups (non-sent-down and long-term sent-down).

Potential Sources of Bias


Some

sent-down youth did not return to cities or did not return to the same cities. There can be unobserved family-level characteristics associated with both senddown and outcomes. We use a fixed effects model based on sibling pairs to address both problems.

Table 3 : Unadjusted Differences by Send-Down Experience Using Sibling Pairs


Not Sent down College Education (%) Years of Schooling 11.4 10.9

Sent Down 11.7 10.8

d -0.3 0.1

Cadre (%)
SEI

8.9
43.7

5.4
44.5

3.5
-0.7

344

344

Notes: *p<.1, **p<.05, ***p<.01

Whats Going On?


If

there are no effects of send-down (from the fixed effects model), why do we observe differences in education between short-term sent-down youth and long-term sent-down youth? The answer largely lies in pre-treatment differences.

Table 4: Unadjusted Differences by Duration


Duration <6 Duration > 6 HS Graduate at Send Down (%) Years of Schooling at Send-Down Years of Schooling at Return College Enrollment in Year of Return (%) College Education (%) Truncated Sample Current Years of Schooling Truncated Sample N
Notes: *p<.1, **p<.05, ***p<.01

53 10.5 10.7 13.2 15.2 11.9 11.3 11.1 349

13.6 9.2 9.3 1.5 3 2.3 9.5 9.4 132

*** *** *** *** *** *** *** ***

Conclusion
Did

send-down experience benefit youth? -- No. Our analyses of the new data show that the send-down experience did not benefit the youth who were affected. Differences in social outcomes between those who experienced send-down and those who did not are either non-existent or spurious due to other social processes.

Accounting for Heterogeneous Responses with Social Context Principle


Possible

with nested data, assuming that patterns of relationships are homogeneous (or following a distribution) within social contexts (by time or space). dk is allowed to vary across k (k=1,K), social context, but is homogeneous within k, conditional on X.

Multi-level Model (MLM)

Yik = ak + dkDik + bXik + eik ak = l+fzk+mk dk = g+szk+nk Other names: hierarchical linear models, randomcoefficient models, growth-curve models, and mixed models. Units of analysis at a lower level are nested within higherlevel units of analysis Examples:

Students within schools Observations over time within persons (growth curve)

Problems without MLM


If we ignore higher-level units of analysis => we cannot account for context (individualistic approach) If we ignore individual-level observation and rely on higher-level units of analysis, we may commit ecological fallacy (aggregated data approach) Without explicit modeling, sampling errors at second level may be large =>unreliable slopes Homoscedasticity and no serial correlation assumptions of OLS are violated (an efficiency problem). No distinction between parameter variability and sampling variability.

Advantages of MLM
Cross-level

comparisons Controls for differences across higher levels

Example: Xie and Hannum (1996)


2 T log Y b 0 + b1 X 1 + b 2 X 2 + b 3 X 2 + b 4 X 4 + b5 X 5 + b6 X1 X 5 +

(1)

Where Y = earnings, X1 = years of schooling, X2 = years of work experience, X4 = a dummy variable denoting membership in the Communist Party of China (1 = party member), X5 a dummy variable denoting gender (1 = female). Note two interactions.

Consider regional heterogeneity

For the ith person in kth city:

2 log ( yik b 0 k + b1k x1ik + b 2 k x2ik + b 3 x2 ik + b 4 k x4 ik + b 5 k x5ik + b 6 x1ik x5ik + ik .

Instead of using fixed effects for the intercept b0k, and full interactions for slope parameters, Xie and Hannum modeled these parameters in a multilevel model. Let z be a city-level covariate that measures the degree of economic reform. Let us assume that individual-level parameters depend on z in the following linear regressions:

Cross-City Model (meta analysis)


b 0 k a 0 + l0 zk + m0 k b1k a1 + l1 zk + m1k
b 2 k a 2 + l2 zk + m2 k b3 a 3 b 4 k a 4 + l4 zk + m4 k b5 k a 5 + l5 zk + m5k

b6 a 6

Combining the two levels =>


2 log ( yik a 0 + a1 x1ik + a 2 x2ik + a 3 x2 ik + a 4 x4 ik + a 5 x5ik + a 6 x1ik x5ik

+ l0 zk + l1 x1ik z k +l2 x2ik zk + l4 x4ik zk + l5 x5ik zk


+ (m0 k + m1k x1ik + m2 k x2ik + m4 k x4ik + m5k x5ik + ik

We can see that the city-level covariate z interacts with most of the individual-level predictors.

Special Cases
Special

case 1: If all the coefficients of the city-level covariate (z) are zero, we have what is called random coefficient model Special case 2: If all the coefficients of the city-level covariate (z) are zero and there are no random coefficients in all slope coefficients (except the intercept), we have what is called variance component model. [See Table 3.]

Summary: Four ways to conceptualize variability in parameters


Specification Complete Random homogeneity variation Degree of Freedom 1 2 Regression Fixed

1+Pk

Parsimony (DF for Model)


Accuracy (like R2)

High

Low

Low

High

where Pk is the number of predictors at the 2nd level, and K is the number of units at the second level.

References

Xie, Yu. 1996. Review of Identification Problems in the Social Sciences by Charles Manski. American Journal of Sociology 101:1131-1133. Xie, Yu and Emily Hannum. 1996. Regional Variation in Earnings Inequality in Reform-Era Urban China. American Journal of Sociology 101:950-992. Xie, Yu, Yang Jiang, and Emily, Greenman. 2008. Did Send-Down Experience Benefit Youth? A Reevaluation of the Social Consequences of Forced Urban-Rural Migration during Chinas Cultural Revolution. Social Science Research 37: 686-700.

Das könnte Ihnen auch gefallen