Abstract
Many papers use fixed effects (FE) to identify causal impacts of an intervention. When treatment status only varies within some FE groups (e.g., families, for family fixed effects), FE can induce non-random selection of groups into the identifying sample, which we term selection into identification (SI). This paper empirically documents SI in the context of several family fixed effects (FFE) applications with a binary treatment. We show that the characteristics of the FFE identifying sample are different than the overall sample (and the policy-relevant population), including having larger families. The main implication of this is that when treatment effects are heterogeneous, the FE estimate may not be representative of the average treatment effect (ATE). We show that a reweighting-on-observables FE estimator can help recover the ATE for policy-relevant populations, and recommend its use either as a primary estimator or as a diagnostic tool to assess the importance of SI. We apply these insights to re-examine the long-term effects of Head Start in the PSID and the CNLSY using FFE. When we reweight the FFE estimates, we find that Head Start leads to a 2.1 percentage point (p.p.) increase (s.e. = 5.9 p.p.) in the likelihood of attending some college for white Head Start participants in the PSID. This participants’ ATE is 83% smaller than the traditional FFE estimate (12 p.p). We also find that the CNLSY Head Start participants’ ATE is smaller than the FE estimates. This raises new concerns with the external validity of FE estimates.
This open access article is distributed under the terms of the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0) and is freely available online at: http://jhr.uwpress.org