Skip to main content

Main menu

  • Home
  • Content
    • Current
    • Ahead of print
    • Archive
    • Supplementary Material
  • Info for
    • Authors
    • Subscribers
    • Institutions
    • Advertisers
  • About Us
    • About Us
    • Editorial Board
  • Connect
    • Feedback
    • Help
    • Request JHR at your library
  • Alerts
  • Free Issue
  • Special Issue
  • Other Publications
    • UWP

User menu

  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Human Resources
  • Other Publications
    • UWP
  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Journal of Human Resources

Advanced Search

  • Home
  • Content
    • Current
    • Ahead of print
    • Archive
    • Supplementary Material
  • Info for
    • Authors
    • Subscribers
    • Institutions
    • Advertisers
  • About Us
    • About Us
    • Editorial Board
  • Connect
    • Feedback
    • Help
    • Request JHR at your library
  • Alerts
  • Free Issue
  • Special Issue
  • Follow uwp on Twitter
  • Follow JHR on Bluesky
Research ArticleArticles
Open Access

Selection into Identification in Fixed Effects Models, with Application to Head Start

View ORCID ProfileDouglas L. Miller, View ORCID ProfileNa’ama Shenhav and View ORCID ProfileMichel Grosz
Journal of Human Resources, September 2023, 58 (5) 1523-1566; DOI: https://doi.org/10.3368/jhr.58.5.0520-10930R1
Douglas L. Miller
Doug Miller is at the Brooks School of Public Policy and the Economics Department, Cornell University ().
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Douglas L. Miller
  • For correspondence: [email protected]
Na’ama Shenhav
Na’ama Shenhav is at the Department of Economics, Dartmouth College ().
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Na’ama Shenhav
  • For correspondence: [email protected]
Michel Grosz
Michel Grosz is at the Bureau of Economics, Federal Trade Commission ().
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michel Grosz
  • For correspondence: [email protected]
  • Article
  • Figures & Data
  • Supplemental
  • Info & Metrics
  • References
  • PDF
Loading

Article Figures & Data

Figures

  • Tables
  • Additional Files
  • Figure 1
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1

    Within-Family Variation in Head Start and Attendance of Some College (PSID)

    Source: Panel Study of Income Dynamics 1968–2011 waves.

    Notes: This figure depicts the identifying variation used in a FFE regression of some college on an indicator for participation in Head Start. Each marker represents the number of individuals that exhibit a particular deviation from the mean Head Start attendance of their family and from the mean attendance of some college of their family. Deviations are defined as the difference between individual attendance of Head Start/some college (1 or 0) and mean of Head Start/some college of one’s family. The marker size represents the unweighted number of individuals. We also include a best-fit line, weighted by the number of individuals in each marker.

  • Figure 2
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2

    Likelihood of Being a Switcher Family Increases with Family Size and Probability of Treatment

    Notes: This figure shows the probability of being in a switching family and the probability of “treatment” by family size using three data sets and varying treatments. Panel A plots the probability of being in a switching family and of attending Head Start by family size for the following groups in the PSID: whites, Blacks, children of mothers with at most a high school degree, and children of mothers with at least some college. Panel B is a simplified version of Panel A, using data on Head Start participation and family size from the CNLSY. Panel C shows the probability of being in a switching family and the probability of migrating to the northern United States, using a linking of the 1910–1930 censuses used in Collins and Wanamaker (2014).

  • Figure 3
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3

    Family Fixed Effects Weights and Head Start Participant Representative Weights by Family Size and Some College β (PSID White Sample)

    Source: Panel Study of Income Dynamics 1968–2011 waves.

    Notes: Each marker in this figure indicates the FFE weights and Head Start participant representative (post-regression) weight for one white switching family. The color of the marker indicates whether the family has two to three children or four or more children. The size of the marker indicates the estimated family-specific beta from a regression of attainment of some college on interactions between Head Start and family ID fixed effects. A larger marker indicates an above median beta, while a smaller marker indicates a below-median beta. The 45 degree line is included for reference. Observations above (below) the line are overweighted (underweighted) in the FFE sample relative to a representative Head Start sample.

Tables

  • Figures
  • Additional Files
    • View popup
    Table 1

    Family Fixed Effects Articles in Top Applied Journals 2002–2017

    Binary Indep.Binary Dep.Both BinaryTotal
    AEJ: Applied  6  4  3  8
    AEJ: Economic Policy  1  1  1  1
    AER  3  1  1  5
    AER Papers and Proceedings  2  2  1  3
    Journal of Health Economics  5  3  2  7
    Journal of Human Resources  7  2  212
    Journal of Labor Economics  2  1  1  5
    Journal of Political Economy  2  1  1  2
    Journal of Public Economics  4  4  4  5
    QJE  1  4  1  4
    Review of Economics and Statistics  2  0  0  3
    Total35231755
    Common Dependent Variables
    Schooling/Attainment23
    Test score17
    Employment/earnings15
    Birth weight  6
    Health  6
    Behavioral issues/crime  5
    Common Independent Variables
    Schooling  8
    Birth weight  5
    Health  5
    Parental traits  4
    Employment  3
    Birth order  3
    Means-tested public program  2
    Death of Family Member  2
    Bombing/radiation  2
    Observations by Sample
    Siblings NTotal N
    p104691,212
    p251,1672,142
    p506,31517,501
    p75160,122551,630
    p90750,6971,582,142
    Year publication min./max.20022017
    Articles with balance table if binary indep.  1
    • Notes: This table presents a summary of FFE articles published between January 2000 and May 2017 in 11 top applied journals, which are listed in the first panel of the table. For reference, between 2002 and 2017 the number of articles published in AEJ: Applied was 310; AEJ: Policy was 313; AER was 1722; AER P&P was 1676; JoLE was 434; Journal of Political Economy was 548; QJE was 639; JHR was 543; JPubE was 1688; REStat was 1033; JHE was 1017. Articles were initially identified using the search terms “family,” “within family,” “sibling,” “twin,” “mother,” “father,” “brother,” “sister,” fixed effect,” “fixed-effect,” and “birthweight” using queries on journal websites. Siblings N is the number of observations reported for the sample of siblings, while Total N represents the number of total observations reported. See text for details.

    • View popup
    Table 2

    Switchers and Nonswitchers Vary along Dimensions Other than Family Size

    Switch
    (1)
    Nonswitch
    (2)
    p-Value
    (1) = (2)
    (3)
    Beta Switch
    (4)
    p-Value (4)
    (5)
    Panel A: Individual Covariates
    Fraction female0.5620.4950.0010.0240.472
    Fraction African-American0.5160.1110.0000.2490.000
    Mother’s years education9.28311.2300.000−0.1400.453
    Father’s years education9.19011.3710.000−0.3890.075
    Had a single mother at age 40.2520.0990.0000.0550.011
    Family income (age 3–6)31809525740.000−47590.000
        (CPI adjusted)
    Mother employed, age 00.5080.5700.0130.0550.019
    Mother employed, age 10.5170.5430.2810.0580.018
    Mother employed, age 20.5360.5540.4390.1180.000
    Household size at age 45.4874.4510.0000.7550.000
    Fraction low birth weight0.0770.0580.0750.0100.483
    Observations11035500660373727372
    Panel B: Inverse Selection into Identification Weights
    Pr(switch)/Pr(Head Start), Whites2.993
    (2.17)
    2.347
    (1.95)
    Pr(switch)/Pr(Head Start), Blacks1.969
    (1.29)
    1.137
    (1.03)
    • Source: Panel Study of Income Dynamics 1968–2011 waves.

    • Notes: Panel A of this table presents comparisons of the characteristics of individuals in switching families and nonswitching families. Columns 1, 2, and 3, respectively, show the mean characteristics of individuals in families that are switchers, individuals in families that are not switchers, and individuals that attended Head Start (HS) in nonswitcher families. Column 3 presents the p-value for the test that Columns 1 and 2 are equal. Column 4 shows the estimates from a regression of each row heading on an indicator for being in a switcher family, with the corresponding p-value shown in Column 5, with standard errors clustered on id1968. All controls from the main specification are included, except the variable shown in the row heading. All estimates are weighted to be representative of 1995 population; see text for details. Panel B shows the mean and standard deviation (in parentheses) of the inverse of the post-regression propensity score weights when the target is Head Start participants. This gives a measure of how aligned the characteristics of switchers are with the characteristics of Head Start participants, the population of interest. An average value of one implies perfect alignment, while a higher value implies that the characteristics of switchers are overrepresented relative to the characteristics of Head Start participants. Pr(switch) and Pr(Head Start) are estimated from a multinomial logit model of these outcomes on family size and other covariates described in the text.

    • View popup
    Table 3

    Change in Weighting of Regression Estimates across Sibling and Switcher Samples (PSID)

    Number of Children in Family
    12345+
    Panel A: Share of Sample
    All (no FFE)0.1230.2730.2380.1470.134
    Siblings sample (no FFE)0.0000.3450.3000.1860.169
    Switchers sample (FFE)0.0000.2100.2710.1970.322
    Panel B: Variance in Head Start
    All (no FFE)0.0890.1040.1210.1270.132
    Siblings sample (no FFE)0.0000.0240.0500.0590.068
    Switchers sample (FFE)0.0000.0450.0980.1310.174
    Panel C: Regression Weights
    All (no FFE)0.1710.2570.2840.1170.101
    Siblings sample (no FFE)0.0000.3380.3740.1540.134
    Switchers sample (FFE)0.0000.2560.3070.1900.248
    • Source: Panel Study of Income Dynamics 1968–2011 waves.

    • Notes: This table shows the change in the composition of the PSID sample moving from all individuals and estimating a model without controls (“All (no FFE)”), to individuals that have at least one other sibling in the sample and estimating a model without controls (“Siblings Sample (no FFE)”), to individuals in families that have variation in Head Start attendance and estimating a model with family fixed effects (“Switchers sample (FFE)”). Panel A shows the share of individuals in each sample that come from a family with one child (zero siblings), two children, etc. Panel B shows the variance in Head Start for each family size and sample. For the switcher sample, this is calculated net of family fixed effects. Panel C shows the “regression weight” given to each family size in a given sample, denoted as ωz and defined formally in Section III. The shares and regression weights do not sum to one for the “all sample” because this sample also includes an additional category of individuals who have an unknown number of siblings (due to a missing mother ID). These individuals account for 8.5 percent of the “all sample.”

    • View popup
    Table 4

    Returns to Head Start by Family Size and Implications for Regression Estimates

    PSIDCNLSY
    Some CollegeHS GradIdleLearning Disability
    CX
    (1)
    FE
    (2)
    FE
    (3)
    FE
    (4)
    FE
    (5)
    Panel A: Effects by Family Size
    Head Start × 1 child family0.169*
    (0.091)
    Head Start × 2 child family0.038
    (0.079)
    −0.126
    (0.099)
    0.058
    (0.050)
    −0.075
    (0.060)
    −0.018
    (0.025)
    Head Start × 3 child family−0.030
    (0.087)
    0.152**
    (0.075)
    0.042
    (0.063)
    −0.001
    (0.071)
    −0.073
    (0.046)
    Head Start × 4 child family−0.053
    (0.100)
    0.251***
    (0.091)
    0.135
    (0.087)
    −0.063
    (0.118)
    −0.042
    (0.052)
    Head Start × 5+ child family0.572***
    (0.119)
    0.348***
    (0.126)
    0.305***
    (0.095)
    −0.317**
    (0.132)
    −0.161*
    (0.091)
    Head Start × Unknown child family−0.099
    (0.108)
    Observations4,2582,9861,2511,2511,247
    Head Start switchers213668668668
    Effective obs. (individs. 2-person families)235.9644.3644.3644.3
    Effective obs. (CX individs.)731.8558.9558.9558.9
    Panel B: Simulated Estimates across Samples Using Family-Size Regression Weights
    All0.046
    Siblings0.0370.0830.081−0.071−0.047
    Switchers0.0690.1230.093−0.077−0.054
    • Sources: Panel Study of Income Dynamics 1968–2011 waves and Children of the National Longitudinal Study of Youth.

    • Notes: Panel A of this table shows the coefficients from regressions of outcomes on a series of indicators for whether an individual attended Head Start interacted with an indicator for the number of children in one’s family. The data source and specification varies across columns. Columns 1 and 2 use our main PSID sample, and the outcome is attainment of some college. Columns 3–5 use the CNLSY79 sample, and the outcomes are indicators for graduating from high school, being idle, and having a learning disability, respectively. Column 1 includes controls, but not mother fixed effects, and standard errors are clustered at the family ID level. Columns 2–5 include mother fixed effects, and standard errors are clustered by mother ID. The number of Head Start switchers is equal to the number of individuals in families that have variation in Head Start. “Effective Obs. (CX Individs.)” is the equivalent number of cross-sectional units that provide the same amount of variation as switchers. “Effective Obs. (Individs. 2-Person Families)” is the equivalent number of individuals in two-person switching families that provide the same amount of variation as switchers. Both of these are calculated using Equation 3, where the denominator is the variance of Head Start, residualized by the family mean of the covariates in the analysis, or 0.125, respectively. Panel B shows the weighted average of the coefficients when using regression weights, ωz (defined in Section III), determined by the overall distribution of families (“All”), the distribution of 2+ child families (“Siblings”), and the distribution of 2+ child families that have variation in Head Start attendance (“Switchers”).

    • ↵* p < 0.10,

    • ↵** p < 0.05,

    • ↵*** p < 0.01.

    • View popup
    Table 5

    Monte Carlo Experiments: Bias of Reweighting and FFE Relative to True ATE, and Efficiency of Reweighting Relative to FFE

    True ATEBias:Ratio: MSE of Reweight to MSE of FE
    FEReweight
    Panel A : Constant TE; p-Score: Xig
    Switchers80−0.3−0.21.03
    Siblings80−0.3−0.51.19
    All80−0.3−0.51.20
    HS participants80−0.3−0.31.04
    Panel B: Large Family TE; p-Score: Large Family
    Switchers83.0−11.1*−0.60.92
    Siblings49.6  22.2*−0.10.70
    All40.3  31.6*  0.10.54
    HS participants41.1  30.7*  0.10.55
    Panel C: TE Linear in Xig; p-Score: Xig
    Switchers94.2−2.0*−0.61.03
    Siblings80.112.2*  1.6*0.99
    All80.012.2*  1.7*1.00
    HS participants91.50.8−0.21.03
    Panel D: TE Linear in Xig; p-Score: Xig Spline
    Switchers94.2−1.5*−0.31.04
    Siblings80.112.7*−0.41.08
    All80.012.8*−0.41.09
    HS participants91.51.3−0.21.09
    • Notes: This table shows the results from 3,000 Monte Carlo simulations. Each panel of the table shows results from a different DGP and/or different covariates used in the p-score, and each row within panel is for a different target population. The true DGP is linear and is discussed in Section IV.D. Panel A shows results where Head Start has a constant treatment effect (TE) for all individuals. Panel B shows results where Head Start (HS) has no effect on individuals from small families (three or fewer children) and a large effect for families with many children (four or more children). Panels C and D show results where treatment effects are linear in Xig. Column 1, “True Beta,” presents the true average increase in the probability of completing some college for participants in Head Start in the sample, which is a function of the DGP and sample composition. Columns 2 and 3 present the bias of various estimation strategies, defined as the difference between the estimated effects of Head Start and the true beta. The estimated effects come from a LPM, propensity-score weighted LPM, respectively. Column 4 presents the ratio of the mean squared error (MSE) of the reweighting estimators relative to LPM. Reweighted estimates are obtained using in-regression weighting, with weights adjusting for the representativeness of switchers (using the variable(s) indicated in each of the panel headings as predictors in the multinomial logit step) and the conditional variance of Head Start within families. All betas are multiplied by 1,000.

    • ↵* p < 0.01.

    • View popup
    Table 6

    Head Start Impact for Representative Eligible Children, Participants, and Siblings Using Reweighting

    FFEReweighted ATE, Target=Diff. b/w FFE and Participant ATE
    Garces, Thomas, and Currie/DemingExpand Sample/ReplicateHS EligibleParticipantsSiblings
    Panel A: Some College (PSID)
    Head Start0.281**
    (0.108)
    0.120**
    (0.053)
    0.052
    (0.064)
    0.021
    (0.059)
    0.064
    (0.061)
    0.099**
    (0.032)
    Y mean in target0.5560.3870.4370.556
    Panel B: Economic Sufficiency Index, Age 30 (PSID)
    Head Start−0.023
    (0.102)
    −0.071
    (0.101)
    −0.040
    (0.099)
    0.021
    (0.113)
    0.017
    (0.066)
    Y mean in target0.213−0.198−0.4850.213
    Panel C: High School Graduation (CNLSY)
    Head Start0.086***
    (0.031)
    0.085***
    (0.030)
    0.033
    (0.042)
    0.048
    (0.037)
    0.020
    (0.044)
    0.037*
    (0.023)
    Y mean in target0.7760.7340.7660.776
    Panel D: Idle (CNLSY)
    Head Start−0.071*
    (0.038)
    −0.072*
    (0.037)
    −0.061
    (0.045)
    −0.055
    (0.042)
    −0.067
    (0.050)
    −0.017
    (0.026)
    Y mean in target0.1970.2210.2010.197
    Panel E: Learning Disability (CNLSY)
    Head Start−0.059***
    (0.020)
    −0.059***
    (0.021)
    −0.031
    (0.026)
    −0.042*
    (0.022)
    −0.040
    (0.026)
    0.017
    (0.015)
    Y mean in target0.0510.0550.0410.051
    Panel F: Poor Health (CNLSY)
    Head Start−0.070***
    (0.026)
    −0.069***
    (0.026)
    −0.063*
    (0.037)
    −0.067**
    (0.034)
    −0.050
    (0.038)
    −0.003
    (0.020)
    Y mean in target0.1030.0980.0740.103
    • Notes: Column 1 of this table shows the FFE estimated impacts of Head Start for whites from Garces, Thomas, and Currie (2002) or for the whole sample from Deming (2009). Column 2 shows the FFE estimate using our expanded sample for PSID outcomes and using our replication sample for CNLSY outcomes. The outcomes in Panels A and B are taken from the PSID white sample, and the outcomes in Panels C to F are taken from the CNLSY sample. Columns 3–5 present reweighted estimates of the effect of Head Start for three target populations (shown in the column header) using the post-regression reweighting procedure, in which we multiply group-level estimates of the impact of Head Start by the representative weight for the target population of interest. Column 6 presents the difference in the estimate in Column 2 (FFE) and Column 4 (reweighted for participants). Sample size is N = 2,986 for the expanded sample, and 1,036 for Garces, Thomas, and Currie (2002). Standard errors obtained by bootstrapping.

    • ↵* p < 0.10,

    • ↵** p < 0.05,

    • ↵*** p < 0.01.

Additional Files

  • Figures
  • Tables
  • Free alternate access to The Journal of Human Resources supplementary materials is available at https://uwpress.wisc.edu/journals/journals/jhr-supplementary.html

    • 0520-10930R1_supp.pdf
PreviousNext
Back to top

In this issue

Journal of Human Resources: 58 (5)
Journal of Human Resources
Vol. 58, Issue 5
1 Sep 2023
  • Table of Contents
  • Table of Contents (PDF)
  • Index by author
  • Back Matter (PDF)
  • Front Matter (PDF)
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Journal of Human Resources.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Selection into Identification in Fixed Effects Models, with Application to Head Start
(Your Name) has sent you a message from Journal of Human Resources
(Your Name) thought you would like to see the Journal of Human Resources web site.
Citation Tools
Selection into Identification in Fixed Effects Models, with Application to Head Start
Douglas L. Miller, Na’ama Shenhav, Michel Grosz
Journal of Human Resources Sep 2023, 58 (5) 1523-1566; DOI: 10.3368/jhr.58.5.0520-10930R1

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Selection into Identification in Fixed Effects Models, with Application to Head Start
Douglas L. Miller, Na’ama Shenhav, Michel Grosz
Journal of Human Resources Sep 2023, 58 (5) 1523-1566; DOI: 10.3368/jhr.58.5.0520-10930R1
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Bookmark this article

Jump to section

  • Article
    • ABSTRACT
    • I. Introduction
    • II. A Survey of Family Fixed Effects Applications
    • III. Fixed Effects and Selection into Identification
    • IV. Extrapolating from Identifying to Target Population
    • V. Extensions
    • VI. Effects of Head Start
    • VII. Other Applications
    • VIII. Conclusion
    • Footnotes
    • References
  • Figures & Data
  • Supplemental
  • Info & Metrics
  • References
  • PDF

Related Articles

  • Google Scholar

Cited By...

  • The Child Health Impacts of Coal: Evidence from Indias Coal Expansione
  • Firm and Worker Responses to Extensions in Paid Maternity Leave
  • Google Scholar

More in this TOC Section

  • Heterogeneous Returns to Active Labour Market Programs for Indigenous Populations
  • Leadership & Gender Composition in Managerial Positions
  • The Impact of Paid Family Leave on Families with Health Shocks
Show more Articles

Similar Articles

Keywords

  • I38
  • I28
  • C23
UW Press logo

© 2025 Board of Regents of the University of Wisconsin System

Powered by HighWire