Abstract
Using short snapshots of income in intergenerational mobility estimation causes “lifecycle bias” if the snapshots cannot mimic lifetime outcomes. We use uniquely long series of Swedish income data to show that this bias is large and to examine current strategies to reduce it. We confirm that lifecycle bias is smallest when incomes are measured around midlife, a central implication from a widely adopted generalization of the classical errors-in-variables model. However, the model cannot predict the ideal age of measurement or eliminate lifecycle bias at other ages. We illustrate how extensions of this model can reduce the bias further.
I. Introduction
Transmission of economic status within families is often measured by the intergenerational elasticity between parents’ and children’s lifetime income. A large and growing literature has estimated this parameter in order to analyze the extent of intergenerational mobility across countries, groups, and time.1 Unfortunately, the estimates in the early literature suffered greatly from measurement error in lifetime income, and successive methodological improvements led to large-scale corrections.2
While the early estimates were severely attenuated from approximation of lifetime values by noisy single-year income data for parents, Jenkins (1987) identifies systematic deviations of current from lifetime values over the life cycle as an additional source of inconsistency. Evidence by Haider and Solon (2006) and Grawe (2006) suggests that the latter is empirically important. Various refined methods to address such lifecycle bias have recently been presented. In particular, Haider and Solon proposed a tractable generalization of the classical errors-in-variables model that, while applicable also in other contexts, has strongly influenced how researchers make use of short-run income data in the intergenerational mobility literature.
But neither lifecycle effects as such nor the strategies to address them have yet been evaluated using actual lifetime incomes. In this paper we make use of Swedish data that contain nearly complete income histories of both fathers and sons, allowing us to derive a benchmark estimate and thus to directly expose the bias that results from approximation of lifetime by annual incomes. We test if current empirical practice can reduce this bias and examine how to improve elasticity estimates further.
First, we show that intergenerational elasticity estimates can vary substantially with the age at which sons’ incomes are observed, confirming that lifecycle effects are of serious concern. The elasticity is below 0.20 when sons’ incomes are measured at age 30 but above 0.40 at age 50, implying a drastically different degree of mobility. However, changes in fathers’ age have less dramatic consequences in our sample, illustrating that lifecycle bias is sensitive to cohort-specific patterns in the evolution of income inequality over age. Second, the bias is smallest when incomes are observed around midlife. We thus verify a central implication from Haider and Solon’s generalization of the classical errors-in-variables model, which is heavily relied on in the empirical literature. Third, while there is indeed an age at which the bias is zero, we find that the standard methodology fails to predict this “ideal” age. Small age deviations lead to notable shifts in elasticity estimates, suggesting that current empirical strategies may still be subject to substantial bias. Finally, we examine if modifications of the standard methodology can reduce lifecycle bias further. We present an extension of the generalized model that makes use of additional covariates, and find that the explicit consideration of human capital accumulation can strongly reduce the bias at early age.
Our analysis centers on Haider and Solon’s generalization of the textbook errors-in-variables model, which adds an age-dependent slope coefficient to true lifetime incomes but maintains that the remaining error is uncorrelated with true values. Under this assumption, lifecycle bias is eliminated when the slope coefficient converges to one. Unfortunately, our data do not fully support this prediction—at this age, the bias from left-side measurement error alone amounts to about 20 percent of the true elasticity (0.21 versus 0.27). Conceptually, the assumption fails to hold because the shape of income profiles varies with parental background even for a given level of lifetime income. Lifecycle bias thus tends to be larger than the generalized model predicts. However, the model is rarely used for formal bias correction, and instead motivates what has become a widely applied rule of thumb in the literature—to measure income around mid instead of early or late age. Our results confirm that this strategy strongly improves intergenerational elasticity estimates, and illustrates how much bias we should expect to remain in applications.
We then examine if modifications of standard practice can reduce this bias further. We present an extension of the generalized errors-in-variables model in which the relationship between annual and lifetime income is allowed to vary across groups. For example, highly educated individuals are found to deviate substantially from the population-average relationship in their early career. Controlling for college education thus considerably improves elasticity estimates that are based on early-age income. We further show that lifecycle bias can be reduced by averaging over multiple observations on the lefthand side (for the offspring), a procedure that reduces the influence of low-income episodes.
Our results thus have positive and negative implications. They confirm that incomes should be measured in midlife, and that deviations from this rule of thumb have detrimental consequences. But they also imply that current methods to compensate for incomplete income data are still imperfect, and that mobility estimates are likely less accurate than commonly assumed. Well-established findings from the literature, such that the intergenerational elasticity is higher in the United States than in the Nordic countries, are not put into doubt. But attempts to detect more gradual differences, as in recent studies on mobility trends, are easily compromised by lifecycle bias—alternative measures that abstract from changes in the variance, such as the Pearson correlation, appear more robust. We do find that simple extensions of the generalized errors-in-variables model can reduce the bias further. However, these improvements are partial and rely on data that may not always be available in practice. Lifecycle bias will thus often remain a concern in applications.
Lifecycle bias stems generally from the interaction of two factors: heterogeneity in income profiles cannot be fully accounted for, and unobserved idiosyncratic deviations from average profiles correlate with individual and family characteristics. For example, the offspring from poorer families may have higher initial incomes but flatter slopes if credit constraints affect human capital accumulation and job-search behavior in their early career. Such patterns are also of importance for other literatures that depend on measurement of long-run income and income dynamics. Examples include studies on the returns to schooling and the extensive literature that relates measures of stochastic income shocks to consumption or other outcomes.
The next section describes the methodology employed in the early literature. We examine the generalized errors-in-variables model theoretically in Section III and empirically in Section IV. We explore extensions of that model in Section V, and Section VI concludes.
II. The Intergenerational Mobility Literature
The target regression model in intergenerational mobility research is
(1)
where denotes log lifetime income of the son in family i,
log lifetime income of his father, εi is an error term that is orthogonal to
, and variables are expressed as deviations from their generational means.3 The coefficient β captures a statistical relationship that is commonly referred to as the intergenerational income elasticity.4 Closely related is the intergenerational correlation, ρ, which is obtained by adjusting β for differences in the standard deviation of log income across generations.
A. Approximation of Lifetime Income
As commonly available data sets do not contain complete income histories for two generations, a major challenge is how to approximate lifetime income.5 Let yi be some observed proxy for unobserved log lifetime income of an individual in family i—for example, a single-year observation, an average of multiple annual income observations, or a more complex estimate based on such annual incomes. Observed values are related to true values by
where is the unobserved log lifetime income of the son in family i and us,i is measurement error. Similarly, for the father we observe
The probability limit of the ordinary least squares (OLS) estimator from a linear regression of ys,i on yf,i can be decomposed into
(2)
where we used Equation 1 to substitute for and applied the covariance restriction
. It follows that the estimator can be down- or upward biased and that the covariances between measurement errors and lifetime incomes impact on consistency. The empirical strategies employed in the literature in the last decades can be broadly categorized in terms of changes in identifying assumptions about these covariances.
B. First Two Waves of Studies
The first wave of studies, surveyed in Becker and Tomes (1986), neglected the problem of measurement error in lifetime status. Often just single-year income measures were used as proxies for lifetime income, thereby implicitly assuming that Var(uf) = 0. Classical measurement error violates this assumption so that estimates of the intergenerational elasticity suffered from large attenuation bias (Atkinson 1980). The second wave of studies, surveyed in Solon (1999), recognized that Var(uf) ≠ 0 but maintained the assumption that measurement errors are random noise.6 Under these assumptions, Equation 2 reduces to the classical errors-in-variables model
(3)
with inconsistencies limited to standard attenuation bias. Researchers typically used averages of multiple income observations for fathers to increase the signal-to-noise ratio but gave less attention to the measurement of sons’ income.
C. Recent Literature
Recently the focus has shifted to nonclassical measurement error. An early theoretical discussion can be found in Jenkins (1987). Analyzing a simple model of income formation, he finds that usage of current incomes in Equation 1 will bias as income growth over the life cycle varies across individuals. He concludes that the direction of this lifecycle bias is ambiguous, that it can be large, and that it will not necessarily be smaller if fathers’ and sons’ incomes are measured at the same age.
Haider and Solon (2006) demonstrate that lifecycle bias can explain the previously noted pattern that intergenerational elasticity estimates increase with the age of sampled sons.7 They show that the association between current and lifetime income varies systematically over the life cycle, contrary to a classical errors-in-variables model with measurement error independent of true values. Böhlmark and Lindquist (2006) find strikingly similar patterns in a replication study with Swedish data. In a meta-analysis of existing intergenerational elasticity estimates, Grawe (2006) concludes that the observed age-dependency can indeed be explained by the existence of lifecycle bias.
Haider and Solon also note that controlling for the central tendency of income growth in the population by including age controls in Equation 1 will not suffice, as variation around the average growth rate will bias estimates. Vogel (2006) provides an illustration based on the insight that highly educated workers experience steeper-than-average income growth. Because available data tend to cover annual incomes of young sons and old fathers, lifetime incomes of highly educated sons (fathers) will be understated (overstated), which is likely to bias substantially downward if educational achievement is correlated within families. Indeed, the probability limit of
can be negative in extreme cases, as our data will confirm. Various refined estimation procedures have been proposed to address such lifecycle bias. We proceed to examine the most popular one in detail.
III. Measuring Income at a Certain Age
Haider and Solon (2006, henceforth HS) generalize the classical errors-in-variables model to allow for variation in the association between annual and lifetime income over the life cycle, which they document to be substantial. Their underlying intuition is that, for two individuals with different income trajectories, there will nevertheless exist an age t* where the difference between their log annual incomes equals the difference between their log (annuitized) lifetime incomes. The generalized model coincides with a classical errors-in-variables model at t*, suggesting that lifetime incomes should be approximated by annual incomes around this age.
The model is applicable to any analysis that relies on approximation of lifetime income by short-term measures, but we describe it here in the context of the intergenerational mobility literature. Assume that and
are unobserved and proxied by ys,it and yf,it, log annual incomes at age t (or age t for offspring and age t′ for parents). Haider and Solon’s generalization of the classical errors-in-variables model is given by the linear projection of ys,it on
,
(4)
where λs,t is allowed to vary by age and us,it is orthogonal to by construction. Similarly the linear projection of yf,it on
is given by
(5)
The parameters λs,t and λf,t may vary over age due to changes in either the correlation or relative dispersion between annual and lifetime income. Under the generalized model, the probability limit of the OLS estimator from a linear regression of ys,it on yf,it becomes
(6)
As HS do, we first focus on left-side measurement error and assume that is observed (such that λf,t = 1 and uf,it = 0 ∀t). Then the probability limit in Equation 6 becomes
(7)
HS note that under the assumption
(8)
left-side measurement error is innocuous for consistency if lifetime incomes of sons are proxied by annual incomes at an age t* where λs,t is close to 1. Their empirical analysis reveals that for an American cohort born in the early 1930s, λs,t is below 1 for young ages but close to 1 around midlife.
The model, often referred to as the generalized errors-in-variables (GEiV) model, thus illustrates how lifecycle bias should be expected to vary with age. Apart from providing conceptual insight, this knowledge can be very useful in applications. Researchers often face the problem that long-run outcomes like lifetime income are of theoretical interest but that available data only contain short snapshots of income. The GEiV model offers a potential remedy because it implies that measurement of income at a certain age might suffice if long-run outcomes are not directly observed. Possible applications are, for example, the returns to schooling or, as emphasized by HS, the intergenerational mobility literature.
The model has indeed become the standard reference to motivate empirical strategies in the latter, where the implied procedure to measure income around midlife is now common practice.8 A variation of the model that relies on the same intuition has been presented in Lee and Solon (2009).
But as the classical model, the GEiV model depends on Assumption 8, as also noted by HS. The validity of this assumption has not been examined and often researchers tend to assume that following the broad recommendation of measuring incomes in midlife is enough to eliminate or nearly eliminate lifecycle bias. Yet, there are reasons to suspect that assumptions like Assumption 8 might not hold.
Note first that for more than two workers we will generally not find an age t* where differences in annual income provide an undistorted approximation of differences in lifetime income. Figure 1 plots log income trajectories for workers 1, 2 (as in Figure 1 in HS), and an additional worker 3. At age the difference between the annual income trajectories equals the difference in lifetime income for workers 1 and 2, and at age
for workers 1 and 3. There exists no age where these differences are equal for all three workers at once.9 This example illustrates that λs,t only captures how differences in annual and lifetime income relate on average among all workers. Individuals, and groups of individuals, will nevertheless deviate from this average relationship so that their annual incomes systematically over- or understate their lifetime incomes compared to the population. A typical example is that highly educated individuals tend to experience steeper income growth over the life cycle, such that their annual incomes understate (overstate) lifetime incomes at young (old) ages relative to those with less education.
Illustrative Example of Log Annual Income Trajectories
Notes: For each worker, the upward-sloping line depicts log annual income by age, the horizontal line depicts log annuitized lifetime income.
For intergenerational mobility studies it is crucial that such idiosyncratic deviations might correlate within families or with parental income. For example, sons from poorer families may have higher initial incomes and flatter slopes if credit constraints affect human capital accumulation and job-search behavior in their early career. There are many other reasons to suspect dependency within families: Parents can transmit abilities and preferences, or influence their offspring’s educational and occupational choices, all of which may affect the shape of income profiles over the life cycle.10 The individual association between annual and lifetime income is thus likely to exhibit an intergenerational correlation itself and cannot be fully captured by a single population parameter like λs,t. Assumption 8 is then unlikely to hold, the probability limit of does not equal βλs,t, and knowledge of the exact lifecycle pattern of λs,t cannot eliminate lifecycle bias.11 The basic implications of the GEiV model are not impaired by these arguments. It may still represent a large improvement over the classical model, which we will examine empirically. However, our arguments imply that lifecycle bias remains hard to address and that the search for an ideal age to measure income at might not be an entirely satisfying path to follow.
There are various ways to probe our theoretical arguments. One can examine the validity of Assumption 8 formally by deriving the elements of us,it for a given income formation model and analyzing its relation to the regressor . Although it can be shown that us,it is correlated with
even for a simple log-linear income formation model (see Nybom and Stuhler 2011), such exercises will not be informative on the magnitude of lifecycle bias that should be expected in practice. In the next section, we therefore provide empirical evidence.
IV. Empirical Evidence on Lifecycle Bias
We use Swedish panel data containing nearly life-long income histories to provide direct evidence on lifecycle bias in estimates of the intergenerational elasticity that are based on annual incomes. We then apply the GEiV model and examine the size of the remaining bias. We evaluate both the rule of thumb to measure incomes at the predicted “ideal” age t* and the model’s ability to correct estimates at other ages.
A. Data Sources and Sample Selection
To the best of our knowledge, Swedish tax registry data offer the longest panel of income data, covering annual incomes across 48 years for a large and representative share of the population. Moreover, a multigenerational register matches children to parents, and census data provide information on schooling and other individual characteristics. All merged together, the data provide a unique possibility to examine lifecycle bias in intergenerational mobility estimation using actual income histories.
To select our sample, we apply a number of necessary restrictions. We follow the majority of the literature and limit our sample to sons and their biological fathers. To these we merge income data for the years 1960–2007. Since other income measures are available only from 1968, we use total (pretax) income, which is the sum of an individual’s labor earnings (and labor-related benefits), early-age pensions, and net income from business and capital realizations.
Our main sample is based on sons born 1955–57. Earlier cohorts could be used but then we would observe fewer early-career incomes for fathers. Conversely, later cohorts are not included as we want to follow the sons for as long as possible. Moreover, to avoid large differences in the birth year of fathers, we exclude pairs where the father was older than 28 years at the son’s birth. On other sampling issues we adopt the restrictions applied by HS and Böhlmark and Lindquist (2006).12
Our data come with a couple of drawbacks. To maximize the length of income histories we use total income, whereas HS use labor earnings. However, total income is a highly relevant measure of economic status and Böhlmark and Lindquist find that total income and earnings yield similar estimates of λs,t over the life cycle. Further, the use of tax-based data could raise concerns about missing data in the low end of the distribution. The Swedish system, however, provides strong incentives to declare some taxable income since doing so is a requirement for eligibility to most social insurance programs. Hence, this concern most likely only applies to a very small share of the population.
Our data also have many advantages. First, they are almost entirely free from attrition. Second, they pertain to all jobs. Third, in contrast to many other studies, our data are not right-censored. Fourth, we use registry data, which is believed to suffer less from reporting errors than survey data. Fifth, and most important, we have nearly career-long series of income for both sons and their fathers. Overall, we believe that the data are the best available for the purpose of this study.
Our main sample consists of 3,504 father-son pairs, with sons’ income measured from age 22–50 and fathers’ income measured from age 33–65, irrespective of birth years.13 Table 1 reports descriptive statistics. Rows 2 and 3 show that dispersions in lifetime income are of similar magnitudes for fathers and sons. Rows 4 and 5 show that on average there are more than 28 positive income observations for sons, and more than 30 for fathers, with relatively low dispersion in both cases.
Summary Statistics by Birth Year of Sons
B. Empirical Strategy
To assess the size of lifecycle bias we compare estimates based on annual incomes with a benchmark estimate that is based on lifetime incomes. As in the theoretical discussion we focus on left-side measurement error (for example, for sons), although we provide brief evidence on lifecycle bias due to right-side (for example, for fathers) and measurement error on both sides in a later subsection. We do this for two reasons. First, left-side measurement error has until recently been neglected in the literature. Second, lifecycle bias is not confounded by attenuation bias from classical measurement error on the lefthand side, which simplifies the analysis.
We use our measures of log lifetime incomes and
to estimate Equation 1 by OLS, which yields our benchmark estimate
.14 We then approximate log lifetime income of sons
by log annual income ys,it (left-side measurement error) to estimate
separately for each age t, to obtain a set of estimates . Finally, we estimate Equation 4, which provides us with estimates of λs,t. None of these estimations include additional controls.
Under the assumptions of the GEiV model, the probability limit of equals λs,tβ, and using annual income of sons at age t* where λs,t = 1 consistently estimates β. As discussed in the previous section, we suspect
to be biased even after adjustment by
. The remaining lifecycle bias after adjustment by the GEiV model, denoted by
, is thus of central interest.15 Note that we assume that
is known in order to evaluate the model’s theoretical capability to adjust for lifecycle bias under favorable conditions. A second (known) source of inconsistency can arise in that the age profile of
will typically not be directly estimable by the researcher.
C. Empirical Results
1. Measurement error on the lefthand side
We first present estimates of λs,t. Figure 2 shows that rises over age and crosses 1 at around age t* = 33. Largely consistent with others, we find that income differences at young (old) age substantially understate (overstate) differences in lifetime income. We note that
is close to 1 only for a short time around age 33, in contrast to the pattern found for older American and Swedish cohorts in HS and Böhlmark and Lindquist (2006), for which it remains close to 1 for an extended period through midlife. A general concern is, thus, that measuring annual income only a few years earlier or later can cause large differences in elasticity estimates. Figure 2 also plots the corresponding estimates of
and Var(ys,t), indicating that the monotone rise of
stems from growth of the former at early and an increase of the latter at older age.
OLS Estimates of λs,t
Notes: The figure shows estimates of λs,t by sons’ age for cohorts 1955–57. λs,t is the regression coefficient in a regression of son’s log annual income on son’s log lifetime income, see Equation 4. The grey lines are estimates of the variance of log annual income of sons and the correlation between log annual and lifetime income of sons over age.
Our central estimates are presented in Figure 3, which plots (the benchmark elasticity),
(estimates based on annual income of sons at age t), and
(estimates at age t adjusted by the GEiV model). For comparison, the figure also includes annual estimates of the intergenerational correlation,
. Table 2 provides additional statistics in the most central age range around t*. Note that the sample is balanced within (but not across) each age. Zero or missing income observations that are not considered for estimation of λs,t and βt are not used to estimate β, which is reestimated for each age. The benchmark elasticity thus varies slightly over age. We list our key findings.
OLS Estimates of Elasticities with Left-Side Measurement Error
Notes: The figure shows the benchmark estimate of the intergenerational elasticity together with the unadjusted and adjusted (by the GEiV model) estimates based on sons’ annual income. The grey line shows corresponding estimates of the intergenerational correlation. The estimates are for cohort 1955–57, left-side measurement error only.
OLS Estimates of Elasticities and Lifecycle bias
First, our benchmark estimate of the intergenerational elasticity of lifetime income for our Swedish cohort is about 0.27 (see also Table 2). This is marginally higher than what most previous studies have found for Sweden, and should be closer to the population parameter due to our nearly complete income profiles.
Second, we confirm that the variation of over age resembles the pattern of
, as predicted by the GEiV model. We therefore find that
increases with age and that the lifecycle bias is negative for young and positive for old ages of sons. One of the central predictions of the GEiV model, that current income around midlife is a better proxy for lifetime income than income in young or old ages, is thus confirmed.
Third, the magnitude of lifecycle bias stemming from left-side measurement error alone can be striking. For example, the elasticity is below 0.20 when sons’ incomes are measured at age 30 but above 0.40 at age 50, thus resulting in drastically different characterizations of the degree of mobility. Analysis based on income below age 26 yields a negative elasticity. We therefore find direct evidence on the importance of lifecycle bias in intergenerational mobility estimates, as has been hypothesized in the recent literature.
Fourth, the lifecycle bias is larger than implied by the GEiV model. While the adjustment of estimates according to this model leads on average to sizable improvements, it cannot fully eliminate the bias. This finding holds true even under the assumption that the central parameters λs,t are directly estimable.
Fifth, the lifecycle bias is not minimized at age t*, the age at which the current empirical literature aims to measure income, but at an age t > t*. We report a similar pattern for other cohorts in Section IVD.
Sixth, the remaining lifecycle bias around age t* is substantial and significantly different from zero. Table 2 shows that
is on average around −0.05 over ages 31–35, which corresponds to about 20 percent of our benchmark. Knowledge of age t* will thus not eliminate lifecycle bias.
Seventh, the pattern of the intergenerational correlation, , resembles that of the elasticity up until the mid 30s, but remains quite stable beyond that age. In contrast to the elasticity, the correlation is not directly affected by the rising variance of ys,it from midlife that is documented in Figure 2. Benchmark estimates of the correlation (not shown) are very similar but annual estimates are much below the corresponding estimates of the elasticity, as classical errors on the left side attenuate the former but not the latter (see Equation 3).
Our arguments apply likewise to the extension of the GEiV model presented in Lee and Solon (2009), which has been applied in much of the recent research on mobility trends (see Nybom and Stuhler 2011).
We briefly compare these empirical results with our theoretical discussion of the determinants of . Table 3 shows its components according to Equation 7. Variation of over age stems mostly from variation in the residual correlation
, while the ratio
is close to one over most of the life cycle.16 Seemingly small residual correlations can thus translate into substantive biases. For example, a residual correlation of 0.03 translates into a lifecycle bias of more than 10 percent of the benchmark elasticity.
Decomposition of Lifecycle bias
We provided intuition why the residuals from Equation 4 may correlate with parental income in the previous section. For further evidence we examine if the residuals correlate also with various other characteristics, specifically: (i) father’s age at birth of his son, (ii) father’s education, (iii) son’s education, (iv) son’s scores on a test of cognitive ability, and (v) son’s country of birth. Table 4 describes how each variable is measured and presents the bivariate correlations. Most estimates are significantly different from zero, especially at early age. The residuals correlate particularly strongly with education, implying that the GEiV model cannot capture some of the heterogeneity in income profiles that arises from human capital investment. But the residuals correlate also with other variables, such as cognitive ability and immigrant status. The GEiV model should thus not be expected to eliminate lifecycle bias in other literatures, in which interest lies on different explanatory variables. It captures changes in the average association between annual and lifetime income in the population over age, but applications are typically based on comparisons of specific subgroups of the population. The model can then not fully eliminate lifecycle bias as the association between annual and lifetime income varies not only over age but also over groups defined by parental income, years of schooling, gender, or other characteristics.17
Correlations Between Residuals and Characteristics
These results provide guidance for applied research, but some remarks about generalizability are warranted. Lifecycle bias will differ quantitatively across populations. The bias is determined by the degree of systematic differences in income profiles between sons from poor and sons from rich families. This mechanism is likely to vary across cohorts and countries. The question is if observed qualitative patterns over age can nevertheless be generalized. Figure 3 demonstrates that income at old age provides a more reliable base for the GEiV model than income at young age. Thus, the relationship between current and lifetime income differs with respect to family background particularly at the beginning of the life cycle. This result is intuitive if one considers potential causal mechanisms of intergenerational transmission. Sons from rich families might acquire more education or face different conditions that particularly affect initial job search (for example, regarding credit-constraints, family networks, or ex ante information on labor market characteristics). Such mechanisms are likely to apply to most populations. Although the size of the lifecycle bias is bound to differ across populations, its pattern over age is thus likely to hold more generally. This conclusion is supported by results for other Swedish cohorts as well as direct evidence on the role of human capital investments, both of which will be discussed later on.
2. Measurement error on the righthand side or both sides
For conceptual reasons, we focused on left-side measurement error, but evidence on the combined effects of lifecycle bias from both sides is also relevant for practitioners. For example, Grawe (2006) finds in his survey of the literature a negative relationship between estimates of βt and the age at which fathers are observed. One may ask if we find similar lifecycle effects from the righthand side, and whether these tend to cancel out (or aggravate) the effects from left-side measurement error. We now base estimates of βt on lifetime income of sons and annual income for fathers (right-side measurement error) or annual incomes for both fathers and sons (measurement error on both sides). The probability limit of is then affected by both attenuation and lifecycle bias. We adjust for both according to the GEiV model. Results are shown in Figures 4 and 5.18
OLS Estimates of Elasticities with Right-Side Measurement Error
Notes: Cohort 1955–57, right-side measurement error only.
OLS Estimates of Elasticities with Both-Side Measurement Error
Notes: Cohort 1955–57, measurement error on both sides. For simpler presentation we only display results for annual incomes at the same distance from t* for sons and fathers. At s = 0 both are measured at their respective t*, at s = 5 both are measured five years after t*, and so on.
Figure 4 demonstrates the large attenuating effects from right-side measurement error. In contrast with findings by Grawe (2006), our estimates of βt (and λf,t) are surprisingly stable over age. The main explanation is that we do not find any substantial increase in the variance of log annual income over the age of sampled fathers, as central parts of their working lives coincided with a gradual decrease in overall income inequality in Sweden.19 This finding contrasts with the observed profile for sons (see Figure 2), suggesting that the pattern of lifecycle bias can substantially vary between populations. However, after adjustment by the GEiV model the remaining lifecycle bias follows a (qualitatively) similar pattern over age as for the case of left-side measurement error.
Figure 5 shows estimates for the case of measurement error on both sides, with fathers’ and sons’ incomes measured at similar ages. The remaining bias is overall larger than for left-side measurement error alone, thus indicating aggravating effects of measurement error on both sides.20 This result also holds when fathers’ and sons’ incomes are measured at their respective t*. As for left-side measurement error, we again find that the GEiV model is more successful in reducing the bias at later ages. Moreover, the estimates suffer from strong year-to-year variability. Reducing this variability is an additional motive for averaging over multiple income observations on both sides.
Figure 6 summarizes the validity of the GEiV model’s key assumption of uncorrelated idiosyncratic income deviations for left-side, right-side, and measurement error on both sides. The correlations between residuals and true income tend to be negative at earlier ages and midlife but are around zero or slightly positive thereafter, illustrating why the GEiV model performs better at later age. The correlation between the residuals us,it and uf,it tends to be closer to zero.
Residual Correlations for LHS, RHS and BHS Measurement Error
Notes: The figure shows estimates over age of the correlation between residuals from the GEiV model of sons and fathers’ lifetime income (LHS ME), of fathers’ residuals and sons’ lifetime income (RHS ME), and of fathers’ and sons’ residuals (BHS ME). The estimates are for cohort 1955–57. For readability, we present three-year moving averages of age-specific estimates.
D. Robustness Tests
We perform various tests of the sensitivity of our main results. For brevity, we focus on left-side measurement error.
1. Treatment of outliers in the income data
Intergenerational elasticity estimates can be sensitive to how one treats extreme and missing incomes (Couch and Lillard 1998, Dahl and DeLeire 2008). We test the robustness of our results by (i) balancing the sample such that only sons with positive income in all ages 31–35 are included, (ii) bottom-coding very low incomes, and (iii) top-coding very high incomes. We compare the bias at ages 31–35 for these samples (summarized in Table 5) with our main results (Table 2). Estimates of the remaining lifecycle bias are on average a third lower in the balanced sample but still correspond to more than 10 percent of the benchmark elasticity. Bottom-coding increases the bias slightly, perhaps since observations with zero income are included, while top-coding has very little effect on the results. Low incomes thus seem influential for the size of bias, but it is not obvious what the right sampling choice would be.
Summary of Robustness Tests
2. Length of observed income profiles
It might be a concern that our measures of lifetime income are only based on almost complete income histories. In our working paper (Nybom and Stuhler 2011), we investigate in detail if our findings are sensitive to the observed spans of income data. In particular, we use neighboring offspring cohorts to study the influence of late-age (early-age) income data of sons (fathers). Changes in the fathers’ age span have little effect on the lifecycle bias. Changes in the sons’ age span matter slightly more, although the pattern over age remains stable. This difference is not unexpected since such changes are likely to alter both and λs,t slightly. The exact profile of lifecycle bias therefore depends on the observed age span, but the major facts are stable: The remaining bias after adjustment by can be large and tends to be negative for young ages and around t*.
3. Cohort and population differences
We repeat our analysis for two other cohort groups (sons born 1952–54 and 1958–60) to examine potential variation across populations, which may for example stem from differential trends in income inequality. For comparability, we limit income profiles to the longest span observed in all three samples (ages 22–47 for sons and 36–65 for fathers). Table 6 presents the most central results. The 1958–60 cohort has a benchmark elasticity that is similar to our main cohort but a slightly larger remaining bias
around age t*. For the 1952–54 cohort, both
and
are substantially lower. Figure 7 plots estimates of
over the full age range. Although the overall patterns are relatively similar, the differences between elasticity estimates at each age are volatile. These differences—substantial even for a fixed sampling procedure across Swedish cohorts—indicate that the bias in elasticity estimates can differ across populations even if incomes are measured at the same age.
Summary of Cohort Differences, Averages over Ages 31–35
OLS Estimates of Elasticities for Various Cohorts
Notes: Left-side measurement error only.
V. Extensions
We proceed to examine if alterations of the generalized error-invariables model and standard estimation procedures can reduce lifecycle bias further. We first offer an extension of the generalized model that makes use of additional covariates. Whether averaging over multiple annual income observations also on the lefthand side can reduce bias is addressed in the following subsection.
A. Extending the Generalized Errors-in-Variables Model
The model presented by Haider and Solon (2006) captures how differences in annual and lifetime incomes relate on average in the population of interest. We showed that knowledge of the average relationship may not be sufficient to eliminate lifecycle bias in applications, as idiosyncratic deviations from this average relate systematically to parental background and other variables. A strategy to reduce the bias further is to condition this relationship on additional covariates that may capture such heterogeneity.
We can extend the generalized model by allowing the intercept to vary with covariates , such that Equations 4 and 5 become (we omit the i subscript)
(9)
and
(10)
The coefficients capture whether, beyond the population-average relationship, annual incomes systematically under-or over-state lifetime incomes for certain groups. Define residual incomes of sons by
(11)
and the residuals for fathers correspondingly. The probability limit of the OLS estimator from a linear regression of
on
is then again given by Equation 6, and subsequent arguments follow as in the standard GEiV model.
We examine if such an extension of the generalized model performs better, both in predicting the ideal age to measure incomes at and in minimizing the bias when using income observations beyond that age. In the previous section we found the bias to be particularly large at young age, and noted that differences in human-capital investment are a likely explanation. We thus include in each of Equations 9 and 10 a single covariate that equals one if the respective person has attended university or college and zero otherwise.
Figure 8 shows the estimated lifecycle profiles of λs,t and μs,t, the coefficient on the college dummy for sons. The pattern of is quite similar for the standard GEiV (solid line) and our extended model (dashed line), as is also the case for fathers (not shown). The pattern of
illustrates that highly educated individuals deviate strongly from the average relationship that is captured in λs,t: Their annual incomes tend to understate lifetime incomes strongly in early and overstate them in old ages, even after the average relationship between annual and lifetime income is taken into account.
OLS Estimates of Coefficients in Standard and Extended GEiV Model
Notes: Panel A shows estimates of λs,t by age of sons for the standard GEiV model (Haider and Solon 2006) and an extended variant with additional covariates (see main text). Panel B shows estimates of μs,t for the extended specification. Cohort 1955–57, left-side measurement error only. For readability, we present three-year moving averages of age-specific estimates in both panels.
Figure 9 compares the remaining lifecycle bias in annual elasticity estimates after adjustments based on the two versions of the GEiV model, separately for the case of left-side and both-side measurement error.21 The extended model strongly reduces lifecycle bias in early age; estimates are in the vicinity of the true elasticity from the mid-20s (for sons), in contrast to the standard model. As early-age income is a poor signal of lifetime income particularly for college graduates, the approximation of lifetime by annual income differences is strongly distorted when education is not taken into account. However, controlling for education reduces the bias only mildly around t* and the intergenerational elasticity is still underestimated at mid- and old ages.
Remaining Lifecycle bias in Standard and Extended GEiV Model
Notes: The figure shows the difference between the benchmark estimate of the intergenerational elasticity and adjusted estimates for the standard GEiV model and an extended variant (see main text), based on annual income for sons (Panel A) or sons and fathers (Panel B). Cohort 1955–57. For readability, we present three-year moving averages of age-specific estimates in both panels.
How can practitioners make use of these findings? Application of the above procedure is straightforward if the data allow for direct estimation of Equation 9.22 If the relationship between annual and lifetime incomes cannot be directly estimated then external evidence on the pattern of λs,t and μs,t may provide a useful approximation. As noted by HS, importing external estimates can be problematic if the central relationships differ across populations. But while income-education associations vary considerably across countries, we may expect similarities in its broad patterns (for example, education tends to decrease early but increase late-career income).
Simple extensions of HS’s model can thus substantially decrease lifecycle bias: The average absolute deviation in adjusted annual estimates across ages 25–45 corresponds to 23.6 percent of the true elasticity for the standard model but falls to 13.8 percent with education controls.23 Note, however, that we cannot reduce the bias much further by using more detailed information: Variables that distinguish finer educational classes, years of education, or cognitive ability provide results that are largely similar to our simple college dummy, and the combination of multiple of these controls does not perform much better either.24 This observation implies that the income trajectories of sons with high- and low-income fathers remain different even when many individual characteristics are controlled for. In addition, only few practitioners can do such detailed bias corrections in their data, and our results here can then only provide partial improvements. Lifecycle bias will thus often remain a concern in applications.
B. Multiyear Averages of Current Income
The importance of dealing with transitory noise in short-run income measures on the righthand side, for example by using multiyear averages, is well recognized in the literature (see Mazumder 2005). But some recent studies that reference to the GEiV model (see Footnote 8) average also over multiple income observations on the lefthand side (for sons). Yet, the theoretical motivation for doing so is not clear. One rationale could be that researchers do not know the exact age at which λs,t equals one. Our finding that lifecycle bias can be substantial even at this age raises the question if and how averaging can help to reduce the bias.
We therefore estimate βt using the logs of three-, five-, and seven-year averages of sons’ income. These averages are also used to estimate λs,t and the remaining lifecycle bias after adjustment by . Figure 10 presents its size for averages that are centered around different ages. The remaining lifecycle bias falls in the number of income observations but is not eliminated. With seven-year averages the true elasticity is underestimated by about 0.03 at ages 31–35 compared to about 0.05 using one-year measures. The standard deviation of the residuals
, which is a central component of the bias, decreases by about one-third as we move from one- to seven-year measures, and diminishes the estimated bias proportionally. The residual correlation falls only slightly and estimates of λs,t are marginally lowered up until about age 40. As of the log transformation, averaging reduces the influence of episodes with very low incomes, which as we noted in the previous section can contribute strongly to lifecycle bias in the GEiV framework.
Remaining Lifecycle bias in GEiV Model with Multiyear Averages on LHS
Notes: The figure shows the difference between the benchmark estimate of the intergenerational elasticity and GEiV-adjusted estimates based on the log of three-year, five-year, or seven-year averages of sons’ annual income (see main text). Cohort 1955–57, left-side measurement error only.
In addition, estimates based on annual measures may suffer from strong year-to-year variability (see Figure 3). Reducing this variability is a second motive for averaging over multiple income observations on both sides. Our results thus provide two separate arguments in support of averaging over income observations also on the lefthand side, when possible. Note, however, that these results pertain to using log of multiyear averages, not to multiyear averages of log annual incomes. As noted by Haider and Solon (2006), estimates based on the latter are algebraically equivalent to the simple average of the single-year estimates, and will thus only smooth estimates.
VI. Conclusions
Using snapshots of income over shorter periods in the estimation of intergenerational income elasticities causes a so-called lifecycle bias if the snapshots cannot mimic lifetime outcomes (Jenkins 1987). We use nearly career-long income data of fathers and their sons, allowing us to estimate a benchmark elasticity and to directly expose the large magnitude of this bias in practice. We confirm that Haider and Solon’s (2006) generalization of the classical errors-in-variables model and their widely adopted suggestion to measure incomes around mid-age can strongly improve elasticity estimates. However, we also show that the failure of another errors-in-variables assumption prevents correct prediction of the ideal age of measurement and thus full elimination of lifecycle effects. The bias that persists in our Swedish data even after application of the generalized model is strongly negative when using annual income below age 30 and remains negative up until the early 40s. Estimates can understate the true elasticity substantially also when income is measured around the “ideal” age as predicted by the generalized model.
Comparisons of intergenerational mobility estimates across countries, groups, or cohorts may thus be of limited reliability if based on short-run income data.25 Still, some of the major conclusions from cross-country studies are not put into question. For example, the findings that income mobility is much lower than found by the early literature, and that the elasticity differs strongly across countries (for example, being higher in the United States than in the Nordic countries and Canada), would be robust even to sizable revisions of the underlying estimates. It might, however, be necessary to revisit those conclusions that are based on more marginal differences. Studies on mobility trends are potentially affected since even moderate lifecycle biases may be sufficient to mask gradual changes of mobility over time. The comparison of elasticities across subgroups of a population can be compromised when the age pattern in income profiles differs, which may, for example, be the case when groups are classified by education, sex, or immigration status. Our results suggest that measures such as the Pearson correlation, which abstract from changes in the variance, are more robust, in particular at later ages.
These results are mostly negative, but our analysis also points to potential improvements. We find evidence that incomes at later ages (for example, age 40–50) provide a more reliable base for application of the GEiV model. Moreover, the bias can be reduced by averaging over multiple income observations from midlife (if available) for both fathers and sons. Using logs of multiyear averages also for sons may counteract the disproportionate influence of occasional low-income episodes, and has the added value of reducing year-to-year volatility from annual measures.
These simple suggestions lead to modest bias reductions, but we also propose and test a simple extension of HS’s generalized errors-in-variables model that may improve elasticity estimates further. The standard model captures how differences in annual and lifetime incomes relate on average in the population, but knowledge of the average relationship is not sufficient if idiosyncratic deviations relate systematically to parental background and other factors. We show how additional covariates can be incorporated into the model to capture some of those deviations, and illustrate that bias in early age can be reduced considerably by conditioning the average relationship on education. This finding suggests that human capital investments are one reason why early-career differences in income predict lifetime differences so poorly, and that the explicit consideration of such investments can strongly increase the signaling value of short-run income differentials.
Further refinements of empirical practice with restricted use of income observations around a specific age can thus improve upon previous estimates but will typically not eliminate lifecycle bias. Development of a more structured approach that aims to capitalize on all available income data seems desirable. Future research could in particular benefit from a more comprehensive exploitation of partially observed income growth patterns. Intergenerational mobility estimates are often based on multiple income observations per individual, but researchers tend to disregard the idiosyncratic income growth across these observations. Such partially observed growth patterns are determined by both observable and unobservable characteristics of the individual and may hence contain more information on lifetime income than what current income levels and observable characteristics can provide.
Our results add to a general conclusion that can be drawn from the intergenerational mobility literature: Addressing heterogeneity in income profiles is an important, difficult, and recurrently underestimated task. The widespread practice of measuring annual income at a certain age as a surrogate for unobserved lifetime income is still prone to lifecycle bias because the most appropriate age for measurement is hard to predict and estimates can be sensitive to small age changes. These issues are potentially important for other literatures that rely on measurement of long-run income or income dynamics.
Footnotes
Martin Nybom is an assistant professor at the Swedish Institute for Social Research at Stockholm University and researcher at the Institute for Evaluation of Labor Market and Education Policy. Email: martin.nybom{at}sofi.su.se. Jan Stuhler is an assistant professor at Universidad Carlos III de Madrid and affiliated with the Swedish Institute for Social Research. Email: jan.stuhler{at}uc3m.es.
↵1. See Solon (1999) for a comprehensive evaluation of the early empirical literature. Recent surveys include Björklund and Jäntti (2009) and Black and Devereux (2011).
↵2. For example, the intergenerational elasticity of earnings for fathers and sons in the United States was estimated to be less than 0.2 in early studies (surveyed in Becker and Tomes 1986), ranged between about 0.3 and 0.5 in the studies surveyed in Solon (1999), and is estimated to be around 0.6 or above in more recent studies like Mazumder (2005) and Gouskova, Chiteji, and Stafford (2010).
↵3. We use the terms earnings and income interchangeably (since the issues that arise are similar), and examine fathers and sons since this has been the baseline case in the literature. A growing literature exists on intergenerational mobility in other family dimensions (such as mothers, daughters, or siblings) and in other income concepts (such as household income), for which our arguments are likewise relevant.
↵4. Equations akin to Equation 1 may also appear as structural relationships to study causal mechanisms of intergenerational transmission. The structural relationship relates typically not to ex post measures of long-run economic status but to the ex ante concept of “permanent income.” The two concepts are not always clearly distinguished, and some studies adopt the term “permanent income” even while focusing on the measurement of mobility. Our analysis relates to the statistical relationship, but incomplete measurement of long-run status impedes identification of both types.
↵5. Note that the availability of better data would not generally solve the identification problem as data sets cannot contain complete income histories for contemporary populations.
↵6. That lifecycle variation had to be accounted for was recognized, but it was generally assumed that including age controls in the regression equation would suffice.
↵7. For a summary, see Solon (1999). Age-dependency of elasticity estimates could also arise if the dispersion in transitory income and thus the attenuation bias vary over the life cycle, as has been documented in Björklund (1993) for Sweden.
↵8. Among others, in Gouskova, Chiteji, and Stafford (2010) for the United States; Björklund, Lindahl, and Plug (2006) and Björklund, Jäntti, and Lindquist (2009) for Sweden; Nilsen et al. (2012) for Norway; Raaum et al. (2007) for Denmark, Finland, Norway, the United Kingdom and the United States; Nicoletti and Ermisch (2007) for the United Kingdom; Piraino (2007) and Mocetti (2007) for Italy. More examples are covered in the surveys of Björklund and Jäntti (2009) and Black and Devereux (2011).
↵9. This result does not depend on a high degree of complexity in income growth processes but also holds, for example, for a simple log-linear income formation model as analyzed in HS (see Nybom and Stuhler 2011).
↵10. For example, individual deviations from the average rate of income growth over the life cycle may be correlated within families, as considered by Jäntti and Lindahl (2012).
↵11. Corresponding biases arise in the case of right-side measurement error in which unobserved lifetime income of fathers is approximated by annual income or if approximations are made for both fathers and sons, as can be derived from Equation 6.
↵12. We restrict the sample to fathers and sons who report positive (nonzero) income in SEK in at least 10 years. We exclude those who died before age 50 and sons who immigrated to Sweden after age 16 or migrated from Sweden on a long-term basis (at least 10 years).
↵13. We express all incomes in 2005 prices (in SEK), apply an annual discount rate of 2 percent, and divide the sums by the number of nonmissing income observations to construct our measures of annuitized lifetime income (there is a small number of negative annual incomes in our data). In the last step, we take the logs of annual incomes and annuitized lifetime income.
↵14. Of course, this estimate is not exactly true since we still lack some years of income. However, this does not affect our approach to use the estimate as a benchmark. The GEiV model is not restricted to any specific population, and should therefore be applicable to our variant of the Swedish population in which we truncate income profiles at some age. It is nevertheless advantageous that we have long income histories. First, our benchmark estimate will be close to the true value. Second, since the income profiles contain most of the idiosyncratic heterogeneity that leads to lifecycle bias, we expect our estimate of the bias to be representative for a typical application. We discuss its sensitivity to the exact length of observed income histories in Section IVD.
↵15. The arguments of HS relate to the probability limit. In a finite sample we need to consider the distribution of
. Reported standard errors for
are based on a Taylor approximation and take the covariance structure of
,
, and
into account.
↵16. The previously documented increase in λs,t over age is offset by an increase in σus,t.
↵17. The observation that the residuals correlate most strongly with education indicates that the GEiV model may perform worse in applications in which education plays a central role. Bhuller, Mogstad, and Salvanes (2011) examine lifecycle bias in returns to schooling estimates and also analyze the applicability of the GEiV model in this context.
↵18. From Equation 6, the probability limit of
equals
for right-side and λs,tθf,tβ for both-side measurement error under the assumptions of the GEiV model. The remaining lifecycle biases equal
and
, respectively. For brevity we use only one age subscript t and show combinations of annual income for sons and fathers with equal distances to their respective t* in Figure 5.
↵19. From the perspective of the GEiV model, βt is expected to decrease when λf,t increases over age, as long as
is sufficiently small compared to λf,t (see also Haider and Solon 2006, p. 1312). However, this ratio is estimated to be close to one in our data, such that an increase in λf,t, as we observe at earlier ages, has only marginal effects on βt. The relationship between βt and λs,t is more straightforward for left-side measurement error, as ∂βt / ∂λs,t = β.
↵20. This result holds also if estimates are only adjusted for classical attenuation bias (see Figure 13 in Nybom and Stuhler 2011). The results confirm the theoretical predictions of Jenkins (1987) that measuring fathers’ and sons’ income at similar ages might not necessarily reduce lifecycle bias.
↵21. With errors on both sides, estimates from the GEiV model are quite volatile over age (see Figure 5). For readability, we here show three-year moving averages of age-specific estimates.
↵22. For example, intergenerational data may include educational attainment, and annual and lifetime incomes for one generation but not the other. Alternatively, an external data source that only covers one generation of the population of interest may be exploited for this purpose.
↵23. We find that separate treatment of low-income episodes, which transformed to logs are highly influential in the estimation of λs,t and λf,t, decreases the bias further in our data. This finding is hard to generalize, but we suggest probing the robustness of estimation results in this respect.
↵24. Variants of Equation 9 that allow for the slope parameter λs,t to differ across groups did not perform better than the simpler specification with heterogeneous intercepts and are thus not presented here.
↵25. One might hope that the bias is of similar size across populations and thus not consequential for comparative studies. Cross-country comparisons would, for example, be reliable if the dispersion and intergenerational correlation in the shape of income profiles were of similar magnitude in each country. However, such assertions are put into question by our finding that lifecycle bias varies even across Swedish cohorts born in the same decade.
- Received March 2014.
- Accepted August 2014.