Abstract
We use a large sample of Swedish-born adoptees and their biological and adopting parents to decompose the persistence in health inequality across generations into pre-birth and post-birth components. We use three sets of measures for health outcomes in the second generation: mortality, measures based on data on hospitalization, and measures using birth outcomes for the third generation. The results show that all of the persistence in mortality is transmitted solely via pre-birth factors, while the results for the hospitalization measures suggest that at least three-quarters of the intergenerational persistence in health is attributable to the biological parents.
I. Introduction
There is a long tradition of studies on intergenerational persistence in longevity and other health outcomes dating back to Beeton and Pearson (1899).1 However, it is not clear from these studies to what extent this persistence can be attributed to genetic factors or to environmental ones, for example, that healthier parents transmit behavior promoting good health to the next generation or have better economic resources to invest in their children’s health development. Although there is a vast literature in epidemiology on the hereditability of a large number of health conditions, the focus in these studies is on the etiological background of diseases rather than understanding the causes of the intergenerational persistence in overall health.
In this work, we study the importance of pre- versus post-birth factors in the intergenerational persistence in health by using a large sample of Swedish adoptees for whom we observe health measures of both biological and adopting parents. We study how the health of the biological parents—related to genetic factors and in utero health (“pre-birth factors”)—and the health of the adopting parents—related to health formation during childhood and adolescence (“post-birth factors”)—affect the child’s health later in life. Our data set is constructed by matching several different administrative registers containing information on health outcomes for biological and adopting parents and their children. We study adopted children born between 1940 and 1967 in Sweden and are able to follow the health of the adoptees up until 76 years of age. For comparison, we also present results on the same outcomes obtained using the population of children raised by their biological parents and born in the same time period as the adoptees.
The main outcome of interest is the health status of the children as adults. We use three sets of measures of this outcome: (i) mortality and premature death, (ii) health indexes based on hospitalization data, and (iii) for females in the sample, birth outcomes of their first-born child obtained from the Swedish birth register. Mortality is either measured by longevity, using Cox regression to account for censoring, or by premature death, as captured by a binary variable of dying before age 60, 65, and 70, for children and parents, respectively. The health indexes are based on hospitalization data from the Swedish inpatient register—one is based on hospitalization visits, and the other is based on hospitalization causes, where each cause is weighted by the probability of dying of that cause. Both measures are standardized by year and gender and transformed to percentile ranks. By residualizing out age-specific effects they can be interpreted as measures of lifetime health. The third measure is motivated by the fact that birth outcomes reflect the health status of the mother giving birth (see, for example, Currie 2011). Perhaps even more importantly, it allows us to gauge the persistence of health transmission over three generations.
In our analysis of nonadoptees, we report strong evidence in favor of the intergenerational transmission of health, although the strength of the persistence is weaker than the intergenerational transmission of education or income (see Solon 1999; Black and Devereux 2011). Our mortality estimates are in line with findings in the literature on parent–child associations in longevity, including those in the epidemiological literature, which are often based on findings from samples of twins in the child generation. For our two health indexes we estimate rank correlations of about 0.13–0.15. We also find evidence of positive health associations across three generations, where the health status of the grandparents is positively associated with the birth outcomes of their grandchildren.
Our decomposition results show that the intergenerational association in mortality and premature death can be fully attributed to pre-birth factors because the association between the life expectancy of the biological parents of the children placed for adoption is as strong as for the children raised by their biological parents. There is no significant association between the longevity of the adopting parents and the mortality risk of the adopted children, nor in the intergenerational association of death by age 60, 65, and 70, respectively. In addition, we show that these results survive several sensitivity tests on sample selection and selective placement. Hence, we are able to confirm results on general mortality from studies by epidemiologists using Danish data on adoptees (Sørensen et al. 1988; Petersen, Kragh Andersen, and Sørensen 2005). The decomposition results for the intergenerational association in the health indexes based on hospitalization attribute some of the health status to post-birth factors in the decomposition. However, a much larger share (75–85 percent) is still attributed to pre-birth factors captured by the health measure of the biological parents.
Our strategy of separating pre- and post-birth effects for intergenerational associations follows previous research that has used the regression-based approach using Swedish data for adopted children and their biological and adoptive parents. This approach has been applied to a number of outcomes, such as education and income (Björklund, Lindahl, and Plug 2006); financial risk taking (Black et al. 2017); wealth, savings and consumption (Black et al. 2020); crime (Hjalmarsson and Lindquist 2013); entrepreneurship (Lindquist, Sol, and van Praag 2015); voting (Cesarini, Johannesson, and Oskarsson 2014); and political candidacy (Oskarsson, Dawes, and Lindgren 2018). Most recently, Black et al. (2020) presents a coherent analysis for nine different outcomes, including wealth, risky investments, and years of schooling. Post-birth factors are much more important than pre-birth factors for outcomes such as wealth and savings rate, and somewhat more important for outcomes such as income, risky market participation, and consumption. The sole exception is years of schooling, where pre-birth factors are slightly more important. The findings in our work constitute an important complement to their results, as we find pre-birth factors to be much more important than post-birth factors for premature death and health as measured by the hospitalization indexes. Because of our results, we now have additional estimates that contribute to our understanding of the degree of genetic versus environmental factors in explaining the intergenerational transmission in well-being that can be compared to results on other important economic and social outcomes from earlier papers using the same adoption design and study population.
There are additional economic motivations for the research question of this study. First, the recent interest in health inequality (see, for example, Chetty et al. 2016) relates closely to the question on the importance of pre- versus post-birth factors in the child–parent association in health, since this intergenerational persistence is an important element in the formation of health inequality. Second, our research question is closely related to the intergenerational persistence in human capital outcomes, income, and wealth. Previous research has shown that health is very important for the formation of human capital, strongly associated with earnings ability in the labor market and indeed an important determinant of individual well-being (see, for example, Deaton 2003). A strong influence of pre-birth factors in intergenerational health persistence would limit the possibilities to affect the intergenerational mobility in economic outcomes and well-being through policy measures that affects post-birth environmental transmission channels.2 Finally, the research question relates to the literature on the effect of various health- and family-related interventions on later outcomes (Almond and Currie 2011; Campbell et al. 2014) as to whether or not they are implemented early in life or during the prenatal period.
We are aware of only two previous studies, from the same research group (Sørensen et al. 1988; Petersen, Kragh Andersen, and Sørensen 2005), that analyze the intergenerational transmission of premature death using data on adopted children and their adopting and biological parents. The authors use Danish data on 960 and 2,365 adoptees, respectively, and find a significant association between the likelihood that the biological parents are still alive at age 50 or at age 70 and the child being alive at age 58 (Sørensen et al. 1988) or at age 70 (Petersen, Kragh Andersen, and Sørensen 2005). For the adopting parents, no such associations were found.3 In addition to the studies on mortality, there is an extensive epidemiology literature on the heritability of specific diseases and psychological conditions using (also Swedish) data on adoptees.4 The studies on cancer and circulatory diseases show that adoptees with at least one biological parent suffering from the disease under study have a significantly elevated risk of getting the disease. No such associations were found for the adopting parents. The study on suicides gives similar results. Research on drug abuse and alcohol usage, however, shows significant associations for both the biological and the adopting parents.
There are only a few studies on the intergenerational transmission of health that use data on adoptees in economics. Sacerdote (2007) uses data on 1,650 Korean-American adoptees placed by Holt International Children’s Services during 1964–1985. He finds physical outcomes (height, overweight) are not transmitted at all from the adopting parents, whereas health-related behaviors (drinking alcohol and smoking) are transmitted from the adopting parents. Thompson (2014) uses data from the National Health Interview Survey (NHIS) to study the intergenerational correlation in health conditions for asthma, hay fever, diabetes, and chronic headaches. He finds a significant association in the prevalence of medical diagnoses between adopting parents and their adoptive children. Classen and Thompson (2016) use the same data set as in Thompson (2014) and perform a similar analysis on BMI and obesity measures. For these outcomes, they find (similarly to Sacerdote 2007) no association between adoptees and their adoptive parents.
Our study is able to reconcile the findings from the previous literature. We replicate the findings from the literature on premature death that shows environmental factors not to be important for intergenerational transmission. At the same time, there are studies in the epidemiological and economics literature that find that, although genetic factors are the explanation for many health measures, there is also a role for the environment, especially regarding some health-related behaviors (such as smoking, drug, and alcohol abuse). To reconcile these findings requires a longer follow-up compared to the study of hereditary diseases, which has been the focus in the epidemiological literature, and a richer set of health outcomes measured throughout the lives of the adopted children. By using hospitalization-based health measures capturing health status through decades of healthcare utilization, we are able to estimate the importance of genetic and environmental factors for overall health. Although genetic factors account for a larger share in the intergenerational transmission of health, we do find some evidence that environmental factors are also important. Another notable difference with the previous epidemiological literature is that we compare our estimates for adoptees to those obtained for the population of children raised by their biological parents.
Our study differs from Sacerdote (2007), Thompson (2014), and Classen and Thompson (2016) along several dimensions. The most important difference is that our data include information on the biological parents of the adoptees, which enables us to decompose the pre- and post-birth parental influences on child health.5 Because we have a much longer follow-up period, we are able to study long-run health outcomes rather than self-reported health outcomes and health-related behavior measured at younger or middle ages.6 Finally, our sample size is much larger than those used in these past studies, potentially allowing us to identify smaller effects due to improved statistical power.
The rest of the paper is organized as follows. Section II presents our econometric models. Section III presents the data and descriptive statistics. The main results and sensitivity analyses are laid out in Section IV. Section V concludes. Finally, the paper contains two Online Appendixes. Online Appendix A provides a brief historical background and a description of institutions related to the adoption process in Sweden. Online Appendix B presents the results of various sensitivity analyses.
II. Empirical Specifications
We first estimate the following intergenerational model on the population of individuals
1
where
represents adult health status for the biological child and
the biological parents’ health. Subscript j indexes the family in which the child is born and raised, and superscripts bc and bp denote the biological child and parent, respectively.
is the child-specific error term assumed to be uncorrelated with
. The coefficient β1 measures the strength of the association between the adult health of the child and the health of the parents and is a combined effect of many different factors, such as genetics, prenatal environment, and environment during childhood and adolescence.
As we have data on the characteristics of adoptees and their biological and adoptive parents, we estimate the following model on the population of adoptees:7
2
where H once more measures health that is transmitted from the biological parent bp, or the adoptive parent ap, to the adopted child ac born in family j and adopted and reared in family i.
is a child-specific error term uncorrelated with
and
.
Before we discuss how we can interpret α1 and α2, let us state the following key assumptions of the adoption design:
Adoptees are conditionally randomly assigned to adoptive families.
The adoption should have taken place close to birth so that it is possible to separate pre- and post-birth effects accurately. If this is not the case, the postnatal pre-adoption environment (for example, the quality of the nursery home) is uncorrelated with the genetic background and the post-adoption environment (or has no influence on the health of the adopted child).
The biological parents have no contact with the adopted child post-adoption.
Under these three assumptions, we are able to provide internally valid estimates of the share of the intergenerational association in health status that is due to pre- and post-birth factors by estimating Equation 2 by ordinary least squares (OLS) using data on adopted children and their biological and adoptive parents. Since α2 captures not only the importance of adoptive parental health, , but also everything else in the adoptive family that is correlated with
, we do not interpret an estimate of α2 as a causal effect, but instead as a measure of the importance of transmission channels stemming from the post-birth influences (a similar interpretation can be made for α1).
The first assumption listed above, that adoptees are conditionally randomly assigned to adoptive families, can be questioned in all empirical studies using data for adoptees (see the discussion in Section IV.D.2). As we will see in Section III.D, we find evidence of less selective placement for our longevity and health measures than what has been found for most other outcomes analyzed in previous adoption studies (such as education and income). Nevertheless, we perform several sensitivity analyses to check the robustness of our main results with respect to this assumption. First, we look at the robustness with respect to changes in the set of confounding parental characteristics included in the model (Section IV.D.2).
Second, we restrict the sample to include only adoptees who moved away from their municipality of birth (Section IV.D.4). We cannot directly observe whether relatives or friends of the biological parents adopted some of the children, but in such cases, children are more likely to stay in the municipality where they were born. Moreover, adopted children who move from their municipality of birth are much less likely to interact with their biological parents post-adoption (hence making Assumption 3 more likely to hold as well).
In the third sensitivity analysis, we restrict the sample of adoptees to first-borns of their biological mothers (Section IV.D.4). The motivation for this restriction is to exclude adoptees who were placed for adoption because of illness, poverty, or other reasons that might make the biological parents unable to accommodate a large family, which, in turn, will increase the probability that the adopting parents are related to the biological ones. That is, first-borns are more likely to be placed for adoption simply because they are less likely to have been planned by their biological parents or born into established families.
Note also that Equation 2 can easily be extended to account for “nature–nurture interactions”by adding the product of and
to this specification (see Björklund, Lindahl, and Plug 2006).8 We investigate the importance of such interactions in Section IV.D.3.
Assuming that adoptees and nonadoptees are drawn from the same distribution, we are also able to decompose an estimate of β1 into separate entities of pre- and post-birth factors, captured by estimates of α1 and α2, which are then interpretable for the population of children. The degree of generalizability of the estimates increases if the intergenerational parameter is linear and if the sum of the estimates of α1 and α2, using the sample of adoptees, equals an estimate of β1, obtained in the population of children. We also perform a test of the external validity of the adoption coefficients by estimating these parameters on the sample of families where at least one child has been adopted out from the family and at least one child was not adopted but was instead reared by the biological mother (see Section IV.D.1).
III. Data and Descriptive Statistics
A. Sample Definition
We use data from different national registers in Sweden and include all males and females born in Sweden between 1940 and 1967.9 We use the Multigenerational Register (see Statistics Sweden 2012) to identify whether a person was adopted as a child. This register also contains a personal identifier of the biological mother and father (if known to the authorities), as well as of the adopting mother and father.
Table 1 shows the number of observations for the two populations used in this study—adoptees and, as a comparison, nonadoptees—at different stages of the sample selection process. In total, there are 64,889 adoptees who we can identify in our data. Approximately 30,000 of them were adopted by only one parent, in most cases the husband of the child’s biological mother. We excluded these individuals from all samples used in this study. For the main analysis, we restrict the sample for whom we have information on both the biological parents. Since the inpatient register starts in 1987, we require that members of the family have not died before then.10
Number of Observations Remaining after Different Sample Restrictions
Figure 1 shows the number of adoptees we are able to identify in our data by year of birth and different categories. The top curve shows the total number of adoptees with two adopting parents that we are able to identify. We see an increase in the number of adoptees between 1940 and 1945. This primarily reflects the increase in the overall fertility rate in Sweden. As discussed in Online Appendix A, there are several reasons for the decline in adoptions between 1945 and 1967.11 The decrease in domestic adoptions towards the end of our study period was offset by an increase in international adoptions.
Swedish Domestic Adoptions by Year of Birth of the Adoptees
The dashed and the thick solid lines below the top curve in Figure 1 show the observations that we are able to include, given the different data requirements indicated below the figure. It is evident from the figure that for those born in the first half of the 1940s, we are able to use a small share of the observations because we are not able to observe data on their biological parents.
B. Measures of Health
1. Mortality
Information on date of death, used for constructing dependent variables that apply to the child generation as well as to the parent generation, is obtained from the national Cause of Death Register (see Socialstyrelsen 2009a). The Cause of Death Register records dates of death and International Classification of Diseases codes for the underlying cause of death from 1947 onwards and with full coverage for all deaths in Sweden from 1961 onwards. Our observation period stops in August 2016. This implies that for the child generation, we can observe the oldest person in our sample until age 76 and the youngest until age 49.
We use two different measures of mortality for both parents and children. First, longevity is constructed using dates of birth and death. Second, we construct indicator variables for the incidence of death before age 60, 65, and 70, respectively. Figure A2 in Online Appendix A shows the share of individuals who died before the end of the observation window by year of birth.
2. Hospitalization
Data for our measures of hospitalization are obtained from the national inpatient register (see Socialstyrelsen 2009b). The national inpatient register includes dates for all hospital stays at Swedish hospitals. This register offers national coverage starting in 1987, and we have access to data for the entire period until 2012.12 Because the first birth cohort included in our data was born in 1940, we observe hospital stays for children from age 47 and until age 72. For parents, we observe hospitalizations at older ages. The inpatient register includes ICD codes for a maximum of eight different medical causes of each hospital stay.
We construct two measures of health utilizing the hospitalization data. The first, labeled “hospitalization,”is simply the residuals from a linear probability model regression of an indicator variable for whether the individual has been in hospital care for each year separately during the observation window on calendar year and year-of-birth indicators. If the person is dead, we treat them as missing. In a second step, we average the residuals for each individual to obtain the measure. This procedure accounts for differences in the probability of hospitalization over the life cycle, and we may therefore interpret the resulting variable as a measure of lifetime hospitalization.
The second measure, labeled “health index,”is constructed in three steps.13 First, for every year, we use a Probit model to regress an indicator variable, equal to one if the individual has died within five years and zero otherwise, on the information from the inpatient register for that year (days, visits, and diagnoses) and indicators of year of birth and gender.14 In a second step, we use these coefficients to create a health index ranging between zero and one by predicting the risk of dying within five years. An individual is assigned the value of one all years after death occurred, whereas individuals not making any hospital visits and still alive are assigned the value of zero. Then, we average over all years for each person. On the basis of this index, we obtain a percentile rank for each individual within each birth cohort and gender separately. The difference of this measure compared to the other hospitalization measure is that it weights the various diagnoses by severity based on how likely the person is to die within five years.
Both “hospitalization”and the “health index”are ranked so as higher percentiles means better health. As the hospitalization and health index measures are adjusted for age effects and ranked by gender and cohort, we effectively compare lifetime health for individuals born in the same year. Note that in our main analysis we use the average of the health measures for mothers and fathers. In a sensitivity analysis we also show results for mothers and fathers separately.
3. Measures based on birth outcomes for the third generation
Previous research has established that birth outcomes to a large extent reflect the health status of the mother (see, for example, Currie 2011). This relation enables us to use the birth outcomes of the children of the females included in our sample as a health measure. Further, weight at birth, and in particular low birth weight (below 2,500 grams), is highly correlated with health outcomes later in life. Studying health at birth for the third generation enables us to test for multigenerational transmission of health.
Using the Multigenerational Register, we are able to link births to all children (adopted and biological) included in our sample. Our data source for studying health at birth is the National Swedish Birth Register (see Socialstyrelsen 2009b). This birth register contains a large amount of information on all births in Sweden from 1973 and onwards.15 We use three different birth outcome measures: (i) birth weight measured in grams (scaled in percentile ranks); (ii) an indicator for low birth weight, that is, birth weight below 2,500 grams; and (iii) an indicator of an APGAR score below 9 at five minutes after the birth.16
C. Descriptive Statistics
Table 2 contains sample means and standard deviations (within parentheses) for the main outcome and control variables in the sample of nonadoptees and adoptees, respectively. The first panel shows information on the children in the two samples. On average, the adopted children have worse health compared to the nonadopted. The same pattern can be seen for the children of the mothers in the child generation, with lower birth weights for the children of the adopted mothers in the child generation, compared to the same outcomes for the children of the population of nonadopted mothers. However, the mean differences are quite small. The third panel shows descriptive statistics for the biological parents. On average, the biological parents of adopted children have much worse health compared to the parents of nonadopted children. The fourth panel shows descriptive statistics of the adopting parents. The adopting parents have somewhat better health than the parents of nonadopted children.17
Summary Statistics of Main Outcome and Control Variables
D. The Association between Biological and Adopting Parent Characteristics
A possible concern with the interpretation of the coefficient estimates is that of selective placement of adoptees. There are at least two reasons why we would observe a positive correlation for characteristics of biological and adoptive parents. First, this correlation could be due to some children being adopted by relatives of one of the biological parents. Second, there could be matching on characteristics known to the adoption agency, either because of the demands of parents or because of the view that an adopted child would be better off in an adoptive family with similar characteristics as the biological parents. One way to check the likely severity of this issue with regard to our main results—made possible by the fact that we can observe health for both adoptive and biological parents of the adoptees—is to simply correlate the health measures for these two parental types. Table 3 shows the correlations of mortality and health measures based on hospitalization data between adopting and biological parents of adoptees.
Correlations between Biological and Adoptive Parents’ Mortality and Health Measures
We obtain very small and statistically insignificant correlations for the hospitalizationbased measures. This differs compared to those reported for most other outcomes in adoption studies using Swedish data.18 This finding is very important for the purpose of this study because it suggests that selective placement is unlikely to generate biased estimates of intergenerational health correlations using adoption data. That said, because selective placement is still possible on unobservable characteristics we discuss this issue and also perform some sensitivity analyses of the likely impact of selective placement on our main estimates in Section IV.D.2. The correlation of our mortality measures varies more, but values are still very low for our indicators for premature death.
IV. Results
A. Mortality
We start this section by studying the intergenerational persistence in mortality. Sørensen et al. (1988) and Petersen, Kragh Andersen, and Sørensen (2005) focus exclusively on mortality outcomes, and we therefore compare our results to these previous findings as a point of departure before showing results for the other health outcomes under study. In the mortality analysis, we extend the work of the two papers mentioned above primarily by having a longer follow-up period (the oldest child cohort is 76 when we stop observing them) and also by comparing our results to those obtained on the population of children raised by their biological parents born in the same birth cohorts.
Table 4 shows the results from the Cox proportional hazard model for the persistence in longevity across generations. The dependent variable in these models is age of death (measured in months) of the individual in the child generation, and the independent variables are the ages of death of the biological and the adopting parents, respectively. The Cox model relies on the proportional hazard specification, but not on any particular functional form for the baseline hazard. The results are presented as hazard ratios and should be interpreted as the relative difference in the hazard resulting from a one-unit (one year) change in the independent variable.
Cox Proportional Hazard Model Estimates of the Associations between Child Mortality and Parental Age at Death
Censoring, on both the dependent and the independent variables, is a main concern for our choice of econometric model as well as principles for sample selection. We use a hazard model to deal with the high proportion of right censoring on the dependent variable. However, since in the full sample we do not observe date of death for 36 percent of the biological mothers, 21 percent of biological fathers, 26 percent of the adopting mothers, and 15 percent of adopting fathers, we also have a problem of censoring on the independent variables. To deal with this, we have restricted the sample to those observations for which we could observe the date of death of all parents, that is, we impose a selection on Complete Cases (CC).
Rigobon and Stoker (2007) show, in the framework of a linear regression model, that a sufficient condition for consistency of the Complete Case regression estimates is that the selection, conditional on observables, is exogenous. That is, that an indicator variable for sample inclusion would be conditionally independent of the error term in the linear regression. Although there are no obvious reasons why this assumption would not apply in our application and to a nonlinear proportional hazard model, we provide a sensitivity analysis of our results. In the first column of Table 4 we present the complete case results from when we use the entire sample born between 1940 and 1967. In the second column, we show the results from when we restrict the sample to those born in the first half of the sampling window defined by year of birth, that is, those born before 1953. In this subsample, we observe date of death for a much larger share of the parents (87 percent of children have parents who are all deceased), which makes the potential inconsistency from censoring on the independent variable much smaller.
Comparing the estimates in Columns 1 and 2 of Table 4 it is apparent that the results are almost identical. This result suggests that we can maintain the hypothesis of exogenous selection conditional on the independent variables. The results furthermore suggest that there is a strong intergenerational persistence in longevity in the population of those raised by their biological parents. The hazard ratio estimate in Column 1 shows that an additional year in average length of life of the parents corresponds to an about 1.8 percent reduction in mortality of the child.
Turning to the estimates for adoptees, Column 3 shows the results for the entire sample and Column 4 for those born before 1953. The estimates are very similar, and they unambiguously suggest that the entire persistency in mortality can be attributed to pre-birth differences. The hazard ratio estimates for the biological parents are similar to those obtained in the sample of children raised by their biological parents, and the estimates for the adopting parents are all insignificantly different from the no-effect hazard ratio of one. Since we require all four parental types to be deceased before the end of our sample period, the sample is limited to about two-thirds of all parents to adoptees born before 1953. Finally, Column 5 shows the result when we use the complete cases sample for the adopting parents only in the sample born before 1953. Since the adopting parents are in general older than the biological ones, we only need to exclude 4 percent. Reassuringly, the estimates from this model are very similar to the estimates for adoptive parents shown in Columns 3 and 4.
As an additional sensitivity analysis, Table 5 presents linear probability model estimates for intergenerational persistence in deaths before ages 60, 65, and 70, respectively.19 The advantage of these models vis-à-vis the hazard models presented in Table 4 is that they can be estimated without any censoring on either the dependent or the independent variables. We restrict the samples to the cohorts that allow us to follow the included individuals to each of the ages. This means that for the model for intergenerational association in mortality before age 60, we restrict the sample in the child generation to those born before August 1956. For mortality before age 65, we restrict to those born before August 1951, and for mortality before age 70, those born before August 1946.
Linear Probability Model Estimates of Intergenerational Association of Dying before Age 60, 65, and 70, Respectively
The results for nonadoptees, shown in Columns 1, 3, and 5, reveal that the inter-generational association in premature death becomes stronger as the age limit increases from age 60 to age 70. The results for adoptees, shown in Columns 2, 4, and 6, suggest that the association can be fully attributed to the biological parents, which confirms our previous results, as well as those obtained by Sørensen et al. (1988) and Petersen, Kragh Andersen, and Sørensen (2005).
Online Appendix B shows the results from a number of alternative specifications and sample restrictions. Table B2 shows the estimates with mothers and fathers separately, and we also include those with unknown biological father in the sample of adoptees. The results show that there is a marginally stronger association between mothers and their children’s longevity than between fathers and their children’s longevity (conditional on the other parent’s longevity). To investigate how the estimates for mortality translate into effects on life expectancies, we need to assign a parametric distribution for the baseline hazard. We use the Gompertz distribution for the baseline hazard rather than the Cox model. The hazard ratio estimates from this model turned out to be very similar to those of the Cox model presented in Table 4; see Online Appendix Table B3. Using these estimates for adoptees, we find that the prediction of one additional year of longevity for the biological parents extends the child’s median life expectancy by 0.25 additional years.20 In Table B4 we show results that are obtained on the entire original sample, and instead of excluding individuals with parents still alive when we stop observing them in August 2016, we include dummy variables for them being alive at that time. All results shown in these tables support our conclusion that the intergenerational persistence in mortality can be attributed to the biological parents.
B. Health Measures Based on Hospitalization Data
Figure 2 shows the relation between percentiles of the parental and child hospitalization and health index (with higher percentile ranks indicating better health). We use a local linear kernel regression, instead of scatter plots, given that the adoption sample is relatively small. The graphs for nonadoptees, shown in the upper panel, reveal a strong intergenerational persistence in health, which is well approximated by a linear relationship (except at the very top of the distribution). The middle panel shows the graphs for the relation between child health and the health of the biological parents in the adoptee sample. The relation is almost equally strong as the one shown for the children raised by their biological parents. Finally, the figures in the bottom panel show the relation between the health status of the adopting parents and their children. The relation is slightly positive, but clearly weaker than for the biological parents.
Relationship between Percentile Rank of Child and Parental Hospitalization and Health Index for Nonadoptees and Adoptees
Notes: The figures show results from bivariate local linear kernel regressions using an Epanechnikov kernel and rule-of-thumb bandwidths. The shaded area represents the 95 percent confidence interval.
Table 6 reports OLS regression results from models using the hospitalization and health indexes as health measures for the child and parental generations. Columns 1 and 3 report the results for nonadoptees. As both measures are scaled in percentile ranks we are estimating rank correlations. The magnitudes of the estimates are somewhat stronger for the hospitalization measure compared with the health index, suggesting that a one percentage point increase in the parents’ relative health is associated with a 0.12–0.14 percentile increase in the child’s health. Hence, confirming findings from previous research, we find that the intergenerational transmission of health in the population is positive but smaller than what is typically found for outcomes such as education and income (see Black and Devereux 2011; Black et al. 2020).21
OLS Estimates of Associations between Percentile Rank of Parental and Child Lifetime Health Measured by Indexes Based on Hospitalization Data
The results for adoptees are reported in Columns 2 and 4. As opposed to the estimates for mortality, the coefficient estimate for the hospitalization measure of the adopting parents is statistically significantly different from zero at the 1 percent level. These results allow us to decompose the intergenerational association in health into pre- and post-birth influences. For the hospitalization measure, such decomposition attributes about three-fourths of the association to pre- and one-fourth to post-birth influences. However, for the health index, the estimate for the adopting parents is smaller and again insignificantly different from zero. The latter result is line with our findings for mortality above, which is not surprising given that the health index partly is based on cause-of-hospitalization specific mortality probabilities.
In Online Appendix B, Table B5, we show results for mothers and fathers separately. We also present results from an extended sample where we include adoptees with an unknown biological father. The results show a slightly stronger association between biological mothers’ health and their children’s health, than between biological fathers’ health and their children’s health, both for adoptees and nonadoptees.22 This is similar to our results for mortality. When increasing the sample of adoptees to include adoptees with an unknown biological father, the sample size more than doubles, which improves the precision of our estimates. This results in the health index measures of the adopting parents becoming statistically significant (p-value: 0.0116). In Table B6 we present separate results for males and females. The results reveal that there is a significant association between adopting parents’ health index and the health of male, but not the female, adoptees.
C. Birth Outcomes
The mother’s health is likely to be at least partly reflected in the birth outcomes of her children (Currie and Moretti 2007). This is the first reason why we use the birth weights and APGAR scores of children as proxy for women’s health. The second reason is that birth weight is known to correlate strongly with later-life health. It can thus serve as an additional measure of the intergenerational transmission of health going into the third generation.23
Table 7 shows results from intergenerational regressions where we use two measures of the birth weight of the first-born child as a health measure of the mother: actual birth weight for the first-born child transformed into percentile scores to facilitate the interpretation and the probability of low birth weight (<2,500 grams), as well as an indicator for an APGAR score below 9 at five minutes.24 Because we have to restrict the sample to females, and additionally to those who give birth, the sample sizes for these regressions are approximately halved compared to those shown in the previous sections.
Associations between Percentile Rank of Parental Health Index and Firstborn Grandchild’s Health at Birth
We find highly statistically significant correlations of the hospitalization measure and health index of the biological parents on all birth outcomes of their grandchildren in the sample of nonadoptees. In Online Appendix Table B7 we show that these associations remain very similar if we control for the health status of the child. Hence, there is only a very weak mediating role of the child’s health in explaining the associations between grandchild’s birth outcomes and parents’ health. Since the previous literature (see Almond and Currie 2011; Barker 1990, 1995) has shown that there is a strong association between birth outcomes, in particular birth weight and adult health, these results contribute further support that there is a multigenerational association in health 25 outcomes.
Turning to the samples of adoptees, the results in Column 1, 2, and 4 show a significant association between the health of the adopting grandparents and birth weight.
The estimates for low APGAR scores in the adoptee sample are in general too imprecise to give significant estimates. Only the health measure of the biological grandparents turned out significantly different from zero at the 5 percent level for this outcome measure. For all sets of results shown in Table 7, the precision of the estimates is not sufficient for a meaningful decomposition of the pre- and post-birth influences on health formation.
D. Sensitivity Analyses
1. External validity
As we discussed in Section II, a way of assessing the similarity between the adoptees and the rest of the population is to compare the sum of the estimates for biological and adoptive parents with those obtained for nonadoptees for the biological parents. The results in Tables 4–6 reveal that the sums of the estimates of adoptees are always larger than the population estimates. This is true for the estimates using premature death as outcome variable in particular.26
We do two different checks of the similarity between the adopted and nonadopted children. First, we compare the results for the decomposition of pre- versus post-birth factors for adoptees, with the intergenerational association for the nonadopted children of the mothers who gave up their first-born child for adoption, in the subsample of adoptees with at least one biological sibling reared by the biological mother. Second, we compare the causes of death for adoptees with those of nonadoptees and do the pre- and post-birth decomposition in the framework of a competing risk analysis.
The results shown in Table 8 from the first exercise for the two health indexes reveal two interesting results. First, the results for the importance of pre- versus post-birth factors are qualitatively very similar to the main ones in Table 6. Second, we now find that the sum of the estimates in the second column is very similar to the magnitude of the population-based estimate in the first column, for both health indexes. Hence, for our main health outcomes, our previous conclusions are unchanged.
Comparison of the Intergenerational Association in Health for the Nonadopted and Adopted Children with the Same Biological Mother
In Online Appendix Tables B8 and B9 we show mortality results for the sample of adoptees and nonadoptees with the same biological mothers. Our result that intergenerational association in mortality can be attributed to pre-birth factors is maintained in these samples. However, the sum of the estimates for the adoption sample is still much larger than the population-based estimates.
Online Appendix Tables B11 and B12 show the results for the competing risk analysis for different causes of death.27 The results show evidence of some important differences between adoptees and nonadoptees. For instance, the positive intergenerational association for the mortality measures between the adoptees and their biological parents is to a high degree due to death from external causes, circulatory diseases, and treatable conditions, whereas for nonadoptees, the positive intergenerational association is mostly due to associations in cancer and circulatory diseases. Hence, we posit that for mortality outcomes, at least based on premature death, external validity is limited, possibly because of differences in the causes of death between adopted and nonadopted children.
2. Parameter robustness with respect to selective placement
In Section III.D, we mentioned two reasons for selective placement of adoptees. First, some adoptions could be made by relatives of one of the biological parents. Second, there could be matching on characteristics known to the adoption agency but unknown to us as researchers. As discussed in Online Appendix A, Section A.4, the empirical importance of the first reason—adoptions by relatives—is likely to be very limited because of the rule prohibiting people with their own biological children from adopting. This rule, to a large extent, precluded parents and siblings of the biological parents from adopting.28
The second reason, matching, is possibly a more important mechanism. However, as reported in Table 3, health status (measured either as hospitalization or the health index) is not correlated for the adopting and biological parents, supporting the absence of selective placement on observable health characteristics. Note also that for mortality, where the results reported in Table 3 suggest a statistically significant positive selection (for some of the mortality measures), implying a positive bias for the estimates for the adopting parents, we do not observe any significant effects for adopting parents in the results. We will, nevertheless, test for parameter robustness with respect to matching based on broad set of characteristics observable in the data.
A simple, and informal, way of empirically testing the assumption of independence between the biological and adopting parents is to include and exclude the observable parental characteristics to check the stability of the coefficient estimates of main interest (see Björklund, Lindahl, and Plug 2006). Table 9 reports results from such a robustness check for the two key results obtained in Section IV.B for our health measures based on hospitalization data. Column 1 shows the results for hospitalization for the biological parents when we include no other parental controls except indicators for the birth cohort of the biological parents, and Column 2 reports the results when we successively add variables for the observable characteristics of the adopting parents: hospitalization, years of education, cohort indicators, and regional indicators of both adopting parents. The estimates for the biological parents barely change with added controls. Column 3 shows the results for the adopting parents when we only include indicators for year of birth of the adopting parents in the model. Column 4 shows the results when we add variables measuring the characteristics of the biological parents. Columns 5–8 report the corresponding results for the health index. The estimates for the adoptive parents remain unchanged with these added controls. Hence, we conclude that there is no evidence that selective placement on observables affects our results.
Sensitivity Analyses among Adoptees
Another potential threat to the random assignment assumption is that adoptees may be nonrandomly assigned to adoptive families based on health endowments at birth. This is particularly troubling if, for example, adoptive parents with better health are somehow able to “pick out”healthier children. While we cannot directly test for this because we lack data on health at birth for the index cohorts, it is unlikely to happen for several reasons. First, the institutional set-up at the time was such that adoptive families were approached as soon as a candidate for adoption became available, and there was an excess of candidate adoptive parents relative to available children. Second, unhealthy infants that were surrendered by their biological mothers were not offered for adoption (see also Online Appendix A). Finally, Holmlund, Lindahl, and Plug (2008), using a sample of adoptees mostly born in the 1970s, show that there is no significant correlation between adoptive parents’ education and the gender of the adoptee and the biological mother’s age at birth, the only two pre-existing characteristics that are available in the data that could potentially proxy for infant health at birth.
3. Is there any evidence of “nature–nurture interactions “ ?
An advantage of the regression-based approach to decomposing pre- and post-birth associations is that the model can very easily be extended to allow for interactions between pre- and post-birth characteristics (“nature–nurture interactions”). This can be done by adding interaction terms between the health measures of the adoptive parents and the health measures of the biological parents.
The results are reported in Online Appendix Table B13. For the hospitalization index, the interaction between biological and adopting parents’ hospitalization is significantly negative, but the magnitude of the coefficient estimate is not very large. Since the interaction effect is negative, it means that adoptive parents’ health becomes relatively more important, with lower health of the biological parents. For an adoptive child with biological parents of mean health, the adoptive parents would have to be in the 98th percentile of health in order for pre- and post-birth factors to be equally important for intergenerational health transmission, using the hospitalization measure.29 For the health index, the estimate is insignificantly different from zero. Taken together, the results suggest that the additive model provides a good approximation of the relation between child health and the health of the biological and adopting parents.
4. First-born adoptees and adoptees who move from their municipality of birth
A concern discussed in Section II is that the adoptee might still maintain significant contact with the biological parents even after adoption, and thus, the characteristics of the biological parents would have effects beyond the in-utero period. A related concern is that the biological parents may have pre-adoption contact with the adopting parents and are thereby able to intervene in the adoption process. One way of limiting the effect of this concern is to restrict the sample to include only those adoptees who move away from their municipality of birth after the adoption. The results shown in Online Appendix Table B14 are almost identical to the one obtained for the entire sample shown in Table 6.
In the final sensitivity analysis, we restrict the sample to include first-born adoptees only. As discussed in Section II, it is more likely that first-born children are placed for adoption simply because they were not planned by their biological parents, and they are less likely to have any contact with their biological parents. Again, Table B14 shows similar estimates compared with the main results in Table 6, only slightly stronger associations between the biological parents and the child for the hospitalization measure.
V. Conclusions
This study uses data on adoptees and decomposes the intergenerational persistence in health outcomes into pre-birth factors—reflected by the health outcome of the biological parents—and post-birth factors—reflected in the health outcomes of the adopting parents. Our results for mortality confirm previous findings—primarily obtained in epidemiology studies—that intergenerational persistence in longevity can be fully attributed to pre-birth factors. Our main contribution is to decompose the association in intergenerational health status. The results for the hospitalization measure suggest a significant effect of post-birth influences. However, a decomposition of the overall health persistence still attributes a much larger share (75–85 percent) to pre-birth factors captured by the health measure of the biological parents.
Our data do not allow us to distinguish between the possibility that if we would have observed mortality for the entire life cycle, we would have been able to estimate a significant effect of post birth-factors as well and the competing possibility that the health measure using hospitalization data captures a broader aspect of health than mortality. Although this is a limitation, our results are still able to reconcile the previous evidence and results obtained in the economics literature that health-related behaviors are affected by the adopting parents (see Sacerdote 2007, for drinking and smoking behavior, and Thompson 2014, on health problems related to environmental exposure), as well as the epidemiology literature on health-related behavior (see, for example, Kendler et al. 2012, 2015).
Similarly to other empirical studies, the results could depend to some extent on the social and physical environment of the country where the data were obtained. In particular, Sweden’s universal and practically free-of-charge healthcare system, low poverty rate, and, compared to most industrialized countries, small income differences, can be important in this context. One could argue that the genetic influences may be more important in such an environment. For example, Turkheimer et al. (2003) finds that genetic differences are more important in IQ determination for high socioeconomic status (SES) children than for low SES ones, since high SES children are more equal on other IQ determinants. In the same spirit, one could argue that genetic differences are “all that is left,”or at least given a more prominent role, in more equal societies such as Sweden and Denmark. Following this line of argument, our results could be interpreted as upper bounds for the share of pre-birth influences on the intergenerational persistence in health outcomes in other countries.
Footnotes
The authors are grateful for comments from Orazio Attanasio, Gerard van den Berg, Richard Blundell, Dalton Conley, Gabriella Conti, Janet Currie, Hans Grönqvist, James Heckman, Krzysztof Karbownik, Magne Mogstad, Therese Nilsson, Robert Östling, Erik Plug, André Richter, Torsten Santavirta, Marianne Simonsen, Helena Svaleryd, Anthony Wray, and Björn Öckert, and two anonymous referees, as well as for those from participants at seminars at University College London, Uppsala University, University of Copenhagen, Aarhus University, NBER Summer Institute 2015, Nordic Summer Institute in Labor Economics, The Family and Education Workshop 2016, The Ce2 Workshop in Warsaw, Nordic Health Economists’ Study Group meeting 2015, and Essen Health conference. Evelina Björkegren (nee Lundberg) gratefully acknowledges financial support from Handelsbanken’s Research Foundations. Mikael Lindahl is the Torsten Söderberg research professor at the School of Business, Economics and Law, Gothenburg University and acknowledges financial support from the Torsten Söderberg and Ragnar Söderberg Foundations and the European Research Council [ERC starting grant 241161]. Mårten Palme gratefully acknowledges financial support from the Swedish Research Council. Emilia Simeonova from the Swedish Research Council and the National Science Foundation. The authors have no financial interest related to this project. The access to the data used in this article is restricted. Anyone who wishes to gain access for replication purposes can contact the corresponding author for assistance (Mikael.Lindahl{at}economics.gu.se).
Supplementary materials are freely available online at: http://uwpress.wisc.edu/journals/journals/jhr-supplementary.html
Mikael Lindahl https://orcid.org/0000-0003-0618-8035
Mårten Palme https://orcid.org/0000-0002-0867-6967
↵1. Intergenerational longevity associations are typically estimated positive but less than 0.15 (Beeton and Pearson, 1899; Pearl 1931; Cohen 1964; Wyshak 1978; Iachine et al. 1998; Gavrilov and Gavrilova 2001), whereas intergenerational correlation in earnings and educational attainments differ between countries, but are rarely below 0.25 (Solon 1999; Black and Devereaux 2011). Studies of intergenerational associations in overall health are quite rare (see, for example, Pascual and Cantarero 2009; Halliday, Mazumder, and Wong 2018; Andersen 2019), although there is a larger literature that has used infant and child health outcomes as measures of the health in the child generation (see, for example, Bhalotra and Rawlings 2011; Currie and Moretti 2007).
↵2. It is important to emphasize that a dominant role for pre-birth factors does not eliminate the role for policy, although it makes it more important to design policies that limits the role for genetic and prenatal environmental factors in transmitting health inequality across generations. A famous example is given in Goldberger (1979), criticizing heritability studies estimating the shares of variance in an outcome that are due to nature or nurture, where he makes the point that variation in eye-sight that are due to genetic differences can be remedied by supplying eyeglasses. Hence, finding nurture to be dominating as explaining variation in an outcome does not mean that policies necessarily are ineffective. A central issue in this discussion is the importance of nature–nurture interactions, something we test for later in the paper.
↵3. In addition, there is a small literature on mortality using data on adoptees and their biological siblings (such as Petersen, Kragh Andersen, and Sørensen 2008) that essentially confirms the findings from the intergenerational adoption studies. A separate but related branch of research examines genetic influences on longevity using samples of twins (see, for example, Herskind et al. 1996; Hjelmborg et al. 2006). For a discussion about the advantages and disadvantages of the twins and adoption approaches to inferring “nature”and “nurture”effects, with a focus on economic and social outcomes, see Sacerdote (2011).
↵4. Zöller et al. (2014) studies prostate, breast, and colorectal cancer. Sundquist et al. (2011) studies coronary heart disease, Zöller et al. (2015) chronic obstructive pulmonary disease, von Borczyskowski et al. (2011) suicides, Kendler et al. (2012) drug abuse, and Kendler et al. (2015) alcohol use disorders.
↵5. Sacerdote (2007) has information on approximately 100 biological parents. This information is not used in the main analysis of his study.
↵6. In Thompson (2014), the outcomes are measured for children, on average, at age ten and in Sacerdote (2007), when those in the child generation are, on average, age 28.
↵7. Our strategy of separating pre- and post-birth effects closely follows Björklund, Lindahl, and Plug (2006), who estimated their relative importance for the intergenerational transmission of education and income.
↵8. There can be various reasons for nature–nurture interactions to be present. One of these is epigenetic mechanisms: environmental factors can affect gene expression in that genes are present, but are either “switched on”or “switched off”depending on environmental factors.
↵9. The lower cohort restriction is motivated by data availability and the upper one by the fact that domestic adoptions in Sweden decreased rapidly in the late 1960s.
↵10. In Online Appendix B, Table B1, we display the sample restrictions for the mortality analysis sample.
↵11. Figure A1 in Online Appendix A shows the ratio of adopted children in birth cohorts 1940–1967, which documents the same trends.
↵12. This implies that only individuals that have survived until 1987 have a health measure based on the hospitalization data.
↵13. The first two follow Cesarini et al. (2016).
↵14. We use the first two digits in the ICD10 diagnosis codes (one letter and one number), which constitute approximately 200 different categories. We do this for the first two diagnoses for each hospital stay. In addition, we include linear variables for the number of hospital stays and an indicator of more than a week in hospital care. We control for gender and stratify on birth cohort.
↵15. This means that we are not able to include individuals born before 1973 in the third generation in the analysis.
↵16. The APGAR score is a summary measure recorded by the midwife very shortly after birth and at given times, with the purpose of summarizing the health status of newborn children. It uses five different criteria: complexion, pulse rate, reflex irritability grimace, activity, and respiratory effort. It is named as a backronym of the included indicators (appearance, pulse, grimace, activity, and respiration), as well as after the anesthesiologist Virginia Apgar, who suggested the score in 1952.
↵17. In the adoptee sample, biological parents are, on average, younger than adoptive parents. Biological mothers are on average 24 years old at birth, and adoptive mothers are on average 34 years old.
↵18. For instance, Björklund, Lindahl, and Plug (2006) find a correlation of 0.14 for the mother’s and father’s years of schooling for children born 1962–1966.
↵19. Probit estimates for intergenerational persistence in early deaths show very similar results.
↵20. For nonadoptees the corresponding figure is 0.24.
↵21. The relatively smaller intergenerational health associations, compared to intergenerational schooling and income associations, found here are in line with the results in the few other studies who report results for these different outcomes. In Halliday, Mazumder, and Wong (2018) the authors use PSID and estimate intergenerational rank correlations in health outcomes for the United States, using self-reported health averaged over the lifetime. They find rank correlations that are almost twice as large (0.26) as our estimates for Sweden. However, this finding is in line with differences of income persistence estimates for the United States and Sweden, which can differ by up to as much as a factor of two. These patterns are also in line with recent intergenerational estimates by Andersen (2019) for Denmark. Halliday and Mazumder (2017) and Mazumder (2011) also find smaller sibling correlations for health status than for education and family income for the United States.
↵22. This is in line with findings for some other outcomes, for example, years of schooling (see Björklund, Lindahl, and Plug 2006), which the authors interpret as evidence of the importance of prenatal environment for the intergenerational association, since biological fathers of adopted children often are absent during the pregnancy of the mothers of children that later are placed for adoption.
↵23. Selection into giving birth is likely driven by maternal health status, so that healthier women are more likely to conceive and deliver live children. It is, however, not obvious that this form of selection would bias our results, or, if this is the case, in what direction if we make inference to the population of all women. We therefore confine ourselves to making inference to the population of women that give birth in the given time window.
↵24. Six percent of children have an APGAR score at five minutes that is below 9. We choose APGAR below 9 instead of below 10 to follow the praxis from medical research of looking at the lower part of the APGAR distribution and because these estimates are more precise. Estimates are qualitatively similar for APGAR below 10.
↵25. This confirms previous findings on longevity (Piraino et al. 2014; Maystadt and Migali 2017) and mental health (Johnston, Schurer, and Shields 2013).
↵26. We note that Sørensen et al. (1988) and Petersen, Kragh Andersen, and Sørensen (2005) present results for premature death using a sample of (Danish) adoptees, but that they did not perform population-based estimations. Hence, we don’t know the degree of external validity of their adoption estimates.
↵27. Online Appendix Table B10 shows the ICD codes for the disease categories used in the competing risk analysis.
↵28. As further discussed in Online Appendix A, Section A.4, Nordlöf (2001) estimated these adoptions to be around 1 percent of the total number of adoptions in the Stockholm area. Brandén, Lindahl, and Öckert (2018) confirm this conclusion, although their estimate of the share of adoptions by close relatives is slightly higher at 5.4 percent, applicable to the whole country. They are also able to eliminate those adopted by close relatives from their sample, and they find that the correlation in years of schooling between (unrelated) adoptive and biological parents of adoptees remains virtually unchanged.
29. This can be seen by equalizing two first derivatives of an extended version of Equation 2 where an interaction term is added,
. First, take the derivative with respect to
. Second, take the derivative with respect to
. Third, equalize the two derivatives and set
equal to the mean in the adoption sample (44.96 according to Table 2). Fourth, solve for
, which gives 98.5.
- Received March 2018.
- Accepted October 2019.