Abstract
Research has shown a strong connection between birth weight and future outcomes. We ask how health problems after birth affect outcomes using data from public health insurance records for 50,000 children born between 1979 and 1987 in the Canadian province of Manitoba. We compare children to siblings born an average of three years apart. We find that health problems in early childhood are significant predictors of young adult outcomes. Early physical health problems are linked to outcomes primarily because they predict later health. Early mental health problems have additional predictive power even conditional on future health and health at birth.
I. Introduction
A large literature has established that low birth-weight babies are more likely to suffer various deficits, including lower average educational attainment. But prior research has not asked how poor child health after birth affects long-term outcomes? We provide a first look at this question using a unique administrative data set based on public health insurance records from the Canadian province of Manitoba. The data combines information from birth records, hospitalizations, and ambulatory physician visits with information from other provincial registers about educational outcomes and use of social assistance. The information about health and outcomes is much more complete, and in many ways more accurate, than what is typically available in survey data. And because Canada has universal public health insurance, this study sheds light on the consequences of disparities in child health in a setting that abstracts from differences in access to insurance coverage.
We follow 50,000 children and their siblings who were born in Manitoba between 1979 and 1987, until 2006, when they are young adults. We compare siblings with different childhood health problems conditional on health at birth. We are able to compare the impacts of health problems at different ages, and to examine the consequences of different types of health problems including asthma, major injuries, externalizing mental health conditions, and other major conditions.
Our results suggest that many physical health problems in early life are significant predictors of future adult outcomes. But this is largely because poor health in childhood predicts poor health in young adulthood. Short-term health events generally have little long-run impact. Mental health problems are a different story—we find that diagnoses of attention deficit hyperactivity disorder (ADHD) or conduct disorder at school entry are significant predictors of future outcomes whether or not future health problems occur. We conclude that poor health in childhood may be a significant source of socioeconomic disparities in adulthood.
II. Background
There is a large literature linking low birth weight to lower average scores on a variety of tests of intellectual and social development (see for example, Breslau et al. 1994; Brooks-Gunn, Klebanov, and Duncan 1996; Currie and Hyson 1999). Several recent studies use sibling comparisons to assess the relationship between low birth weight and future outcomes (Conley and Bennett 2000; Johnson and Schoeni 2007; Lawlor et al. 2006; Black, Devereux, and Salvanes 2005; Royer 2005; Currie and Moretti 2007). Oreopoulos et al. (2008) examine the Manitoba data used in this study and, consistent with the other studies, find that siblings of lower birth weight have worse outcomes. They did not, however, have any data on health after birth.
In fact, very few studies have examined the effect of child health after birth on future outcomes.1 Case, Fertig, and Paxson (2005) use data from the 1958 British birth cohort study and show that children who suffered chronic conditions as children had lower educational attainment, wages, and employment probabilities than other children.
Smith (2009) examines the long-term effects of child health using a retrospective health measure using data from the Panel Study of Income Dynamics (PSID). In 1999, the 25–47-year-old adult children of PSID respondents were asked whether their health when they were younger than 16 was excellent, very good, good, fair, or poor. In models with sibling fixed effects, Smith finds significant negative effects of poor overall health status in childhood on earnings.
Case and Paxson (2006) treat adult height as an indicator of childhood health, and show that much of the wage premium associated with adult height can be explained by children’s cognitive test scores. They interpret their results as evidence that early child health affects cognition, which in turn affects earnings.
Our study improves on this literature in several ways. First, we have a continuous measure of health that is taken from medical records and covers the child’s entire life. This is an improvement over retrospective measures, or the snapshots that are available in the British Cohort Study. Second, we are able to compare children to their own siblings, in order to control for fixed aspects of family background that may affect both health during childhood and future outcomes.
III. Data
Our main source of data is records that are routinely collected through the administration of Manitoba’s public health insurance system. These records include enrollment files, physician claims, and hospital claims for every person in Manitoba. These data are matched to administrative records on educational attainment and social-assistance (welfare) takeup and use.
The registry contains information on 96 percent of all children born in Manitoba over the sample period and tracks 99 percent of the original sample conditional on remaining in the province until June of their 18th year.2 We restrict our sample to families with more than one child born between 1979 and 1987 (excluding 1983, as we are unable to match this cohort to educational information). We track outcomes for these children through 2006. Oreopoulos et al. (2008) show that the sibling cohort and entire cohort of Manitoba births over this period are quite similar. We also exclude children who ever have a diagnosis of mental retardation. Further details on the construction of the data set are available in the data appendix.
Because the data set includes all hospitalizations and ambulatory visits, there are a very large number of potential health measures. Birth weight, congenital anomalies, and perinatal problems are obtained from hospital records and are used as measures of health at birth.3
In order to collapse the other available health measures in an objective and arms-length way, we use Adjusted Clinical Group (ACG) software developed by researchers at Johns Hopkins University (The Johns Hopkins University 2003).4 The software is designed to measure morbidity by creating constellations of diagnoses. Medical providers indicate diagnoses using International Classification of Disease codes (ICD9 or ICD10 codes depending on the year). The software groups 14,000 ICD codes into 32 groups (called Aggregated Diagnostic Groups or ADGs).5 Individuals are assigned an ADG code if they have been diagnosed with any of the ICD codes in the group in either an outpatient or hospital visit over the past year.6
We use the ADG codes to construct several health measures. First, the software classifies some ADGs as major and some as minor for each age group, and we start by looking at their definition of major diagnoses for children. However, this definition excludes several diagnoses that are highly prevalent among children, and which are thought to have important effects: While the Johns Hopkins definition of major conditions includes acute, unstable mental health conditions such as psychosis, it excludes “stable” mental health conditions such as ADHD and conduct disorders, two of the most common mental health conditions among children. Also excluded are asthma and major injuries.7
These are potentially significant exclusions. Using large-scale national surveys of children from both the United States and Canada, Currie and Stabile (2006, 2009) show that mental health conditions in childhood are associated with lower future test scores and schooling attainments in both countries. Externalizing disorders such as ADHD and conduct disorders were found to have the largest effects on future outcomes.8
Injuries are the leading cause of death among children over one year of age in developed countries, notwithstanding a dramatic reduction in deaths due to injuries in the past 30 years (Glied 2001). Yet we have little information about the burden of morbidity caused by injuries among surviving children (Bonnie et al. 1999).
Asthma is the leading cause of school absence and pediatric hospitalizations in children, and one of the most common chronic conditions of childhood (U.S. Environmental Protection Agency 2006). The available evidence suggests however, that when properly managed asthma may have little impact on children’s functioning or academic achievements so it is not clear that asthma in childhood should be expected to have consistent effects on young adult outcomes (Annett et al. 2000; Gutstadt et al. 1989).
In view of these literatures, we examine the effects of asthma, major injuries, two specific stable mental health problems (ADHD and conduct disorders), as well as the number of major health conditions as classified by the ADG software. In each case the measure is constructed to cover a specific age range starting from the date of birth of the child. So, for example, we define a child as having a major injury between ages 0–3 if the child has a diagnosis of a major injury at any point between birth and their fourth birthday. We construct similar measures for the age ranges 4–8, 9–13, and 14–18. We chose these age ranges to correspond to important stages of childhood: preschool years, early elementary school, early adolescence, and the late teen years.
Table 1 shows the means of our measures for each age range. Because the first age range is one year smaller than the others, the estimates are prorated so that they all apply to a five-year interval. The fraction of children with a medical contact for asthma ranges from 8 percent among 0–3-year-olds, to 14.4 percent among 9–13-year-olds. These numbers compare very closely with the best available evidence of asthma prevalence for the United States, which suggest that 13.1 percent of 0–17-year-old children have ever been told by a doctor that they have asthma (Bloom and Cohen 2007).
Major injuries are clearly the most common reason for seeking medical attention, with around 40 percent of children having at least one visit for a major injury over each age interval. U.S. data suggest that in 2000, 11.9 percent of children younger than 10, and 17.9 percent of children 10–19 received medical attention for an injury, which suggests that over a four- or five-year period, a rate of 40 percent is not unreasonable.
In comparison, contacts for externalizing mental health problems are not nearly as common: 3–4 percent of children receive a medical contact for ADHD or conduct disorders over each period. U.S. rates of ADHD prevalence are higher than this. For example, Froehlich et al. (2007) report that 8.7 percent of 8–15-year-old children in the National Health Interview Survey have been told by a doctor that they have ADHD. But there is evidence that Canadian rates of mental health diagnosis are lower than in the United States. Currie and Stabile (2009) found that 9.3 percent of children in the U.S. National Longitudinal Survey of Youth were being treated for mental health conditions compared to 4.5 percent in a similar Canadian survey. This disparity in the rates suggests that Canadian children who are diagnosed may display more severe behavior problems than treated children in the United States, a comparison that should be borne in mind when interpreting the results below.
About 15 percent of children 0–13 have other major conditions. Although it is relatively common to have a medical contact for a major health problem, most individual health problems have low prevalence. For example, the 629 children who had a medical contact related to hearing loss when they were 0–3 represent only 1.2 percent of the sample. This is the main reason that we group these major health conditions together. Still, 40–50 percent of young children categorized as having a major condition have hearing or vision problems, and it would not be surprising if these problems had effects on learning.
The average number of major health conditions is a little over 0.2 for children aged 0–13 (vs. 0.335 for 14–18-year-olds), which suggests that most children with a major health condition have only one such condition. A small number of children, however, have multiple conditions, with the maximum being 10–12 for 0–13-year-olds (14 for 14–18-year-olds). Finally, the table shows the effects of limiting our attention to major conditions. While the average number of major conditions identified is far less than one, the average child has an average of five different major and minor conditions diagnosed over a five-year period.
Table 2 explores the temporal pattern of health problems for the children in our sample. For example, a child is assigned the pattern 0000 if he or she did not have a diagnosis for a particular health condition in any age range. The fractions ever diagnosed with asthma, ADHD/conduct disorder or major injuries are 28.31, 10.18, and 82.06 respectively, while 47.36 percent ever have a major condition. Relatively small numbers of children have a diagnosis related to the same condition in each period. However, 5.58 percent of sample children have a major injury in every period.
The variation in our data over time periods suggests that in each period, some children without initial health conditions develop them, and other children with health conditions recover from them. Such variation is the key if we are to be able to use the data to examine the impact of health conditions at different ages.
The outcome variables we examine are created by linking the healthcare registry information to administrative data on education, and social assistance. We use education enrollment records to determine whether a student has attained Grade 12 by age 17. This measure is available for all birth cohorts. Overall, about 70 percent of children are in grade 12 by age 17. Not attaining grade 12 by this age could indicate that a student entered school late, has been held back in a grade at least once, or has dropped out.
We also have information from provincial language arts standards tests taken in grade 12. These tests contribute 30 percent to the students’ final course grade. Individuals pass the language arts test by scoring 50 percent or more on a comprehensive exam. The score on the test is normalized to have a mean of zero and a standard deviation of one for the entire population of students in Manitoba. Within each birth cohort, some test scores are missing and we have imputed scores for these children based on the reasons for the failure to write the test, as discussed further in the data appendix.
Students in the province can select into one of several math tracks. In each year we classify high school courses into college versus noncollege preparatory mathematics based on the difficulty of the course and the course material.9 We calculate that 22 percent of the sample took college-preparatory math courses.
Finally, the sample of Manitoba residents is matched to monthly social assistance records (the provincial welfare program). Residents become age-eligible to participate in welfare in their own right (rather than as a dependent child) on their 18th birthday. Our youngest birth cohort only can be followed for 1.25 years after the age of 18. While the older cohorts can be tracked for longer, we define our social assistance exposure window to be a consistent 1.25 years for each cohort (or 70 weeks). Using this exposure window, 6 percent of our sample went on social assistance in the 70 weeks after they became eligible on their 18th birthday.
The administrative records provide limited information on the characteristics of the mother at the birth of the child, and on the number of children in the family. We control for mother’s marital status, age, child gender, child birth order, the number of children in the family, and the child’s birth year. We use 2004 as the fixed point to determine family size and the birth order of the child. This year, many years after the final birth cohort used in the analysis, 1987, was chosen in order to try to ensure that families were past the childbearing phase. Unfortunately, there is no information about the family’s income or the mother’s education.
IV. Conceptual Framework
A growing body of research suggests that adverse conditions in early childhood may have particularly negative long-term effects. Cunha and Heckman (2008, forthcoming) hypothesize that this is because health human capital is complementary to skills and “skill begets skill,” so that children who suffer early disadvantages may fall behind and never catch up. On the other hand, to the extent that children are resilient and recover, one might expect more recent health conditions to have greater impacts on current outcomes. If both mechanisms are at work, then one might expect to find that both health insults in early childhood and recent health problems have particularly negative impacts on young adult outcomes.
This section presents a barebones model that captures these ideas and shows that early health shocks may have either larger or smaller effects than later ones, even in the simplest models:
(1)
(2)
(3)
where Yt is a young adult outcome, Ht is contemporaneous health, Ct is contemporaneous cognitive ability, and ht and ct are log health and log cognitive ability, respectively. We assume that outcomes are produced using inputs of health and cognitive ability, and that cognitive ability depends on ability last period and also on health last period. Finally, health depends on health last period, and is subject to random problems, ut. Note that in principal, cognitive ability also can be subject to random shocks, but including them would only complicate this illustrative model. Moreover, we cannot observe shocks to cognitive ability, but only shocks to health in our data.
Solving the model recursively yields an equation of the form:
(4)
where, (δ1ct–4+δ2ht–4) represents both cognitive and health endowments at birth (both will be assumed to be captured by our measures of health at birth in combination with the sibling fixed effects), ut is a contemporaneous health problem, and ut–3 is a health problem in the first three years of life. The coefficients to δ3 to δ6 are given by:
The model yields several interesting special cases:
Case 1. b2 = 0 so that cognition does not depend on health. If γ < 1 then health depreciates, and the effects of health problems die out over time. In this case, more recent health problems always have a larger effect than problems further in the past.
Case 2: b2 > 0 and γ = 1. Now early health problems matter more than later health problems. The reason is that early health problems affect the development of cognitive ability through multiple periods.
Case 3. if b2 > 0 and γ < 1 then it is possible to generate many interesting patterns. For example, if α = β = 0.5; b1 = 1.5; b2 = 0.2; γ = 0.7, then δ3 = 0.5, δ4 = 0.45, δ5 = 0.47, δ6 = 0.56 so that health problems in the first years of life and contemporaneous health problems matter most.
Hence, the impact of health shocks may be bigger or smaller for health shocks further in the past depending on which of the two mechanisms highlighted above dominates: On the one hand, if early health problems affect early cognitive ability, then future cognitive ability and early health problems have cumulative effects. However, people tend to recover from health problems over time, so that the main effect of the health problem per se diminishes with time. The takeaway message is that the overall pattern of effects on outcomes is an empirical matter.
The empirical analogue of Equation 4 is given by:
(5)
where OUTCOME is one of the young adult outcomes described above, X is a vector of controls including marital status, sex of the child, and mother’s age at birth, dummy variables for birth order of the child, family size, and year of birth indicators, HEALTH0 are measures of health at birth and {HEALTH0–3, HEALTH4–8, HEALTH9–13, HEALTH14–18} is a vector of age specific health problems. We use a number of different measures of health, as described above.
Estimation of Equation 5 could be biased by omitted characteristics of families, including characteristics that affect young adult outcomes, the health of children in the family and the propensity of the family to seek medical care.
Hence, our main focus is on models of the following form:
(6)
where MOTHER is an indicator for each mother in the data. The inclusion of mother fixed effects will help us to control for many unobserved family background characteristics that may be correlated with the propensity to use medical care, health status, and with young adult outcomes. The fact that we observe families over a relatively short period of time is helpful, as siblings will be less likely to be exposed to different environments over a short time than over a long time. The mean difference in age between two siblings is only three years, and 74 percent of children are born less than four years apart. We estimate all of our models using linear probability models for dichotomous outcomes both for ease of interpretation, and for ease of including fixed effects.
V. Estimation Results
In order for models that include family fixed effects to be informative there must be variation within families in both the health problems children experience and the outcomes observed later in life. To explore the extent of this variation we report the average difference in each outcome for families with children who have different health measures in each age group. The results are reported in Table 3.
The first column of Table 3 reports the number of siblings with different health measures at each age. So, for example, there were 3,177 pairs of siblings with differences in whether they were ever treated for asthma between ages 0 and 3. The remaining columns report the average difference in outcomes for these pairs. In each case the difference is reported as the outcome for the child with the worse health measure minus the outcome for the child with the better health outcome. Many of the mean differences reported in Table 3 are significantly different than zero. They are all of the anticipated sign except for the case of asthma at age 0–3 which is negatively related to future welfare utilization. Thus, this initial exploration of the data shows that there is a good deal of variation in health between siblings, and that siblings in worse health generally suffer worse outcomes. However, these simple comparisons do not control for health at birth, multiple health problems, or health problems at different ages. Hence, we turn to estimation of Equation 6.
Table 4 shows estimates of a model similar to Equation 6 except that we include only measures of health at birth and health at 0–3. There has been a good deal of popular discussion of the idea that the earliest ages are a uniquely vulnerable period, so it is of interest to see if health conditions at these ages are associated with long-term outcomes.
Table 4 shows that, consistent with other studies, health at birth has significant effects on all four of our outcomes. Turning to health after birth, we find, perhaps surprisingly, that asthma at 0–3 is not predictive of adverse outcomes, while major injury is significant only in the regression for Grade 12 by 17. The most powerful and consistent predictors of poor outcomes are mental health problems and the number of other major health conditions.
Relative to their own healthy siblings, children with an early diagnosis of ADHD or conduct disorders are 1.6 percentage points more likely to end up on welfare immediately after becoming eligible (on a baseline of 5.5 percent). They are also 4.4 percent less likely to be in grade 12 by age 17. An additional major condition increases the probability of being on welfare by 18 percent, reduces the probability of being in grade 12 by age 17 by 1 percent, reduces the probability of taking college preparatory math by 3 percent, and reduces the literacy score by 0.15 of a standard deviation. Hence, poor mental health at 0–3 and other major conditions are associated with significantly poorer outcomes.
But how much of the effect of early health conditions is due to the fact that they are predictive of later health conditions, and how much operates through another channel? Table 5 takes up this question by showing estimates from a model of the form Equation 6, which includes health conditions in all four age ranges.
Table 5 suggests the following conclusions:
A diagnosis of ADHD and/or conduct disorder at school entry (4–8) or later is associated with much more negative outcomes. Controlling for such diagnoses at ages four and above greatly attenuates the estimated effect of mental health diagnoses at 0–3, presumably because it is difficult to accurately diagnose mental health problems at such a young age.
The number of other major conditions at 0–3 and 4–8 remains predictive of social assistance use even when later diagnoses of major conditions are controlled. However, when diagnoses at 14–18 are included in the model, diagnoses at 0–3 and 4–8 are no longer significantly associated with future school outcomes.10
Health at birth remains a significant predictor of young adult welfare use and schooling attainment even when many intervening health measures are included in the model.
On average, major injuries in childhood do not have lingering effects on educational attainment or welfare use, though major injuries in adolescence place teens at higher risk of poor outcomes.
Asthma in early childhood is not predictive of diminished future educational outcomes or receipt of social assistance in young adulthood, though there are some weak associations between adolescent asthma and outcomes and asthma at 14–18 has a significant effect on future social assistance use.
One difficulty with the interpretation of Equation 6 is that conditions that occur early can persist over more periods than ones that appear late in our observational window. An alternative way to look at the data is to compare the effects of health conditions which apparently lasted for only one period. The largest single category in Table 2 is children who never have a given condition. The next largest groups are children who had a condition only once. We might compare children who have a given condition at age 0–3 and then recover, to those who have the same condition at a different age. Table 6 includes indicators for having a condition only at 0–3, having it only at 4–8, having it only at 9–13, and having it only at 14–18. A fifth indicator is added for children who have a condition over multiple periods. The left out group for each condition is children who never had the condition.
Table 6 confirms that for physical health problems, it is problems that occur late in adolescents or which last for multiple periods that have effects on schooling and welfare participation. Children who sustain health problems early in life and then recover do not appear to suffer lasting effects on these outcomes. Again, mental health proves an exception. ADHD/conduct disorders at 4–8 are associated with poorer outcomes even if the children have no other contacts for mental health problems, although the effects are largest for mental health problems that persist over multiple age ranges.
VI. Extensions
While the ADG system is a well-established way to construct health measures from underlying ICD9/ICD10 codes, some arbitrariness in grouping diagnoses and in classifying diagnoses as major or minor is inevitable. One obvious extension involves breaking down the injury category because it is so large. We have tried to estimate the effect of serious head injuries (skull fractures and intracranial injuries including concussion) since there is a literature positing long-term effects of such injuries (Hawley et al. 2008). However, although 4–7 percent of children have such injuries in each age group, we did not find any statistically significant effects. It is possible that to do so may require larger samples.
Our findings with regard to ADHD/conduct disorders beg the question of how other mental illnesses affect child outcomes. We have repeated our analyses with a somewhat broader measure of mental health conditions defined using all of the conditions included in ADG 24 “recurrent or persistent, stable psychosocial conditions” the category that includes both ADHD and conduct disorder. This exercise produced similar though somewhat larger estimates of the effects of mental health conditions.
In our administrative data we have no information about measures of socioeconomic status (SES) such as parent’s income or education. SES-related gaps in maternal reports of child health status tend to grow with child age in both the United States and Canada (Case, Lubotsky, and Paxson 2002; Currie and Stabile 2003). And poor children receive more insults to their health than richer children, including more injuries, chronic conditions, and acute conditions (see for example, Newacheck 1994; Newacheck and Halfon 1998; Currie and Lin 2007; Case, Lubotsky, and Paxson 2002). Hence, it would be of great interest to break down our results by SES.
We constructed a measure of socioeconomic status (SES) using income measured at the enumeration area level (an area similar to a U.S. Census tract from the 1986 Canadian Census). Our outcome measures vary with this measure. However, our estimation results were inconclusive. If we divide our sample by SES, the point estimates of the effects of health problems tend to be larger for the lower SES group, but we cannot reject the null hypothesis that the effects are similar for high and low SES groups.
VII. Discussion
The strengths of our study include: a large sample; coverage of the population from birth to followup; a long followup period; and the use of objective health measures rather than self-reports or retrospective measures. The fact that we observe multiple children from the same family is an additional strength because sibling comparisons enable us to control statistically for many unobserved characteristics of families that could be related to health and propensity to seek care.
Still several issues arise when using family fixed effects models. First, there may be characteristics of the individual child that are correlated with health conditions and also with future outcomes. While we cannot control for all such factors, it is important to note that we have unusually thorough controls for health at birth, and that the degree of serial correlation in many of our outcomes (such as injuries) is modest.
Second, parents may treat siblings differently, and illness could cause parents to behave differently toward the sick child, if for example, illness changes parental perceptions about the marginal return to investments in children. If parents favor the stronger child, then the estimated effect of health problems will reflect not only physiological effects, but also the result of the parent’s smaller investments in the sick child. If on the other hand, parents of children with health problems invest in order to compensate disadvantaged children, then sibling fixed effects will tend to understate the true physiological effect of health. Such bias could go either way. Rosenzweig and Zhang (2006) argue that in China, parents favor the stronger child, perhaps because many Chinese still expect to be supported by their children in old age. In the United States, the available evidence suggests that investments are usually compensatory (Behrman, Pollak, and Taubman 1982, 1989; Ashenfelter and Rouse 1998; Ermish and Francesconi 2000; McGarry 1999; McGarry and Schoeni 1995, 1997) so that sibling comparisons are likely to yield underestimates of the true physiological effects of health. We believe that Canada is more similar to the United States than to China, so that our estimates may tend to understate the effects of health problems.
Third, health conditions of one sibling may have an impact on the other sibling. To the extent that the household is disrupted by a child’s illness, we might expect both children to be negatively affected in which case our sibling comparisons will also tend to underestimate the effects of health conditions.
Fourth, sibling fixed effects models may exacerbate the effects of measurement error. This last point highlights a strength of our analysis in that our measures of health are much more accurate than those used in previous studies, and less likely to be subject to bias due to self/parental reports, forgetting, and so on.
Still, several limitations are implicit in our reliance on administrative health records. First, the outcomes that can be examined are limited by the availability of administrative data sets that can be merged to the health records. It would be very interesting to be able to measure adult earnings or employment status, but this information is currently unavailable.
Second, what we can observe is whether a child had any interaction with the healthcare system for a specific ADG over a four-year interval. It is always difficult to construct measures of underlying health from data about utilization of care. There are two reasons to believe, however, that the measures we construct are good proxies for underlying health status: First, these children are all fully insured. It is reasonable to expect that a seriously ill child who is fully insured will interact with the healthcare system at least once, and in fact, over 98 percent of our sample children have a contact with the medical system in each period. Hence, the use of data from a country with universal health insurance mitigates the concern about access to care (and hence about the relationship between health and utilization). Second, our measure is not affected by the number of visits, as long as the child sees the doctor at least once.
Our Canadian setting provides a data set that would be virtually impossible to replicate in the U.S. It would be difficult to find a U.S. insurer that had comprehensive data on a similarly sized sample of children from birth to young adulthood. And as argued above, universal access to care makes it defensible for us to approximate underlying health status with data based on healthcare utilization. Still, readers may wonder whether our results can be generalized to a U.S. setting. We believe that if health problems have long-term impacts in the Canadian setting, then they are likely to have even greater impacts in the United States, where access to care is an issue for many children. More specifically, it is possible for example, that asthma could have more deleterious effects in the United States, given large numbers of children with inadequate access to preventive care.
VIII. Conclusions
Our research offers several striking conclusions. First, both poor health at birth and early mental health problems are associated with poorer long-term outcomes, even conditional on future health outcomes. Second, physical health problems in early childhood are associated with poorer long-term outcomes but this appears to be because they predict poorer future health. Unless they persist over time, even serious early health problems have little association with future schooling attainment or reliance on welfare. This is true of major injuries as well as illness. Finally, it is notable that we find very little effect of childhood asthma on the outcomes we examine.
Our overall conclusion is that health problems in early childhood may be significant determinants of adult socioeconomic status, even in a country like Canada where all children have access to health insurance. Hence, prevention and better care for children who have early health problems could make a significant difference to their life prospects.
Appendix 1 Data
The province of Manitoba was chosen for this study because of the unique ability to link the sources of data used in this paper. With a population of 1.17 million, Manitoba has the fifth largest population among Canada’s provinces and territories. Within Canada, Manitoba has generally ranked in the mid-range of a series of indicators of health status, socioeconomics, and healthcare expenditures.
The data used in this study come from a number of sources. The birth data originate from Manitoba Health hospital records. The registry contains information on all births in Manitoba since 1970. Siblings are linked to mothers using hospital birth record information. The registry data allow us to specify the mother in all cases. Fathers are specified in 85 percent of cases. When an individual turns 18 years old, he or she receives his or her own family identification number. On marriage, a female receives the identification number of her husband. Both the mother’s identification number (an encrypted Personal Health Identification Number) and the family identification number are used to define siblings.11 Several checks on this algorithm as applied to the nine years of birth cohorts (looking at missing data, the number of children designated as having the same mother and father, and complicated blended families) have indicated it to be highly accurate.
Information on the provincial language arts test is taken from education enrollment records and linked to the provincial registry. Taken in Grade 12, these tests contribute 30 percent to the students’ final course grade. Individuals pass the language arts test by scoring 50 percent or more on a comprehensive exam. The test focuses on reading comprehension, exploring and expanding on ideas from texts, the management of ideas and information, and writing and editing skills. For each birth cohort, we record the test score in five percentage point categories (13 in total, with a residual 14th for students scoring between 0 and 35 percent) in the year that most students write the test. Within each birth cohort, approximately 40 percent of test scores are missing. We impute scores for missing students based on the reason for missing information (ranking them below the lowest scoring category among those who wrote the test).
The missing data categories, listed from highest to lowest rank are: absent (about 1 percent of each birth cohort sample); In Grade 12 but not tested (about 8 percent); In Grade 11 or lower (about 19 percent), Not enrolled (about 2 percent), and Withdrawn from School (about 10 percent). For the entire sample, we therefore have 19 test score categories. Following methods discussed by Mosteller and Tukey (1977) and Willms (1986), we compute a standardized score for each individual by assuming an underlying logit distribution, which is divided into pieces according to the percentage of cohort members in each category. Scores are calculated separately for each birth cohort because of small changes in the categories available and in the percentage distribution each year. In a typical year, the highest scorers are given an index score of 2.96, while those withdrawn from school are given a score of -1.84. The logit transform produces an index with an overall mean of zero and a standard deviation of one. The ordering on this index is closely correlated with the student’s eventual graduation status.
We remove children who ever had a diagnosis of mental retardation from the sample. This includes ICD9’s 317–319 and ICD10’s F70–F79.
Appendix Table A1 shows means of the “control variables” that are available in our administrative data. Note that while we start with approximately the same number of children in each birth cohort, the focus on comparing siblings means that in our sibling sample, children in the middle cohorts are more likely to be retained in the sample (because they are more likely to have a sibling in the sample).
In order to collapse the number of health measures to a manageable number in an objective and arms-length way, we use Adjusted Clinical Group (ACG) software developed by researchers at Johns Hopkins University (The Johns Hopkins University, 2003). The ACG is designed to measure morbidity by clustering individuals by their age, gender, and constellations of diagnoses. Medical providers indicate diagnoses using what are called International Classification of Disease Ninth or Tenth edition (ICD9 or ICD10) codes. This software groups 14,000 ICD9 codes into 32 groups (called Aggregated Diagnostic Groups or ADGs) on the basis of 5 criteria: (1) duration of the condition (acute, recurrent, or chronic), (2) severity of the condition (for example, minor and stable versus major and unstable), (3) diagnostic certainty (symptoms focusing on diagnostic evaluation versus documented disease focusing on treatment), (4) etiology of the condition (infectious, injury, or other), and (5) specialty care involved (medical, surgical, obstetric, and so on). Individuals are assigned an ADG code if they have been diagnosed with any of the ICD9/10 codes in the group in either a physician or hospital visit over the past year. A person can have from 0–32 ADGs. The system further classifies diagnoses as “major” or “minor,” a distinction we take advantage of in our study.
The ADG system has been extensively validated in the United States (Weiner et al. 1991; Weiner, Starfield, and Lieberman 1992; Powe et al. 1998; Wiener et al. 1996). The Manitoba Centre for Health Policy also has evaluated the application of the ACG software to the Manitoba administrative data (Reid et al. 1999). They found, for example, that the diagnostic codes used in Manitoba worked well with the ACG software, and that the fraction of people with no valid code in a given year (18 percent) was similar to that expected on the basis of previous analyses of Manitoba data. (People have no valid code if they did not see a doctor at all during the reference period). About 16 percent of the population had four or more ADG codes in a year. The system also generated a distribution of relative expenditures similar to that seen in other data sets (Minnesota Medicaid recipients, and a large U.S. HMO), suggesting that relative expenditures for different types of illness are not very different in Canada and the United States. Finally, the MCHP study verified that areas with high rates of premature mortality also had higher morbidity as measured by the ACG system.
We use the ADG codes to construct the health measures used in the analysis. In each case the measure is constructed to cover a specific age range for the child defined by the date of birth for the child (rather than by calendar years). So, for example, we sum the number of major condition codes recorded in each year between ages 0–3 to get a measure of the number of major conditions in that age range. We construct this measure for the age ranges 0–3, 4–8, 9–13, and 14–18.
Major conditions are defined using ADG codes 3, 9, 11, 12, 13, 18, 25, and 32. These codes capture most of the chronic and acute major illnesses faced by children including orthopedic, ear, nose throat and eye problems, cancers, and a variety of other acute major illnesses.
The definition of a “major ADG” comes directly from the John Hopkins software and depends on the age of the child. For children ages 0–17 it includes ADGs 3, 9, 11, 12, 13, 18, 25 and 32; and for children ages 18 and older it includes 3, 4, 9, 11, 16, 22, 25, and 32. For the sake of defining a consistent measure across age groups, we redefine the major ADG group using the 0–17 definition for all ages in the sample.
ICD9 codes have been used for both physician claims and hospital separation abstracts through March 31, 2004. These will have generated the ADG scores for all nine birth cohorts up through age 14. Beginning April 1, 2005, ICD10 coding was adopted for hospital separation abstracts. Because of this, a relatively small number of ICD10 diagnoses (N=447) on abstracts from the 1985–87 birth cohorts were also used in categorizing major conditions.
Appendix Table A2 shows the most prevalent ICD9 codes generating major conditions for each age group. While the most common serious conditions change as children age, hearing and vision problems are important in each age group. Appendix Table A3 shows the most prevalent ICD9 codes generating major injuries. While the most common serious conditions change as children age, “open wound of the head” and “certain adverse conditions, not elsewhere classified” are important categories at all ages.
Appendix Table A4 shows the most prevalent ICD9 codes for congenital problems and perinatal problems (ICD9 740–779). By definition, these conditions occur at birth or slightly thereafter. However, children with serious congenital/perinatal problems continue to have contacts with the medical system that are related to these diagnoses. Hence, one can sum the number of contacts having to do with congenital/perinatal problems at each age. In our regression models, we control for the number of contacts related to congenital/perinatal problems at each age group.
The health measures are generated from physician visits and hospital separation abstracts. Emergency department and hospital outpatient visits are not uniformly included in the data sets. Some of these visits are captured as (physician) ambulatory visits. An earlier analysis using one year of Winnipeg data found that 4.9 percent of ambulatory care was provided by emergency departments and outpatient clinics and that residents of lower income neighborhoods were disproportionately likely to receive such care.12,13 In our records, 2.5 percent of physician claims are for emergency room visits, and 1.6 percent of hospital claims over the period 1979–2004 are for outpatient visits. This comparison suggests that about 1 percent of visits could be missing.
Our analysis is based on ADGs and numbers of ADGs rather than on numbers of visits. Hence, if a child with a missing visit had another contact for a diagnosis in the same ADG within a four-year period (for example, a follow up visit), that child’s condition would be included in our analysis. Nevertheless, in order to gauge the potential importance of missing visit records, we conducted the comparison shown in Appendix Table A5. This table shows the number of ADGs for each age group calculated first using the entire sample of visits available, and then excluding the ER and outpatient records that we do have. Clearly, this exclusion makes very little difference to the average number of ADGs. It also had no effect on the maximum number of ADGs observed, so we have chosen to conduct our analysis using the entire sample of visit records.
Finally, we have examined the number of children who do not have any medical contacts in each 4 year period. It is extremely small, only 0.63 of a percent in the 0–3 and 4–8 age ranges, rising to 1.24 and 1.29 in the 9–13 and 14–18 age ranges. Only 10 children lack visits over the whole 0–18 interval. These may be due to data entry errors, including children who left the province, but whose exit was not recorded. In any case, it is not the case that large numbers of children will lack diagnoses because they lack any access to medical care.
Footnotes
Janet Currie is a professor of economics at Columbia University. Mark Stabile is a professor of business economics and public policy at the University of Toronto, Phongsack Manivong is a data analyst at the University of Manitoba, Leslie L. Roos is a professor of community health science at the University of Manitoba. The authors are grateful to Randy Fransoo for sharing his expertise about the data and to Paul Newacheck, Louise Sequin, and participants in seminars at Brown University, the Wharton School, the NBER Labor Studies group, Peking National University, and the Chinese University of Hong Kong for helpful comments. They also thank the following people from the Manitoba Ministry of Education—Heather Hunter and Shirley McLellan; from the Ministry of Family Services and Consumer Affairs—Harvey Stevens and Jan Forster; and from the Ministry of Health—Louis Barre. Finally, the authors thank seminar participants at the Geary Institute at University College Dublin and the Claremont-McKenna Colleges, Sanders Korenman, and three anonymous referees for helpful comments. Funding was provided by the Partnership for America’s Economic Success, the Social Science and Humanities Research Council of Canada (MH 2007/2008–22), and the Canadian Institute for Advanced Research and the RBC Financial group. The results and conclusions presented are those of the authors. No official endorsement by Manitoba Health, the Partnership for America’s Economic Success or other funding agencies is intended or should be inferred. The data used in this article are housed at the Manitoba Centre for Health Policy. Information on access can be obtained by contacting Leslie Roos, Manitoba Centre for Health Policy, 4th Floor, 727 McDermot Avenue, Winnipeg, MB R3E 3P5 Canada; e-mail: Leslie_Roos{at}cpe.umanitoba.ca. For two years after publication of this article, all codes to replicate the tables in this paper will be kept on file.
↵1. Salm and Schunk (2008) use administrative data from a German city and show that six-year-old children with health problems also have lower test scores, but they are not able to track the children over time.
↵2. Approximately 20 percent of the sample leaves the province between the birth of the child and their 18th year. Oreopoulos et al. (2008) find that there is no correlation between poor child health and the family leaving the province. There also is a small amount of attrition from children who die. Children who died before age eight were much less healthy at birth and most of these deaths occurred within the first year of life. Excluding these children did not affect our results.
↵3. Perinatal refers to the period from birth to 22 weeks. We count only those congenital and perinatal problems that are considered “major” health conditions by the Aggregated Diagnostic Groups (ADG) system. Appendix Table 4 in the data appendix lists these conditions. Congenital problems may continue to generate diagnoses as the child ages. For example, a 10 year old may have visits related to a congenital heart defect. We treat this as a congenital problem rather than as a new health condition.
↵4. The ADG system has been extensively validated in the United States. (Weiner et al. 1991; Weiner, Starfield, and Lieberman 1992; Powe et al. 1998; Wiener et al. 1996). The Manitoba Center for Health Policy also has evaluated the application of the ACG software to the Manitoba administrative data (Reid et al. 1999). See the data appendix for further details.
↵5. Groups are created on the basis of five criterion: (1) Duration of the Condition (acute, recurrent, or chronic), (2) severity of the condition (for example, minor and stable versus major and unstable), (3) diagnostic certainty (symptoms focusing on diagnostic evaluation versus documented disease focusing on treatment), (4) etiology of the condition (infectious, injury, or other), and (5) specialty care involved (medical, surgical, obstetric, and so on.)
↵6. Further information about the most common ICD codes in the ADGs we use is available in the data appendix.
↵7. “Major injuries” are not included as a major ADG for 0–17-year-olds, but they are included for 18-year-olds. In order to construct a more consistent measure, we use the same definition of major conditions for 18-year-olds as we do for younger children. Further details are in the data appendix.
↵8. Duncan et al. (2006) report similar findings.
↵9. The number of college preparatory math courses available increased over our sample period. The inclusion of year fixed-effects will help to account for this trend. We also estimated alternative models using a binary variable equal to one if a student obtained a grade of 80 percent or more on their math courses. Results using this specification are quite similar.
↵10. The same is true if we include health measures at 0–3, 4–8, and 9–13 only. That is, only the latest measure is predictive of schooling outcomes.
↵11. Siblings are noted as “full siblings” if they are children of the same mother (as noted on the birth record) and the same man is noted on the research registry (using the child’s family identification number) as “family head” at the time of the child’s birth. Slightly over 85 percent of those identified as siblings (from having the same mother) meet the criterion set out above.
↵12. Mustard, Cam, Anita Kozyrskyj, Morris Barer, and Sam Sheps. 1998. “Emergency Department Use as a Component of Total Ambulatory Care: A Population Perspective.” Canadian Medical Association Journal 158(1):49–55.
↵13. Roos, Leslie, Randy Walld, Julia Uhanova, and Ruth Bond. 2005. “Physician Visits, Hospitalizations, and Socioeconomic Status: Ambulatory Care Sensitive Conditions in a Canadian Setting.” Health Services Research 40(4):1167–85.
- Received November 2008.
- Accepted March 2009.