Abstract
I examine the impact of prenatal total suspended particulate (TSP) exposure on educational outcomes using county-level variation in the timing and severity of the industrial recession of the early 1980s as a shock to ambient TSPs (similar to Chay and Greenstone 2003b). I then instrument for pollution levels using county-level changes in relative manufacturing employment. A standard deviation decrease in TSPs in a student’s year of birth is associated with 2 percent of a standard deviation increase in high school test scores for OLS and 6 percent for IV. I also consider how migration and selection into motherhood relate to my results.
I. Introduction
The fetal origins hypothesis (FOH) suggests that the “kind and quality of nutrition [we] received in the womb; the pollutants, drugs and infections [we] were exposed to during gestation … shape our susceptibility to disease, our appetite and metabolism, our intelligence and temperament.” (Paul 2010b) Almond and Currie (2011) recently reviewed the steps economists have taken to cleanly identify the validity of the FOH, and provided substantial evidence that impacts of in utero treatments can continue to play out later in life. The FOH raises an interesting suggestion that in utero pollution exposure could carry long-term consequences such as decreased performance in school, lower educational attainment, and reduced earnings and lifespan. And while some impacts of in utero pollution on developmental factors may be observed using current metrics of infant health, factors with direct mental impacts that are unaccompanied by commonly observed physical traits go uncounted, and policy decisions made solely on the basis of contemporaneously observed physical measures may be understating the true costs of pollution.
Economists have found a number of pollutants to be hazardous to contemporaneous physical health.1 Less is known about how impacts extend further in the life cycle, though recent research finds higher lead levels lower IQ scores and increase deviant behavior as well as decrease educational performance, cognitive ability, and labor market outcomes (Reyes 2007; Nilsson 2009), and prenatal radiation exposure lowers later test scores (Almond, Edlund, and Palme 2009). I further the research on pollution and health by considering the impacts of prenatal total suspended particulate matter (TSP) exposure on long-run outcomes, specifically educational achievement in high school. Unlike prior research on pollution and long-run events, I examine a common, currently regulated pollutant that remains ubiquitous even under today’s more strict environmental policies. And while a good deal has been learned about the negative health effects of particulates, little is known about how such a pollutant might impact mental development and cognitive ability. Given the ever-present debate over regulatory policy, such considerations are of drastic importance.
I combine several data sets containing economic, demographic, weather, pollution, and test information from the state of Texas and exploit a period of industrial recession in Texas from 1981 to 1983 with its dramatic impact on manufacturing production and associated ambient TSP levels, similar to the identification strategy first employed in Chay and Greenstone (2003b). Ordinary least squares (OLS) show that, during the recessionary period, a standard-deviation decrease in the mean pollution level in a child’s year of birth is associated with 2 percent of a standard deviation increase in high school test performance. As a correction for potential complications such as measurement error and omitted variables bias, I then instrument for TSPs using annual employment changes in the county-level manufacturing sector. Instrumental variables (IV) results are larger; a standard deviation decrease in TSPs is associated with approximately 6 percent of a within-county standard deviation increase in test performance. Results are statistically significant only in the years of greatest pollution variation, suggesting that the effect may be too subtle to identify using mild changes in ambient pollution caused by time variation or by making across-county cross-sectional comparisons. I also discuss challenges to identification including measurement error, systematic migration, and selection into motherhood and health behavior, and how each may bias my results.
Section II provides scientific background on the relationship between particulate matter and brain development. Section III describes the data. Section IV discusses the intuition behind the period of analysis, choice of instrument and some potential complications to identification. Section V describes the empirical methods used. Section VI presents my results. Section VII considers additional factors important in the interpretation of my results. Section VIII concludes.
II. Pollution, health, and intrauterine development
I focus on airborne TSPs, the measure of particulate pollution used by the EPA in the earlier years of the Clean Air Act and up through the late 1980s. This includes all suspended, airborne liquid or solid particles smaller than 100 micrometers in size. Suspended particulates can be naturally occurring (for example, dust, dirt, and pollen), which tend to make up the larger particles, and a byproduct of common economic activities such as fuel combustion (for example, coal, gasoline and diesel), fires, and industrial activity, which tend to make up the smaller particles. Regulatory attention has shifted to finer sizes of particulate matter, smaller than 10 micrometers (PM10) and smaller than 2.5 micrometers (PM2.5). Both of these size classifications are contained with the older TSP measure. As the composition of particular matter varies, there is no scientifically identified conversion metric between TSP levels and the more modern measures. However, it is useful to consider such a relationship given that much of the work in economics today uses the PM10 classification. A 1999 study by the World Bank Group noted a commonly used ratio is PM10 = TSP∗0.55 (The World Bank Group 1999).
Inhaled particulates cause a number of health problems including difficulty breathing, decreased lung function, aggravated asthma, and cardiac trouble. Exactly how particulates might impact health depends partially on their size, which influences how the body responds to exposure. Particles can be classified as inhalable, thoracic, and respirable based on their expected deposition. Larger particles (inhalable) are more likely to cause physical problems in the mouth, nose, and trachea, both in the short term (for example, difficulty breathing) and the long term (for example, nasal and esophageal cancer). Particles that are small enough to enter the lungs can do damage to the lung tissue. Particles smaller than 4 μm are respirable, and can penetrate to the level of alveoli and interfere with oxygen gas exchange (EPA 2009).
Particulate matter may impact fetal development in a number of ways. Reduced oxygen or cellular and organ damage sustained by the mother as a result of particulate exposure can result in reduced resources to the developing fetus, interrupting or permanently altering brain development. Given that brain development can begin as early as four weeks into the pregnancy and continue up through birth, there is a large window for potential damages caused by reduced nutrients available to the fetus. Such factors are also likely to result in more physically observable factors such as lowered birth weight and shortened gestation (Chay and Greenstone 2003b; Currie and Walker 2011). Given the link between low birth weight and long-run outcomes such as education (Behrman, Rosenzweig, and Taubman 1994; Behrman and Rosenzweig 2004; Almond, Chay, and Lee 2005; Currie and Moretti 2007) prenatal pollution can impact educational outcomes through at least two channels: (1) pollution may cause lower birth weight, which in itself somehow causes students to perform worse, and (2) pollution may have a direct and separate impact beyond birth weight. For (1), low birth weight is currently observed and can be considered in formation of policy. For (2), my results suggest consideration of low birth weight alone will underestimate the true social costs of pollution.
Research on finer particles suggests particulate matter may indeed alter fetal development in manners less easily observed. For example, Perera, Li, Whyatt, Hoepner, Wang, Camann, and Rauh (2009) found a link between fuel-combustion related polycyclic aromatic hydrocarbons (PAHs), a specific type of particulate matter, and a number of pre- and early postnatal developmental problems including damage to the immune system, hindered neurological development, impairment of neuron behavior associated with long-term memory formation, and lower IQ scores at age 5. They found no statistically significant difference in birth weight or gestation length, suggesting such mental effects can be present even when observable physical effects are not. Other medical studies have found correlations between air pollution and brain damage, including cerebral vascular damage, neuroinflammation, neurodegeneration (Block and Caldero´n-Garciduen˜as 2009), and brain legions (Raloff 2010). Controlled studies on rats found those exposed to fine and ultrafine particles had elevated proinflammatory response in their brain tissue (Campbell, Oldham, Becaria, Bondy, Meacher, Sioutas, Misra, Mendez, and Kleinman 2005), and that long-run PM2.5 exposure led to decreased cognition and observable brain differences such as fewer dendrites and reduced cell complexity (Fonken, Xu, Weil, Chen, Sun, Rajagopalan and Nelson 2011). Observational studies of children exposed to black carbon found higher levels of pollution exposure were associated with lower memory and cognition even after controlling for birth weight, sociodemographics, tobacco smoke exposure, and blood lead level (Suglia, Gryparis, Wright, Schwartz, and Wright 2008).
While these studies are not focused on in utero exposure, they suggest a link between particulates and brain damage. More specific in utero-driven mental effects have been detected for exposure to methymercury, another form of particulate (Grandjean, Weihe, White, and Debes 1998; Debes, Budtz-Jørgensen, Weihe, White, and Grandjean 2006), and controlled studies using rats found in utero exposure to diesel exhaust, a major source of particulate matter, resulted in decreased locomotor activity and altered neurochemistry in the brain (Suzuki, Oshio, Iwata, Saburi, Odagiri, Udagawa, Sugawara, Umezawa, and Takeda 2010).
In summary, particulate matter could have lasting impacts on mental ability via a number of fetal mechanisms. Larger particles are likely to cause health problems in the mother, which could then harm fetal development. Based on prior research, such impacts are likely to be accompanied by physically observable damages such as premature birth and lower birth weight. But smaller particles, which have the ability to permeate protective membranes, could cause damage to fetal neural cells without physical damages that would be observed in commonly measured indicators of infant health.
III. Data
I combine several data sets on pollution, weather, school quality, economic conditions, demographics, and test scores. TSP data on the annual arithmetic mean come from the EPA database of historical air quality data. Readings from all pollution monitors within 20 miles of a county population centroid are collapsed to an annual mean and weighted by the inverse of their distance from said centroid. Thirty counties have population centroids within 20 miles of at least one monitor active for all years from 1979–85.2 I use only monitors that take readings regularly over the year (at least one an average of every two weeks) so as not to have any one period on the year influence my results.3 Results were similar using all available monitors without restrictions, though the first stage was slightly weaker due to decreased precision of pollution measurement.
Weather is a potential confounder in the regression of student test outcomes on pollution exposure due to health interactions (Descheˆnes and Greenstone 2007; Barreca 2008; Descheˆnes, Greenstone and Guryan 2009; Stoecker 2010) and my firststage regression (see Knittel, Miller and Sanders (2009) for an in-depth discussion of weather and pollution). Using data from the Global Surface Summary of the day, I control for third-order polynomials in average annual temperature, number of days with rain, and average annual specific humidity. To get county level measures I follow a strategy similar to my pollution data, but allow for inclusion of all monitors within 50 miles due to the substantially smaller number of weather monitors present. In some specifications I allow for more nonparametric weather controls and consider wind speed as well. Yearly averages by birth year cohorts for pollution and weather variables are shown in table 1.
Industry level employment estimates (used for instrument construction as described in Section V), and per capita income are from the Regional Economic Information System (REIS). I also use the REIS annual population estimates and Census land area information to construct a county by year population density estimate to control for urbanization and its impact on both pollution levels and education.
Test score data are from the Texas Education Agency (TEA) monitored Texas Assessment of Academic Skills (TAAS). From 1994 to 2002, tenth graders were required to exhibit competency on both a TAAS math exam and reading exam, where competency is a score of 70 or higher on the Texas Learning Index (TLI), an annually adjusted score intended to equate difficulty of passing across test years.4 Students who have not achieved competency by twelfth grade cannot graduate from high school, making the TAAS a high stakes exam. Note that, while I only use the score from each student’s first tenth grade attempt, they have additional opportunities in eleventh and twelfth grade to retake the exam if necessary. I focus on the math portion of the exam as an outcome variable, as math scores are often considered more informative of learning when discussing standardized exams and used more frequently in the education literature.5 Figure 1 shows the distribution of the firstattempt test scores with an indicator line for the passing score of 70.
A number of factors make the TAAS data well suited for my analysis. First, the basic structure of the exit exam has remained consistent throughout its administration, making comparison across years more valid. Second, TAAS data have a wealth of information on each student, including race, ethnicity, gender, free lunch status, and special education status, which I include as controls. Assuming that the student population somewhat reflects the overall population, these variables also stand in for the population makeup of the counties and partially control for differences in outcomes spanning from variation in overall racial and ethnic makeup across regions.
The majority of students taking exit exams between 1994 and 2002 have years of birth between 1979 and 1985, allowing me to view birth cohorts in a period before the recession (1979–81), during the recession (1981–83), and very briefly after the recession during a period of recovery (1983–85). After matching students to counties for which I have all covariates and creating a balanced panel in both pollution and schools, the remaining sample consists of 757,507 students for the 1979–85 period and 334,956 students for the 1981–83 period at 416 schools across 30 counties. table 1 shows means and standard deviations for student data across all included birth years.6 To control for changes in school quality I include school by year pupil-to-teacher ratios using data from the Common Core of Data (CCD).7 See Sanders (2011) for a more in-depth discussion of all data sets.
My TAAS data lack specific date of birth. My approximation of prenatal pollution exposure assigns students the average TSP level for their current county of residence in the year of their birth. The lack of exact date of birth also means that I cannot directly address the issue of students being “young” or “old” for their grade. If some of the effect of pollution exposure is the need to repeat a grade, this effect will be masked by the inclusion of year of birth by year of test fixed effects. A lack of recorded data for earlier cohorts means I cannot directly address the issue of who has been retained in prior grades or how many students have left school prior to taking the tenth grade exam and appearing in my sample. The dropout age in Texas is 17 years of age, which helps partially alleviate the concern of early dropouts as many students have not yet reached that age by grade 10.
IV. Method
I model test performance in tenth grade (TLI 10) as a function of the TSP level in the student’s year of birth,
TAAS data do not contain information on the student’s region of birth. I assume that the county in which I observe a student taking the exam is the county in which they were born (similar to Ludwig and Miller 2007), which introduces a potential source of measurement error in assignment. In situations where there are multiple schools per county, all schools within the same county are assigned the same birth year pollution treatment.
The economy of Texas underwent a sectoral shift as a result of the 1981–83 recession; the manufacturing sector saw decreases in both employment and capacity utilization, and employment shifted to sectors such as retail and services (Orrenius, Saving, and Caputo 2005). This led to a sharp drop in statewide TSPs in a short period of time. Average TSP levels exhibited the greatest changes between 1981– 83, as shown in Figure 2, when state average TSP levels fell by almost 10 micrograms per cubic meter (μg/m3), a change of approximately 14 percent, and remained permanently lower. This is the largest permanent decrease in Texas in such a short period since the early 1970s.8
Counties with greater shares of their economy in manufacturing saw greater relative decreases in pollution. Almost 50 percent of particulate emissions in 1976 came from industrial production, and the decrease in TSPs during the industrial recession correlated strongly with a decrease in industrial and manufacturing production. By 1985 industry’s contribution to total national particulates was down to approximately 37 percent (Environmental Protection Agency 1985; Chay and Greenstone 2003b). I use this relationship as the basis of my IV analysis, where I use the relative share of county-level employment in manufacturing (manufacturing employment divided by all other employment sources) as an instrument for TSPs.
Figure 3 shows how pollution levels changed by changes in manufacturing employment levels. The figure divides counties into terciles based on the absolute change in relative manufacturing employment counties experienced during the recessionary period. Groups with greater decreases in relative manufacturing ended up with greater relative drops in their ambient TSP levels.
Using the recession and relative manufacturing employment as a source of identification introduces some complications. First, reduction in pollution was likely accompanied by job loss and decreased income, both of which could have negative impacts on prenatal development, and income controls may be insufficient to capture the full effect. The most likely effect, that the recession led to lower income, which in turn leads to lower fetal and infant health, would bias my results toward zero (as pollution and income are negatively correlated).9 Second, employment data at the county level are available on a yearly basis, so all pollution assignment is on the county by year of birth level. This could be problematic if pollution causes a shift in the timing of birth across years, which would mean the measurement error is correlated with treatment rather than random, and my results must be interpreted with this in mind. Third, the recession may have impacts on test outcomes that are correlated with positive changes in fetal health but not related to pollution. Dehejia and Lleras-Muney (2004) note that babies born in periods of high unemployment are more likely to have better birth outcomes. This could be attributable to maternal behavior modification during recessions or a selection into motherhood effect. This could bias results in the direction of my findings. Finally, the recession and change in county employment makeup could drive migration patterns that could in turn alter the makeup of students taking the test. This change in student composition might result in changes in test scores that I would incorrectly assign to the impacts of earlier pollution. I now directly address the motherhood selection and migration issues in depth.
A. Selection into motherhood
Dehejia and Lleras-Muney (2004) found much of the improved behavior was driven by selection and behavior of black mothers, who represent a small portion of my sample (approximately 15 percent of all births in the counties used), while white mothers saw a reduction in average health during recessions. In addition, any choice to engage in childbearing behavior must come with a lag. Unfortunately, I do not have the necessary means to directly address the issue of behavior modification, as I have no information about the mothers of the students analyzed. Instead I examine natality records from Texas to see how the composition of mothers may have changed. I consider how TSPs and my instrument are correlated with factors commonly related to socioeconomic status and child outcomes: maternal age and race, and in what month prenatal care began.10 Results are shown in Columns 1 through 4 of Panel A in table 2. I weight by the number of total live births per county and control for only county and year effects. I find a statistically significant relationship between TSP levels and the share of mothers who are white. This suggests regions with higher pollution have a higher share of white births. The effect is small—a 10- unit difference in TSPs (approximately the change seen during the 1981–83 period) is correlated with a slightly higher probability of a birth being to a white mother (approximately 6 percent of the mean). As an addition consideration I also checked to see if changes in my instrument are correlated with changes in mother effects (Panel B of table 2), and found no statistically significant effects for mother age or race. There is a correlation between relative manufacturing employment and prenatal usage. Mothers in more manufacturing-intensive counties wait longer to begin prenatal care. Again, the effects are small, where a 10-unit difference in TSPs is correlated with a tenth of a month delay.
B. Selective migration
Changes in job composition may have altered the makeup of families via migration patterns. For example, families of poorer performing children move out, or families of higher performing children move in. If those who have worse performing children either (1) moved out of counties that saw greater (lesser) pollution changes as a result of the recession, or (2) were less (more) likely to move in to greater pollution counties after the recession in systematic ways, my results would be biased upward (downward). Note that, if systematic migration were a confounding issue, I expect to see effects in all periods both during and after the recession. If there were systematic differences in the type of child brought in through migration after the recession, that effect should be present in the other, nonrecession periods. I show in Section VI that this is not the case.
A study by the Pew Research Center using American Community Survey data found that Texas has the lowest outward migration rate of the 50 states. 76 percent of the population over age 18 living in the state in 2005–2007 was born there. Texas also ranked low on the inward migration scale (34th out of 50).11 Another potential measure of migration is a longitudinal data set that tracks mobility over time, such as the Panel Study of Income Dynamics (PSID). Using PSID data from 2007, approximately 59 percent of 80 responding household heads from Texas indicate they live in the same state in which they grew up (versus the average of 64 percent for all national respondents). Within that same sample, 28 percent of Texas respondents indicated they had not moved at all from 1981 to 2003, (versus 27 percent for all respondents). Jointly, these surveys indicate that mobility within Texas is either on par with or lower than the national average. However, neither addresses the issue of county-level mobility.
As a more concrete analysis of how migration might influence my findings, I consider (1) county-level migration trends across time, and (2) how changes in the student covariates are associated with changes in pollution during year of birth. I use Internal Revenue Service migration data on tax return information to track county-level migration. The benefit of these data is that they cover all individuals that file taxes, including all counties in Texas. The drawback is that data are not available until the 1990–91 tax year. This does, however, show if pollution changes during the recession are correlated with long-run shifts in migration patterns. The average annual outward migration rate varied between 4.9 and 5.8 percent per year, while the annual inward migration rate varied between 4.2 and 6.6 percent. To see how rates varied with pollution changes during the recession, I first calculate the 1981–83 change in pollution for each county. I then regress that change on inbound and outbound migration percentages (the number of migrating individuals, including primary filers and exemptions, divided by the population) for each tax year from 1990 through 2004. Of the resulting regressions, none found statistically significant results. I note that in each case after 1992 the estimated relationship between pollution change and migration rates was negative (counties that had the lowest pollution changes had greater inward and outward migration rates). Results are omitted for simplicity but are available upon request.
I next consider the statistical association between student covariates and pollution between 1981 and 1983. These results are shown in Columns 5 through 10 in Panel A of table 2. I run regressions with each of my demographic covariates as an outcome variable, controlling for only school and year of birth fixed effects, and weight by the number of students in each cell. Only the fraction of the school population that is black is significantly correlated with TSPs. A 10-unit difference in TSPs is associated with a 0.8 percentage point decrease in the share of students who are black, or a change of approximately 5 percent. This does suggest potential migration factors, and though I control for race in all regressions, race may be correlated with unobservables and my results should be interpreted with this in mind. When considering the relationship between student makeup and the instrument (Panel B), I found none to be statistically significant.
In summary, statistically significant correlations exist between some motherhood characteristics and the share of students who are black when considering the 1981– 83 birth cohort. There may be unobservable factors also correlated with pollution that bias OLS regressions. There are no economically significant correlations between my instrument for pollution and any observable mother or student characteristics, and the use of the instrumental variables strategy may alleviate concerns over unobservables.
V. Econometric model
I collapse all student data by demographic group, school of attendance, year of birth, and year of test to limit potential omitted variables bias caused by higher levels of aggregation (see Hanushek, Rivkin, and Taylor 1996). I weight all regressions by the number of students in each cell.12 The OLS estimation model is:
where s, c, b, and t refer to school, county, year of birth, and year of the test, respectively. The parameter β is the estimated achievement impact of an additional unit of TSP exposure in the child’s year of birth, αs is a vector of school fixed effects, θb,t is a vector of year of birth by year of test fixed effects, Xs,t is a vector of (collapsed individual) school-level student and school covariates, Bc,b is a vector of economic and demographic covariates in the year of birth, Tc,t is a vector of economic and demographic covariates in the year of the test, Wc,b is a vector of county-level weather covariates in the year of birth, and ε is an error term. Pollution treatment varies at the county by year of birth level.
In my IV analysis, I model TSPs as a function of all workers in a county employed in the manufacturing industry (SIC code 400) over total county employment levels in all other sectors in a given year. Given a linear relationship where Π is defined as the marginal impact of changes in relative manufacturing employment, the relationship (minus other covariates) is:
I multiply the result by 100 to make Π interpretable as a percentage change. Clearly, controlling for income is important. The recession was likely accompanied by loss of income, which could in turn have an impact on fetal health and long-run cognitive growth. But income is likely to be correlated with the error term in the regression of pollution on test scores. I instrument for per capita income using changes in national crude oil prices and the strong link between crude oil prices and income in Texas.13 I theorize counties with larger oil extraction sectors prior to the recession had their per capita incomes change more drastically with crude oil prices (similar to the coal reserves and county wages instrument used in Black, Daniel, and Sanders 2002). Due to the limited availability of specific oil extraction employment, I use the more general mining employment (SIC code 200), which contains within it petroleum extraction, drilling, and other oil-mining employment sources. My final income instrument is the annual inflation-adjusted price of crude oil weighted by the fraction of county employment in the mining and extraction industry prior to the recession (using an average of 1976–78 values as the baseline):
While annual crude oil price variation occurs on the national level, the instrument varies by county due to the cross-county differences in prerecession mining sector size.
VI. Results
In the discussion that follows, the term “standard deviation” refers to a within-county standard deviation. All regressions control for: student race, ethnicity, sex, special education, and free lunch status, the school-level pupil/teacher ratio in the year of the exam, income per capita in the year of birth, income per capita in the year of the exam, population density in the year of birth and year of the exam, cubics in temperature, rain, and humidity, and school and year of birth by year of test fixed effects. All standard errors are clustered on county to allow for county-specific correlated errors over time.
Using the 1979–85 cohorts as a whole (Column 1 of table 3) shows the impact of TSPs is not statistically different from zero. Due to the subtle relationship between pollution and ambient TSPs, the relationship may be undetectable when analyzing mild variations or gradual changes in TSPs driven by long-run trends. Instead I focus on the period of the largest variation, the recession period of 1981–83. This follows Chay and Greenstone (2003b), who exploit a similar methodology to identify the effects of pollution exposure on infant mortality.14 I also consider effects by the period just prior and just following the recession period to investigate the presence of trends. Columns 2, 3, and 4 show OLS results for a sample restricted to those born in the periods spanning 1979–81, 1981–83, and 1983–85, respectively (note this causes some overlap in students across samples). The results are statistically insignificant for 1979–81 and 1983–85. For the 1981–83 period the coefficient is statistically significant and has the anticipated negative sign. A standard deviation decrease in average TSPs in the year of birth is associated with 2 percent of a standard deviation increase in eventual test scores. Chay and Greenstone (2003b) note a similar time variation across pre, during, and postrecession periods, and note that the recession period is most useful as “there appears to be greater potential for confounding in cross-sectional analyses and analysis of changes in the surrounding nonrecession years.”
As a further check into the presence of background trends, I repeat the regressions for the 1981–83 period including one year lags and leads of TSPs. Results, shown in Column 5, are robust to the inclusion of both lags and leads, and only the oneyear lagged value is marginally significant. This is not in itself problematic. The TAAS data only allow me to identify year of birth, and for some individuals born early in the year the pollution exposure in the prior year is actually the most relevant. Adding two-year lags and leads leaves the coefficient on current TSPs of similar magnitude but increases the standard error enough to remove statistical significance. A joint test of significance on all lags and leads in this specification yields a p-value of 0.27, suggesting they have little explanatory power in this model.
I next consider the impact on the fraction of students passing the standardized exit exams on their first attempt.15 Column 7 considers the impact of prenatal pollution exposure on the probability of obtaining a passing math score (a TLI score greater than 70) on the first try for the 1981–83 cohort. A standard deviation decrease in ambient pollution in the year of birth is associated with an approximate one percentage point increase in cohort passage rates.
The shock of the recession is unable to overcome the complication of measurement error, four types of which may be present. First, true ambient pollution is measured with error at the monitor location. Second, pollution information from air monitors is assigned by using the weighted distance formula as described in Section III. If two counties are similarly located from the same monitors, those two counties will receive similar assignment of pollution levels, thus reducing the variation in county pollution levels beyond its true value. Third, I assign pollution levels to students by assuming the county in which they take the exam is the county in which they were born. Finally, I assign pollution based on year of birth, which introduces noise in the true level of pollution exposure seen by individuals. My instrument can help with the first and second error sources, but unfortunately cannot impact the third or fourth.
OLS may also face omitted variables bias problems. As noted in Section IV, correlations exist between pollution changes and other factors associated with test scores, such as maternal behavior and eventual student demographics. To address such issues, I employ an instrumental variables strategy as discussed in Section V. Using a county-specific manufacturing-based instrument has the additional benefit of a greater level of between-county variation in ambient TSPs, as each county now has a unique source of variation. I report the first-stage coefficients in all IV tables in addition to the standard statistical significance metrics (discussed below). I use this IV strategy for only the 1981–83 as it is the period of greatest interest given the substantial pollution change, and in the 1979–81 and 1983–85 periods, the first stage is substantially weaker. This suggests that, similar to the subtlety of the effects of pollution on test scores, the relationship between manufacturing production and pollution is harder to discern in the presence of mild changes.
I use limited information maximum likelihood in all estimations due to its greater robustness to weak instruments. In order to assess the strength of the first stage, I report a variety of test statistics. Baum, Schaffer and Stillman (2007) suggest the Kleibergen-Paap F-statistic (Kleibergen and Paap 2006) as a cluster-robust test of overall significance, which I include in my primary tables. I also report the weakinstrument “Angrist-Pischke” multivariate F-test as described in Angrist and Pischke (2009) for both endogenous variables.16 Finally, I report the p-value for the Stock- Wright S-statistic as described in Stock and Wright (2000), which tests for joint significance of endogenous regressors in the case of weak-instrument robust inference.
Column 1 of table 4 shows the primary IV result is statistically significant and approximately 3 times the size of the OLS result. A standard deviation decrease in pollution is associated with almost 6 percent of a standard deviation increase in test scores. The Angrist-Pischke F-values for both endogenous regressors are close to the classic, single endogenous variables F = 10, while the Stock-Wright S-statistic rejects at just above the 3 percent level. Finally, a comparison of the Kleibergen- Paap F-statistic to the Stock-Yogo weak identification critical values as reported in Stock and Yogo (2002) indicates that the instruments fall between the 15 and 20 percent maximal size threshold when using LIML estimation.17 As a whole tests suggest the instruments for both income and manufacturing are well defined. The first-stage coefficient on the manufacturing instrument is approximately 0.61; a one percentage point increase in the ratio of relative manufacturing employment increases ambient TSP levels by 0.61 μg/m3.18
The larger IV results suggest the presence of downward bias in the OLS estimates. Part of this may be classical measurement error in pollution assignment, though this is unlikely to explain the full difference. The IV estimates identify the local average treatment effect, which may be larger than the average treatment effect identified by OLS. It also suggests the presence of omitted variables bias correlated with pollution and test scores. For example, pollution may be higher in more urban areas where access to prenatal care is greater, which could offset the negative effects of pollution. The treatment of income as endogenous may also be important. There are complex relationships between income and test scores as well as income and pollution, and allowing income to be endogenous may influence results. I address this further below as I explore the robustness of my IV results.
Column 2 controls for estimated nonmanufacturing employment levels (total county employment minus manufacturing employment divided by population) to test if the first stage is a function of total employment rather than manufacturing employment. The first-stage coefficient is smaller, but second-stage results remain significant though slightly smaller in magnitude. I next add the number of days in the year of birth that were above 85 degrees and below 25 degrees (given the relationship between high temperature and birth weight found in Descheˆnes, Greenstone and Guryan (2009) and the relationship between low temperature and cognitive skills found in Stoecker (2010). I also add the number of days that the average wind speed was above 13 miles per hour, which corresponds to a 4 on the Beaufort scale and is considered fast enough to raise dust on land. Addition of these variables leaves results largely unchanged (Column 3).
Columns 4, and 5, and 6 explore the importance of income per capita in the year of birth in my analysis. In Column 4 I treat only pollution as endogenous while including income per capita as a covariate. This substantially lowers the strength of the first stage by both increasing the standard error and decreasing the coefficient on the manufacturing instrument. The second-stage coefficient on TSPs is now large in magnitude but has a p-value of 0.11. This is a potential concern, particularly if it implies that IV results are driven by the inclusion of the income instrument discussed in Section V. However, this does not appear to be the case. Column 5 repeats Column 4 but excludes income completely. While the estimated impact of TSPs in the second stage is now lower, it remains statistically significant at the 5 percent level. In Column 6, I instrument only for income, leaving TSP as exogenous. Results are smaller than the baseline specification but remain larger than OLS, suggesting potential omitted variables bias in income as well as pollution. As a whole, these findings suggest that how I treat income in the year of birth is an important factor. The IV result is not driven by the inclusion of the income instrument, but failing to treat income as endogenous means both the first and second stage lack clean identification, as income is likely to be correlated with the error term in both states of the regression.
Finally, in Column 7 I return to my main specification but use the TLI passage rate as the outcome variable. Results suggest that a standard deviation decrease in pollution is associated with a three percentage point increase in countywide passage rates, or around 5 percent of a standard deviation.
VII. Discussion
Using standardized test scores as a measure of cognitive development presents some additional complications. TAAS math exam passage rates increased from 57 percent in 1994 to 83 percent in 2002.19 This increase may be due to improved schooling, decreases in ambient pollution levels, or other, less socially productive changes such as “teaching to the test.” In order for these effects to bias results, such practices must vary across counties over time in a manner that is correlated with TSPs as well as my instrument, and present during the 1981–83 birth cohorts. The plausibly exogenous nature of the earlier recession shock provides some safeguard against such confounders.
Texas changed how special education students were treated in the 2000 test year. Prior to 2000, special education students did not have their test scores used in the calculation of school-wide passage rates, which were then used to grade schools and determine sanctions. After the 1999–2000 school year, special education scores were included. This could have caused schools/districts to change which populations of students were classified as special education, and the relevant policy change occurs during the testing time frame associated with birth cohorts during the recessionary period.20 Richardson (2010) notes that the policy change may have more generally influenced how teachers allocated their time, and caused them to focus on lower achieving students they may have ignored before due to exempt status. In prior drafts I controlled for this more flexibly by allowing the special education effect to vary by year of test, and results were unchanged.
An additional difficulty is the inability to observe parental behavior changes correlated with the level of treatment. Pollution could have observable impacts such as lowered performance or difficulty concentrating in class, which could in turn cause parents to change the allocation of resources. Almond, Edlund, and Palme (2009) found that, in the case of cognitive damage due to radiation exposure in utero, parental behavior reinforced the effects when considering differences between siblings. If this extends to my analysis, it would suggest the effect I find might be partially driven by adjusted parental response.
VIII. Conclusion
I find a statistically significant relationship between prenatal pollution exposure and educational outcomes, specifically performance on standardized high school exit exams. Results are statistically significant only in the periods of the most drastic pollution variation, suggesting a subtle relationship that may be difficult to separate from background trends using minor differences in pollution across counties or gradual changes driven by time. OLS results show a standard deviation decrease in the average annual ambient TSP level during the year of birth is associated with an increase of 2 percent of a standard deviation in test scores, and just under one percentage point increase in countywide test passage rates. Instrumental variables results suggest the same drop in TSPs is associated with an almost 6 percent of a standard deviation increase in test performance and an increase in county passage rates of almost three percentage points. I also address the issues of selection into motherhood and migration.
When gauging the magnitude of such estimates, one must consider the measurement of exposure—the average TSP level in the year of birth. The impacts should not be considered the effect of a brief shock (such as, for example, the effect of pollution on weekly mortality), but rather the impact of an overall lower level of pollution exposure during the entire process of fetal development. As an additional frame of reference, consider the recent finding in Rockoff (2004) that moving one standard deviation up in the distribution of teacher quality raises same-year test scores by approximately 10 percent of a standard deviation. Though the one-time impact of this finding makes it less directly comparable to the long-run effects found with pollution reduction, the magnitudes are nevertheless an interesting comparison.
Infants that survive in higher pollution environments may not escape the consequences of exposure simply because they avoid becoming low birth weight or mortality statistics. Instead, they continue to suffer the effects years later in the form of reduced educational performance. Given that such performance may impact total educational attainment, lifetime earnings, health, and longevity, there are substantial policy implications. For example, Currie and Thomas (2001) find a standard deviation increase in test scores is associated with 11–14 percent higher wages and a 3–7 percent higher employment probability at age 33. And if socially marginalized groups are more likely to grow up in polluted environments, pollution exposure may help to partially explain differences in test scores and other long-run outcomes seen across races and socioeconomic groups, and environmental improvement may help to close such gaps. As noted by Reyes (2007), environmental policy and social policy may at times be one and the same.
Footnotes
Nicholas J. Sanders is an assistant professor of economics at the College of William & Mary, but completed this research as a postdoctoral fellow at the Stanford Institute for Economic Policy Research, Stanford University. He thanks Hilary Hoynes, Christopher R. Knittel, Douglas L. Miller, Jed T. Richardson, and seminar participants at the All University of California Labor Conference, the University of California, Davis, Sacramento State University, Sonoma State University, Lewis & Clark College, University of Tennessee, Williams College, Stanford University, the Atmospheric Aerosols & Health Group, the NBER Summer Institute, the NBER Labor Meetings, and the SIEPR Postdoc Conference for helpful comments. Student-level data used in this analysis contain year of birth, information that is not publicly provided by the Texas Education Agency. It is, however, available for purchase upon request. The author may be contacted for further details at < njsanders{at}wm.edu > .
↵1. See, for example, Friedman, Powell, Hutwagner, Graham, and Teague (2001); Chay and Greenstone (2003a); Chay and Greenstone (2003b); Neidell (2004) ; Currie and Neidell (2005) ; Ponce, Hoggatt, Wilhelm, and Ritz (2005) ; Lleras-Muney (2010) ; Neidell (2009); Currie, Hanushek, Kahn, Neidell, and Rivkin (2009a); Currie, Neidell, and Schmieder (2009b); Knittel, Miller, and Sanders (2009) ; Currie and Walker (2011) ; Moretti and Neidell (2011) ; and Sanders and Stoecker (2011).
↵2. I have repeated my analysis using distances of 10 and 30 miles, and results are largely similar across distance choice. These results are not included but are available upon request.
↵3. Monitors that take few readings per year have a disproportionately large number of readings taken during the winter.
↵4. See Martorell (2004) for a more detailed discussion of the TAAS exit exams, and Haney (2000) for discussion of difficulty of the exam across test years.
↵5. Results for reading scores were also significant but are omitted for simplicity.
↵6. In the cleaning of test data, I drop all students with missing covariates, nonstandard test administration, English as a second language and migrant students (approximately 200 students per year classified as frequent movers by the TEA). I also calculate an estimated “age at test” using the year of the exam minus the listed birth year and drop all students with ages suggesting coding errors (calculated ages younger than 15 and older than 18).
↵7. I drop schools with pupil-teacher ratios that are likely “coding errors,” where I call any given year a coding error if that year’s pupil-teacher ratio is at least three times the size of the average of all other years at that school.
↵8. From 1977 to 1978, the annual TSP mean dropped by approximately 15 μg/m3, which is likely attributable to the sizable temporary spike in pollution levels seen in 1977. This may have been caused by dust storms that took place in February and March of that year.
↵9. A possible exception is if job loss causes parents to spend more time at home, which has positive effects via either (1) reduced stress from not working during pregnancy, or (2) increase time spent with the child shortly after birth. Both such potential effects seem small in comparison to the additional stress and income loss caused by loss of employment.
↵10. Texas did not record the education level of either the mother or father during the period of interest. In prior drafts, I considered the proportion of mothers who are married and the number of prenatal visits as well. No differences in trends were visible.
↵11. http://pewsocialtrends.org/2009/03/11/sticky-states/.
↵12. For example, one cell would be nonspecial education white students on free lunch at school s born in year b taking the exam in year t.
↵13. Rising oil prices helped Texas partially avoid the earlier stages of the industrial recession, but by 1981, external oil supply increased substantially, leading to a rapid decrease in the real price of oil.
↵14. Chay and Greenstone (2003b) focus on the 1980–82 period, which is the time frame of the greatest variation on a nationwide level, and use a first-difference approach with an alternative instrument. Texas had the recession hit slightly later due to the oil price changes discussed in Section V, hence the later time frame of my analysis.
↵15. Failing the exit exams on the first attempt may drive changes in future student behavior, though prior research suggest such effects are not present for the Texas exams taken in tenth grade (Martorell 2004).
↵16. I obtain this statistic using the user-written Stata program xtivreg2 (Shaffer 2010).
↵17. The maximal critical values provided in Stock and Yogo (2002) are used to bound asymptotic bias and true rejection rates in the presence of potentially weak instruments. See Baum, Schaffer and Stillman (2007) for a discussion of LIML weak identification values and the use of the Kleibergen-Paap F-statistic.
↵18. This coefficient size is robust to choice of covariate sets, though substantial statistical power is gained with the addition of controls for population density. Results are available upon request.
↵19. Average passage rates are calculated using all first time test-takers.
↵20. This policy change means that considering the probability of a student being special education as an additional outcome is infeasible.
- Received July 2011.
- Accepted October 2011.