## Abstract

The probability of dropping out of high school varies considerably with parental education. Using a rich Canadian panel data set, we examine the channels determining this socioeconomic status effect. We estimate an extended version of Carneiro, Hansen and Heckman (2003)’s factor model, incorporating effects from cognitive and noncognitive ability and parental valuation of education (PVE). We find that cognitive ability and PVE have substantial impacts on dropping out and that parental education has little direct effect on dropping out after controlling for these factors. Our results confirm the importance of determinants of ability by age 15 but also indicate an important role for PVE during teenage years.

## I. Introduction

In this paper we investigate the forces determining the high school dropout decision using a rich panel data set that includes survey responses from children, parents, and high school administrators. Following recent work by Heckman, Stixrud, and Urzua (2006); Cunha and Heckman (2007, 2008); and Cunha, Heckman, and Schennach (2010)—hereafter, HSU, CH07, CH08, and CHS, respectively—we use a factor based model to capture the effects of pre-high school skill investments as reflected in cognitive and noncognitive ability measures. Our work confirms their findings of the importance of these abilities (and the investments they reflect), but it also identifies and measures the effects on dropping out of a third unobserved factor: parental valuation of education.

The idea that how much parents value education—either in terms of its intrinsic worth or in terms of their perceptions of the economic returns—affects their children’s educational outcomes is neither surprising nor new. Parental aspirations for their children’s education were seen to be a key factor of the “Wisconsin Model” of educational attainment developed in the 1960s, ’70s and ’80s (Davies and Kandel 1981). Recent work by Attanasio and Kaufmann (2009) and CHS also suggests that parental expectations and influences have a nonnegligible effect on early education decisions. We provide further evidence that parental aspirations are strongly correlated with dropping out of high school. However, the interpretation of these results is complicated. Parents’ answers to questions (such as we use) about the level of education they “hope” their child will attain could reflect their own valuation of education in general, an assessment of their child’s own capabilities, or some combination of the two. If they reflect the former, this would call for policy responses that can substitute for parental and family influences. If, instead, the answers reflect insider knowledge about a child’s own abilities, then policies should focus on how to generate those abilities in the first place. We address this central identification problem using a factor-model approach and provide a clear statement of the conditions under which our estimator permits identification of the causal impact of parental valuations on children’s dropping out. In particular, we show that under a standard index sufficiency condition, estimates of parental valuation effects, conditioning on a child’s existing stock of cognitive and noncognitive skills at age 15, can be interpreted as a direct impact during the teenage years. Specifically, the sufficiency condition requires that the dropout decision depends only on the cognitive and noncognitive skill index values at age 15, not on how those values were generated. In this sense, our work is complementary to Todd and Wolpin (2006), CH07, and CHS, which focus on the dynamic production of cognitive and noncognitive skills up to age 14.

One key feature in data on dropping out of high school is the strong correlation between family socioeconomic status and dropping out (for example, Eckstein and Wolpin 1999; Belley, Frenette, and Lochner 2008, for the United States and Canada respectively). In Canadian data, described below, teenage boys with two parents who are themselves high school dropouts have a 16 percent chance of dropping out, compared to a dropout rate of less than 1 percent for boys whose parents both have a university degree. Because of this, we implement a factor model that incorporates flexibility in parental education effects. More specifically, we implement a flexible version of the factor model presented by Carneiro, Hansen, and Heckman (2003)— hereafter, CHH—in which we explicitly account for nonlinearities in skill production by allowing the distributions of skills and parental valuations to vary in shape, as well as location, with parental education. Nonlinearities may reflect socioeconomic differences in endowments and investments, meaning that teenagers from highly educated families and those from less-educated families may be drawing from different ability and parental valuation distributions.

We use the Youth in Transition Survey (YITS), a rich longitudinal data set surveying a sample of over 20,000 Canadian teenagers. All youth were given the PISA reading aptitude test at age 15 (in the year 2000) and were then reinterviewed every two years thereafter. The key dependent variable is whether the child is a high school dropout (that is, is no longer in school and has not graduated) at age 19. We focus on the high school dropout decision partly because it has important implications for lifetime outcomes of youth (see, for example, Campolieti, Fang, and Gunderson 2009) and partly because, with a negligible direct cost of staying in high school, it allows us to focus on issues other than short-term liquidity constraints. The latter are central to discussions of educational choices beyond high school but their presence may obscure investigation of other factors, particularly relating to parental influences, that are more cleanly isolated in our context. As we will see, family income has very small direct impact on dropping out of high school.

The YITS also includes surveys of a parent and of a school administrator when the child is 15. It contains a long list of questions related to individual characteristics often seen as reflecting noncognitive abilities, as well as questions related to peers, the home environment, and aspirations. Factor models provide an ideal vehicle for examining data of this type, which consist of many noisy measures of individual characteristics. We set out a system containing a dropout indicator function augmented by a set of measurement equations related to key underlying factors, interpreted as the stock of pre-existing abilities (both cognitive and noncognitive) and parental valuation of education. The parental valuation factor is constructed to be correlated with measures of parental aspirations for their children’s education and their willingness to save for that education after controlling for family income level. Thus, it is a factor related to parental valuation of education as revealed in parental statements and actions. Identification of the impacts of the factors in this class of models is obtained through covariance restrictions. We begin our investigation with a simple model of the dropping-out decision that allows us to describe and defend our identifying restrictions.

We find that the highest ability students are predicted never to drop out regardless of parental education or parental valuation of education. However, for teenagers with medium and low cognitive ability, family background plays an important role in explaining their dropout decisions. For the least skilled children with parents who are themselves high school dropouts, whether their parents value education or not makes a strong difference in their chances of dropping out. A low cognitive-ability teenager whose parents are high school dropouts has a probability of dropping out of 0.045 if his parents place a high value on education, and 0.40 if his parents’ valuation of education is low. Interestingly, when parents place a high value on their less-skilled children’s education, we find that parents’ own education level does not affect dropping out. Thus, differences in dropping out by parental education status arise because families with diverse education backgrounds have different distributions of abilities and of parental valuation of education. Interpreted within our factor model, this is evidence that parental education does not have a direct effect on dropping out during the teenage years, though it could still play a role in developing cognitive and noncognitive abilities up to age 15.

The paper is organized as follows. Section II contains a brief discussion of previous literature. In Section III, we present a simple life-cycle model describing the decision to drop out from high school. In Section IV, we map the model to data, setting out an estimable counterpart. In Section V, we describe the data, and Section VI contains results from the estimator described earlier. In Section VII, we summarize and conclude.

## II. Previous Literature

Our paper fits within a rich literature examining the high school dropout decision, particularly for the US. In papers dating back to the 1960s, researchers developed variants of what came to be called the “Wisconsin Model” of educational and occupational attainment (Sewell, Haller, and Portes 1969; Alexander, Eckland, and Griffin 1975; Haveman, Wolfe, and Spaulding 1991). A key element of this model was its emphasis on the development of educational aspirations during adolescence and the importance of parents and peers in shaping those aspirations. Parental aspirations for their children were seen to be of particular importance (see, for example, Davies and Kandel 2008).

Within the more recent economics literature, Eckstein and Wolpin (1999) uses a structural dynamic choice framework to examine dropping out in the Unites States. It finds that dropouts have lower ability and motivation as well as lower expectations about rewards from graduation. Todd and Wolpin (2006) investigates the form of the production function for cognitive skills using data from the NLSY79 Children Sample, focusing on implications for racial gaps in test scores. It finds that mother’s ability (as measured by the AFQT test) has a large impact on test score outcomes. Oreopoulos, Page, and Stevens (2006) finds that increases in parental education, stemming from changes in compulsory school laws in the Unites States, result in reduced dropping out for their children, though Black, Devereux, and Salvanes (2005) finds only limited evidence of such effects in Norwegian data.

Several papers find an association between parental education and time and money inputs into children’s education and then between those inputs and children’s educational outcomes. Carneiro et al. (2013) finds a causal impact of increased parental education on time inputs such as time spent reading to a child and whether the child has been taken to a museum. Haveman, Wolfe, and Spaulding (1991) and Del Boca, Flinn, and Wiswall (2014) argue that more time spent with a child, particularly when the child is young, improves educational outcomes.

Our paper contributes to this literature by focusing on the unobserved channels through which socioeconomic status operates. We distinguish between parental educational plans for a specific child (which partly reflects their assessment of the child’s abilities) and parents’ valuation of education in general. Thus, parents may generally value education very highly while also having low aspirations for what a child will achieve because they believe that child has low abilities. This distinction means that we can, under certain assumptions, test whether parents’ education and the value they place on their children’s education have a direct effect on children’s behaviour during their teenage years. Under our approach, increased parental time inputs are a reflection of parental valuation of education.

## III. Determinants of Dropping Out

### A. Model Outline

In this section we set out a model of the decision to drop out of high school in an intertemporal optimizing framework. We use the model to specify exactly what we are estimating and how to achieve identification. Before describing the model it is helpful to briefly sketch the nature of the data we employ because the ultimate goal of the theoretical framework is to guide our empirical analysis.

#### 1. Data preliminaries

We use data from the Youth in Transition Survey (YITS). The YITS is a longitudinal survey that tracks the experiences of two cohorts of Canadian youth. It provides a rich panel of information on the participants’ demographic background, their participation in education and work, as well as their beliefs, attitudes, and behaviours. The youngest cohort was 15 years old when the first cycle of data was collected in 2000. Because schooling is legally required up to age 15, we use data from this cohort. The first cycle of data therefore provides a means to characterize a baseline. In the YITS, participants are surveyed every two years. We also use data from the third cycle when the youth were 19 years old.

The YITS is useful, in part, because it includes a parents’ survey completed by the parent or guardian who identified him or herself as “most knowledgeable” about the child. The responding parent provided data about their and their partner’s education, work, and income. Parents also answered questions about their attitudes toward, and aspirations for, their children. At the time of the first survey the children also completed a reading test that was administered through the Programme for International Student Assessment (PISA). PISA was an effort, coordinated by the Organisation for Economic Co-operation and Development (OECD), to generate internationally coherent measures of cognitive skills. We use data from the PISA reading cohort. The YITS also includes scores for math and science, but while all of the students wrote the reading test, only half of them wrote either the math or science test.

#### 2. A simple decision problem

Teenagers are assumed to make the dropout decision rationally based on expected returns given their levels of ability and their information on the returns to education. We recognize that modeling teenagers as rational, forward-looking agents may stretch credulity, so we also modify the model to allow parents to enforce a minimum effort level.^{1}

In setting out the model, we divide individual lives into three periods, numbered from zero to two. The middle, or teenage, period (Period 1) corresponds to the time after the legal school leaving age (16 in most Canadian provinces in our sample period) and before the typical graduation age (18). The dropout decision is made in this period and we model it as conditional on the ability the teenager has accumulated in Period 0 (that is, up to age 15) and on expected returns to high school graduation in the future (Period 2). We do not model optimizing decisions in Period 0 but we begin with a description of that period because assumptions related to the generation of ability in that period are relevant for the interpretation of our estimates.

### B. Period 0: The “Shaping” of Teenagers

We assume that a child is endowed with an ability vector, θ_{0}, at birth. The vector has two elements, corresponding to cognitive and noncognitive ability, and is determined by,

(1)

where *f*_{1}(.,.) is a (possibly nonlinear) function, θ* _{F}* is a (2 × 1) vector of hereditary cognitive and noncognitive abilities characterizing a family and ι is a vector of individual-specific traits that are randomly assigned.

Ideally, we would like to separately account for the impact of youth’s ability and observable parental characteristics, such as education, on the dropout decision. However, this is complicated by:

1. The fact that parental education is likely a function of θ* _{F}*. Indeed, we will assume that parental education (PE) is determined as,

(2)

where ν* _{p}* corresponds to parental valuations of education and η summarizes all factors contributing to

*PE*, which are orthogonal to θ

*and ν*

_{F}*. To be more specific, we interpret ν*

_{p}*as reflecting parents’ beliefs about returns to education, with parents who believe that returns are higher having higher values for ν*

_{p}*. One could, alternatively, interpret ν*

_{p}*as a preference parameter reflecting parental taste for education. There is nothing in the data that would allow us to delineate these interpretations and changing the interpretation does not affect the nature of our estimator.*

_{p}2. The ability we observe in the data—ability at the start of the teenage period, denoted as θ_{1}—will itself be a function of parental inputs. In particular, we assume it is a function of θ_{0}, of parental education (either because it reflects family income effects or because hours of parental time from educated parents are more effective in generating children’s ability), and of parental valuation of education (because it helps determine how much effort parents invest in improving their child’s ability). That is,

(3)

where, following CH07 and CHS, it is possible that cognitive ability at age 15 is a function of both endowed cognitive and noncognitive abilities, and the same is true for age 15 noncognitive ability.

### C. Periods 1 and 2: The Dropout Decision and After

In Period 1, a teenager has two options: study toward a high school degree or work at the market wage for dropouts (denoted as wage *w _{LHS}*). To simplify the discussion, we assume that

*w*is the minimum wage and, therefore, is not a function of a person’s abilities. In Period 2 (representing the remainder of life), dropouts earn

_{LHS}*w*. The Period 2 earnings for graduates are higher and we will assume they are determined by,

_{LHS}(4)

where *grd* is a measure of academic performance, and α_{2} are scalars, and α_{1} is a vector.

The superscript *p* in indicates that the above equation represents a prediction conditional on the information available to the teenager and his or her family. We allow the information about market returns to differ across families and youths, by specifying,

(5)

where ν* _{p}* denotes the heterogeneous parental valuation of education. Thus, children’s notions of the returns to education increase with their parents’ perceptions of the same. Parental education is included on the assumption that more-educated parents may have better information on the returns to education (Junor and Usher 2003). Note that while this specification incorporates predictions about future returns, the model still doesn’t have any important uncertainty. Each family acts as if it knows the returns to education; it’s just that what they claim to know differs across families.

The academic performance measure in Equation 4 is determined by,

(6)

where, the ψs are parameters or vectors of parameters as required. Thus, academic performance is potentially determined by school inputs, *x _{s}*, the child’s abilities, his or her effort level,

*e*, and by parental inputs determined by

*PE*and ν

*. The combination of Equations 4 and 6 implies that the return to effort in school comes in the form of higher earnings in Period 2. We do not explicitly model the related choices and outcomes in Period 2 but differences could arise, for example, if higher grades raised the probability of going to university with its attendant higher earnings. Equation 4 can be seen as a linearization of such processes. This linearization is also consistent with a richer, less restrictive model of earnings.*

_{p}### D. Utility

We assume linear utility of consumption *U*(*c*) = *c*, in order to focus on perceived payoffs. Effectively, an agent chooses a consumption level *c _{t}* in each period

*t*ε {1, 2} by choosing whether or not to stay in school in Period 1. Labor is inelastically supplied in each period. The labor endowment in the second period,

*n*, reflects the expected length of working life after age 18. Labor income is consumed in full during each period and agents have no means of transferring wealth between the two periods, with the noticeable exception of completing education. We assume that student consumption in Period 1 is based on transfers from their parents determined by a combination of parental permanent income,

*PI*, and current family income

*FI*. This allows transitory income shocks to have an impact on education decisions as one might expect in the presence of credit constraints (Coelli 2011). Students cannot increase their Period 1 consumption by working and optimally choose “schooling effort,”

*e*, which has an impact on their future earnings. Effort affects utility negatively and, for convenience, we assume that it enters utility additively through a general function

*g*(

*e*) = −[γ

*e*+ (1/2)

*e*

^{2}], with –γ > 0 being a minimum level of effort. While not specifying their utilities directly, we assume that parents value their children’s consumption and are willing to exert their own effort to induce education effort by the children. Because of this, we assume that the minimum effort level for the child is a positive function of ν

*, implying that even myopic students may supply enough effort to graduate if their parents believe that returns to education are high.*

_{p}### E. Empirical Specification for the Dropout Decision

The decision of whether to drop out of high school is determined by the difference between the lifetime utilities associated with dropping out and graduating, evaluated at the optimal effort level. Lifetime utility for an agent who does not drop out can be written as,

(7)

where *f _{s}*(

*PI, FI*) is a function of parental permanent income and current income denoting consumption by a youth while in school, and β is the discount factor. Given the objective functions and constraints, we can easily show that the optimal effort of a student is

(8)

Optimal effort is a function of patience and expected working life of an agent, as well as the importance of effort in determining schooling outcomes. It will also be affected by parental valuations of the returns to education and peer characteristics through their impacts on γ.

Lifetime utility for an agent who drops out of high school in Period 1 is simply,

(9)

where *f _{w}*(

*PI, FI*) summarizes consumption transfers to a dropout youth in Period 1.

When deciding whether to drop out (*d* = 1) or not (*d* = 0), an agent examines the difference between *V _{S}*(

*e**) and

*V*, where

_{W}*V*(

_{S}*e**) is the value of staying in school while providing the optimal effort,

*e**. Assuming that

*f*(×) and

_{S}*f*(×) are linear and that parental permanent income,

_{W}*PI*, can itself be approximated as a linear function of parental education, this difference is given by,

(10)

where θ_{11} and θ_{12} are two ability factors that we will loosely call “cognitive” and “noncognitive” ability respectively, and *u*_{0} is an error term that incorporates an idiosyncratic component of current utility as well as any added randomness associated with the grade function and second period earnings for graduates. This index function completely determines dropping out, with *d* = 1 if *I _{D}* > 0, and it is the basis of our estimation. Notice that because we have substituted in for optimal effort, variables such as hours of studying do not belong in the index.

Our main interest is in estimating γ_{2} and λ* _{dv}*: the impacts of parental education and parental valuation of education. We treat the dropout decision in the teenage years as conditional only on the actual values of θ

_{11}and θ

_{12}, not on the specific combination of innate ability and family inputs that generated those values. If we assume, in addition, that there are no further factors that are both relevant for education decisions and a function of ν

*and*

_{p}*PE*, then the vector θ

_{1}is a sufficient statistic for all the education-related decisions that were made before the teenage years. Under these assumptions, the estimated effects of parental education and ν

*are interpreted as effects not already reflected in θ*

_{p}_{11}and θ

_{12}—that is, as new and incremental effects on the dropout decision after age 15. In the data, we observe proxies for the θs and ν

*at the same time (age 15). Thus, we effectively identify the parental valuation effect by comparing children with the same ability values but differences in parental valuation of education. This means that we are using variation in parental valuation of education that is orthogonal to the child’s abilities at the start of the teenage years. Given all this, we are able to interpret ν*

_{p}*effects as reflecting the direct impact of parental valuation of education (operating, perhaps, through channels such as inducing children to put in more effort) rather than “insider information” about children’s abilities. Such an interpretation requires the additional assumption that our parental valuation measures do not reflect predictions about future changes in children’s abilities past what would be predicted based on their abilities at age 15. In general, these observations highlight that we are not estimating the total impact of parental education and ν*

_{p}*since both could have played a role in the production of the age 15 abilities.*

_{p}Estimation of Equation 10 is complicated by the fact that we do not directly observe θ_{11},θ_{12} or ν* _{p}*. In the next section, we present empirical approaches to address this problem using the model to interpret what we obtain from each approach.

## IV. Empirical Strategies

The discussion in the previous section suggests that estimation of Equation 10 without accounting for the θ and ν* _{p}* factors will imply biased estimates of γ

_{2}because of the predicted correlation of parental education with those factors. One possible solution to this problem is to introduce a proxy for each of the unobserved factors. To understand the issues with this approach, consider a simplified example where dropping out depends only on one ability factor that is related to a test score. Our data includes results for students taking the PISA tests at age 15 (described in the data section). Assume that the test score is generated according to,

(11)

where *PISA* is the PISA test score. In this equation, and the other measurement equations that follow, the δs and λs are either parameters or vectors of parameters, as required, and the *u*s are error terms that are assumed to be independent of covariates, the factors and the error terms in all other equations. Equation 11 says that the test score is a reflection of the true value of ability at age 15, observed with error.^{2} We loosely denote θ_{11} as “cognitive” ability as it features in a cognitive test score equation but describe its actual content in more detail below.

Consider estimating a regression specification for dropping out that includes PISA as a proxy for θ_{11}. To derive such a specification, we can solve Equation 11 for θ_{11} and substitute the result into Equation 10. The resulting specification will include PISA as a covariate but *u*_{1}, the error term determining PISA, will also appear in the error term of the new specification. Thus, estimates will be inconsistent. In particular, the coefficient on *PE* will reflect ability effects because the part of ability not fully captured in the test score is correlated with *PE*. We could address this problem and obtain consistent estimates if we had a second proxy for θ_{11} and used it as an instrument for PISA in the dropout equation (Chamberlain 1977). This is a reflection of the key point made by CHH that we can obtain consistent estimates if we have at least two proxies related to each factor. In our simple single factor example, one can show that the CHH systems estimator and the IV estimator using one proxy as an instrument for the other are equivalent. The CHH systems estimator goes further in allowing consistent estimates in the presence of multiple unobserved factors and, as that is our situation, we employ its estimator to obtain the main results in this paper. We also present some initial results using the simple proxy estimator. We view the latter as essentially a reduced form way to characterize the main patterns in the data, allowing the reader more direct insight into the variation we are using than is easily obtainable from the systems estimator.

Like many panel data sets, the YITS includes a large set of background variables, with the number expanded by the fact that parents and children are asked separate sets of questions. CHH proposes using extensive sets of variables such as these to construct a system of measurement equations in the spirit of factor analysis to identify and control for the effects of latent factors. As just stated, we require at least two such measurement equations related to each factor along with the main estimating Equation 10. This system is estimated jointly, imposing identifying covariance restrictions, which we discuss below.

For the case of the “cognitive” ability factor, the first measurement equation we use is the one for the PISA score, Equation11. Another measurement of cognitive skills in the YITS is provided by students’ grades reported at age 15. An expression for this can be obtained by substituting the solution for optimal effort into Equation 6, yielding,^{3}

(12)

Specifying PISA as being related only to θ_{11} is a key identifying assumption. Several papers have shown that even results on low-stakes tests such as the PISA are related to individual traits such as a desire to please as well as cognitive ability. Because of this, we interpret θ_{11} as a combined factor that reflects all abilities that students employ in cognitive tasks, even if they would not normally be called pure cognitive ability. Our second factor, θ_{12}, will then capture other noncognitive traits that are, because of the way they are introduced into the estimator, orthogonal to θ_{11}. Essentially, we are not concerned with cleanly delineating cognitive and noncognitive abilities but, rather, are interested in capturing a combination of them as completely as possible so that we can hold them constant while isolating the effects of parental valuation of education. In this regard, it is important that Equation 11 also embodies an assumption that parental valuations (ν* _{p}*) do not affect the PISA score. We make this assumption because, as a test that does not affect school outcomes, it is not of direct concern to parents. In contrast, grades are potentially influenced by personal traits not related to cognitive tasks and parental valuations because, for example, teachers may reward noncognitive skills or high valuation parents may help children with their homework. Equation 12 allows for those effects.

To provide supporting evidence for the latter assumption—that grades are a function of parental valuations but PISA scores are not—we estimate versions of Equations 11 and 12 using observable proxies in place of unobservable factors. The results from these regressions are reported in Appendix 1 Table A1. We use PISA reading scores to proxy for cognitive ability. To proxy for parental valuations of education we use a measure of parental aspirations based on a variable built from parents’ responses to a question about the level of education they hope their child will achieve (more details on this variable are provided below). In addition to the PISA reading scores, we also have—and use—PISA math scores for half of the students and PISA science scores for the other half. We first regress PISA math scores on reading scores, parents’ aspirations for their children’s education, and controls for family characteristics and noncognitive skills. If cognitive related skills can be summarized by a single index then parental aspirations should have no impact on PISA math scores after controlling for the reading scores. That is, in fact, what we find. Children whose parents hope they will attend university score only two points higher on the PISA math test when compared to similar children whose parents have lower educational aspirations, and this effect is not significantly different from zero at any conventional significance level. To put these results into context, the mean math score is 545 and the standard deviation is 80 points. For comparison, we also regressed the students’ grade 10 math grades on the same set of covariates. In this regression, parental aspirations do have a statistically significant effect (at the 1 percent significance level): On average, math grades are 4.24 points higher (which is roughly one-third of a standard deviation) for children whose parents have university aspirations for them. We get very similar results when we repeat this exercise using PISA science tests and science grades. We view this evidence as providing substantive support for a specification in which PISA scores depend only on ability while school grades have more multi-faceted influences.

Choosing measurements for the second, “noncognitive”, element in the ability vector is complicated by the fact that noncognitive abilities are heterogeneous and difficult to reduce to one factor. Borghans et al. (2008) argues for classifying these abilities into the Big Five factor scheme favored by some psychologists. However, they also present evidence that among the Big Five factors “conscientiousness” is strongly related to education outcomes while several of the others are not. Rather than try to extract a factor from a set of disparate questions, we restrict ourselves to questions related to conscientiousness. Conscientiousness is associated with being achievement-oriented, self-disciplined, and confident. As a primary proxy for this, we use a question asking students how often the statement “I do as little work as possible. I just want to get by” is true for them. We code a variable equaling 1 if they answer “Never” and assume this is determined by an underlying index function,

(13)

Whether a child provides only the minimum effort depends on their level of conscientiousness, θ_{12}, but also on parental valuation of education since parents who value education highly may pressure children to do more than the bare minimum. Following CH08 and CHS, we assume that the current value of this measurement variable reflects only noncognitive ability, θ_{12}, though cognitive abilities may have been an input into the production of θ_{12} itself in the past.

Our second measure of noncognitive ability is based on a question asking the student whether he completes his assignments. This is related to the organization and goal-oriented dimensions of conscientiousness. We specify the index function determining this variable as,

(14)

where we have again assumed parents have an effect on achieving education-related outcomes, such as handing in homework.

As a measure for ν* _{p}*, we use the parental aspirations variable we introduced earlier. We will call that variable

*parasp*and assume it is determined according to,

(15)

The aspirations that parents hold for their children’s education are clearly a function of the child’s ability, which is reflected in Equation 15. It is worth emphasizing that ν* _{p}* is by construction orthogonal to both components of θ

_{1}. This presupposes that parents’ valuation of education is separable from their child’s ability. One might think of ν

*as the answer a parent would give to the aspiration question before their child was born. Put another way, if parents had insider information about their children’s abilities, that knowledge would not be reflected in ν*

_{p}*(unless there were other unobserved skills that are orthogonal to θ*

_{p}_{11}and θ

_{21}). In the results section we present direct evidence that estimates of ν

*are uncorrelated with other observed measures of skills not included in our model, specifically math and science test scores.*

_{p}The second measure of parental valuations is an equation corresponding to parents’ answers to a question about whether they have saved for their child’s future education. We use this as a dummy variable, the value of which is determined by an underlying index function,

(16)

Thus, holding family income constant, parents who value education more highly will likely save for their children’s education. As with the *parasp* variable, savings behavior may partly reflect parents’ information about child’s ability.

Together, Equations 10–16 constitute a system in which the dropout process is specified jointly with measurement equations that help identify the role of abilities and the parental valuation factor. CHH discusses the conditions under which one can obtain identification for all the factor loadings on θ_{11}, θ_{12}, and ν* _{p}* in these equations as well as the parameters that define the distributions for θ

_{11}, θ

_{12}, and ν

*. In particular, in our system we obtain identification if one of the measurement equations includes only one of the factors. This condition is satisfied by the*

_{p}*PISA*equation, which includes only the θ

_{11}factor—an assumption we explained, and indirectly tested, earlier. We also need to normalize one of the loadings for each factor to one. We set λ

_{T}_{θ}

_{1}(the loading on θ

_{11}in the

*PISA*equation), λ

_{s}_{υ}(the loading on ν

*in the*

_{p}*saved*equation), and λ

_{c}_{θ}

_{2}(the loading on θ

_{12}in the

*hmwork*equation) to one.

Identification of the factor model also requires that the errors and factors are orthogonal to the observable characteristics determining dropping out and the measurements.^{4} As our model makes explicit, we expect that parental ability and education are inputs in the development of children’s ability. Moreover, as the results in Abbott et al. (2013) and CHS suggest, the relationship between parental and child ability is nonlinear. In our model, if the factor equations (1)–(3) are linear then the shape of the factor distribution is the same for all values of PE and we can write the likelihood with an estimated factor distribution not conditional on PE, which would be the standard version of this type of estimator. However, if, for example, θ_{1} is a nonlinear function of PE but we implement the more standard version of the estimator, then the effect of PE in determining the shape of the θ distribution could be reflected in the coefficient on PE in the dropout and measurement equations.

To address this issue, we specify and implement an “extended” factor estimator in which the points of support for the factor distributions are the same for every observation but the probabilities associated with those points are allowed to differ by parental education. If the factors and errors are orthogonal to the observed covariates conditional on parents’ education, then the identification proofs that are outlined in CHH will hold. To understand the intuition behind this extension, consider a model in which we fully condition on parents’ education. Running the factor model separately for each parental education category amounts to the standard factor model described in CHH. To reflect this intuition, we include parental education in all of our measurement equations. In our model, we restrict the coefficients on observed covariates, as well as the factor loads and locations, to be the same for each parental education category so that it is possible to assess the direct effects of parents’ education. For comparison with the extended model, we report results from a standard factor model where the factor distributions are constrained to be the same for all levels of parental education.

We have 17 parameters related to the factor distributions to identify (counting the factor loadings that have not been normalized plus the variances of the factors). We have 19 unique covariances that are allowed to be nonzero in the structure among the errors of the dropout equation and the six measurement equations. Thus, the order condition for identification is met. The rank condition corresponds to whether the specific pattern of entry of factors in the various equations allows us to recover all 17 parameters. This is indeed the case. The different probability weights are identified by the extent to which the distributions of our factor-related measures do something other than simply shift proportionally when parental education changes.

We estimate the parameters using maximum likelihood, specifying the factors as having discrete distributions. Conditional on specific values for the factors, an individual’s contribution to the likelihood function is just the product of normal CDF evaluations (as all the dependent variables are actually discrete). This product is calculated for each possible combination of values for the factors, then these factor-conditional products are each multiplied by the associated probability of observing that set of factor values and then summed. Details are discussed in Appendix 2. The factors provide a flexible way to link the various equations, representing the joint distribution as a flexible mixture of normals. Maximizing the likelihood function provides estimates of the γ and δ vectors as well as the factor loadings (λ*s*) and the locations of the points of support and the associated probabilities for the factor distributions. It also provides consistent estimates of γ_{2}, the direct effect of parents’ education. Allowing the distributions of the factors to depend on parental education introduces an additional channel for socioeconomic status to affect dropping out: Two students with the same abilities and parental valuation might have similar probabilities of dropping out even if their parents have very different education levels. Notice, however, that their ex ante probabilities of having those factor values could be very different.

## V. Data

As stated earlier, we use data from the Youth in Transition Survey (YITS). We focus on boys because the fraction of girls who drop out of high school in these data is very low. Reduced form results for girls are available from the authors. The original sample of 29,687 students was drawn from a two-stage sampling frame. Schools were sampled first. In the second stage, students were sampled within the 1,187 schools. Approximately 13 percent of the sample is lost due to nonresponse to the parental survey. The overall response rate to the third cycle was 66 percent. Some cases were also lost due to missing data or invalid responses to questions. The final sample is 7,755 boys. In all reported results, we use weights provided by Statistics Canada that account for oversampling, nonresponse to the parental survey, and longitudinal attrition.

We identify individuals as high school dropouts if, according to their self-report, they had not completed the requirements for a high school diploma and were not in school at the time of the Cycle 3 survey.^{5} The third wave of the YITS data was conducted between February and June 2004 when respondents were all age 19. In most provinces, this corresponds to the spring of the year following their normal graduation year.^{6}

The unconditional dropout rates at age 19 using our dropout definition are 0.055 for boys and 0.036 for girls. These compare with numbers from the OECD showing that 11 percent of 20 to 24-year-old Canadians (both genders combined) have not completed high school and are not currently in school (de Broucker 2005). Our rates are lower than the OECD numbers partly because some students who have not yet graduated at age 19 and are still in school will ultimately drop out, causing the dropout rate to be higher in the 20–24-year-old age window in the OECD data. Belley, Frenette, and Lochner (2008) also uses YITS data and finds that at the fourth (age 21) wave, the dropout rate for both genders combined is 0.07. We focus on dropping out at age 19 because we believe it provides a clearer picture of the role of family supports on the dropout decision and because it reduces the amount of sample attrition we face. Lower dropout rates in the YITS could also relate to dropouts being more likely to attrite from the sample. Sample weights used in all of our calculations are supposed to account for this but may not do so completely.^{7} Finally, to place Canada’s experience in context, Belley, Frenette, and Lochner (2008) uses the NLSY to show that with a comparable definition of dropping out at age 21, the U.S. dropout rate is 0.17, that is, 0.1 higher than for Canada.

We describe our other variables as they arise in our estimation. A table of sample means is provided in Appendix 1.

## VI. Results

We begin by quantifying the observed socioeconomic gradient in the data before accounting for any unobserved heterogeneity. In all specifications we include (but do not report) province indicators. All standard errors are clustered at the high school level because of the nature of the sampling scheme. We measure the socioeconomic gradient by estimating a probit model including a series of variables capturing socioeconomic status of the child’s family. Key among these variables are parental education and income. Income is defined as total before-tax family income including transfers, expressed in thousands of dollars, and put into adult-equivalent form by dividing by the square root of the number of people in the family. Parental education is captured with a set of six categorical variables corresponding to the highest level of education achieved by both parents: (1) both parents are high school dropouts; (2) one parent is a high school graduate and the other is a dropout; (3) both parents are high school graduates; (4) both parents have a post-secondary education below the BA level or one has postsecondary education below the BA level and the other has a lower level of education; (5) one parent has a BA and the other has some lower level of education; and (6) both parents have at least a BA. Lone parent families are assigned to Categories 1, 3, 4 or 6, depending on the parent’s education. The sample means in Table A1 indicate that approximately 10 percent of the sample falls in each of Categories 1 and 6. We also include variables capturing family structure, with indicators corresponding to lone parent families, two parent families in which both biological parents are present, and “other” two parent families that correspond, essentially, to stepparent families, and other family types (the omitted category is a two-biological parent family). Lone parent families may face a “poverty of time” and other stresses that affect school completion. We include a dummy variable for whether the person lives in a rural (as opposed to urban) location and a variable corresponding to the number of times the family has moved in the child’s lifetime up to age 15. We would expect more moves to correspond to a weakening of social connections that may be important in school completion. Finally, we also include variables corresponding to whether the child is an immigrant and whether the youth is of aboriginal descent.^{8}

To preserve space we do not report the coefficients on all family-related variables. Those coefficients tend to be small in size and follow predicted patterns: Dropping out increases with the number of moves and aboriginal status and declines with immigrant status. An increase in income per adult equivalent for a family of four from $15,000 to $50,000 reduces the probability of dropping out by less than 0.01. This fits with results in Belley, Frenette, and Lochner (2008) indicating that while there are family income effects on educational attainment in Canada, they are not strong. We view parental education as related to permanent income of the family, therefore current income when controlling for parental education is something closer to transitory income. In specifications where we do not control for parental education, the coefficient on family income is twice as large.

In Figure 1, we plot predicted probabilities of dropping out for the strongest socioeconomic predictor: parental education. As the figure demonstrates, parental education is strongly correlated with dropping out. Relative to a student whose parents have a BA or higher (a person whose probability of dropping out is 0.007), a student both of whose parents are themselves high school dropouts has a 0.14 higher probability of dropping out. Youth whose parents have a high school diploma have a 0.05 higher probability of dropping out compared to those whose parents both have a BA. The main conclusion from the figure is that there is a steep gradient associated with parental education that points toward a calcification of educational differences across generations. Belley, Frenette, and Lochner (2008) shows that dropout gradients with respect to parental education and family income are steeper in the United States, but the evidence in this table indicates that intergenerational persistence is still an issue in Canada.

Next we present some reduced form evidence of correlations between observed characteristics and dropping out of high school. The purpose of this exercise is to demonstrate that the measures of skills and parental valuations that we use in the paper are strongly correlated with dropping out while other measures such as peer and school characteristics are not related to dropping out after controlling for socioeconomic status. In the first column of Table 1, we present the marginal effects on dropping out of parental income and education as well as PISA reading scores and the parent’s reported aspirations for their children. The reading test scores are entered in quartiles interacted with parents’ aspirations. Parental aspirations are measured by parental responses to the question “What is the highest level of education that you hope [child’s name] will get?” We code a dummy variable equalling one if the parent’s response was a “university degree or higher” and zero if their response corresponded to a “college degree or lower.”

We plot the predicted probabilities of dropping out (evaluated at the mean) for the various categories of PISA and parental aspirations in Figure 2. These patterns are interesting in themselves. When boys scored in the top quartile on the PISA reading test they were very unlikely to drop out. For these boys, a change in parental aspirations is associated with a small and statistically insignificant change in the dropout probability. Boys who scored in the top PISA quartile and whose parents hoped they would achieve a university degree had less than a 1 percent chance (0.008) of dropping out, compared to a 3 percent chance for similar boys whose parents expected a lower level of educational attainment.

In the bottom three PISA quartiles, parental aspirations have significant impacts that increase in magnitude as we move lower in the PISA distribution. Figure 2 indicates that high parental aspirations not only reduce the likelihood of dropping out but also flatten the gradient across the PISA quartiles. This happens in a non-linear way: The largest proportional reductions in dropping out occur for students in the 2nd and 3rd quartiles.

One way to put these results in context is to consider where along the PISA distribution the probability of dropping out is closest to the unconditional probability and how that differs by parental expectations. Overall, in the sample 0.055 of boys drop out. Boys whose parents have low aspirations will drop out at the unconditional average rate only when they have reading scores in the third quartile. If a boy’s parents hoped he would obtain a university degree, his chances of dropping out are similar to the average if he is in the bottom quartile of the PISA distribution. The overall implication is that high parental aspirations have a powerful influence on teenage educational outcomes for children whose base (that is, age 15) cognitive abilities lie in the lower three quartiles of the ability distribution.

It is worth noting that once PISA and parental aspirations are included in these reduced form estimates the socioeconomic gradient falls by roughly half. The difference in the dropout probability between those with two BA parents and those with two dropout parents is 0.13 when just controlling for income, parental education, and other socioeconomic variables but falls to 0.078 when also controlling for aspirations and PISA scores. The gradient conditional on PISA and parental aspirations is shown graphically in the second panel of Figure 1. This suggests that a substantial proportion of the parental education effects we estimated in earlier specifications partly subsume the factors measured by the aspirations and PISA variables.

In Column 2, we introduce the measure for noncognitive skills that we described earlier, and which takes on the value one if a child said that he never wanted to “just get by.” While this variable, which measures conscientiousness, is significantly related to dropping out, the effect is small in comparison to the effects of PISA and parental aspirations. Never wanting to just get by reduces the probability of dropping out by 0.024. The socioeconomic gradient as well as the PISA and parental aspiration gradients are relatively unchanged after including this proxy.

In the next column of Table 1, we add our second measure of noncognitive skills (an indicator variable equalling one if the student reports he always completes his assignments) and two scale measures of other noncognitive traits (self-esteem and self-efficacy). Self-esteem is measured using the ten-item Rosenberg’s self-esteem scale and captures the youths’ global feelings of self-worth or self-acceptance (see Rosenberg 1965). Because this measures overall psychological well-being, we anticipate that its relationship to behavioral outcomes may be weak. The YITS includes a self-efficacy scale adapted from Pintrich and Groot (1990) that measures perceived competence and confidence in academic performance.

The results in Column 3 indicate that self-esteem has no direct effect on dropping out, but self-efficacy has a significant impact although a relatively small one. A one standard deviation increase in self-efficacy reduces the probability of dropping out by 0.01. The third column also shows that children who complete their assignments are less likely to drop out by a margin of 0.024. Including this second measure of conscientiousness reduces the effect of the “get by” variable, suggesting that these two measures are highly correlated.

Inclusion of self-efficacy, self-esteem, and the homework indicator does not affect the socioeconomic gradient but does reduce the impact of parental aspirations and PISA scores. As we mentioned earlier, the self-efficacy scale likely also captures cognitive ability. For example, one question included in the scale asks students to indicate how frequently this statement is true: “I’m confident I can understand the most complex material presented by the teacher.” This, along with the correlation between parents’ aspirations and PISA, would explain why including self-efficacy in Column 3 reduces the PISA-aspirations gradient. Nonetheless, the PISA and aspirations gradients remain steep.

In Column 4, we include variables corresponding to peer group characteristics (including whether close friends value school or have dropped out themselves), the local unemployment rate, and personal behaviours such as smoking. The coefficients on these variables indicate little or no association between dropping out and peer behaviour but some significant associations with having a dependent child and smoking. More importantly, introducing the peer, dependent child, and smoking variables has very little impact on the socioeconomic gradient impact estimates, though it does generate a reduction in the size of the aspiration/PISA effects. We also estimated specifications in which we included measures of hours of paid work for the students. None of the measures entered significantly nor changed key estimated marginal impacts. In the final column of Table 1, we incorporate school characteristics that were reported by the high school administrators as a part of the first wave of the YITS survey, including ratios of students to teachers and to computers. Including school characteristics has essentially no effect on the socioeconomic gradient and little impact on the PISA and parental aspirations effects. Thus, for both personal and school characteristics, their inclusion does not alter our main conclusions about the socioeconomic gradient and its relationship to aspirations and ability. Given this, in the remainder of the paper we examine the role of abilities, parental valuations, and parental education in determining dropping out without considering peer or school effects. This allows for a sharper focus on a set of relationships that, anyway, appear to be little or not at all affected by school or peer effects.

### A. Factor Estimators

In this section, we present results from the full factor model set out in Section IV. Recall that our goal with the factor model is to use the added measures of ability (cognitive and noncognitive) and parental valuations available in the YITS to better control for these factors.

A key decision in implementing these models is the number of points of support in the estimated factor distributions. We first estimated the models with two points of support for each factor then added additional points. Adding a third point of support for the cognitive ability factor distribution significantly improved the fit of the model, but adding a third point for the parental valuation and noncognitive ability distributions, and a fourth point of support for cognitive ability, were not helpful. More specifically, the model returned probability masses for the additional points of support that were close to zero and imprecisely estimated. Thus, we implement a specification with three points of support for cognitive ability and two each for parental valuation and noncognitive ability.

Table 2 reports the marginal effects and coefficients describing the relationship between parental education and family income estimated in two different factor models. The first column contains results from our extended system estimator that allows the distributions of the three factors to differ by parental education level. Note that this specification nests the more standard model, where the distributions of the three factors do not vary with parental education, as a special case. For comparison, in the second column we present results from the standard model. Using both the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) measures, the extended model represents a better fit of the data.

In both columns of Table 2, the impact of family income is essentially zero: notably smaller than the already small impacts in Belley, Frenette, and Lochner (2008). The gradient in the marginal effects of parental education estimated in the flexible factor model, and shown in Column 1, is flatter than the reduced form and the standard factor model estimates. Examining the coefficients on the parental education variables in the second panel of Column 1 suggests that the effects on dropping out for the four lowest parental education categories are not significantly different from each other. We focus on the gradient between families where both parents have a BA and families where both parents are high school dropouts because that is the steepest gradient in the raw data.

Estimates of the factor loadings, locations, and associated probabilities are given in Table 3 (Table 2 in the case of the factor loadings for the dropout equation). The factor loadings indicate that all three factors have statistically significant and sizeable effects on both dropout and grades indexes. The noncognitive ability factor is marginally significant in the parental aspirations equation and negatively related to parents’ willingness to save for their children’s education. The cognitive ability factor enters the parental aspiration equation significantly but not the *saved* equation. Thus, parental responses and actions relating to their valuation of education do not appear to be a mere reflection of measured child’s abilities. To the extent this carries over to other possible unmeasured factors, the pattern implies that we really are capturing parental valuation of education rather than getting another measure of child abilities.

To further explore the effect of parents’ education and the direct impact of the factors on dropping out, Figure 3 presents fitted probabilities conditional on each possible combination of the unobserved characteristics, based on the estimates from our extended factor model. Predicted probabilities for two BA families and two dropout families are shown in the first and second columns, respectively. Probabilities for all other parental education categories are available from the authors. The fitted values are the predicted probability for each individual evaluated at each possible combination of the mass-points in the factor distributions. Because there are three points of support in the cognitive ability distribution and two points in each of the noncognitive and parental valuation distributions, this yields 12 fitted probabilities for each individual. To estimate the probability for each parental education category, we take the simple average of the fitted probabilities across teenagers whose parents fall within the particular education category. The top panel of Figure 3 shows the fitted dropout probabilities evaluated at high noncognitive ability and the bottom panel shows the same for low noncognitive ability.

With few exceptions, high-cognitive ability teenagers do not drop out regardless of their parent’s education or the values of the other factors. For teenagers with low noncognitive skills whose parents are high school dropouts and place low value on education, moving from high to low cognitive ability increases the probability of dropping out from 0.019 to 0.40. Parental valuation effects are nearly as large for teenagers with medium or low cognitive abilities. A teenager with low cognitive and noncognitive abilities whose parents place a high value on education has a 0.045 probability of dropping out, which means the impact of parents’ valuations for a low-ability boy is 0.36. Moreover, a student whose parents place a high value on education has essentially a zero probability of dropping out unless he has both low cognitive and noncognitive abilities, and even then his dropout probability is not statistically different from zero. In comparison, noncognitive ability has effects that are substantial but smaller than either of the other two factors. Looking at the bottom right corner of the panels for the children of high school dropouts, increasing from low to high noncognitive ability reduces the probability of dropping out by 0.11.

Table 4 shows the estimated distribution of teenagers across the joint unobserved factor types for each parental education level. These tables show that teenagers whose parents both have a BA have high probability (0.14) of having high ability and having parents who place a high value on education. In comparison, 0.014 of teenagers from two dropout families are estimated to possess the same characteristics. Only 0.004 of children with two BA parents fall within the low-ability low-parental-valuation factor type. Thus, raw comparisons of teenagers from these two types of families will capture the fact that the children in the two BA families are more able and have parents who care more about education.

The differences in ability reflect differences up to age 15. These could include inter generational transmission of ability (Equation 1) but could also include the type of early childhood investment effects stressed in recent papers by Heckman and coauthors (for example, Heckman and Lochner 2000; Cunha, Heckman, and Schennach 2010) (Equation 3). Noncognitive ability does not show the same degree of correlation with parental education. In fact, teenagers whose parents both have a BA are less likely to have high noncognitive skills. This might reflect the notion that, holding cognitive ability and parental valuations constant, children who grow up in affluent families might not need to work as hard and as a result are less likely to develop traits that are related to conscientiousness. In unreported robustness checks, we investigate this finding by examining the correlation between measures of self-efficacy (an alternative proxy for cognitive and noncognitive skills) and parental education. Holding constant PISA scores, grades, and our proxies for parental valuations, parental education is not correlated with self-efficacy.

On the whole these results suggest that much of the observed socioeconomic gradient in the raw data is driven by differences in teenagers’ unobserved characteristics, including how much their families value education. For example, a child whose parents are both dropouts has effectively the same probability of dropping out (zero) as a child with the same ability from a highly educated family if both sets of parents value education highly. The true difference between these children is that the probability of the two dropout parents placing a high value on education is 0.30 while the probability of two BA parents doing so is 0.78. As we stated in our discussion in Section III, these latter differences may themselves reflect impacts of parental education and/or parental valuation of education on educational investments in earlier childhood years. However, the lack of a gradient with respect to parental education, once we condition on abilities and parental valuations of education at age 15, indicates that parental education does not have a direct effect on the dropout decision in the teenage years. We also estimated a version of the model in which we excluded the parental valuation of education factor. In the estimates from this restricted specification, which are available from the authors, there is a differential in dropping out by parental education. The difference between those results and the ones presented here implies that what appears to be a direct effect of parental education in the teenage years in the more restricted model (happening, for example, through more educated parents being better able to help with homework or through permanent income effects) really just reflects parental education proxying for what really matters: whether parents value education.

In our data, what parents who care about education actually impart is a black box— they could devote more resources to their child’s education, they might convince their child that there is a return to effort in school, or they might enforce some level of effort. Our results are in apparent agreement with Behrman and Rosenzweig (2002)’s estimates of the impact of variation in maternal education on children’s educational outcomes holding constant family fixed factors (by using differences in education between twin mothers). But it finds significant effects of paternal education. Carneiro, Meghir, and Parey (2013) finds significant effects of maternal education on grade retention in NLSY data when it instruments for maternal education using local labor market conditions and access to colleges when the mother was 17. To the extent that increasing parental education increases parental aspirations for their children’s education (which Carneiro, Meghir, and Parey 2013 finds is the case), the estimated parental education effects identified by these papers may ultimately occur through the channel we identify.

### B. Interpreting Parental Valuations

Our results would overstate the impact of parental valuations if dropping out were affected by some type of ability which is orthogonal to the PISA reading score but correlated with parental valuations. This could happen if there were other skills, orthogonal to the ones we measure, about which parents have private information.

The YITS includes PISA reading scores for all of the sampled children; however, half of the children also wrote the PISA science test and the other half wrote the PISA math test. The sample sizes are too small to use the science and math tests in the main analysis to estimate cognitive skills. However, we can use these data to perform a simple but informative out-of-sample robustness check to verify: (1) whether our identifying assumption is sensitive to the choice of test scores; (2) whether any additional third factor, orthogonal to cognitive and noncognitive skills, exhibits patterns of comovement with science and math PISA tests, which we have not used to identify θ_{11}. Given that “hard” skills related to math and science are often viewed as quite different from “softer” reading related skills (with different people embodying high values of each, that is, “math nerds” versus “creative writer” types), we view this test as quite informative.

Assuming that cognitive skills can be summarized with the PISA reading score implies that after conditioning on θ_{11}, ν* _{p}* should not predict either science or math skills. We test this implication using estimates of θ

_{11}and ν

*from our extended factor model. Using the estimated parameters, the factor estimates are obtained by use of Bayes rule:*

_{p}(17)

where Y is the matrix of outcomes (both dropping out and the measurement values), θ is a vector containing all the factors, *k* indexes the factor distribution’s points of support, and Γ is the matrix of all parameters in the system, including those defining the factor distributions.

Figure 4 shows the relationship between the estimated factors and the students’ math scores. An analogous figure for science scores appears very similar and is available from the authors. Because the factors do not have a meaningful scale, we normalize both the estimated factors and the math scores. The left panel of Figure 4 shows the strong positive association between math scores and our estimates of cognitive ability. In the right panel, we relate the estimates of parental valuation (on the X-axis) with the residual variation in math scores after controlling for estimated cognitive ability. Graphically, it appears that parental valuations do not predict residual variation in math scores. The correlation, albeit significant, is tiny and negative. Based on this, we conclude that the parental valuation factor is not simply a reflection of other cognitive skills, at least, as captured by science and math measures not used in our estimation.

Finally, one possible confounding factor that we have not so far discussed is unobserved school quality. Our reduced form estimates show no impact of observable school characteristics (for example, student-teacher ratios) on the main results. Nonetheless it is possible that the ability and family valuation of education factors may partly pick up unobserved school characteristics. We examine this possibility in two ways. First, we reestimate our reduced form specification incorporating school fixed effects.^{9} In those estimates, we find that the inclusion of school fixed effects has very little impact on the key coefficients of the socioeconomic gradient and on the aspiration and ability variables. In a second exercise, we also plot the fitted ability and parental valuation factor values vis-a-vis our observable school quality measures (students per teacher and students per computer). The scatter plots from this exercise exhibit no discernable relationships between estimated factors and school quality measures; the best fit lines through the clouds typically have slope coefficients that are not significantly different from zero and are quantitatively negligible. Based on this evidence we conclude that our ability and parental valuation factors are not inadvertently picking up school quality.

## VII. Conclusion

In this paper, we build on insights from earlier work such as Sewell, Haller, and Portes (1969); Davies and Kandel (1981); Todd and Wolpin (2006); and Cunha and Heckman (2007) to investigate the importance of three underlying factors in determining the propensity of teenagers to drop out of high school: cognitive abilities, noncognitive abilities, and the value placed on education by the teenager’s parents. Our empirical approach follows Carneiro, Hansen, and Heckman (2003) in using a main dropout equation estimated jointly with a set of measurement equations comoving with the unobserved factors. Given arguments in Cunha, Heckman, and Schennach (2010) that ability production functions may be nonlinear in parental investments, we employ an extension to the estimator in Carneiro, Hansen, and Heckman (2003) in which the unobserved factor distributions are allowed to differ across families with different parental education. Implementing this flexible estimator results in four main findings. First, skills accumulated by age 15, as reflected in our ability measures, have a substantial impact on dropping out. The highest ability individuals are predicted never to drop out regardless of parental education or parental valuation of education. Second, for children of high school dropouts, parental valuation of education has a substantial impact on medium-and low-ability teenagers. For a low-skilled child, having a parent who places a high value on education affects dropping out by roughly the same amount as obtaining the highest skill level. Third, skills reflected in our noncognitive ability measures have impacts that are sizeable but much smaller than those of the other two factors. Fourth, parental-education effects on dropping out operate almost entirely through the distributions of unobserved factors. In other words, the teenagers who drop out are predominantly low-skilled children whose parents place a low value on education. Our estimates indicate that children with these unobserved characteristics are rarely found in highly educated families.

The interpretation of these results depends on the underlying economic structure. If we assume an index-sufficiency model in which cognitive and noncognitive abilities at age 15 fully summarize all relevant investments before that age then, once we include measures of those abilities, all other effects should be interpreted as being impacts of the particular characteristics after age 15.

Whether or not one accepts the assumption that cognitive and noncognitive abilities at age 15 are sufficient statistics for everything that has gone before, two striking results remain: First, parental valuation of education matters in the dropout decision, once we control for abilities; second, the quantitative impact of parental valuation is very large. Whichever way one interprets these results, it seems that parental valuation of education falls into the category of determinants of education (and through it, outcomes in later life) for which a youth cannot be held directly responsible and this provides some justification for policy intervention. In fact we view these results as hopeful in the sense that they suggest at least the hypothetical possibility that policy may have nonnegligible effects on dropout rates despite early under-investment in skills or weak incentives due to low valuation of education within families. In other words, there might exist a margin for late intervention above and beyond the slow, cross-generational process of raising parental education and early investment in children’s skills.

## Appendix 1

## Appendix 2

### Likelihood Function

In this appendix, we present an example contribution to the factor likelihood function for person *i* who is a dropout; has a test score in the lowest quartile; has parents who state they hope their child gets a BA; states he just wants to get by in effort; has an average overall grade below 59; has parents who saved for their education; and hands in homework late.

(1)

where: the F()s are cumulative normal distribution functions; *j, k* and *m* index the points of support in the θ_{11}, θ_{12} and ν* _{p}* distributions, respectively; the

*p*s are probabilities associated with the points of support;

*z*corresponds to the vector of all observable covariates in the dropout equation with a vector of associated coefficients, γ; and

*x*

_{1}…

*x*

_{6}are the vectors of observable covariates in the measurement equations with vectors of associated coefficients, δ

_{1}… δ

_{6}. In our data, dropping out is a binary variable as are

*parasp*(a dummy equalling one if the parents hope their child will obtain a BA or more and zero otherwise),

*saved*(which takes a value of one if the parents saved for their child’s education),

*getby*(which equals 1 if children say they just want to get by in terms of effort), and

*hmwork*(which equals 1 if the child always completes his assignments). These variables contribute simple Probit type expressions to the likelihood conditional on the factor values. We divide the PISA test scores into quartiles and use indicators for the quartile of the

*PISA*variable. Thus, the contribution to the likelihood function is in the form of components of an ordered Probit. Here,

*PISA*

_{1}is the test score value that defines the upper bound of the first quartile. Similarly, we group grades in four categories (59 and less, 60 to 69, 70 to 79, and 80 and above). As a result, the contributions for this variable also take the form of ordered Probit expressions.

## Appendix 3

### Specification of PISA and Grades Equations

## Footnotes

Kelly Foley is an assistant professor of economics at the University of Saskatchewan.

Giovanni Gallipoli is an associate professor of economics at the University of British Columbia.

David Green is a professor of economics at the University of British Columbia. The authors thank participants at the Conference on Structural Models of the Labour Market and Policy Analysis, and the CLSRN and IRP workshops at the University of Toronto and University of Wisconsin, respectively, for their comments. The data used in this article can be obtained from the authors beginning May 2015 through April 2018.

↵1. Given the relatively low monetary cost of attending high school in Canada, we do not explicitly model credit constraints. However, in the empirical analysis we do control for both short-and long-term financing constraints.

↵2. This variable and all other measurement variables related to the factors occur as categorical variables in our data so these equations should be interpreted as index functions underlying actual realizations of the measurement variables.

↵3. One might be concerned that this measure, and others, partially capture school specific effects. In the robustness section we examine the role of observable school characteristics in our reduced form estimates and discuss the potential importance of observed and unobserved school effects.

↵4. As CHH discusses, identification of the coefficients on the observed variables is given when they are orthogonal to the factors and errors. When that is the case, we can discuss identification of the factor loadings and variances in terms of the dependent variables net of the effects of the righthand side variables—that is, the broadly defined errors in all the equations.

↵5. Our dropout definition differs from the one used by some other authors (for example, Eckstein and Wolpin 1999) who include current students who have not graduated as dropouts. We view counting these ongoing students as dropouts as a potential mislabeling that could cause us to miss relationships such as parents pushing their children to complete their schooling in “whatever time it takes.” We reestimated our model using Eckstein and Wolpin’s definition and found similar results to those presented here with the main exception that the importance of parental valuation of education is somewhat reduced, though still economically substantial and statistically significant.

↵6. Some students in Quebec and, depending on their birth month, some students in Ontario may not have completed high school at the time of third survey. Our results are similar if we exclude Ontario and Quebec from the sample.

↵7. Student attrition between Cycles 1 and 2 and between Cycles 2 and 3 implies that we have roughly 70 percent of the original sample with usable information by the third cycle. This is not an inordinately high attrition rate by the standards of most panel data. The weights provided to address this issue are the result of an estimated attrition process. Thus, the variables used in constructing the weights can potentially play the role of exclusion restrictions. If there are variables used in constructing the weights that are not used in the final estimation then those variables effectively become instruments for addressing selection. The information we have obtained from Statistics Canada suggests that the variables used in constructing those weights are all variables that are either included in our final specifications or are strongly related to included variables. One exception to this is a variable based on a question to the parents about whether they were willing to have their data shared with another government department (HRSDC). We tried, unsuccessfully, to get access to this variable to allow us to model attrition explicitly ourselves. Instead, we tried implementing an estimator including an explicit attrition process and using as an exclusion restriction a variable equaling the proportion of times a respondent did not answer a question asked of everyone. This variable did not perform well in determining attrition, and we were forced to rely on the provided weights.

↵8. These variables are included because of evidence that recent immigrants are facing substantial barriers to integrating into the economy and society at large. The aboriginal descent variable is suggested by high rates of poverty in this community. In specifications not shown, we included indicators for whether the child is second generation (that is, born in Canada with at least one parent who is an immigrant) and the language spoken at home is an official language. Because these variables were never significant or economically substantial in a variety of specifications we dropped them from the analysis.

↵9. Issues relating to the number of students per school in the data set and confidentiality restrictions imply that when we do this we must reduce the number of schools by over two-thirds, leaving a highly unrepresentative sample. For this reason, we do not report this as our main estimation results. These results are available from the authors.

- Received April 2012.
- Accepted September 2013.

This article requires a subscription to view the full text. If you have a subscription you may use the login form below to view the article. Access to this article can also be purchased.