## Abstract

We estimate the effects of completing vocational rather than general upper secondary education programs on earnings at age 28 and, using surrogate index techniques, at age 40. We apply longitudinal administrative data for Denmark and marginal treatment effect models, with distances to educational institutions as instruments. We find significant and substantial heterogeneity in earnings effects consistent with selection on gains. A policy shifting students at the margin towards vocational education tends to have small and insignificant long‐term effects for females and for males with low math skills, but negative long‐term effects for males with high math skills.

## I. Introduction

Young people’s choice of education is important for their future labor market outcomes and working life. Important choices include whether to enroll in and complete an upper secondary and a tertiary education, and the particular type of education.

We focus on the choice between vocational and general upper secondary education programs, and we estimate the effects of this choice on earnings at age 28 and 40. We investigate heterogeneity in the effects using longitudinal administrative data for Denmark and marginal treatment effect (MTE) models, with distances to educational institutions as instruments in the educational choice model.

Upper secondary education systems differ between countries in both the relative size of general and vocational programs and the characteristics of vocational programs (Ryan 2001; Piopiunik and Ryan 2012). In the United States, for instance, there is no separate vocational track in upper secondary education, whereas most European countries do have separate general and vocational tracks. In Denmark, as in Germany, Switzerland, and Austria, vocational upper secondary education is characterized by a large apprenticeship system in which students alternate between school courses and apprenticeship training at firms. Compared to general upper secondary education, vocational education may give rise to higher earnings and employment early on in a career because the specific vocational skills acquired qualify one directly for jobs and because students are more closely connected to the labor market while in education, which may facilitate the school‐to‐work transition (Piopiunik and Ryan 2012). In Denmark, general upper secondary programs do not directly qualify participants for jobs in the labor market, but they provide access to further and higher education, leading to more years of education and later labor market entry and, consequently, higher costs in the form of forgone earnings. On the other hand, the higher level of education leads to higher earnings, especially later in the career when combined with more labor market experience. Another mechanism leading to relatively higher long‐term earnings and employment for those with a general upper secondary degree may be that general skills depreciate at a slower rate than vocational skills due to technological progress (Hanushek et al. 2017).

The Danish institutional setting is an interesting case because of the separate general and vocational tracks in upper secondary education and because students’ educational choices are less restricted by parental income than in most other countries. There is no tuition for upper secondary or post‐secondary programs, and students aged above 18 years are offered maintenance grants and loans by the state that are generous by international standards.

The main contribution of our study is to estimate the effects on earnings of completing a vocational rather than a general upper secondary education, focusing on heterogeneity in effects with respect to both observable covariates and unobservables. This is important for policy considerations because individuals may select into education programs based on idiosyncratic knowledge of comparative advantage, and individual heterogeneity is an important reason for offering different types of education programs. Specifically, we consider whether young people who choose a vocational (or general) upper secondary program generally benefit in terms of ultimate earnings as a result of their choices and whether students at the margin of indifference between vocational and general education would typically benefit from choosing vocational programs.

We use MTE methods that allow us to estimate conventional treatment effect parameters such as the average treatment effect (ATE), the average treatment effect on the treated (ATT), and the average treatment effect on the untreated (ATUT), as well as policy‐relevant treatment effects (PRTEs), which are average effects for “marginal students” whose educational choice is affected by a given policy change; see Heckman and Vytlacil (2005, 2007), and Cornelissen et al. (2016). We are not aware of any previous studies using MTE methods to compare the effects of vocational versus general upper secondary education. We also explore mechanisms underlying the earnings effects by investigating the effects on employment probability, hours worked, the hourly wage rate, and educational attainment at age 28.

There exists only a small literature on the effect of choosing vocational versus general upper secondary education. Based on ordinary least squares (OLS) regressions, Bishop and Mane (2004) find positive effects of taking vocational courses in U.S. high schools on earnings and employment eight years after high school graduation. The findings for the United States in Meer (2007) indicate that the choice of technical vocational courses is consistent with comparative advantage. Malamud and Pop‐Eleches (2010) find no significant effect on labor market participation or earnings of a reform in Romania that shifted students into more general and less vocational education.

Hanushek et al. (2017) use data for 11 countries from the International Adult Literacy Survey and a difference‐in‐differences (DID) framework to compare outcomes across different ages (16–65) for males who completed vocational or general secondary education. Using such cross‐sectional data, it is in general not possible to distinguish between age and cohort effects, but the authors argue that conditional selectivity into education types does not seem to vary over time and that their method therefore identifies how relative outcomes of the two types of education vary with age. As the authors discuss, however, their method cannot identify the effect of vocational versus general education at any given age because selection into type of education is influenced by unobserved characteristics. They find that employment probability and earnings both increase with age if a person has completed general instead of vocational education.

Similar results are found by Brunello and Rocco (2017) and Choi, Jeong, and Kim (2019), who use another international cross‐section survey, PIAAC, and propensity score weighting and matching techniques. Using Swedish administrative data and sibling fixed effects models, Golsteyn and Stenberg (2017) find that completion of vocational instead of general upper secondary education has positive effects on earnings early in the career, but negative effects later.

Torun and Tumen (2019) use standard instrumental variables (IV) techniques to estimate, for males, the effect on employment of completing a vocational instead of a general upper secondary education in Turkey. They find no significant effect. The instrument is a dummy for availability of vocational schools in the town of residence at age 13. They limit the sample to individuals with an upper secondary degree as the highest education level attained. Using MTE methods and distance to secondary schools as an instrument, Carneiro, Lokshin, and Umapathi (2017) find high returns to upper secondary schooling in Indonesia, but they do not distinguish between general and vocational education. Distance to educational institutions has been used as an instrument for educational choice in other contexts, notably in the literature on returns to college education, which we discuss in Section III.

When considering the choice of upper secondary education, student ability and skills at the end of lower secondary school are important. Here we study the first four cohorts for which we have information on test scores at the end of lower secondary school. We can follow these cohorts up to age 28–31, and we use earnings at age 28 as one of our outcomes.

It is important to investigate more long‐term earnings effects as well. This is highlighted by our discussion of earnings dynamics above and by the results in earlier studies, such as Hanushek et al. (2017), indicating relatively more favorable long‐term effects of completing general rather than vocational education. We use the “surrogate index” techniques discussed in Athey et al. (2019) to estimate earnings effects at age 40. Thus, we use a sample of older cohorts to estimate a model for earnings at age 40, conditioning on a large set of intermediate outcomes by age 28–31 that are observed in our main sample. Based on this model, we predict earnings at age 40 for the four cohorts in our main sample and use these predicted earnings as an outcome in the main analysis.

Earnings at about age 40 are in general a much better indicator of lifetime earnings than earnings around age 28 (Björklund 1993; Haider and Solon 2006). In our analysis, the intuition for considering earnings at age 40 as an indicator for lifetime earnings is that, whereas vocational education may lead to higher earnings early in the career, as discussed above (because of fewer years with forgone earnings, a more effective school‐to‐work transition, and earlier accumulation of job experience), the earnings advantage of a higher level of education for those who choose general upper secondary school followed by a post‐secondary degree will continue after age 40.

We find that the ATE (of completing vocational instead of general upper secondary education) on earnings at age 28 is positive for males and negative for females. For predicted earnings at age 40, the ATE is negative for both genders. On average, males gain $8,000 in annual earnings at age 28 by completing vocational instead of general education, but lose $8,700 at age 40. Females lose about $6,700 in earnings at age 28 and $12,200 at age 40. These ATE estimates are consistent with the results of earlier studies, including Hanushek et al. (2017).

However, we find substantial heterogeneity in effects consistent with selection on gains. On average, individuals selecting into vocational education benefit from this choice, and those selecting into general education also benefit from their choice. This heterogeneity stems from both covariates and unobservables. The difference between ATT and ATUT is clearly significant. For earnings at age 28, these estimates are $14,900 and $3,300 for males and $5,100 and –$10,200 for females, respectively. At age 40 they are $3,700 and –$17,300 for males and $8,100 and –$18,100 for females.

We consider different PRTEs, including the effects of a policy in which the propensity score for completing vocational instead of general education is increased by one percentage point for all individuals. The PRTE estimates indicate that, for most males at the margin, it may be a financial advantage in terms of long‐term earnings to choose general instead of vocational education, but for males with low math skills and for females, there does not appear to be any significant benefit or loss from differing educational choices for marginal students.

## II. Institutional Setting

In Denmark, compulsory schooling ends after the ninth grade, typically in the year the student turns 16. There is an exam at the end of ninth grade in the most important subjects (including math, science, Danish, and English). In addition to exam test scores, students receive marks for the year’s work evaluated by the teacher. The educational options after ninth grade (or the optional tenth grade) are vocational or general (academic) upper secondary education programs. General programs are typically three‐year programs. They qualify students for admission to further and higher education, but do not in themselves provide qualifications for jobs in the labor market. Most students choosing a general upper secondary program will therefore enroll after graduation in a university bachelors program (followed by a masters program) or in a two‐ to four‐year professionally oriented post‐secondary program. There are no tuition fees, either for upper secondary or post‐secondary programs.

The vocational education and training system provides many different types of education, in seven main categories: the commercial field, technology, construction, craftsmanship, food production, mechanics, and service. Vocational programs last two to five years, typically about four years, starting with one year of school‐based courses followed by three years in which students alternate between more specialized school courses and apprenticeship training at firms. Vocational programs qualify students for jobs (for instance, as a carpenter, electrician, or hairdresser). Most students completing a vocational education will enter the labor market directly as skilled workers, although options also exist for enrolling in post‐secondary education programs. For the cohorts studied here there were no specific admission criteria to vocational upper secondary education programs. Criteria for entering general upper secondary programs were not very strict either.^{1}

## III. Empirical Strategy

We wish to estimate effects on earnings of the type of upper secondary education. This is complicated for several reasons. First, the choice of program is highly endogenous, and we cannot assume constant effects. Students differ in their preferences and abilities, and presumably selection on expected gains is important. Second, students can choose between many programs. In Denmark, after compulsory school, students can choose between four different main types of general upper secondary education programs and seven main types of vocational upper secondary programs, and within each main program they specialize by choice of various subjects in general programs and by choice of specific profession in vocational programs (typically after the first year). Young people also have the option of not enrolling in (or completing) any upper secondary program.

We simplify by analyzing the effects of completing a vocational rather than a general upper secondary education program (conditional on completing some upper secondary program).^{2} Thus, we have a simple treatment dummy that is one if the student completes a vocational education and zero if they complete a general education. Some students regret their initial choice of program and switch to another program, and some students even complete two different programs. We define treatment status by the first completed program.

### A. Exclusion Restrictions

To take account of selection on unobservables, we use as instruments distances to the two types of education (at age 15). Distance is assumed to affect the choice of upper secondary education (and the likelihood of completion), but conditional on controls, not directly the outcomes. Distance to educational institutions has been used as an instrument for choice of college by Card (1995); Kane and Rouse (1995); Kling (2001); Currie and Moretti (2003); Cameron and Taber (2004); Carneiro and Lee (2009); Carneiro, Heckman, and Vytlacil (2011); and Nybom (2017). Distance has also been used as an instrument by Carneiro, Lokshin, and Umapathi (2017), who estimate returns to upper secondary schooling in Indonesia, and by Torun and Tumen (2019), who estimate the effects of completing vocational instead of general upper secondary school in Turkey.

The validity of the exclusion restriction for distance instruments has been discussed in the literature mentioned above. We use rich administrative panel data that allow us to control for many important parental background variables, child characteristics (including academic ability indicators based on test scores at the end of compulsory school), and a large number of dummies for municipality of residence (see Section IV), which add credibility to our instruments. Furthermore, the educational choice under consideration is made when the students leave lower secondary school, typically at the age of 15 or 16, when they are living with their parents. Therefore, the choice of residence is made by the parents, not the child.^{3} We conduct specification checks supporting the validity of the instruments; see Section VI and Online Appendix C.5.

In addition to our main focus on the effect of completing vocational rather than general education, we also consider the effect of enrolling in vocational rather than general education. The distance instruments are strong for both the enrollment and completion margins (see Sections V.B.1 and VI and Online Appendix C.4). There are two reasons why we have chosen the completion margin for our main analysis. First, completion is economically more interesting because the type of upper secondary education completed has important consequences for job opportunities and post‐secondary education options. Second, the exclusion restriction may be more problematic for the enrollment margin because the distance instruments significantly predict the type of education completed conditional on enrollment (Online Appendix C.5). An important mechanism may be that long commuting distances both discourage enrollment and increase the probability of switching to the other type of education (conditional on enrollment). Thus, distance is not an ideal instrument that would randomly assign enrollment and then play no further role. On the other hand, it is reasonable to assume that it does not directly affect outcomes conditional on the type of upper secondary education completed.

### B. MTE Estimation

Our instruments are continuous, and we apply MTE methods; see Heckman and Vytlacil (1999, 2005, 2007), Heckman (2010), and Cornelissen et al. (2016). These methods allow estimation of heterogeneity in treatment effects with respect to observables and unobservables, including the expected idiosyncratic gains from treatment.

Standard instrumental variables (IV) estimates can identify the local average treatment effect (LATE) for those who comply with the instrument (Imbens and Angrist 1994), but the external validity of LATE is uncertain when treatment effects are heterogeneous (compliers may be unrepresentative of the sample), and with a continuous instrument the interpretation of LATE is not straightforward (it is a weighted average of LATEs for all pairs of values of the instrument). In contrast, it is possible by using MTE methods to estimate conventional treatment effect parameters such as the ATE, ATT and ATUT, as well as PRTEs for policies affecting the educational choice of different subgroups of the sample.

Consider a standard potential outcomes model with two types of education: 1 2

where *Y*_{1} and *Y*_{0} are potential outcomes (earnings in our application) in the treated state (vocational education) and the untreated state (general education); **X** is a vector of control variables; *D* is the indicator variable for treatment; **Z** = (** X**,

*Z*^{e}) is a vector of variables affecting the educational choice, which include the controls

*X*, and the instrumental variables

*Z*^{e}(distances to educational establishments) excluded from Equation 1. Finally,

*U*

_{0},

*U*

_{1}, and

*V*are unobservables. The observed outcome is

*Y*=

*DY*

_{1}+ (1 –

*D*)

*Y*

_{0}.

Equation 2 can be rewritten as *D* = 1{*P*(** Z**) ≥

*U*}, where

_{D}*P*(

**) is the propensity score (the probability of choosing vocational education given observables), and**

*Z**U*represents the quantiles of

_{D}*V*.

^{4}By construction,

*U*is distributed uniformly over the interval (0,1).

_{D}*U*can be interpreted as unobserved resistance to treatment. An individual selects treatment (

_{D}*D*= 1) if the net benefit of treatment [

*P*(

**) –**

*Z**U*] is positive. The model allows this net benefit to depend on

_{D}*Y*

_{0}and

*Y*

_{1}through dependence between (

*U*

_{0},

*U*

_{1}) and

*U*. The MTE is defined as the treatment effect at given values of

_{D}*U*and

_{D}**X**: 3

Upon holding ** X** fixed, the MTE curve shows how the MTE varies with unobserved resistance to treatment,

*U*. At each point

_{D}*U*=

_{D}*p*, the MTE (given

**X**) can be interpreted as the mean treatment effect for persons at the margin of indifference between vocational and general education; that is, for persons with propensity score

*P*(

**) =**

*Z**p*; see Heckman (2010).

Standard IV assumptions are needed for identification, namely that the distance instruments are significant in the educational choice model in Equation 2 and that they are statistically independent of (*U*_{0},*U*_{1},*V*) conditional on ** X**. In practice it is necessary also to restrict the shape of the MTE curve, that is, for the relation between the MTE and

*U*given

_{D}**X**, to be independent of

**X**, except for the intercept which may vary with

**.**

*X*^{5}With these assumptions we have 4

where the first term represents heterogeneity in the observables (because the variables in **X** may affect the two potential outcomes differently and thereby influence the intercept of the MTE curve), and the second term, *k*(*p*) = E(*U*_{1} – *U*_{0}|*U _{D}* =

*p*), represents heterogeneity in unobservables; that is, variation in the MTE by unobserved resistance to treatment. Upon defining

*k*(

_{j}*p*) =

*E*(

*U*|

_{j}*U*=

_{D}*p*),

*j*= 0,1, we have

*k*(

*p*) =

*k*

_{1}(

*p*) –

*k*

_{0}(

*p*).

We estimate the conditional expectation of *Y* in the sample of treated and untreated separately using regressions of the form: *Y _{j}* =

*x*β

_{j}+

*K*(

_{j}*p*) + ɛ

_{j},

*j*= 0,1, where the control functions

*K*(

_{j}*p*) are functions of the propensity score

*p*,

*K*

_{1}(

*p*) =

*E*(

*U*

_{1}|

*U*≤

_{D}*p*), and

*K*

_{0}(

*p*) =

*E*(

*U*

_{0}|

*U*>

_{D}*p*). In this application, we use parametric models with polynomials in

*p*. The estimation is a two‐step procedure, in which we estimate the propensity score function in the first step (using a logit model) and the outcome equations in the second step. We apply the Stata command mtefe (Andresen 2018) with varying specifications of the degree of the polynomial of

*k*, and thereby

_{j}*K*.

_{j}The MTE given by Equation 4 can be used to estimate treatment effect parameters such as the ATE, ATT, ATUT, and PRTEs. For instance, the (unconditional) ATE is calculated using the overall means of *X* (in the first term of Equation 4) and an equally weighted average over the unobserved component of the MTE (the last term). Similarly, the ATT is calculated using the means of **X** for the treatment group and a weighted average over the unobserved component of MTE, with more weight to individuals with a low unobserved resistance to treatment. For details, see Heckman and Vytlacil (2005, 2007), Heckman (2010), and Cornelissen et al. (2016).

### C. Long‐Term Effects on Earnings

As explained in Section IV below, our main sample consists of the cohorts born in 1986–1989, which we can follow in the registers until 2018, that is, until age 28–31 (beginning of year). This is rather early in a career, especially for those who choose higher education, and about 8 percent are still enrolled in education at age 28. Earnings at about age 40 are generally a much better indicator of lifetime earnings than earnings at age 28–31 (Björklund 1993; Haider and Solon 2006). As a supplement to observed earnings at age 28, we therefore construct an additional outcome variable: predicted earnings at age 40 conditional on earnings, labor market status, and detailed educational attainment by age 28–31 (and other variables); for details, see Section IV.B. We do this by using register data for older cohorts to estimate an OLS model, with earnings at age 40 as the dependent variable and short‐term outcomes by age 28–31 and pre‐treatment covariates as explanatory variables, and we then use the estimated coefficients to predict earnings at age 40 for the cohorts of our main sample.

Predicted long‐term earnings at age 40 is what Athey et al. (2019) refer to as a “surrogate index.” Because the long‐term outcome is not observed in our main sample, we use surrogates (short‐term outcomes measured up to age 28–31) to predict the long‐term outcome and estimate long‐term treatment effects. Athey et al. (2019) show that the ATE on the long‐term outcome is identified if three assumptions hold: unconfoundedness (independence between treatment and potential outcomes conditional on pre‐treatment controls), surrogacy (independence between treatment and the long‐term outcome conditional on short‐term outcomes and pre‐treatment controls), and comparability (the conditional distribution of the long‐term outcome given pre‐treatment controls and short‐term outcomes is the same in the main sample and the sample of older cohorts).

Athey et al. (2019) also briefly discuss the use of instrumental variables, instead of invoking the unconfoundedness assumption. In an IV setting, it is assumed that the exclusion restriction holds for both short‐ and long‐term outcomes (no correlation between instruments and potential outcomes conditional on pre‐treatment controls), and the surrogacy assumption states that, in the main sample, the long‐term outcome is independent of the instruments, conditional on pre‐treatment controls and the short‐term outcomes. Intuitively, the surrogacy assumption requires that the short‐term outcomes fully capture the causal link from a variation in treatment (induced by a change of the instruments) to the long‐term outcome, either because the short‐term outcomes are themselves the causal factors or because they are correlated with the causal factors.

Our application obviously calls for several surrogates. Thus, choosing vocational instead of general upper secondary education will typically result in relatively high earnings early in a career (at age 28, for instance) but a lower level of educational attainment. A high level of education and high earnings at age 28 both predict high earnings at age 40, so we need both surrogates. Furthermore, earnings at age 40 vary with the specific type of education attained, not just the level of education, so we also need surrogates for type of education (field of study). The expected relation between earnings at age 40 and at age 28 will depend on whether the individual was still enrolled in education at age 28 and whether they were employed, so we shall also need surrogates for labor market status and their interactions with earnings at age 28 and other short‐term outcomes. As emphasized by Athey et al. (2019), when predicting a long‐term outcome with earlier observations of the same variable (and related variables), it may be useful to include such surrogates for several time periods (for instance, not just at age 28, but also at ages 25, 26, and 27).^{6}

We cannot test the surrogacy and comparability assumptions, but they seem to be reasonable in our application. Because we have many important and detailed short‐term surrogate outcomes measured by age 28–31, including educational attainment in terms of both level of education and field of study, and several years of observations on labor market status and earnings, as well as many pre‐treatment controls, it is reasonable to assume that, conditioning on all these variables, earnings at age 40 are independent of instrument‐induced variation in the choice of vocational versus general upper secondary education. We conduct a plausibility check of the surrogacy assumption, as suggested in Athey et al. (2019). Thus, we treat earnings at age 28 as a pseudo‐unobserved outcome, predicting it using lagged values of the surrogates, and then compare results using this predicted earnings variable against results using observed earnings at age 28. The two sets of results are very similar; see Section VI and Online Appendix C.3.

The assumption of comparability of samples is likely to hold approximately because the two samples consist of different cohorts of the same population, and the variables used are consistent over time—they are based on administrative registers for the full population. An issue is that earnings dynamics and returns to different types of education may change over time due to technical change, business cycles, and structural changes in the labor market. As the main sample consists of cohorts 1986–89 and the “old sample” of cohorts 1974–1977, we cannot expect the comparability assumption to hold exactly. However, we show in robustness checks that varying the cohorts included in the old sample does not change results significantly; see Section VI and Online Appendix C.3.

We report bootstrapped standard errors (based on 500 replications) for the treatment effects that take account of the uncertainty in the estimation of the propensity scores, the means of the covariates **X** (at which the MTEs are evaluated), the treatment effect parameter weights, and the surrogate index.^{7}

## IV. Data

We use Danish administrative data that cover the whole population and have a panel structure. Unique personal identification numbers make it possible to link different registers. The data include information on earnings, employment status, enrollment in education, completed education, place of residence, and ninth grade test scores. The data also contain links between children and parents.

Information on test scores at the end of ninth grade is not available before 2002, nor is other information on student ability in primary or lower secondary schools. We focus on the first four cohorts with test score information; these are the cohorts born 1986–1989 who would typically complete ninth grade in the years 2002–2005 (that is, in the year they turn 16). We can measure outcomes up to age 28 for all of these cohorts (and up to age 31 for the 1986 cohort).

### A. Sample Selection

Our main sample consists of 161,432 individuals from the 1986–1989 cohorts who were enrolled in Danish lower secondary schools, who had completed an upper secondary education program by age 25, who were living in Denmark the year they left lower secondary school and also at age 25 and in the years in which we measure outcomes, and for whom we have information on distance to educational institutions. For more details of sample selection, see Online Appendix Table A1. Whereas 94 percent of the cohorts had enrolled in an upper secondary program by age 25, only about 82 percent of those enrolled had completed an upper secondary degree by age 25. Dropout rates are especially high in vocational programs. This is discussed further in Online Appendix C.4.

### B. Outcomes and Covariates

We consider two main outcomes. The first is earnings at age 28 (that is, in the year in which the student was 28 at the beginning of the year). Earnings are measured as annual wage income plus income from self‐employment (before taxes). They are deflated by the consumer price index (base year 2015) and measured in USD 1,000.^{8} Other income variables and the hourly wage rate are also measured in USD.

Our second main outcome is predicted earnings at age 40 as discussed in the Introduction and in Section III.C. We construct predicted earnings at age 40 using register data for older cohorts born 1974–1977. With earnings at age 40 as the dependent variable, we estimate four OLS models that are identical except that we condition on educational attainment, labor market status, and earnings by different ages (age 28, 29, 30, and 31, respectively). The parameter estimates of these models are then used to predict earnings at age 40 (for cohorts 1989, 1988, 1987, and 1986, respectively) in our main sample.

We include the following explanatory variables in these models. The most important variables are earnings, labor market status (employed, student, or inactive), and educational attainment (highest completed education) at age 28–31. Educational attainment is measured by 290 indicator variables, each representing a specific education defined in terms of both the level of education and the narrowly defined field of study.^{9} Second, we include the grade point average (GPA) from general upper secondary school for those who have completed such an education. Third, we include gender, and the (pre‐treatment) parental earnings and parental level of education (five categories). Fourth, we include interaction terms, most importantly between the level of education at age 28–31 (eight categories) and all other variables, and between gender and all other variables.^{10} Finally, to take account of earnings dynamics, we include three‐year lags of earnings, labor market status, and level of education and their interactions, as well as their interactions with gender. For instance, for the model conditioning on short‐term outcomes by age 28, these variables are included at age 25, 26, and 27, as well as at age 28. All in all, each model contains about 890 parameters. The number of observations is approximately 260,000 for each model, and the adjusted *R*‐squared is between 0.50 (age 31) and 0.43 (age 28).^{11}

The treatment variable is equal to one (zero) if a vocational (general) upper secondary education was completed by age 25. If a student completed both a general and a vocational program before age 25, they are categorized by their first completed program. We conduct the analysis separately for males and females, for two reasons. The first reason is that their earnings outcomes are very different. Thus, the level of earnings is much higher for males, and males who complete a vocational education have higher earnings at age 28 than males who complete a general education, whereas the opposite is the case for females; see the top of Table 1.

The second reason for conducting the analysis by gender is the large gender differences in the choice of specific types of education within the two main categories of vocational and general upper secondary education. This is especially pronounced for vocational programs, in which males are much more likely to choose technology, construction, craftsmanship, and mechanics, whereas females are more likely to choose the commercial field, food production, and service. Within general programs, males are more likely to choose the technical track.^{12}

To better understand the age dynamics of the earnings effects, we investigate additional outcomes at age 28: labor market status, educational attainment, working hours, hourly wage rate, and fertility (Table 1). The proportion enrolled in education at age 28 is larger for individuals with a general upper secondary degree, and they also have more years of education and are much more likely to have completed a post‐secondary degree, in particular, a university degree.^{13} Males with a vocational education are more likely to be employed at age 28 and less likely to be inactive (not in employment or education) than males with a general upper secondary education; the opposite holds for females. Females with a vocational education tend to be a more negatively selected group who choose shorter vocational programs than males, on average about half a year shorter among those with no post‐secondary degree. This may partly explain the large proportion of inactive females with a vocational education.^{14} Another explanation may be that those with a vocational education are much more likely to have children by age 28.

For wage earners (about 80 percent of the sample), the lower part of Table 1 shows working hours and the hourly wage rate at age 28. Males with a vocational education work about ten more hours per month than males with a general education; the hourly wage rate is approximately the same for the two groups. For females, in contrast, the number of hours does not differ between the two groups, whereas the wage rate is 12 percent higher for those with a general education.

In our analysis we control for many important pre‐treatment covariates: ninth grade test scores and teachers’ marks for the year’s work in different subjects, cohort and municipality dummies, family structure, ethnicity, and parental education, income, labor market status, and crime. Table 2 shows selected control variables by treatment status; for the full set of covariates, see Online Appendix Table A5. Students completing vocational programs have on average more disadvantaged backgrounds and lower test scores than students completing general programs, which may explain at least some of the differences in outcomes by treatment status in Table 1. The differences in mean test scores between the two groups are substantial, corresponding to about one standard deviation in the distribution of test scores.

### C. Instruments

As instruments, we use distances to vocational and general upper secondary educational institutions. These are measured as the distance via the road networks, from the student’s residential address January 1 in the year they leave lower secondary school (typically at age 15) to the nearest educational institutions. To calculate these distances, we use the exact geographic coordinates for each of the identified educational institutions, and the coordinates for the southwest corner of the geographic quadrant of size 100 by 100 meters in which the student’s residence was located. We do not calculate distances for students living on small islands without a bridge to the mainland.

We construct separate distance measures to vocational and general schools. For each student, we might just use the distance to the nearest vocational institution and the distance to the nearest general institution. The nearest vocational school may not offer all seven main vocational tracks, however, and the nearest general school may not offer all four main tracks of general programs. Therefore, we use the weighted average distance to the seven main vocational education tracks and the weighted average distance to the four main general education tracks. These distance variables for each student are based on the distances to the nearest institutions offering each main track. The weights we use are based on the overall number of students choosing each main track.

Table 3 shows the percentage of observations by distance to vocational and general schools. Distances to general schools are typically shorter than distances to vocational schools. About 14 percent live less than 5 km from vocational schools, whereas about 40 percent live less than 5 km from general schools. Only 7.5 percent live more than 20 km from general schools, but 28 percent live more than 20 km from vocational schools. One reason for this pattern is that most vocational schools have an affiliated general school (that is, a mercantile or technically oriented general school) in the same location, whereas more traditional general schools are not located in connection with vocational schools. The distance to the nearest vocational school is consequently often at least as long as the distance to the nearest general school. The two distance variables have a strong positive correlation of 0.81. This is largely because both types of educational institutions are typically concentrated in urban areas, and distances to both types of institutions are short for individuals living in a larger city, whereas both are longer for individuals living in more sparsely populated areas.

Distance to general upper secondary education is better defined than distance to vocational education. At general schools, teaching will take place at the same institution throughout all three years of the program. For vocational education, we measure the distance to the institutions offering the seven basic courses (the first year of the vocational education), but, after the first year, some of the more specialized courses may be offered only at institutions further away, and apprenticeship positions or training places may be located quite a large distance from the educational institutions. One would therefore expect that distance to general schools is more important for the choice between general and vocational education than distance to vocational schools.

## V. Results

Before presenting the main results using MTE methods, we will briefly discuss OLS results.

### A. OLS Estimates

Table 4 shows, for each gender and earnings outcome, OLS estimates for the treatment variable (an indicator for completing vocational instead of general upper secondary education). These OLS models include the full set of controls: three cohort dummies, 92 municipality of residence dummies, and 62 covariates for family background and test scores and marks for the year’s work in different subjects in ninth grade (Online Appendix Table A5). For earnings at age 28, the estimates indicate that completion of vocational instead of general upper secondary education has positive effects for males and negative effects for females. Males’ earnings increase by about $7,200, whereas females’ earnings are reduced by $2,700. For predicted earnings at age 40, the OLS results indicate a negative effect for both genders: $5,600 for males and $8,800 for females.

The OLS results are very sensitive to the set of controls included in the model, especially the variables for academic ability at the end of lower secondary school (Online Appendix Table A6). The effects of vocational education are more favorable when academic ability controls are included, reflecting the endogeneity of educational choice. Thus, individuals choosing vocational education have on average lower ability and, therefore, lower potential earnings in both educational options.

### B. MTE Results

#### 1. Selection of type of upper secondary education

We now present the MTE results.^{15} We begin with the results for the first‐stage logit selection models of completing vocational instead of general upper secondary education, as shown in Table 5. We report average marginal derivatives for the two distance instruments. The models include the full set controls. The models in Columns 1 and 3 (for males and females, respectively) are used in our main specification. Here, we use as instruments both the distance measures and their interactions with the math test score and a dummy for missing math score to allow for the possibility that the effect of distance on educational choice may depend on cognitive ability. It is common in applications of MTE methods to use such interactions as additional instruments.^{16} Columns 2 and 4 in Table 5 represent a parsimonious specification in which we exclude the interactions and use only the two distance variables as instruments. In Online Appendix C.1 we show that our treatment effect estimates are not changed in any significant way by using the parsimonious specification.

The average marginal derivatives of the distance variables are almost identical in the two specifications in Table 5. They have the expected signs, and the distance to general education is highly significant. The point estimates indicate that an increase in the distance to general education by 10 km increases the probability of completing a vocational program by about four percentage points for males and two percentage points for females. An increase in distance to vocational education by 10 km tends to reduce the probability of completing a vocational program by about 0.4 percentage points for males and 0.7 percentage points for females. The larger and more significant effects for distance to general education are as expected; see Section IV.C.

The instruments are jointly highly significant according to the chi‐squared test statistics for the exclusion restrictions in the lower part of Table 5. In Columns 1 and 3, the cross terms between the distance variables and the math test score variables are highly significant (see the last chi‐squared tests in Table 5). We have considered distance‐squared and interactions between distance and other controls, but these terms were not significant. The estimated propensity score has common support over the interval from zero to one.^{17}

#### 2. MTE curves

We now turn to the main MTE results. Figure 1 shows the estimated MTE curves with 90 percent confidence intervals for our models with polynomials in the propensity score as control functions. For each model (given by outcome and gender), we choose the specification with a polynomial of order *n* if the *n* order terms are statistically significant (for either *k*_{0} or *k*) and if higher order terms are not significant. Figure 1 shows how estimates of the MTE vary with the unobserved resistance to treatment. The MTEs are evaluated at the sample means of the covariates **X**. Each panel also shows the ATE that is the integral of the MTE over the interval from zero to one. The MTE curves are decreasing for predicted earnings at age 40 for both genders and for earnings at age 28 for females. A decreasing MTE curve is consistent with selection on idiosyncratic gains—individuals with higher unobserved resistance to treatment (vocational education), who therefore tend to choose the other option (general education), have lower gains to treatment.

The MTE curve tends to be U‐shaped for earnings at age 28 for males. This is not necessarily inconsistent with individuals selecting into treatment based on gains. Thus, (segments of) the MTE curve may be increasing if individuals care about other outcomes in addition to the outcome considered in the analysis. In our application, individuals with high unobserved resistance to treatment (vocational education) might put more weight on long‐term earnings and therefore choose general education despite higher short‐term gains to vocational education at age 28, if they expect higher long‐term gains to general education.^{18} The MTE curves for more long‐term earnings at age 40 are decreasing for both genders, as noted above.^{19} Variation in the MTE in Figure 1 is substantial for both genders and both outcomes.^{20}

#### 3. Conventional treatment effect parameters

We now discuss the estimates of treatment effect parameters shown in Table 6. For earnings at age 28, the ATE is positive for males and negative for females. For predicted earnings at age 40, the ATE is negative for both genders. By completing vocational instead of general education, males gain on average $8,000 in earnings at age 28 but lose $8,700 at age 40. Females lose $6,700 in earnings at age 28 and $12,200 at age 40. The ATE estimates have the same signs as the OLS estimates in Table 4 but tend to be larger numerically.

For both outcomes and both genders, the ATT is higher (and the ATUT smaller) than the ATE, and the ATT is positive and significant, whereas the ATUT is negative or insignificant. This is consistent with selection on gains. On average, the individuals selecting into vocational education are those who benefit the most. The difference between ATT and ATUT is substantial and is clearly significant for both genders.^{21} For predicted earnings at age 40, the ATT and ATUT estimates are $3,700 and –$17,300 for males and $8,100 and –$18,100 for females.

The differences between the ATT and the ATE (and the ATUT) are caused by both observed and unobserved heterogeneity in treatment effects. Thus, the ATT is evaluated at the means of **X** in the group of treated (instead of the full sample used for ATE), which shifts the intercept of the MTE curve. Similarly, the ATUT is evaluated at the means of **X** among the untreated. The values of **X** are important for the MTE and consequently for the average treatment effect parameters if the estimates of (β_{1} – β_{0}) are different from zero; see Equation 4. In Table 6 we report the *p*‐value of a test for overall observed heterogeneity. We test whether (β_{1} – β_{0}) = 0 for all covariates. For all the estimated models, this test indicates significant observed heterogeneity.

Unobserved heterogeneity can also explain differences between the ATE, ATT, and ATUT parameters because the weights given to the MTEs at different values of the unobserved resistance to treatment differ. When estimating the ATT, more weight is assigned to individuals with low unobserved resistance to treatment (because they have a high probability of treatment), the ATUT assigns more weight to individuals with high unobserved resistance, and the ATE is based on equal weights. These differing weight distributions are important if the MTE curve is not flat. In Table 6, we report the *p*‐value for a test that the MTEs are constant across unobserved resistance to treatment; that is, a test for unobserved heterogeneity. In our models in which the last term in the MTE Equation 4 is represented by a polynomial in the propensity score, this is a test that the coefficients in this polynomial are all zero. We find highly significant unobserved heterogeneity.

The two upper panels of Figure 2 illustrate for each gender how the difference between the ATT and the ATE for predicted earnings at age 40 can be explained by observed and unobserved heterogeneity. In each panel, the MTE curve is evaluated at the overall means of the covariates in the estimation sample (as in Figure 1), whereas the MTE(ATT) curve is evaluated at the covariate means of the treatment group. For both genders, the MTE(ATT) curve is shifted upward compared to the MTE curve. Thus, the observed characteristics of the treatment group (completing vocation education) increases the ATT compared to the ATE. The two upper panels also show the weights that are used to calculate the ATT from the MTE(ATT) curve. Because the weights are higher for lower values of unobserved resistance to treatment and the MTE curves are decreasing, unobserved heterogeneity increases the ATT further compared to the ATE.

The form of the MTE curve is related to how potential outcomes vary by unobserved resistance to treatment. The two lower panels of Figure 2 illustrate this relation, again for predicted earnings at age 40. The MTE curve and the curves for potential outcomes are shown for the overall means of the covariates in the estimation sample. Potential earnings in case of vocational education (*Y*_{1}) and general education (*Y*_{0}) are measured on the right vertical axis. For males, the *Y*_{1} curve is decreasing, and the *Y*_{0} curve is increasing. The MTE curve is the difference between the *Y*_{1} and *Y*_{0} curves. Thus, the shape of both potential outcome curves contributes to a decreasing MTE curve. This is not the case for females, for whom the *Y*_{1} curve is almost constant at about $46,000, whereas the *Y*_{0} curve increases with unobserved resistance to treatment from about $31,000 to $77,000. The MTE curve therefore looks like a mirror image of the *Y*_{0} curve.

#### 4. Policy‐relevant treatment effects

A policy change that affects the benefits or costs of choosing vocational instead of general education will mainly affect individuals who are at the margin of indifference between vocational and general education. Different policies may affect the educational choice of different marginal groups. We estimate effects of five policy simulations that increase the probability of choosing vocational education. To do this, we use our MTE model to estimate policy‐relevant treatment effects (PRTEs); see Heckman and Vytlacil (2001, 2005, 2007); Carneiro, Heckman, and Vytlacil (2011); and Cornelissen et al. (2016). The PRTEs are computed as a weighted average over the MTE curve, where weights reflect the individuals who are shifted by the policy and where the MTEs are evaluated at the means of the control variables for policy compliers (Cornelissen et al. 2016, Equation 29). The PRTEs are effects per individual shifted into treatment.^{22}

The first two policies we consider shift the propensity score for all individuals. The first policy augments the propensity score by one percentage point for all observations (although it is not allowed to be larger than one). Instead of changing the propensity score directly, the second policy changes it indirectly through a change in instrument values. Thus, we consider an increase in the distance to general schools by 1 km for everybody. This results in an average increase in the propensity score by 0.4 percentage points for males and 0.2 percentage points for females.^{23}

The effects of these two policies are similar; see the middle part of Table 6. For females, the effects are small and statistically insignificant. For males, both policies have significant positive effects on earnings at age 28 (of $9,300 and $11,000), but have negative effects on predicted earnings at age 40 (of –$5,300 and –$6,400).

Different policies may affect low‐ and high‐ability students differently. For instance, a policy that introduces or modifies admission criteria to either vocational or general education based on ninth grade test scores will affect low‐ability students’ choice of education, but not the choice made by high‐ability students. The last PRTEs in Table 6 are for policies that increase the propensity score by 0.01, but only for those with low, medium, and high ninth grade math scores, respectively.^{24} For males, a policy increasing the propensity to complete vocational education for students with low math skills tends to have more positive effects on earnings than a policy that increases the propensity score for students with high math skills. This is most pronounced for earnings at age 40, for which the increase in the propensity score for low‐ability students results in a small insignificant PRTE, whereas the shift for high‐ability students results in a large negative PRTE of –$11,400. Effects for females are mostly insignificant, and there is no clear pattern.

The PRTE estimates indicate that, for most males at the margin of indifference between vocational and general education, it may be an advantage in terms of long‐term earnings to choose general instead of vocational education, but for males with low math skills and for females, there seems to be no important benefit or loss from a shift in educational choice for marginal students.

### C. Earnings Dynamics and Results for Additional Outcomes at Age 28

To explore mechanisms related to the earnings effects, we discuss results for additional labor market outcomes, educational attainment, and fertility at age 28. In the Online Appendix we report estimates for OLS models (Tables A7–A10) and for MTE models (Tables A11–A14). We discuss mainly the MTE results, starting with the ATE estimates. The overall pattern of the OLS estimates corresponds approximately to that of the ATE estimates.

#### 1. ATE estimates for additional outcomes at age 28

For males, the positive ATE on earnings at age 28 in Table 6 (of about 16 percent) are explained by a ten percentage point higher employment probability and by higher earnings among those employed due to, on average, 6 percent higher hourly wage and 11 percent more working hours for those working; see the ATE estimates in Online Appendix Tables A11 and A12.

For females, the negative ATE on earnings at age 28 in Table 6 is partly explained by a negative ATE on the hourly wage of about 7 percent. The ATE on employment is insignificant for females, which contrasts with the large positive employment effect for males. This gender difference may be explained by vocational education leading to earlier fertility timing related to fewer years of education. The estimate of the ATE on the probability of having children by age 28 is not very precise. The insignificant point estimate indicates that vocational education increases this probability by 6.5 percentage points (Online Appendix Table A13); OLS estimates are clearly significant and indicate an effect of 12 percentage points (Online Appendix Table A9).

For both genders, the ATE estimates for educational attainment at age 28 represent an important mechanism behind the negative ATEs on earnings at age 40 in Table 6. Thus, the ATE estimates for the probability of having attained a post‐secondary degree are −56 and −67 percentage points for males and females, respectively (Online Appendix Table A11).

#### 2. Heterogeneity in effects on additional outcomes at age 28

The important heterogeneity in effects on earnings at age 28 shown in Table 6 with more positive ATT than ATUT effects are related to heterogeneity in the effects on employment, working hours, and the hourly wage rate at age 28 (Online Appendix Tables A11 and A12). For males, the ATT effect on the employment probability is 15 percentage points, whereas the ATUT estimate is only six percentage points. The ATT effects on hours and hourly wage are also large and positive (17 and 11 percent), whereas the ATUT estimates are small (7 and 2 percent). For females, the effects on employment are insignificant (but with a positive ATT and a negative ATUT point estimate). The ATT effect on hours is positive (11 percent), whereas the ATUT effect is insignificant, and the ATT and ATUT effects on the hourly wage rate are zero and −9 percent, respectively.^{25}

For earnings at age 40, the large difference between the positive ATT and the negative ATUT effects (shown in Table 6) can partly be explained by heterogeneity in the effects on the probability of attaining a university degree. For males, these ATT and ATUT effects are −19 and −37 percentage points, and for females, they are zero and −29 percentage points. Thus, the effect of vocational education on the probability of attaining a university degree is much more negative for those who complete general upper secondary school than for those who complete vocational education. The intuition is that only rather few in the latter group (with lower academic skills and inclination) would attain a university degree if they had chosen general upper secondary school.

For males, the positive ATT and ATE effects on employment at age 28 (15 and 10 percentage points) are rather close in size to the negative effects on being enrolled in education (12 and 8 percentage points). Thus, part of the positive ATT and ATE effects on earnings at age 28 may be explained by vocational education leading to less post‐secondary education and therefore earlier career entry. However, when we exclude from the sample those enrolled in education at age 28, and when we (in addition) exclude those enrolled at ages 26 and 27, the positive ATT and ATE effects on earnings at age 28 remain significant, although they become smaller (Online Appendix Table A14). For females, the estimated effects on employment are less precise, and excluding those enrolled in education does not alter the estimates of earnings effects much.

## VI. Robustness and Specification Checks

We conduct a series of robustness and specification checks, discussed in detail in Online Appendix C. First, we show that our main results in Table 6 are robust to using alternative specifications for the distance instruments. Second, we show that using total income instead of earnings as the outcome at age 28 does not change the results very much, and we discuss likely reasons for the differences.

Third, we run robustness and specification checks on the model used to predict earnings at age 40. Thus, we show that including data for more older cohorts or excluding lags of the short‐term outcomes in this model do not significantly affect the ATE, ATT, or ATUT estimates for predicted earnings at age 40. We also find it is important that the model includes a broad spectrum of short‐term variables. Thus, a model that includes detailed educational attainment and gender as explanatory variables, but not other short‐term outcomes, such as earnings and labor market status, results in treatment effect estimates for earnings at age 40 that are significantly and substantially different from the baseline estimates in Table 6. Furthermore, we assess the validity of the surrogacy and comparability assumptions, as suggested by Athey et al. (2019), and our analysis indicates that the assumptions are plausible in this application.

Fourth, we show results for an alternative definition of treatment, namely *enrollment* in vocational instead of general education (conditional on enrollment in some upper secondary program). We find that the distance instruments are highly significant in the first stage and that the second‐stage ATT estimates become smaller than in the main analysis with completion as treatment. This is due to inclusion of dropouts, the majority of whom are from vocational programs. We also find that the ATUT and ATE estimates differ little from the main analysis.

We finally discuss specification checks related to the validity of the distance instruments. We show that the instruments do not significantly predict the selection into the estimation sample of completers (or enrollees). We also find that the distance instruments influence the type of education completed, conditional on enrollment. This suggests they are more likely to be valid for the main analysis with completion as treatment than for the analysis with enrollment as treatment; see Section III.A. Furthermore, we show that the distance instruments do not affect GPA in general upper secondary school, indicating that although they affect the type of education completed, they do not affect the quality of education. In placebo tests of the validity of the instruments, we find that the instruments do not predict completion of compulsory (lower secondary) school. As another check of instrument validity, we estimate the first‐stage models excluding controls for test scores.^{26} Large differences in the estimates of the coefficients on distance, compared to our main specification with test score controls, might indicate that including additional controls for student ability (which are not in our data) would also make large changes to the coefficients of the instruments. Although we find that these differences are small and not statistically significant, they might be suggestive of some endogeneity of the instruments, which should give rise to some caution when interpreting results.

## VII. Conclusion

We use MTE methods to estimate the effects on earnings of completing a vocational instead of a general upper secondary education. For both genders, we find more negative long‐term ATEs on earnings at age 40 (compared to the short‐term effects at age 28) of choosing vocational education, consistent with findings in earlier studies, including Hanushek et al. (2017), who find this pattern for both male employment and earnings. We discuss mechanisms underlying the pattern of earnings dynamics and investigate additional outcomes at age 28, including employment, educational attainment, wage rate, working hours, and fertility.

We find important heterogeneity in effects (in terms of both observables and unobservables) that are consistent with selection on gains. For earnings at age 28 and 40, the ATT is positive and significant, whereas the ATUT is negative or insignificant, and the difference between the ATT and the ATUT is substantial and clearly significant for both genders. For earnings at age 40, the ATT and ATUT estimates are $3,700 and –$17,300 for males and $8,100 and –$18,100 for females. Thus, in terms of earnings at age 40, individuals who complete vocational education benefit on average from their choice (because the ATT is positive). Similarly, individuals who complete general education also benefit on average from their choice (because the ATUT is negative).

To investigate whether students at the margin of indifference between vocational and general education would typically benefit in terms of earnings from choosing vocational (or general) programs, we estimate policy‐relevant treatment effects for policies that, in different ways, manipulate the propensity score for completing vocational instead general education. These include a policy that increases the propensity score by one percentage point for all individuals and a policy that increases the distance to general schools by 1 km for all individuals. For females, the PRTEs are largely small and insignificant, indicating no important benefit or loss from a shift in educational choice. For males, the PRTEs on earnings at age 40 are negative. However, these effects tend to be small and insignificant for males with low math skills. Thus, for most males at the margin, it may be advantageous in terms of long‐term earnings to choose general instead of vocational education, but not for males with low math skills. However, conclusions based on the PRTEs should be made with caution. Different policies will affect different marginal groups, and we have considered only a few stylized policy simulations.

## Footnotes

↵1. Most students who wished to enroll in general programs after the ninth grade were admitted via a recommendation from their ninth grade school. A positive recommendation was based on an overall evaluation of the student, although this was strongly correlated with marks and test scores at the end of ninth grade. However, students who did not receive a recommendation could still enroll in general programs if they passed admission tests. Even students who dropped out of ninth grade or who did not sit for the ninth grade exam could eventually enroll in general programs after completing the optional tenth grade of lower secondary school. For the cohorts studied in this paper, about 10 percent of those who eventually enrolled in some type of upper secondary program did not sit for the ninth grade exams or had test scores below passing level. Of this low‐ability group, about 20 percent enrolled in general upper secondary programs.

↵2. Such simplification is standard in the literature. For instance, Hanushek et al. (2017); Brunello and Rocco (2017); Choi, Jeong, and Kim (2019); and Torun and Tumen (2019) also focus on vocational versus general upper secondary education. The larger literature on the effects of college typically consider college versus no college.

↵3. We have information on distance before completion of lower secondary school (at age 15). Carneiro, Lokshin, and Umapathi (2017) have data only for distance at the time when they measure outcomes (at age 25‐60). As they state, this means that the causality of the first‐stage relation might be reversed because individuals completing upper secondary school may move afterwards to more urban areas with shorter distances to schools.

↵4. It is assumed that the distribution of

*V*is continuous.↵5. Without this restriction, it would only be possible to estimate MTEs if the propensity scores in both treatment groups had full support for all values of

**X**, which is not feasible in practice as discussed in Carneiro, Heckman, and Vytlacil (2011) and Cornelissen et al. (2016). To achieve the restriction on the MTE curve we assume separability: E(*U*|_{j}*V*,*X*) = E(*U*|_{j}*V*),*j*= 0,1, as in Brinch, Mogstad, and Wiswall (2017), and Cornelissen et al. (2018). Alternatively, one can assume full independence between*Z*and (*U*_{0},*U*_{1},*V*), as in Aakvik, Heckman, and Vytlacil (2005); Carneiro, Heckman, and Vytlacil (2011); and Carneiro, Lokshin, and Umapathi (2017).↵6. The MTE approach focuses on effect heterogeneity. It is therefore a limitation that predicted earnings at age 40 are constructed based on a model assuming constant effects of the surrogate outcomes at age 28 (given pre‐treatment controls). On the other hand, the large number of intermediate outcomes at age 28 and their interactions mean that we have plenty of variation in predicted earnings at age 40, which can capture important heterogeneity. For instance, compared to those who clearly prefer a general program, individuals at the margin between vocational and general education who choose general education will presumably obtain on average lower grades in general upper secondary school and choose shorter or less demanding post‐secondary programs. Conditional on the specific program chosen, they may earn less or have a lower probability of being employed at age 28. All of these short‐term outcomes and their interactions will tend to generate lower predicted earnings at age 40 for those at the margin compared to those who clearly prefer a general program. Our main results, which we discuss in Section V.B, are consistent with this presumption because the PRTEs of choosing vocational education are more positive than the ATUT for earnings at age 40 (and at age 28).

↵7. We use stratified bootstrap sampling to preserve the original number of observations of males and females in the main sample and the sample of older cohorts. We also report bootstrapped standard errors for MTE‐based estimates when the outcome is measured at age 28 and for OLS estimates for predicted earnings at age 40.

↵8. We use the exchange rate end of 2018: $1 = DKK 6.5. To limit the effect of outliers we trim observations if earnings at age 28 are negative or above DKK 1m ($153,800); this is less than 0.4 percent of the sample.

↵9. These variables are based on the Danish version of the International Standard Classification of Education (ISCED).

↵10. We do not interact gender with the 290 indicators of specific educations because this does not improve the adjusted

*R*‐squared and produces noisy predictions for some education programs with highly unbalanced gender composition. We have checked that including interactions between gender and these detailed indicators of education programs does not affect the treatment effect estimation results in any significant way.↵11. For brevity we do not report details of these regressions, but they are available upon request. We did not include in the surrogate index all pre‐treatment controls used in the MTE estimations, rather only gender and parental education and earnings. We did try to include other pre‐treatment controls, for instance, municipality dummies, but they did not improve the adjusted

*R*‐squared. It was not possible to include ninth grade test scores because these do not exist for the older cohorts. In view of the large number of important short‐term outcomes for which we control, it is probably not important to include all pre‐treatment controls. In Online Appendix C.3, we discuss a specification check supporting this presumption, as well as other specification checks related to the model for predicted earnings at age 40.↵12. See Online Appendix Table A2. These gender differences related to the treatment variable are followed by similar differences in educational attainment by age 28–31 (measured in 2018). Among those who complete a post‐secondary education, males are more likely to choose technical educations and science, whereas females are more likely to choose humanities, teacher training programs, and health (Online Appendix Tables A3 and A4).

↵13. The difference in years of education between the education groups may seem small because the average length of post‐secondary programs is about four years, but vocational upper secondary programs are on average about one year longer than general programs. Also, for individuals who complete both a general and a vocational upper secondary degree, but no post‐secondary degree, years of education is based on their vocational degree.

↵14. In our data, women on maternity leave are categorized as employed or student if they were in one of these states when they took leave. For women on maternity leave with full pay (provided by the employer), this payment will be included in our measure of earnings, but maternity pay in the form of public transfers is not included. In Online Appendix C.2, we discuss a robustness check in which the outcome is total income including public transfers.

↵15. Results using standard IV methods (2SLS) are discussed in Online Appendix B.

↵16. Carneiro, Heckman, and Vytlacil (2011) and Nybom (2017) also interact distance instruments with cognitive ability. Carneiro, Lokshin, and Umapathi (2017) use interactions between distance and control variables as instruments. Cornelissen et al. (2018) use interactions between their primary instrument and covariates as additional instruments.

↵17. Online Appendix Figure A1 shows its frequency distribution by treatment status and gender.

↵18. In principle, this corresponds to the argument in Cornelissen et al. (2018), who estimate an increasing MTE curve. They find that children with higher unobserved resistance to attending childcare (the treatment) have higher returns to attending childcare (in terms of cognitive development) and that these children come from more disadvantaged family backgrounds in terms of characteristics not controlled for in the analysis. Thus, one possible explanation for the increasing MTE curve may be that childcare costs are a more important disadvantage in low‐income families.

↵19. Brinch, Mogstad, and Wiswall (2017) discuss U‐shaped MTE curves. They show that the U‐shape can be generated if the population consists of two subpopulations with constant MTE curves at different levels and with different distributions of the unobserved resistance to treatment.

↵20. The variation is of the same order of magnitude as in other studies. Carneiro, Heckman, and Vytlacil (2011) estimate the effect of college on log earnings and find that the MTE evaluated at the sample means of the covariates varies between −0.2 and 0.4. Carneiro, Lokshin, and Umapathi (2017) estimate the effect of upper secondary education on log hourly wages and find a variation between −0.2 and 0.5. This corresponds to a variation in

*Y*_{1}/*Y*_{0}between 0.82 and 1.65 (where*Y*_{1}and*Y*_{0}are not‐log‐transformed potential earnings). We find the largest variation for female earnings at age 40, where*Y*_{1}/*Y*_{0}varies between 0.60 (46/77) and 1.48 (46/31). For male earnings at age 40, the variation is between 0.73 (61/83) and 1.06 (72/68). Values of potential earnings (in $1,000) indicated in parentheses can be seen in the lower part of the next figure, which is discussed in Section V.B.3.↵21. The heterogeneity in effects on earnings at age 28 are reflected in heterogeneity in effects on employment probability, hours worked, and the hourly wage rate; see Section V.C.

↵22. As discussed above, post‐secondary education is an important mechanism behind earnings effects of general upper secondary education in particular. Although some higher education programs are rationed in Denmark, there are many programs without admission restrictions, including, for instance, most programs within natural science, engineering, economics, business, and teaching. Therefore, when we consider small changes in the probability of choosing vocational instead of general upper secondary education, rationing of post‐secondary programs is not an important limitation for the estimation of PRTEs.

↵23. An alternative policy that increases the distance to general schools to be equal to the distance to vocational schools whenever the former distance is shorter gives very similar PRTE estimates (results not shown).

↵24. We split the sample by the average of the math test score and the mark in math for the year’s work evaluated by the teacher (less than or equal to 7, between 7 and 9, and at least 9). The percentages with low, medium, and high scores are 23, 46, and 31 percent (approximately the same for males and females).

↵25. Heterogeneity in effects on fertility timing does not help explain the heterogeneity in earnings effects at age 28 for females. Thus, the ATT effect on the probability of having children by age 28 is large and significant, whereas the ATUT effect is smaller and insignificant (Online Appendix Table A13). Similar results were obtained using other measures of fertility by age 28, for instance, the number of children, or having a child below three years of age.

↵26. Nybom (2017) applies a similar check for a first‐stage selection model with distance as instrument.

- Received February 2021.
- Accepted March 2022.

This open access article is distributed under the terms of the CC‐BY‐NC‐ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) and is freely available online at: https://jhr.uwpress.org.