## Abstract

How does class size in compulsory school affect people’s long-run education and earnings? We use maximum-class-size rules and Norwegian administrative registries, allowing us to observe outcomes up to age 48. We do not find any indication of beneficial effects of class-size reduction in compulsory school. For a one-person reduction in class size we reject effects on income as small as 0.12 percent in primary school and 0.15 percent in middle school. Population differences in parental background, school size, or competitive pressure do not appear to reconcile our findings with previous studies.

## I. Introduction

The relationship between class size and human capital is one of the most researched and debated questions in education, but evidence on this matter has been limited. The main challenge when estimating class-size effects is to correct for potential omitted variable bias that may arise because of decisions by parents, schools, and education authorities that may lead to sorting, as well as other confounding factors. The most credible studies either rely on social experiments, such as the Tennessee STAR study (for example, Krueger 2003), or exploit quasi-experimental setups that generate arguably exogenous variation in class size (for example, Angrist and Lavy 1999; Hoxby 2000; Leuven, Oosterbeek, and Rønning 2008).

Because of data limitations, the large majority of existing class size studies have focused on short-term outcomes, such as test scores. Ultimately, however, one would like to know whether class size also impacts outcomes in the longer run. One reason for focusing on long-term outcomes is that fade-out is known to be a concern in education contexts, especially at earlier ages (Chetty et al. 2011). But short-run outcomes, such as test scores, may also be a partial measure of the human capital affected by class size, and some have argued that noncognitive skills in particular are important for outcomes later in life (Heckman and Cunha 2007). A focus on test scores may therefore lead to substantial underestimates of actual gains. Finally, there is also substantial variation in the short-term effects that are found in the literature (Wößmann and West 2006), but little is known whether this is mirrored by similar effects in the longer term.

An early study that found persistent longer-term effects of small classes is Krueger and Whitmore (2001). Using the STAR data, they show that those attending a small class in Grades K–3 were more likely to take college-entrance exams. Similarly, using the STAR data, Chetty et al. (2011) find that pupils who attended small classes are more likely to attend college, but they do not find effects on earnings at age 27. Recently, Fredriksson, Öckert, and Oosterbeek (2013, 2016), using Swedish data, were the first to report statistically significant effects on earnings and find that a one-pupil decrease leads to 1.5 percentage point increase in earnings. In a recent study, Falch, Sandsør, and Strøm (2017), using Norwegian data, do not find any significant beneficial long-term effects from a middle school class-size reduction, consistent with the absence of short-run effects in Leuven, Oosterbeek, and Rønning (2008).

We present long-term impact estimates of average class size in compulsory school (Grades 1–9) for Norway, as well as separate class-size effects for primary school (Grades 1–6) and middle school (Grades 7–9). The data cover all cohorts graduating from middle school and go back to the late 1970s. Using Norway’s registry data, we can trace these cohorts’ education and earnings up to 2014, when the oldest individuals are 48 years old. The empirical approach is standard and exploits maximum-class-size rules that were effective up to the school year 2002–2003. The maximum class size in primary school was 28 (and 30 in middle school). Our data are informative about class size in middle school, as well as class size in primary school for those that did their primary education in so-called combined schools that have both a primary and middle school department. About one out of four primary schools and nearly two out of three middle schools in Norway are such combined schools.

We do not find beneficial effects of class size in primary school on earnings and completed schooling. Our results for class size in middle school are in line with the results for primary school. Although the effects are statistically insignificant, we have enough precision to reject beneficial effects on earnings as small as 0.12 percent for a one-pupil reduction in class size throughout primary school with 95 percent certainty. Long-run estimated impacts of class size in middle school on completed schooling are also small and statistically insignificant, and they also allow us to reject small beneficial effects. These results for middle school are consistent with the previous results of Leuven, Oosterbeek, and Rønning (2008), who estimated short-term effects of class size in middle school on test scores of middle exit exams for two cohorts in the early 2000s.

They did not find significant effects on test scores and could reject effects as small as 1.5 percent of a standard deviation for a one-student reduction in average class size for three years.

Our results contrast with the recent evidence from Sweden by Fredriksson, Öckert, and Oosterbeek (2013, 2016), who find large negative and statistically significant effects from average class size in primary schools (Grade 4–6) on earnings and, in some specifications, also for years of schooling. We investigate two possible explanations for these diverging results.

The first is differences in (compensating) parent behavior. Fredriksson, Öckert, and Oosterbeek (2016) show that higher education parents compensate for larger classes while less educated parents do not. They also find that lowering class size improves achievement at age 13 for children from parents with low (below median) income, while children from high (above median) income families do not appear to benefit from smaller classes. That this also could be the case for Norway is suggested by the results from the only Norwegian study on short-run effects of class size in primary school by Bonesrønning and Iversen (2013), who have data on one cohort of fourth graders. They find effects for the subgroup of students with parents who are educated at or below the upper secondary school level and for the subgroup of students from dissolved families. However, we find no evidence that smaller classes provide benefits in the long run for children from more disadvantaged backgrounds, whether it is by parental income, education, or migrant background, suggesting that parent behavior does not drive the different results.

The second potential explanation lies in differences in the school population. Compared to Sweden, schools in Norway are relatively small. If school size mediates classsize effects, then this is another potential source of effect heterogeneity. Another difference in the population stems from the Swedish sample of school districts with only one school. Such schools are perhaps more shielded from competition, and competitive pressure could attenuate the effect of class size. When we approximate the Swedish sample by focusing on single-school municipalities, none of the estimates provide evidence of class-size effects.

The analysis proceeds with a brief discussion of the institutional context in Norway before introducing the data sources. In Section III we present the empirical approach before presenting the results in Section IV. The paper finishes with a discussion of the findings and conclusion.

## II. Institutions and Data

Compulsory education in Norway covers nine years of schooling.^{1} Children begin their primary education in the year they turn seven and, after six years in primary school, transfer to middle schools that cover Grades 7–9. After compulsory schooling students have the possibility to enroll in general or vocational upper secondary schools, where the general track leads to university. Compulsory schooling is free of charge, and nearly all schools are public (less than 3% are private). Although schools in Norway have to conform to the same laws and curriculum, the responsibility for providing education lies with municipalities who are responsible for operating schools. Both primary schools and middle schools have catchment areas, which are relatively strictly enforced. Catchment areas of middle schools are typically larger than the catchment areas of primary schools and cater for all pupils in a predetermined number of primary schools.

The data in this study come from administrative registries collected by Statistics Norway that cover the complete population. From these registries we know from which middle school individuals graduate. We take the graduating cohorts from 1978 onwards and link them to various data sources. The tax registry provides information on earnings from work until 2013. Information on the highest observed education attainment comes from the national education registry. Finally, we also use the population registry to identify the parents of the individuals in order to obtain family background characteristics, such as parental age and education at age 16, as well as the municipality of residency throughout childhood. We follow people up to age 48, which allows us to report average class-size effects on schooling and earnings between age 27 and 42 following Fredriksson, Öckert, and Oosterbeek (2013), but we will also estimate age-specific class-size effects from age 18 to age 48.

### A. Class Size

Information on class size also comes from administrative data. Schools report the number of classes and the number of pupils separately for each grade. Using this information, average class size at the school grade level is calculated by dividing enrollment by the number of classes. Following Leuven, Oosterbeek, and Rønning (2008), we compute for each school *s* the average class size a cohort (not explicitly indexed) would have experienced in primary school (PS), , and in middle school (MS), . The reason for using average class size rather than using all class sizes separately for each grade is that class size is highly correlated from year to year, and separating class-size effects by grade is not feasible. Using class size from a single grade is also problematic since it may capture effect of class size in previous or subsequent grades. These complications are avoided by taking the average of class size across grades, which delivers an interpretable and policy-relevant effect.

Class-size information is available back to the school year 1977–1978. This left truncation implies that we cannot construct complete class-size histories and for all school cohorts. For the graduating cohort in 1978, for example, only class size in Grade 9 (the final grade of middle school) is available, while information on class size in earlier grades is not. A complete class-size history for middle school is observed for the 1980 cohort and onwards, and for the 1986 cohort and onwards we observe both complete primary and middle school class-size histories. Our analysis, however, retains all cohorts to preserve as much data as possible, and average class size refers to average *observed* class size. We extensively investigated the role of truncation, and the results from these robustness analyses show that truncation does not affect our results.^{2}

Since we only observe students as they graduate from middle school, we cannot be sure they have all experienced the entire class-size history of their graduating school. Some of the students move or change schools during their compulsory schooling years, and some students change schools between primary school and middle school. This means that while we can construct average class size for school s at level *k* = PS, MS, at the individual level we need to account for exposure. We define *s*(*i*) as the school from which student *i* graduates at the end of compulsory schooling. If a student moves to school *s*(*i*) during level *k*, then school exposure will be less than one, and failing to account for exposure would result in an overestimate of the first stage, and an underestimate of the effect of class size. We therefore account for within- and between-municipality compulsory-school changes at both the primary and middle school level when we construct our class-size measures.

More precisely, for an individual in a school cohort *s*(*i*) her average class size at level *k*=PS, MS is approximately times the fraction of grades that student *i* attended in school *s*(*i*) at level *k*. We refer to this fraction as “exposure” and denote it with . We can then write student *i*’s class size at level *k* as
(1)
where is the average class size the student experienced elsewhere (than *s*(*i*)).^{3}

From the administrative registry we know the number of primary school grades and middle school grades a student experienced in the graduating municipality. For example, if a student moves to the municipality of her graduating school before the start of the fourth grade, we set , and . If the child moves at the start of eighth grade, then and , etc. This approach gives average exposure rates of about 0.91 in primary school and 0.98 in middle school for combined school students. We also show in Online Appendix Figure W10 that our instruments are unrelated to exposure.

Finally, we adjust these exposure rates to take within-municipality moves into account. We show in Appendix A that within-municipality moves are 1.5 times more prevalent than between-municipality school moves. Our analyses in the Appendix show that this gives an additional adjustment factor of 0.88 in primary and 0.98 in middle school, and mobility adjusted exposure rates of 0.82 and 0.96. The external data sources that we analyzed suggest that this a conservative adjustment, both in the sense that we take an upper bound on within-municipality moving rates and in the sense that we assume that every within-municipality move also involves a school move.^{4}

Let equal one if student *i* attended primary school at school *s*(*i*) and zero otherwise. The registries do not record which primary school people attended. This means that because we do not directly observe we do not observe either. However, many middle schools in Norway have an integrated primary school. Children that receive their primary schooling in such a combined school typically complete the compulsory schooling up to Grade 9 there. This means that for this population we also have exposure at the primary school level.

One complication is that larger combined middle schools often also take in pupils from other primary schools in the municipality (the regulatory maximum travel distance is higher for middle school students).^{5} Consequently, for these incoming pupils. In our main analysis we will therefore base our estimates on the subpopulation of students who graduated from combined middle schools that do not take up students from other primary schools. We operationalize this by stratifying our sample to students who attended combined schools where takeup, the difference between enrollment at the end of primary school and enrollment at the start of middle in the following school year, is no more than one pupil:
(2)

This ensures that for nearly all individuals in our data and that we therefore correctly observe .

In one analysis we use the broader combined school sample. Here we need to account for the fact that students who are part of the takeup were not exposed to primary school in their graduating middle school. However, at the school cohort level we can estimate the probability of attending the combined primary school by the ratio of the number of students at the end of primary school to the number of students at the start of middle school which we can use to adjust exposure as follows: (3) which is our final class-size measure.

Table 1 reports the average class sizes for all students (All), for those who attended combined primary and middle schools (Combined), and our reference population of students who attended combined schools with takeup ≤ 1 in grade 7 (Baseline). Average class size in middle school is about 25 and slightly less, 23, in the combined schools, which reflects their smaller size. Average class size in primary school is about 21 in our baseline sample. Since there is no takeup in our baseline sample, class size is essentially the same in primary and middle school. We will therefore estimate the effect of average class size in compulsory schooling (Grades 1–9) for this population, while we will separate these effects using the full combined school population.

While class sizes are very similar across the three populations, schools are smaller in our baseline sample of combined schools with no takeup. Average enrollment at the grade level is about 35 pupils, which means that they typically will have one to two classes. Schools in the expanded sample of all combined schools, now including those that takeup students from other primary schools, are somewhat larger with about 50 pupils at the grade level and on average two classes. Finally, average enrollment at the grade level in the total population is about 90, or three to four classes, which highlights that combined schools tend to be smaller than schools that offer only middle school.

Table 1 also reports descriptive statistics on the background characteristics of the people in our data, as well as the outcomes we consider: educational attainment and labor earnings, both logarithmic and normalized relative to the total sample average. The main thing to note here is the similarity, both in terms of background and outcomes, between those who attended combined schools and the population at large. Parents have somewhat less schooling in the combined and baseline sample, but overall the differences are minor.

### B. Maximum Class-Size Rules

Until 2003 all schools were subject to maximum-class-size rules. In middle schools the maximum class size was 30, whereas in primary schools it was 30 up to the year 1985 when it was lowered to 28.^{6} Figure 1 illustrates the maximum-class-size rule. The left panel plots average class size in compulsory schooling against enrollment at the start of school (or first observed enrollment if first grade enrollment is missing). The solid line is predicted class size given enrollment. It has the familiar saw-tooth shape, with predicted class size increasing until a new classroom is opened after each multiple of the maximum class size when class size drops discontinuously. As can be seen from the figure, actual class size follows the pattern of predicted class size with discontinuous drops at multiples of 28. The right panel plots average class size in middle school against enrollment at the start of middle school.

Figure 2 shows the distribution of class size in primary and middle school. Conforming to the maximum-class-size rule, hardly any classes are observed beyond 28 in primary school and 30 in middle school. Modal class size is 22 in primary school and 26 in middle school, but we also observe many smaller classes because quite a few schools in Norway are small or of moderate size.

## III. Empirical Approach

The main challenge when estimating the causal relationship between class size and subsequent outcomes lies in addressing potential omitted-variable bias. Parents, but also teachers and/or schools, may make choices that result in a nonrandom allocation of students to classrooms in such a way that class size correlates with unobserved determinants of performance. In this case estimation approaches that rely on unconfoundedness assumptions, such as ordinary least squares (OLS), will be inconsistent.

Because of this concern our analysis exploits quasi-random variation in class size generated by the maximum-class-size rules discussed above, following the seminal paper by Angrist and Lavy (1999). The idea is that while class size discontinuously drops at the multiples of the maximum class size, nothing else will change as long as people’s position relative to these cutoffs is as if random. We exploit this so-called fuzzy regression discontinuity design using an instrumental variable (IV) strategy.

### A. Instrumental Variable Estimation of Class-Size Effects

The basic setup is as follows. Consider the following outcome equation
(4)
where outcome *y* (education or earnings) depends on average class size (*cs**), and a function *f*(·) of enrollment. We abstract here from control variables for clarity but discuss our controls below. Class size can potentially correlate with unobservables, *E*[*u*|*cs*, enroll] ≠ 0, in which case OLS estimation of Equation 4 gives an inconsistent estimate of the class-size effect β_{1}. We tackle this by instrumenting class size. Let the corresponding first stage be
(5)
where the instrument *z* satisfies *e, u* ⊥ *z* | enroll, and *π*_{1} ≠ 0. Here we also control for enrollment through *g*(·). Instrumenting *cs** with *z* in Equation 4 allows us to consistently estimate β_{1}.

We follow Fredriksson, Öckert, and Oosterbeek (2013) and assign individuals to segments of enrollment with a window of ±15 pupils around each discontinuity. For schools at the first discontinuity we therefore take everybody with school enrollment between 16 and 45, for the second segment everybody with enrollment between 46 and 75, etc. Conditional on segment we then instrument class size with an indicator variable that is one above the segment’s discontinuity and zero below. The first-stage coefficient is allowed to vary across segments, the period during which the maximum-class-size rule was 30, the transition period, and the final period when the maximum-class-size rule was 28.

The baseline specification for *f*(·) and *g*(·) consists of segment dummies and a segment-specific linear spline in enrollment with the kink at each discontinuity. We furthermore allow these controls to vary across the periods when maximum class size was different in primary schooling and an intermediate transition period. These segment dummies and enrollment splines take the possibly confounding effects of the running variable, enrollment at the start of school (enroll), into account. In robustness checks we estimate more flexible specifications and show that this does not matter for our results. We further control for gender, month of birth, immigrant background, mother’s and father’s years of schooling, age at school start, and cohort dummies. All reported standard errors are clustered at the school level. In Tables W10–W12 in the Online Appendix we show that clustering at enrollment (see Lee and Card 2008) makes hardly any difference for the estimated standard errors and sometimes even decreases them.

### B. Separating Class Size in Grades 1–6 and Grades 7–9

Although we start out by estimating the effect of average class size in compulsory schooling in our baseline sample, it is also interesting to separately estimate the effect of class size in primary and middle school. In the broader population that attended any combined school there will be schools that take in children from other primary schools. This breaks the near perfect correlation between enrollment in primary and middle school grades. We can therefore use these schools to separately estimate the effect of class size in primary and middle school. To do so we estimate the following equation with two-stage least squares (2SLS): (6)

The effects of interest are δ_{PS} and δ_{LS}, the coefficients on class size in primary school, , and class size in middle school, . All specifications control for segment dummies and segment splines, separately for primary and middle school. Control variables **x** include as above gender, month-of-birth, immigrant background, mother’s and father’s years of schooling, age at school start, cohort dummies, and time dummies (if necessary).

With two endogenous variables we need two first stages: one for average class size in primary school and one for middle school. The first stages are following
and
where the instruments above^{PS} and above^{MS} are indicator variables for being above the predicted class-size discontinuity within a segment based on enrollment in primary and middle school. All standard errors are again clustered at the school level.

Finally, note that we estimate the effect of class size in primary (middle) school while keeping predicted class size in middle (primary) school constant. Primary class-size effect estimates in most existing studies may pick up correlated class-size effects in middle school or vice versa.

### C. Measurement Error in Class Size

It is well known that IV will consistently estimate β_{1} with classical measurement error in *cs**. As explained in Section II, while measurement error in class size due to school takeup is not a concern in our baseline sample, primary school attendance is not observed for some proportion of the students in the broader combined school sample and the population as a whole. We therefore correct for primary school exposure as in Equation 3. To provide a heuristic motivation for this correction, consider the basic case where we observe average class size in compulsory school *cs* for students who attended primary school in their combined school (*d* = 1), while we do not observe class size ¬*cs* for students who attended primary school elsewhere (*d*=0).

Substituting this in Equation 4 gives
(7)
and there are now two first stages
which then gives the following reduced form equation
(8)
assuming that *z* ⊥ *d*, ¬*cs*, which implies that ψ_{1} = 0, our reduced form estimate gives

If we could estimate the first stage of *cs**, it would give
and IV would consistently estimate β_{1}, but since we only observe *cs* and not *cs** we cannot estimate this first stage. However, the takeup condition (Equation 2) in our baseline sample ensures that Pr(*d* = 1) ≈ 1, so that in our baseline sample the first stage of *cs* gives an estimate of π_{1}, and IV recovers β_{1}. In our broader sample this is not the case, and we need to address this. Note, however, that for each cohort we can estimate *p* = Pr(*d* = 1) by the ratio of the number students who graduated from middle school to the number of students at the end of primary school four years earlier. If we call this estimate , we can use it to correct *cs* for exposure in the same vein as above, and construct an estimate of

If we now compute the first stage
we get the correct value. Note that we could have specified
but if ψ_{1} = 0, then the second term on the right-hand side cancels out in the first stage. This is confirmed by a robustness check where we estimate with the average class size in the other primary schools in the municipality. The intuition of the above applies to our more general exposure correction in Equation 3.

We check whether takeup rates and outside class size are indeed orthogonal to our instrument. Figures 3 and 4 show no signs of correlation between the maximum-class-size rule and the share of students from other schools or the class size of other primary schools in the municipality .

While the above implies that our IV approach addresses the measurement error in class size, we observe class size for our sample of students that attended Grades 1–6 and Grades 7–9 in the same school. We therefore start our analysis on this baseline sample. This also provides a check in the sense that the estimated class-size effect on this sample should be very similar to the estimated effect on the extended sample of everyone who attended a combined school where class size is for some measured with error.

### D. Instrument Validity

One concern in regression discontinuity designs is that agents may place themselves systematically on the left or right side of the discontinuities. Urquiola and Verhoogen (2009) found a particularly stark example of this in Chile. As a first check, Figure 5 shows histograms of distance from the discontinuity enrollment normalized to zero at each segment’s discontinuity pooled across segments. This allows for a visual inspection of whether bunching takes place. The figure also plots average class size as a function of normalized enrollment. This is done separately for the average compulsory school class size in our baseline sample (the leftmost graph at the top), those graduating from any combined school (the rightmost graph at the top and leftmost at the bottom), and for all individuals (the rightmost graph at the bottom).

The top left panel in Figure 5 shows normalized primary school enrollment for our baseline sample. Because schools in this population are relatively small, the data consist mostly of enrollment around the first two discontinuities. The distribution is relatively smooth and downward sloping, but there appears to be some indication of bunching right after the discontinuity, which is indicated by the solid vertical line.

We performed a more in-depth analyses and formal tests of bunching (see the Online Appendix). We investigate bunching separately for the years up to 1987 when the maximum-class-size rule was 30 in Figure W1, and for 1988 onwards when it was 28 in Figure W2. We formally test for bunching following Cattaneo, Jansson, and Ma (2017, 2018), who propose a test that is robust to bandwidth selection issues with discrete running variables that is common in the popular McCrary density test described in McCrary (2008). While there appear to be visible signs of bunching around some of the discontinuities in the first period (up to 1988), we fail to reject the null hypothesis of no bunching. In the second period (from 1988 onwards) we do not see any bunching, nor do the tests detect any.

While the bunching is relatively minor and only observed in the first period, we investigate manipulation by the school administrator and families.

#### 1. Manipulation by the school administrator

One potential source of bunching may arise from strategic behavior on the part of school administrators. Angrist et al. (2017) point out that school administrators in Israel have considerable discretion as to whether or not to approve applications from families who want to advance or delay the school starting age of their child. If school administrators use this discretion strategically, they are able to manipulate the maximum-class-size running variable to minimize the number of classes. This sort of manipulation can be a threat to the validity of the regression discontinuity design. To address this issue Angrist et al. (2017) use a donut strategy that excludes observations of enrollment close to the discontinuity.

In Norway the school starting age is not often postponed or advanced, but if this decision is taken, it is by the municipality and not the school administration, which probably reduces the likelihood of the type of strategic behavior observed in the Israeli case. However, manipulation of this type can potentially be difficult to uncover with standard balancing tests. Investigating whether the share of pupils that start compulsory school at the stipulated time is discontinuous at the maximum-class-size thresholds will provide a clear indication that the administrators are manipulating the running variable. Figures W3 and W4 show how the average probability of finishing at the stipulated time change with primary school enrollment. We find no indications of manipulation of the school enrollment by discretionary enforcement of the school-starting-age rule.

#### 2. Manipulation by the family

Another possible form of running variable manipulation arise from families moving strategically. If well-informed families, at least partly, base their choice to move on expected change in class size, this could potentially threaten our identification strategy. Manipulation of this type should show up in balancing checks if high ability type families are more likely to move strategically.

Section 1.3 in the Online Appendix provides an extensive analysis of moving. First, we do not see any indication of strategic moves before school start and before transitioning to middle school in Figure W5. For our instrument to be valid, it should be unrelated to the moving decision. Figure W6 shows that moving rates are continuous around the discontinuities we are exploiting. We also test whether the instrument affects mobility using reduced form regressions reported in Tables W1 and W2 and find no such evidence. Strategic moving would also show up in within-school changes in enrollment over time. We investigate whether being above the threshold leads to outflow of students in Figures W7 and W8. Again, there is no sign that families move out of schools with large classes.

#### 3. Balancing tests

The reason for worrying about bunching is that it may lead to a violation of the exogeneity of the instrument. To address potential confounding because of the bunching we implement our IV using a so-called donut strategy, where we exclude in all analyses the observations where the distance to the discontinuity is no more than one.^{7} This will strengthen our first stage and addresses potential imbalance around the cutoff. The donut is indicated by the two vertical dashed lines in Figure 5. The middle and bottom panels show qualitatively the same picture for normalized middle school enrollment in the larger population that attended any combined school (middle panel) and for all individuals (bottom panel). The histograms are flatter because schools in these samples are larger.

A direct check of a violation of instrument exogeneity is to see whether the instrument correlates with predetermined variables that are strong predictors of the outcome variables under consideration. From the registry data we can link pupils’ background characteristics. We have information on the age and education for both parents and pupils’ gender, immigrant status, and year and month of birth. The first four columns of Table 2 regress the main outcome variables on these predetermined background characteristics in the population. The regressions further control for time and cohort dummies and include the reference specification for *f*(enroll_{i}). The table shows that the background characteristics are economically important predictors of the outcomes and are statistically significant; thus, we strongly reject the null hypothesis that these predetermined characteristics do not matter.

The last four columns of Table 2 replace the outcomes with our instruments for primary and middle school class size for everyone, those who attended a combined school, and our baseline population. If there is no confounding bunching or selection around the discontinuities, our instruments should be orthogonal to the pupils predetermined background characteristics. This is indeed what we observe: most of the regressors are not significantly different from zero, and the *p*-value on the joint test that the coefficients are equal to zero are all highly insignificant.^{8} Moreover, all the coefficients are very small. This is also the case for the significant effect of mother’s years of schooling on the primary-school threshold.^{9} Finally, the exogeneity of our threshold crossing instruments is also confirmed by specification checks that show that our point estimates are insensitive to the inclusion of these background variables.

### IV. Results

#### A. The Average Effect of Class Size in Compulsory School

Before discussing the class-size effect estimates, first consider the first-stage estimates reported in Panel A of Table 3. Although our 2SLS estimates are based on first stages where the first-stage coefficient can vary across segments and time periods, for ease of interpretation, Table 3 reports the average first-stage coefficient. The instrument, being above the maximum-class-size threshold in a segment, is a strong predictor of class size in compulsory school. Crossing the discontinuity reduces average class size by more than seven pupils, and this large drop is highly significant, with an *F*-statistic above 240.

Panel B reports the corresponding pooled reduced-form estimates for average schooling and earnings between age 27 and 42. We see no evidence that being above the threshold has any effect on average completed years of schooling, nor on average earnings.

Panel C reports the corresponding 2SLS class-size estimates. Although these are based on first stages where the first stage coefficient can vary across segments and time periods, they are close to the IV estimate implied by the pooled first-stage and reduced-form estimates in Table 3. The first column reports the effect of class size on education. There is no evidence that class size in compulsory school matters for educational attainment. The point estimate implies that a one-pupil increase throughout compulsory school reduces average educational attainment by less than 1/100th of a year and is not statistically significant. The second column in Panel C reports the estimate for log-earnings. Consistent with the finding for education, there is also no evidence either that class size affects log-earnings. Even though the point estimate is statistically insignificant, it is precise enough to reject negative effects of a one-pupil increase of class size throughout compulsory school as small as −0.0031 with 95 percent likelihood. To investigate whether labor supply at the extensive margin matters, the third and final column reports the class-size effect for average earnings (in levels) normalized by the population average. This variable is set to zero for individuals without income from work, and we therefore do not exclude any individual from the estimation sample for this reason. It turns out that unemployment and labor force participation are not affected. The point estimate for earnings in levels is nearly identical to that for log-earnings, estimated with similar precision, and we thus find no evidence that class size in compulsory schooling matters, on average, for earnings between age 27 and 42.

Tables A3–A5 in the Appendix investigate the sensitivity of the estimates to the specification of the enrollment controls in the reference 2SLS estimation. Columns 1–6 in these tables control for segment dummies, and what varies is how enrollment is taken into account. The first column in these tables controls linearly for enrollment, and the second column adds a quadratic term. The third and fourth columns repeat these specifications but introduce interactions with segments. The fifth column allows for segment-specific linear splines, which is the reference specification used in the paper. The sixth column relaxes this specification even further by allowing for segment-specific quadratic splines. The seventh column drops the donut around the discontinuities. The eighth column limits the sample to include only schools with enrollment no more than five pupils away from the thresholds, and in the ninth column no more than ten pupils away. The final column shows the estimates without individual and family controls.^{10,11}

Specification 6 shows that the flexible specification in enrollment starts to capture part of the variation generated by the discontinuities, and we see a sharp drop in the first stage *F*-statistics from about 400 to 40. The specifications in Column 7 shows that the fuzziness inside the donut weakens the first stage.

The reduced-form and 2SLS results for earnings and log-earnings are very consistent across specifications, and only for schooling do the estimates change somewhat. The reduced-form schooling effects are statistically significant when the controls for enrollment do not allow for segment interactions, when the enrollment controls start to absorb the first stage, without individual controls or without a donut. There is no systematic pattern in terms of bias, and the main takeaway is that the statistical significance of the schooling estimates is fragile. In all cases the economic significance of these schooling effects is very small, just like the effects for earnings, and we therefore interpret the analyses in Table A3–A5 as highlighting the robustness of our conclusions.

#### B. The Effect of Class Size by Age

After graduating from middle school, students usually enroll into an upper secondary school.^{12} If small classes increase the probability of a person having more education, it should also decrease income and labor market participation while being educated, and later in life, these students could be expected to collect higher earnings than their less-educated peers. The outcomes analyzed in the previous subsection are the average of outcomes measured when the individuals were 27–42 years of age. However, we can also estimate the age-by-age impact of class size. Examining the age profiles of class-size impacts on different long-term outcomes for a wider age range allows us to see whether dynamic adjustments are important. Figure 6 shows the average impact of a one-pupil increase in compulsory school class size for years of schooling and log-earnings separately for each age between 18 and 48. Since the sample size decrease as age increases, we typically lose some precision as age increases.^{13}

The left panel of Figure 6 shows that the class-size impact on years of schooling is almost always small and negative and never significantly different from zero. The effect starts out at zero at the age of 18, which can be interpreted as meaning that class size does not impact the probability of graduating from high school at nominal duration. The class-size effect gradually grows more negative until the mid-thirties, before steadily shrinking towards zero as age increases. The effect size corresponds to the average estimate of −0.008 over the age 27–42 presented in Table 3. It should be kept in mind that the effect sizes are very small at all ages, and that the 95 percent confidence interval excludes any negative effects smaller than −0.035 years of schooling (about 13 days).

The right panel of Figure 6 shows that the effect of class size on log-earnings is slightly positive when people are in their early twenties, before falling to zero when approaching thirty. From then onwards the effect size is close to zero and statistically insignificant. Notice that the class-size impact between the age of 27 and 42 is in line with the 0.0004 average impact size reported in Table 3.

While age profiles are of interest if we want to comment on the dynamic effects class size has on labor market behavior, for the most part there are few signs of the effects changing with age. The most striking conclusion we can draw from these figures is that they support the point estimates from Table 3 and that these estimates are stable across different ages.

#### C. The Average Effect of Class Size in Primary and Middle School

As explained in Section III.B above, many combined schools take in children from other primary schools, which breaks the near perfect correlation between (predicted) class size in primary and middle school. We use this feature of the broader population that attended a combined school to decompose the class-size effect estimate for compulsory schooling into that for primary school and that for middle school. Table 4 reports the results.

As before, our 2SLS estimates are based on first stages that interact PS and MS threshold crossing with segment dummies and regime dummies to take into account that maximum class size decreased from 30 to 28 in 1985. Also, for ease of interpretation, Panel A and B of Table 4 report the pooled first-stage coefficients. As expected, the primary-school threshold crossing indicator loads on class size in primary school, while the middle-school threshold crossing indicator predicts middle school class size best. This also shows that there is sufficient independent variation in our data to separate the effects of class size in primary and middle school.

For completeness, Panel C reports the (pooled) reduced form effects, while Panel D of Table 4 reports the 2SLS class-size effect estimates for the long-term outcomes that we consider: education, log(earnings), and average earnings. The first row reports class-size effects for the primary school grades (1–6). In the first column we see a small class-size effect estimate for years of schooling, which is also highly insignificant. In the second column we report the estimated class-size effect in primary school for log-earnings, which is also small and very similar to the average point estimate for average class size in compulsory schooling (Grades 1–9) in our baseline sample reported in Table 3. The third column reports the primary school class-size effect estimate for earnings, which includes zeros for those without work or outside the labor force. As above, this estimate is very close to that for log-earnings. The final row of Table 4 reports the estimates of the long-run impact of class size in the middle school grades (7–9). We again find no evidence for any class-size effects for years of schooling, log-earnings, or earnings.^{14}

To summarize, we find small effect estimates on long-run schooling and earnings for class size in compulsory schooling. Decomposing these effects into class-size impacts in primary and middle school gives even smaller and more precise point estimates, and we can reject beneficial effects on log-earnings of a one-student decrease in class size as small as 0.12 percent in primary school and 0.15 percent in middle school.

### V. Heterogeneity and Interpretation of the Results

Our results contrast with the evidence from Sweden (Fredriksson, Öckert, and Oosterbeek 2013, 2016) that shows large negative and statistically significant effects from average class size in primary schools (Grade 4–6) on earnings and, in some specifications, also for years of schooling. In their preferred specification (Fredriksson, Öckert, and Oosterbeek 2016, Table A3) a one-pupil increase in class size in primary school reduces average earnings between age 27 and 47 by 1.5 percent, while our point estimate of the same effect is very close to zero, and we can reject effects as small as 0.082 percent. A natural question to ask is whether we can understand this difference in the class-size effects between Norway and Sweden.

One explanation lies with parent behavior. Fredriksson, Öckert, and Oosterbeek (2016) find that lowering class size improves achievement at age 13 for children from parents with low (below median) income, while children from high-income (above median) families do not appear to benefit from smaller classes. Fredriksson, Öckert, and Oosterbeek (2016) also show that high-income families help more with homework if their children are in larger classes, while low-income families do not appear to adjust their homework help to class size. This suggests that compensating behavior by parents may explain why there is no evidence for a class-size effect for children from high-income families in Sweden.

If Norwegian parents are like Swedish ones, we could expect similar differential class-size effects. To investigate this hypothesis, Table 5 reports effect heterogeneity by parental background characteristics that are strong predictors of children’s outcomes and are arguably correlated with parental inputs. The lack of evidence for class-size effects in the overall population suggests that Norwegian parents compensate more quickly, we therefore investigate heterogeneity more in the lower tail of the parental background distribution. Also, since the average effects are approximately zero, there is little scope for interaction effects of a significant magnitude across broad groups because a negative effect must be offset by a positive one.

In line with Fredriksson, Öckert, and Oosterbeek (2016), the first three panels of Table 5 reports class-size effects for children from families with low- (first quartile) and high-income (fourth quartile) parents. The first two panels report the class-size effects separately by father’s (Panel A1) and mother’s (Panel A2) income separately, while the third panel (Panel A3) reports the same results from aggregate family income. The next panel (Panel B) shows class-size effects separately for children of families where none of the parents have more than compulsory schooling (about 18 percent of our population) and children from more highly-educated backgrounds. Panel C show the effect of class size differs between immigrants (7 percent of our population) and nonimmigrants, while Panel D of Table 5 shows how the effect of class size differs between boys and girls.

As can be seen from Table 5, there is no evidence that smaller classes provide benefits in the long run for children from more disadvantaged backgrounds, whether it is by parental income, education, migrant background, or sex. The effect of fourth-quartile-income fathers and mothers are the only groups that have significant class-size effects, but the effects point in opposite directions, and there are no significant class-size effects by family income. In fact, none of the other reported class-size effect estimates are statistically significant at the 10 percent level, and none of the differences across demographic groups are close to being significantly different from zero. The absence of class-size effects in Norway is consistent with all Norwegian parents systematically compensating for the class size of their children by increasing their own investments.

A second explanation for the contrast between Norway and Sweden may lie in differences in the school population. In contrast to Sweden, Norway has strong regional policies that aim to sustain populating relatively rural and remote areas. Consequently, many schools in Norway are relatively small. Average enrollment in the Swedish sample is 63 pupils, compared to 35 pupils in our baseline sample. This means that in our sample most schools will be in the segment around the first discontinuity, while in Sweden most schools will be in the second segment. If school size mediates class-size effects, then this is another potential source of effect heterogeneity. To investigate this possibility, Table 6 reports class-size effect estimates separately by discontinuity.^{15} For years of schooling the effects of class size are indeed slightly larger and more negative around the second discontinuity, but neither effect is statistically or economically significant. For log-earnings we do not find any indication that an increase in class size reduces average earnings between age 27 and 42. Our point estimates for the effects on education and earnings around the second discontinuity are still nowhere close to the effects, both in size and statistical insignificance, reported in Fredriksson, Öckert, and Oosterbeek (2016).

Finally, a difference in the population stems from the Swedish sample of school districts with only one school.^{16} Such schools are perhaps more shielded from competition, while competitive pressure could attenuate the effect of class size. To investigate whether this difference can explain the diverging results we need to approach the Swedish setup in our sample. In Norway municipalities function as the de facto school districts. This means that the only way to approximate the Swedish sample is by focusing on single-school municipalities. One caveat here is that while in Sweden these single-school districts may lie in a large municipality; this is not the case in Norway.

Panel B of Table 6 presents the estimation results from the sample of one-school municipalities. None of the estimates provide evidence of class-size effects, suggesting that differences in competitive pressure do not explain the diverging results between Norway and Sweden. Unfortunately, we cannot cut our data in such a way to increase both school size and have single-school districts as well, which would be necessary to completely match the Swedish population.

### VI. Conclusion

This paper presents long-run impact estimates of average class size in compulsory school for Norway. Many Norwegian middle schools also have a primary school department, allowing us to separately estimate the effects of average class size in Grades 1–6 (primary school) and Grades 7–9 (middle school). Thanks to exhaustive administrative registries, we have information on earnings and schooling for cohorts graduating from compulsory schools from 1978 to 2001, allowing us to observe wages up to age 48 for the oldest cohort.

We do not find any evidence of substantive beneficial effects of class size neither in primary school nor in middle school, and our most precise estimates reject effects on income as small as 0.26 percent for a one-person reduction in class size throughout compulsory schooling (a nine-year class-size reduction). Decomposing these effects into class-size impacts in primary and middle school gives even smaller and more precise point estimates, and we can reject beneficial effects on earnings of a one-student decrease in class size as small as 0.12 percent in primary school and 0.15 percent in middle school. This finding stands in contrast with findings from Sweden Fredriksson, Öckert, and Oosterbeek (2013, 2016), who find substantial impacts on long-run wages and educational attainment. Chetty et al. (2011) found statistically insignificant estimates for the United States for wages of 25–27-year-olds. The 95% confidence interval for the effect of a one-pupil class-size increase is [−0.007, 0.005] and is consistent with both the findings for Sweden and the current estimates for Norway. We investigated whether the differences between Norway and Sweden may be driven by school size and, possibly, competitive pressure/choice, but we are unable to reconcile the differences between Norway and Sweden.

Our findings emphasize that the effect of class size is not a structural parameter and are in line with substantial heterogeneity in short-run effects across school institutions and populations documented elsewhere (Wößmann and West 2006). Little is known about when and why class size reduction is effective. Further opening up the black box of education production is necessary to make progress on these issues.

## Appendix

### A. Residential Moves and Class Size

Using our data, we can compute exposure at the individual level, taking moves across municipalities into account. Table A1 reports the share of the sample of pupils who move to the graduating municipality in primary school and middle school, respectively, and the share of the stipulated time typically spent at that school. The shares of movers are similar in the baseline sample and the combined schools sample. About 9 percent of the students arrive at the graduating combined school during primary school, and about 3 percent of the students arrive at the combined school during middle school. On average, both students who move during primary and middle school arrive about halfway into the track. Combining this gives between-move-adjusted average primary school exposure rates of 0.96 at the end of primary school, or 0.93 at the end of middle school, and a 0.98 exposure rate to middle school.

While in our data we observe between-municipality movers, we cannot identify students who change schools within a municipality. The only information we have on within-municipality mobility comes from recent national standardized testing data. The first of these tests was introduced in 2004, and tests these days are administered at the start of fourth, seventh, and eighth grade in compulsory school. The tests track students from the start of fourth grade to the end of sixth grade and from the start of seventh grade to the start of eighth grade and have information about the primary school of the students. The seventh-grade tests were first conducted in the school year 2007–2008, which means we only have information about school moves for the 2012–2013 graduating cohorts and onward.

The period of measurement in primary school is from the start of fourth grade to the end of sixth grade. This allows us to determine the relative number of within-municipality to between-school changes. Table A2 shows the relative share of within-municipality school changes relative to the between-municipality school changes. If the share is larger then more students change school within municipality than between municipalities. The baseline sample schools are typically located in smaller municipalities with fewer schools to move between, which is why it is natural to observe relatively more between-municipality school changes in this sample.

Denote the between-municipality move adjusted exposure measure reported in the final row of Table A1 by . We want to correct this with a factor such that the adjusted exposure is in expectation equal to the exposure that takes both between and within-municipality school moves into account :

This implies the following correction factor

We observe the denominator in our data, and the numerator can be written as

Furthermore

Combining these expressions gives

Table A1 shows that between-municipality movers are relatively uniformly distributed across grades and that their average exposure is about 0.5. Assuming that this is also the case for within-municipality movers implies that . From our data we observe also , and Table A2 gives us estimates for a. We are conservative and assume that within-municipality moves are 1.5 times more prevalent than between-municipality school moves; that is, α = 1.5. The primary-school exposure rate and correction factor are at the end of primary school. In our analysis we therefore need to make a final correction for not moving during middle school (that is, multiply with ). Combining this information gives the adjustment factors (Table A3).

## Footnotes

A first version was presented at the Educational Governance Workshop in Trondheim 25–26 April, 2013. The authors thank Hessel Oosterbeek, Björn Öckert, and three anonymous referees for feedback that improved the paper. All remaining errors are those of the authors. The data employed in the analysis are drawn from Norwegian administrative registers. Statistics Norway provides microdata for research projects. More information is available at: https://www.ssb.no/en/omssb/tjenester-og-verktoy/data-til-forskning. Inquiries about access to data from Statistics Norway should be addressed to mikrodata{at}ssb.no, and the authors are willing to assist.

Supplementary materials are freely available online at: http://uwpress.wisc.edu/journals/journals/jhr-supplementary.html

↵1. As of 1997, children start school the year they turn six, and compulsory schooling lasts ten years. The cohorts used in the analysis were not affected by this reform.

↵2. Truncation is mainly an issue for primary school class size. While in our baseline sample we observe complete middle school class-size histories for basically all pupils, primary class-size histories are truncated to some degree for about one-third of the sample (see Table W3 in the Online Appendix). While this truncation is typically relatively minor, we have investigated the sensitivity of our main results to truncation. Estimates based on graduating cohorts with complete histories reported in Table A6 and Table A7 are very close to those based on the full sample we report below. We also use the graduating cohorts with complete histories and apply various degrees of truncation to the class-size variable. When we compare the class-size estimates based on complete histories to those based on truncated histories, we see that these class-size effects are very stable and not affected by the truncation (Tables W4–W6 in the Online Appendix). This is probably because class size typically varies little from grade to grade, and we interpret the results from these robustness analyses as showing that truncation does not bias our estimates.

↵3. We show in Section III that this second term does not affect our class-size estimates as long as our instrumental variable is orthogonal to exposure, , and class size elsewhere, . We provide evidence in support of this below.

↵4. The last row of Table A2 shows that within-municipality moves are less prevalent than between-municipality moves in our main (baseline) sample. Using a correction factor of 0.8 instead of 1.5 would give a higher exposure rate of 0.87 in primary school. We, however, use our conservative adjustments, which means that our IV estimates are probably somewhat too large in the baseline sample.

↵5. This breaks the perfect correlation between class size in primary school and class size in middle school and will help us below in separately identifying class-size effects in primary and middle school.

↵6. The new rule was not supposed to affect the children who were already enrolled in primary school, but in practice, most schools enforced the new class size rule for all grades in primary school.

↵7. We also performed robustness checks where we vary the donut size and show that this does not affect our results (Online Appendix Tables W7–W9).

↵8. Without the donuts, we do not find any indications that the regressors have predictive power on the placement of the schools on either side of the maximum-class-size threshold.

↵9. A graphical representation of these balancing tests is shown in Figure W9 in the Online Appendix.

10. Table A6 reports the class-size effect in compulsory schooling for the sample with full class-size histories.

11. Tables W7–W9 in the Online Appendix further vary the donut size from ±1 to ±5 and show that this does not affect our results.

↵12. Upper secondary school is optional, but almost all children in the Norwegian school system enroll for the first year. It is nevertheless common to drop out or spend more than the nominal duration before matriculating.

↵13. The outcomes are observed for each cohort at ages 18–27, while they are observed for only one cohort at age 48.

↵14. Table A7 reports the class-size effects in primary school and middle for the sample with full class size histories. Table A8 reports class-size effects across all school levels and samples.

↵15. Few schools have enrollment around the higher discontinuities, and these stratified estimates are relatively noisy. We therefore do not report them here.

↵16. Fredriksson, Öckert, and Oosterbeek (2013, 2016) need to restrict their sample in this way to tackle the problem of flexible catchment areas in Sweden.

- Received February 2017.
- Accepted June 2018.