Abstract
Many existing studies find that females perform better when they are taught by female teachers. However, there is little evidence on what the long-run impacts may be and through what mechanisms these impacts may emerge. We exploit panel data from middle schools in Seoul, South Korea, where students and teachers are randomly assigned to classrooms. We replicate the existing literature that examines contemporaneous effects and find that female students taught by a female versus a male teacher score higher on standardized tests compared to male students even five years later. We also find that having a female math teacher in seventh grade increases the likelihood that female students attend a STEM-focused high school, take higher-level math courses, and aspire to a STEM degree. These effects are driven by changes in students’ attitudes and choices.
I. Introduction
Many studies find positive and sizable impacts of female teachers on female students, suggesting that teacher–student gender matching could close the gender gap in various fields (see, for example, Dee 2007; Muralidharan and Sheth 2016; Lim and Meer 2017), while others do not (Holmlund and Sund 2008; Cho 2012; Antecol, Eren, and Ozbeklik 2015).1
The limited evidence on longer-run impacts suggests that effects can be substantial. Lavy and Sand (2015) find that primary school teachers’ bias towards boys can have impacts through high school.2 At the higher education level, Carrell, Page, and West (2010) find that female cadets at the United States Air Force Academy who are randomly assigned to female instructors for introductory science and math courses are more likely to persist in those fields, and Porter and Serra (2017) show that a brief presentation by female role models can significantly increase economics course-taking by female students.3
This paper is the first to examine the longer-term effects of teacher–student gender matches at the secondary level. Our data track students from Grades 7 to 12 and enable us to investigate how these effects change over time, as well as how they affect important outcomes like preparation for postsecondary study and choice of major. Further, we are able to examine some of the mechanisms potentially driving these outcomes.
The main issue plaguing identification of the impact of teacher–student gender matches is that, in most contexts, students are not randomly assigned to teachers. This leads to potential confounds in which, for example, female teachers may be systematically assigned to lower-achieving students, or female students who are more likely to succeed with female teachers are assigned to those classrooms. In these cases, any teacher–student gender match effects may reflect sorting rather than, for example, rolemodel effects. While several of the studies listed above used environments featuring random assignment, there is little evidence on whether the teacher–student gender match has a long-run impact when taking place at more formative ages—after all, it may be too late to substantially affect field of study by the time a student arrives in college. Further, there is little evidence on the mechanisms through which these effects are operating.
We avoid the problem of nonrandom sorting using a unique Korean middle school practice: the random assignment of students into a classroom each year. We find that the presence of a female teacher substantially increases female students’ test scores compared to male students, and this effect persists at least through Grade 12. Our long-lasting gender gap effects are somewhat surprising, since the effects of educational interventions generally fade out fairly quickly (Jacob, Lefgren, and Sims 2010). We show that the mechanisms behind these persistent effects appear to be greater focus and participation by female students, as well as selection into higher-quality high schools.
This paper builds on the work of Lim and Meer (2017), which investigates short-term effects of teacher–student gender matching on the standardized test scores, in several ways. We examine longer-run effects and the mechanisms behind these effects using panel data covering Grades 7–12, as well as a followup survey shortly after high school graduation. We also show the effects of female teachers on female students’ STEM outcomes, such as taking higher-level math courses and attending a STEM-focused high school, and so on.
Our results shed light on the importance of classroom interactions on students’ outcomes. By extending the scope of the analysis to cover outcomes throughout secondary school and into postsecondary choices, we show that teacher–student gender interactions can have implications for lifelong academic and career decisions. In Section II, we discuss the institutional background and our data. Section III delves into the empirical approach, Section IV presents the results, and Section V concludes.
II. Data
A. Institutional Background
Several features of the South Korean educational system make it well suited to study the impact of teacher–student gender matches. First, elementary school graduates are assigned to a local middle school (spanning Grades 7–9) by lottery.4 Second, middle school students are randomly assigned to a physical homeroom classroom, through which subject teachers rotate to give lessons. Due to strong social norms and government policies, the goal is to produce homogeneous homeroom classrooms in terms of academic ability (Kang 2007). The most common approach is to order students by the previous year’s academic performance, with the leading student to the first classroom, the second-ranked student to the second classroom, and so on, and schools report adhering to these expectations (Lim and Meer 2017).
Subject teachers are assigned to classrooms in a manner unrelated to the characteristics of either the teachers or the classroom’s students. In general, the physical classrooms are split between the subject teachers by, for example, dividing odd- and even-numbered rooms; teacher characteristics are not correlated with student characteristics (Lim and Meer 2017). This quasi-random classroom and teacher assignment produces the random variation in teacher–student matching within a school.
Over the past decade, there has been increasing use of ability tracking in South Korea (Byun and Kim 2010). In our data, ability grouping is common for math and English, but infrequent for Korean. In the presence of ability grouping, the students move to the classroom with their ability group for that subject before returning to their original homeroom classroom. Figure 1 gives an example, with high-ability students in Classroom 2 moving to Classroom 1 and low-ability students in Classroom 1 moving to Classroom 2 for this subject. This might threaten our identification strategy, which rests on the random assignment of students to teachers. Including school-by-subject fixed effects might be insufficient to account for this potential source of bias. We therefore estimate specifications with school-by-subject-by-ability-group fixed effects, enabling us to compare similar sets of students. Our identifying assumption is that teachers and students are randomly assigned within the same school and ability group, as in Jackson (2014, 2018).5 In practice, our results are unaffected by the inclusion of finer fixed effects, suggesting that ability grouping does not lead to sorting correlated with teacher–student gender matches and would not bias our results.6

Example of Ability Group Formation
While compulsory education ends in ninth grade, nearly all students in South Korea go on to high school. There are two rounds of admissions to high school in Seoul. Admission is determined through an application process in the first round (Seoul Metropolitan Office of Education 2012). Students can apply to one of 35 magnet and private autonomous high schools, six art high schools, one athletic high school, and 74 vocational high schools. Schools select students on the basis of their academic performance and recommendations from principals and teachers. If selected, students must attend the school to which they are admitted.
The second round consists of a lottery for 19 autonomous public high schools, 19 science-focused high schools, four art-focused high schools, and 183 general academic high schools. Students can list preferences for one autonomous school, one science- or art-focused school, and up to four general schools. Students receive some preference for schools within their administrative district. The lottery clears entry into the autonomous public and specialized schools first and then into the general academic high schools. Students are guaranteed entry into a school. Conditional on entering tenth grade, 95.2 percent of students in 2011 graduated, and only 0.32 percent repeated a grade.
One year after high school entry, students choose among academic tracks, irrespective of the type of school. Most schools provide a math–science track and a humanities–social science track, with the exception of specialized science and foreign language schools.7 Students focus on advanced courses within a given track. Students are free to choose their academic track. Past test scores, student characteristics, and so on are not taken into account by the school in restricting a student’s choice. Changing tracks is rare; students apply to postsecondary institutions the following year, in Grade 12, and it is difficult to catch up on the requisite coursework. Notably, applying to a STEM department at a postsecondary institution requires scores in advanced math and science courses. Therefore, if teacher–student gender matches in middle school—a particularly formative time period (Berenbaum, Martin, and Ruble 2008)—have persistent effects into high school, they are likely to have lifelong impacts on academic and career choices.
B. Data Set
Our data set is the Seoul Education Longitudinal Study of 2010 (SELS2010), which surveyed Grade 7 students and their teachers in 2010. Data are available through Grade 12, with a followup survey shortly after high school graduation on postsecondary outcomes. Subject teachers in math, English language, and Korean language are linked to the students in Grades 7–10 and in Grade 12.8
Students in the Grade 7 panel were sampled by stratified two-stage cluster sample design. First, 74 middle schools were randomly chosen from the population of 370 public or private middle schools in Seoul, excluding two middle schools that are operated by the central government and one athletic middle school. Sixty-two of the sampled schools are coeducational, seven are all-boys schools, and five are all-girls schools. Two classrooms were then drawn randomly within the sample school; 4,544 of the 5,065 targeted students responded to the survey in 2010. Attrition reduced the sample to 4,347, 4,162, 3,541, 3,394, and 3,305 from 2011 through 2015, respectively. As discussed in Section III.B, we find no evidence of differential attrition from our sample based on teacher gender match. Students in the sample advanced to high school in 2013, with 3,017 going to academic high schools and 524 to vocational high schools. Of these, 2,809 and 496 students in academic and vocational high schools remained in the sample through 2015. In May 2016, two months after high school graduation, students were surveyed about their postsecondary outcomes at that point; 2,195 students responded.
Our primary unit of observation is a teacher–student pairing, with multiple observations per student, allowing us to use within-student variation in teacher gender in a given year. Of 13,632 possible teacher–student matches in seventh grade, 3,364 observations cannot be linked due to teacher nonresponse. An additional 72 observations have missing values for test scores, student gender, or teacher gender. We compare the remaining 10,196 observations, representing 4,163 students and 497 teachers, with those that are dropped and find no significant differences between them in students’ predetermined characteristics. 1,892 (45.4 percent) students and 408 (82.1 percent) teachers are female. Eleven percent of these teacher–student pairs are male–male, 8 percent are male–female, 44 percent are female–male, and 37 percent are female–female.
Student data include standardized test scores for each subject, which take place after the first semester of the school year. We standardize these to have a mean of zero and standard deviation of one for ease of interpretation. We also have data on student background characteristics, course-taking in high schools, and track choice, as well as on teacher characteristics, including experience and education.
III. Empirical Approach
A. Tests of the Identifying Assumptions
While the institutional characteristics of our setting provide the random variation in teacher–student gender matches necessary for unbiased estimation, it is instructive to examine empirical evidence to validate the identification strategy.
First, we show that students are not assigned to physical homeroom classrooms on the basis of their ability and that subject teachers’ characteristics are not correlated with the students’ ability. We use resampling techniques to examine whether the actual classroom assignments of SELS students appear unusual relative to classrooms constructed artificially (for more details, see Lehmann and Romano 2005; Good 2006; Carrell and West 2010). We conduct this exercise for the assignment of ninth graders to physical homeroom classrooms. This allows us to use prior years’ test scores as a proxy for ability, thus demonstrating that students are not systematically assigned to classrooms with male or female teachers based on their previous performance.9 We randomly draw, without replacement, 10,000 synthetic classrooms from the sample of all SELS students within the school.
For each of these synthetic classrooms, we compute the total seventh grade test score of its students.10 We calculate an empirical p-value measuring the proportion of artificial classrooms with a value greater than the actual classroom in the data. We conduct this exercise using test scores from all three subjects. Under random assignment, the empirical p-values will be uniformly distributed, since any p-value will be observed with equal probability. That is, the empirical p-value shows whether the student characteristics of the real classroom are unusual relative to a truly random draw. We test whether the distribution of the empirical p-values is uniform with Kolmogorov–Smirnov and χ2 goodness-of-fit tests.
The results, in Table 1, Panel A, show that none of 222 school-by-subject p-value distributions deviate significantly from the uniform distribution at the 5 percent level using the Kolmogorov–Smirnov test; eight (3.6 percent) do so using the χ2 goodness-of-fit test. We do not find evidence of nonrandom assignment of students into classrooms by academic ability.
Resampling Tests
We further check the random assignment of teachers into classrooms with respect to student’s academic ability by regressing the empirical p-values on the average characteristics of teachers visiting the classroom, controlling for school fixed effects to accommodate the random assignment within a school. Nonrandom assignment would cause teacher characteristics to be correlated with the empirical p-value, since teachers of a certain type are assigned to unusual classrooms on the basis of some student characteristic (namely, test scores). As is shown in Table 1, Panel B, none of 24 coefficients are statistically significant at the 10 percent level.
Next, we test differences in the characteristics of teacher–student pairings. We present sample means from our data in Table 2. Panel A presents the characteristics for female and male students by teacher gender, demonstrating that students are not more likely to be assigned to a teacher of a particular gender on the basis of observables. Panel B compares teacher characteristics when matched with female and male students. Female teachers assigned to male students are more likely to have an administrative position, and male teachers assigned to female students are more likely to be from a teachers college. However, since the randomization in our sample is within schools, adjusting for school fixed effects eliminates the significance of these differences.
Comparison of Mean Characteristics
As another way to examine whether there are systematic differences in how teachers are assigned to students, we regress teacher gender on students’ observable characteristics, controlling for school-by-subject-by-ability-group fixed effects. The results are shown in Table 3. Teacher gender is not correlated with students’ predetermined characteristics, consistent with random assignment of students to teachers.
Likelihood of Having a Female Teacher
B. Specification
Our estimation strategy is straightforward. First, we examine contemporaneous effects of teacher-student gender matches by estimating a series of specifications with the following general form:
1
where yijbgs is a test score of student i taught by teacher j for subject b in ability-group level g, if any, in seventh grade at school s. fsi and ftj are indicator variables equaling one if student i and subject teacher j are female, respectively. Our simplest specification includes only these variables. We add, in turn, an increasingly rich set of controls. These include Xi, a vector of students’ predetermined characteristics including income, the number of siblings, and indicators for living with both parents, having at least one parent with bachelor’s degree or higher, having both parents being employed, and Tj, a vector of teacher characteristics, including teacher’s age and indicators for teacher graduating from teachers college, teacher with master’s degree or higher, homeroom teacher, the teacher holding an administrative position, and the teacher having less than five years of experience. We include school-by-subject-by-ability-group fixed effects gbgs to compare students of the same ability in a subject within a school to ensure that ordinary least squares (OLS) produces unbiased estimates. Standard errors are clustered at the school level to account for correlations among students within the same schools. If the random assignment suggested by the institutional features truly holds, these specifications should not differ substantially when the additional controls are included. Our identification does not hinge on the inclusion of fixed effects, but rather is driven by the randomness in the assignment process.11
β1 is the difference in average academic achievement between female and male students when taught by a male teacher. β2 represents the average difference in performance for male students between being taught by a female teacher and being taught by a male teacher. β3 indicates the change in the gender gap between female and male students when switching from a male to a female teacher. β2 + β3 is the same-gender teacher effect for female students, that is, the effect of switching from a male teacher to a female teacher for female students.
We modify Equation 1 slightly to examine the effects of teacher-student gender matches in seventh grade on standardized test scores over time:
2
where yjbgls is test score in year t = 2–6 (namely, Grades 8–12), which is standardized within a subject and the year. We focus on the effects of seventh grade teachers over time because we are most confident that random assignment holds for that grade. Students are entering a randomly assigned middle school, and their academic performance cannot yet have been affected by teachers at the middle school when assigned to seventh-grade classrooms. This also provides the longest time span to examine the persistence of these effects.
Our primary interest is the coefficient β3 when t=2–6, representing gender gap effects one through five years after exposure to the seventh grade teacher. When we examine outcomes that do not differ across subject, such as choice of academic track, we collapse this specification to a single observation per student.
We examine whether attrition from the sample is likely to bias our results. If attrition is systematically related to the teacher–student gender interaction—that is, if low-performing female students with female teachers are more likely to exit the sample—our results would be the product of a selected sample. In Table 4, we test whether the proportion of female teachers in seventh grade impacts the likelihood of attriting from the sample. No meaningful patterns emerge to indicate that selective attrition drives our results. Further, recreating our primary results in Tables 5 and 6 using only students who remained in the sample all six years yields similar results.12
Attrition
Contemporaneous Effects in Seventh Grade
Effects over Time
IV. Results
A. Contemporaneous Effects
We begin by briefly examining the effect of teacher–student gender matches in seventh grade on students’ standardized test scores in that year. Table 5 begins with the most parsimonious specification that includes no additional controls. Boys with a female teacher rather than a male teacher see a statistically insignificant decrease in performance of 0.06 standard deviations, but a girl with a female teacher relative to a male one has an increase of 0.08 standard deviations. The coefficient on the female student by female teacher indicator in Column 1 indicates that the performance gap between female and male students increases by 13.9 percent of a standard deviation (SE = 6.1 percent) when the teacher is female rather than male. In other words, the gender gap effect is composed of an opposite-gender teacher effect for switching from a male teacher to a female one for male students (−β2) and a same-gender teacher effect for the switch to a female teacher for female students (β2 + β3). This effect is substantial, considering the evidence that 10 additional days of schooling increases academic performance by 0.01 standard deviation (Carlsson et al. 2015). Adding controls across specifications does not meaningfully affect the estimate of this gender gap effect, providing additional evidence that random assignment holds in our data. Note, though, that the share of the gender gap effect attributable to changes in boys’ performance does change. Boys no longer see a negative impact of switching to a female teacher by Column 3, the specification that includes school-by-subject-by-ability-group controls. That is, the increase in performance gap between female and male students is driven by increases in female students’ performance.
Even the inclusion of student and teacher fixed effects in Columns 6 and 7, thus identifying from within-student and within-teacher variation in the gender match, leaves the gender gap effect intact.13 Taken together, these results indicate that the teacher–student gender match is uncorrelated with observable and unobservable student and teacher characteristics—the random assignment on which our identification strategy rests appears to hold. To assuage any lingering concerns about selection to female teachers within a school-by-subject-by-ability-group cell, we compute predicted test scores in seventh grade as fitted values from linear regressions of seventh grade test scores on all observable student and teacher characteristics.14 We replicate the models from Columns 1, 2, and 3 of Table 5 in Online Appendix Table 1 and show that there are no correlations between the predicted test scores and teacher student gender matches when we control for school-by-subject-by-ability-group fixed effects.
B. Longer-Term Effects
Table 6 presents the impacts of the seventh grade teacher–student gender match for later years. Panel A presents results including school-by-subject-by-ability-group fixed effects, and Panel B also includes student and teacher fixed effects (corresponding to Columns 3 and 7, respectively, of Table 5). For brevity, we limit ourselves to the most parsimonious specification that we believe clearly satisfies the identifying assumptions (Panel A), but also present the most restrictive (Panel B). This latter set of results serves as an additional robustness check of our identification strategy.
Somewhat surprisingly, we find that the gender gap effects persist even five years after the initial teacher–student gender match. The effects vary slightly over time, but there are no significant differences between the contemporaneous effect and the effects in the following years.15 While education interventions are generally characterized by large fade-out effects (Jacob, Lefgren, and Sims 2010), previous studies on longer-run impacts of teacher gender suggest that the impacts of this particular phenomenon may be long-lasting (Carrell, Page, and West 2010; Lavy and Sand 2015).
STEM outcomes are of particular interest, since although the gender gap in math achievement in secondary education is small, women are substantially underrepresented in both STEM majors and careers.16 Column 1 of Table 7 presents the results for the effects of seventh grade subject teachers on student’s math–science track choice in 11th grade. These specifications include fixed effects for school by seventh grade ability group, which should adequately control for any possible nonrandom assignment to ability group in seventh grade. There are no significant gender gap effects for any subjects; however, there is a statistically significant same-gender teacher effect for female students. Girls are 15.1 percentage points (SE = 7.5) more likely to choose the math–science track in high school when taught by female versus male math teacher in seventh grade (the combination of having a female math teacher and the additional effect of female math teachers on girls). Substantial gender gap effects emerge in advanced math course-taking in 11th grade (Column 2), with having a female seventh grade math teacher reducing the gender gap between female and male students by 9.7 percentage points (SE = 5.5). The same gender teacher effect for female students is even bigger; female students are 15.7 percentage points (SE = 6.6) more likely to take at least one advanced math course when they were taught by a female math teacher in seventh grade versus a male teacher. Column 3 shows the impacts on self-reported interest in majoring in a STEM field; once again, having a female seventh grade math teacher substantially reduces the male–female gap in these aspirations. Similarly, Column 4 shows that the male–female gap in attending a STEM-focused high school is substantially reduced when a female student has a female math teacher in seventh grade. Especially given the baseline means, listed in the bottom of line of the table, these effects are substantial. However, these four outcomes are closely related, and multiple hypothesis testing correction is warranted. Utilizing the methods in Jones, Molitor, and Reif (2018), which employ permutation techniques to adjust p-values for multiple hypotheses, we find that the female student–female math teacher interaction is statistically insignificant when these corrections are applied.17
Effects on STEM Outcomes
Finally, Column 5 of Table 7 reports the impact on the likelihood that a student reports that they have no set plans for postsecondary education. Female students are significantly less likely to report that they have no such plans if they had female teachers in any subject in seventh grade. This provides further evidence of an overall positive influence of female teachers on female students, which we turn to in the next section.
C. Evidence on Mechanisms
Altogether, these results show dramatic impacts of having female teachers on female students in the long run, with substantial increases in preparation for postsecondary STEM study. But the underlying mechanisms remain opaque. Female students may be treated differently by or react differently to female and male teachers. For example, Spencer, Steele and Quinn (1999) show that negative stereotypes of girls’ math ability undermine girls’ performance on math exams. A dearth of role models or general attitudes and expectations may discourage female students from entering these fields, making it difficult to increase participation (Leslie et al. 2015; Eble and Hu 2017). Below, we examine several possible drivers of our results.
We begin by examining whether female students taught by a female teacher are more likely to be taught by a female teacher in later years. In that case, the persistent effects of seventh grade teachers could, in fact, be driven by cumulative effects of later teachers. As seen in Table 8, our results are not driven by cumulative exposure to female teachers set in motion by better performance in seventh grade.18
Effects on Future Teacher Gender
We next examine whether teacher–student gender matches in seventh grade affects ability group assignment in later years. Table 9 shows results for the likelihood of being in a high ability group in Grades 8–10 and Grade 12 in a given subject. While this potential mechanism has a reasonable basis—better performance in seventh grade leads to higher ability grouping in later years, with better performance in those years—we find no evidence that this is driving our results.
Effects on Ability Grouping
Having addressed these possibilities, we turn to student attitudes and choices. Indeed, our results appear to be driven by changes in student attitudes, which result in female students who had female teachers in seventh grade being more likely to select higher-quality high schools. Column 5 of Table 7 shows that having female teachers in seventh grade increases the likelihood that female students declare that they have postsecondary plans. In Table 10, we show the impact of the female student–female teacher match on an index of self-reported student engagement constructed using principal component analysis.19 While female students have a lower engagement score than male students, having a female teacher in a particular subject in seventh grade eliminates this gap, and the effects persist into high school. The point estimates lose statistical significance by 11th grade, but remain relatively large.
Effects on Student Engagement Index
Given the nature of the high school application process in Seoul, any increase in high school quality necessarily reflects student preferences to a great degree. In Table 11, we show the impact of the seventh grade teacher–student gender match on the quality of the student’s classroom peers, as estimated by previous-year standardized test scores, and on teacher value-added.20 Given the random assignment in middle schools, it is reassuring that there are no effects in eighth and ninth grades. However, in tenth grade— high school—female students who had a female teacher in seventh grade have both peers and teachers who are of significantly higher quality, as measured by test scores, and this effect is also present in 12th grade.21
Effects on Peer and Teacher Quality in Later Years
To complement the results on peer and teacher quality, we examine other measures of high school quality in Table 12. Column 1 shows the likelihood that the student attends a prestigious first-round high school, namely, an application-only high school, excluding nonelite vocational schools. The impact on the gender gap of having more female teachers is statistically insignificant. In Column 2, there is a substantial but statistically insignificant effect on the likelihood of attending a high school in a different administrative district, which would reflect a willingness to incur greater costs to attend a preferred school. This effect is not entirely driven by students who attend first-round schools. Examining only those who attend second-round (lottery) high schools, the female student interaction with percent of seventh grade female teachers is 0.161 (SE = 0.106), once again statistically insignificant at conventional levels but fairly sizable. This suggests even those female students with more female teachers who do not attend an application school are requesting higher-quality schools in the second-round lottery. Column 3 shows the impacts on the percent of 11th grade students in a high school who achieved “Above Basic” performance in a particular subject on the National Assessment of Educational Achievement (NAEA) test, an indicator of achievement above the median grade-level expectation. Importantly, these scores are for students one cohort above those in our sample. Column 4 shows results for students performing at the “Below Basic” level, below 20 percent of grade-level expectations. Finally, Column 5 shows the effects on the percent of students who are reported to the School Violence Committee, to which infractions against students both on and off school premises must be reported; these include violent acts but also bullying and cyber-bullying. Female students with a greater proportion of female teachers in seventh grade attend safer schools, though we note, at least based on official reports, schools in South Korea are extremely safe.
Effects on High School Quality
Taken together, these results indicate that the teacher–student gender interaction in middle school leads to greater engagement for female students with female teachers. Since the impact on attending application-only schools is fairly small, the primary channel for these results must be the choice set submitted by students for the second-round lottery. That is, exposure to female teachers in seventh grade leads female students to express a preference for higher-quality schools, even at a cost of greater travel time.
Finally, we turn to postsecondary outcomes, in Table 13. Our sample size is smaller due to the nature of this wave of the survey, which was conducted two months after these students left high school. Columns 1–3 show the results of the seventh grade teacher interaction variables on the probability of attending a university, a community college, or being employed, respectively. No meaningful patterns emerge; the coefficients are, for the most part, statistically insignificant. In Column 4, we examine the likelihood that students applied to a science, technology, engineering, or medical sciences major conditional on attending a college or university. Column 5 narrows that definition by removing the traditional science fields. In both cases, female students with female math teachers in seventh grade were more likely to apply to those fields, though the effect is only statistically significant in Column 5 and is not distinguishable from a similarly positive interaction effect for having a female English teacher. For female students, the interaction effect with a female Korean teacher is large, negative, and statistically significant; having a female Korean teacher in seventh grade significantly reduces the likelihood that a female student will apply to a STEM major. Columns 6 and 7 condition on applying to a STEM or TEM field and examine the likelihood of being accepted; there are no significant impacts. The increase in applications coupled with little difference in acceptance rates suggest some uptick in female participation in these majors when they had a female seventh grade math teacher, although sample sizes are small and the estimates are fairly noisy.
Seventh Grade Teacher Effects on Postsecondary Outcomes
Columns 8 and 9 use data from the 2014 Graduate Occupational Mobility Survey, which surveys recent college graduates 18 months after graduation. We match wage and employment information to major and examine whether majors chosen by students in our sample are more likely to lead to positive labor market outcomes. That is, these are not the actual outcomes for those in our data, who are just beginning their postsecondary education, but rather the projected outcomes. There are no impacts on the average wage, but female students with female math teachers in seventh grade choose majors that have a statistically significant 3.8 percentage point higher employment rate shortly after graduation.
Taken together, our results suggest that there are substantial impacts on the secondary-school performance for female students with female teachers, particularly in STEM subjects. These are driven by changes in attitudes, aspirations, and choices, particularly in terms of course-taking and high school quality. Given the nature of the South Korean education system, the large improvements in academic performance, increases in positive attitudes towards STEM fields, and increases in propensity to take STEM courses are necessary to see a greater likelihood of majoring in STEM fields in college by female students. However, our results are much more mixed for realized postsecondary outcomes, suggesting that these improvements are not sufficient to yield meaningful changes in actual outcomes.
V. Conclusions
Our study shows that pairing of female students with female teachers at a formative age—middle school—increases standardized test scores well into high school, the likelihood of advanced course-taking and making plans for postsecondary attendance, and, specifically for those with female math teachers, both studying STEM and aspirations to do so in college. We show that persistent student engagement and increased ambition in choosing high schools of higher quality explain why there is little fade-out of these impacts. It is important to note that there are, at times, partially offsetting negative effects on academic outcomes for boys from a shift to female teachers. Overall, these findings shed light on the importance of teacher–student gender matches in future policies to close the gender gap, especially in STEM fields. However, results on postsecondary outcomes suggest that interventions in secondary school may not be sufficient to substantially increase participation at that level.
Footnotes
The authors are grateful for valuable comments from Mark Hoekstra, Jason Lindo, James West, and seminar participants at Baylor University, Brigham Young University, and the University of Texas. This paper uses confidential data that can be obtained by request from Seoul Education Research & Information Institute (SERII) at www.serii.re.kr/. The authors are willing to assist (jmeer{at}tamu.edu).
↵1. Another approach to closing the gender gap is single-sex schooling. Park, Behrman, and Choi (2013, 2018) and Lee, Niederle, and Kang (2014) use random assignment in South Korea. Jackson (2012) exploits the nature of rules-based school assignment in Trinidad and Tobago.
↵2. A related strand of literature shows positive impacts of same-race teachers, particularly for African-American boys (Gershenson, Holt, and Papageorge 2016; Gershenson et al. 2018).
↵3. Recent work by Kofoed and mcGovney (2017) shows that same-gender and same-race role models can influence occupation choice at the United States Military Academy.
↵4. While students in some districts can submit a list of schools among which they will be randomly assigned, students in Seoul are not permitted to do so (Korea Legislation Research Institute 2011).
↵5. A corollary is that assignment to a female teacher in one subject is uncorrelated with the student’s other teachers’ characteristics. The assignment mechanism is such that this is unlikely to be the case. Further, we find no evidence that a teacher–student gender match in one subject is related to the other teachers’ characteristics.
↵6. Ability groups are self-reported in our data; we include an indicator for schools that do not have ability groups.
↵7. Students on the math–science track take different standardized math tests in 11th and 12th grade. The SELS2010 uses vertical scaling to transform scores onto a common scale. As such, including academic track as a control variable has no impact on our findings for those outcomes. English and Korean language exams are the same across tracks.
↵8. Those links are not available in 11th grade.
↵9. Eighth grade physical homeroom assignment was not available in the data shared with us.
↵10. Our results are unchanged if we use eighth grade test scores.
↵11. Nevertheless, we did investigate whether the sample students with variation in teacher gender was driving the results. There are 3,950 observations corresponding to students with such variation, about 40 percent of the total sample. Estimating our canonical specification (Column 3 of Table 5) on this sample yields a female-teacher-by-female-student interaction effect of 0.120 (SE = 0.058); on the balance of the sample, it is 0.128 (SE = 0.064). This provides further evidence that our more parsimonious approach provides a cleanly identified estimate.
↵12. Additional results in the Online Appendix show that cumulative attrition, as opposed to attrition in each year conditional on survival, is also unrelated to teacher–student gender interaction. Further, we interact the teacher–student gender match with the previous year’s test scores. If lower-performing female students who have female teachers are more likely to attrit—the primary concern regarding sample selection—the coefficient on this variable will be positive. This is not the case. The interaction is small and statistically insignificant. Low 12th grade test scores reduce the likelihood that a student is in the postsecondary sample. This is unsurprising given that students were surveyed after they left high school. We further examine attrition by the teacher–student gender interaction in each subject. Only one of the 18 point estimates for the interaction effect by subject is statistically significant at the 10 percent level.
↵13. These findings replicate those using other South Korean data to examine same-year effects (Lim and Meer 2017). That study finds a contemporaneous interaction effect of female students with female teachers of about 0.10. It uses a larger sample of about 11,000 students, but that sample consists of a single year of data for ninth graders in 2004 spread across all of South Korea.
↵14. Jackson (2014) notes that using predicted test scores rather than using individual covariates is more efficient way to check for sorting because the predicted test scores can be viewed as a weighted mean of the individual covariates, with the weights being their importance in determining test scores, and tests of multiple covariates can lead to difficulty in interpretation.
↵15. In Panel A, the p-values are p = 0.51, p = 0.92, p = 0.76, p = 0.60, and p = 0.73 for the differences between the interaction coefficients in Columns 1–5 and that in Column 3 of Table 5, respectively. In Panel B, p-values are p = 0.72, p = 0.84, p = 0.67, p = 0.99, and p = 0.53 for the differences between the interaction coefficients in Columns 1–5 and Column 7 of Table 5.
↵16. In Online Appendix Table 8, we examine results corresponding to Column 3 of Table 5 and Panel A of Table 6, broken out by subject. Due to the large number of interactions, it is unsurprising that the estimates are noisier in this specification; however, they are most consistent for mathematics.
↵17. We also create an index for these STEM outcomes using principal component analysis to examine the impact of the teacher–student gender interactions. The results, reported in Online Appendix Table 9, show large and significant effects of having a female math teacher in seventh grade for female students.
↵18. We also examine whether there are spillover effects from having female teachers in other subjects. The results, presented in Online Appendix Table 2, show no meaningful patterns; the point estimates are fairly noisy. However, the same-subject interaction term is unaffected. For example, the estimate corresponding to Column 3 of Table 5 is 0.145 (SE = 0.055), compared to 0.147 in the original. We also experiment with replacing the teacher’s characteristics for a particular subject with those of a different subject’s teacher. Those results, also in Online Appendix Table 3, yield small and statistically insignificant point estimates for the teacher–student gender interaction effect. Taken together, this suggests that the impact of teacher–student gender interaction is limited to that particular subject.
↵19. A concern is that this common variation includes a student’s innate characteristics. For example, a student with higher intellectual ability would be more likely to be focused and to participate in class. Because our hypothesis is that the persistent effects occur through changes in students’ attitude and behavior, we need to remove these innate components from the index. To do so, we regress the index obtained through principal component analysis on student fixed effects, and then use the residual. We thus remove the student’s common characteristics across the three subjects.
↵20. We calculate value added by estimating teacher fixed effects from a regression of leave-out mean scores in grade g on leave-out mean scores in grade (g – 1) and school-by-subject-by-ability-group fixed effects.
↵21. Due to the data limitations discussed in Section II.B, we are unable to estimate these results for 11th grade.
- Received February 2018.
- Accepted October 2018.






