Abstract
Gender disparities in academic performance may be driven in part by the interaction of teacher and student gender, but systematic sorting of students into classrooms makes it difficult to identify causal effects. We use the random assignment of students to Korean middle school classrooms and show that the female students perform substantially better on standardized tests when assigned to female teachers; there is little effect on male students. We find evidence that teacher behavior drives the increase in female students’ achievement.
I. Introduction
Gender gaps in academic performance, with girls generally outperforming boys in language arts and boys generally outperforming girls in math, have persisted despite decades of effort to close them. Understanding the causes of these gaps is crucial, especially at younger ages, because they may lead to gender differences in later course-taking, occupational choices, and labor market outcomes (Lavy and Sand 2015).
One possible source of gender-based disparities is whether a student and a teacher share the same gender. These gender interactions may affect academic performance through changes in the behavior of both parties, that is, through student- or teacher-centered mechanisms. Role-model effects, an example of the former, predict that students will be more engaged in study when they are taught by a same-gender teacher (Dee 2007). As an example of the latter, a teacher might assign less difficult homework questions to girls if he or she believes that girls are less capable in math than boys (Jones and Dindia 2004).
The primary threat to identifying the causal effect of teacher–student gender matches is the nonrandom sorting of students that typifies classroom assignment in most contexts. For instance, students with a lower propensity to achieve academically may be more likely to be assigned to a teacher of a particular gender. Beginning with Dee (2007), the standard approach in this literature has been to use student interactions with multiple teachers across different subjects. By using estimates including student fixed effects, unobserved student characteristics that are correlated with student quality and teacher gender will not bias estimation. Dee uses the fact that the National Education Longitudinal Survey of 1988 surveys two teachers for every student to estimate within-student teacher-gender effects; he finds evidence of substantial positive impacts on academic achievement of being assigned to a teacher of the same gender. Moreover, Dee uses subjective evaluations of both teacher and student perceptions to show that students are less likely to be seen as disruptive when evaluated by a teacher of the same gender and more likely to report interest in that academic subject. Using a different approach, Muralidharan and Sheth (2016) exploit panel data from India, in particular, schools with only one classroom per grade, in which there can be no sorting of students. They find that female primary school students perform significantly better with female teachers, with no impact of teacher gender on male students.
On the other hand, several studies find no effect of teacher gender. Holmlund and Sund (2008) use Swedish secondary-school panel data and identify the impact of same gender teachers using teacher turnover. Once they control for subject-specific gender effects, they find no impact of gender-matching on student performance. Cho (2012) uses math and science test score data from 15 OECD countries and shows that there is no significant effect of teacher–student gender matching in eight of these countries, including the United States. Most recently, Paredes (2014) examines role model and teacher bias effects with data from Chile, finding small but statistically significant gender-matching effects for girls and no effects for boys, as well as suggestive evidence that role model effects drive the result.
However, this within-student estimation approach—even when including teacher fixed effects—is insufficient if students and teachers are systematically matched on characteristics correlated with gender. For instance, suppose female students who would benefit relatively more from having a female teacher are more likely to be assigned to female teachers who themselves are better role models for female students. In this case, a positive student–teacher gender interaction effect reflects sorting. As Dee notes, “The internal validity of such within-student comparisons could still be compromised by the nonrandom sorting by students with subject-specific propensities for achievement and by unobserved teacher and classroom traits correlated with gender.” He finds some such evidence in the NELS:88 with a number of indirect tests, particularly in the assignment of female math teachers. Other studies lacking random assignment must indirectly show that the identification strategy holds; for instance, Paredes (2014) uses the previous year’s test scores to control for achievement propensity.
For identification, we exploit a unique feature of secondary education in South Korea: The random assignment of students into a classroom, where students remain throughout the school day. We provide evidence for our identifying assumption in a number of ways: First, as-good-as-random assignment of students to classrooms is a strict policy in South Korea. We confirm that schools follow this policy by surveying a large number of them on the topic. We also show that assignment to classrooms within a school is uncorrelated with observable characteristics. Furthermore, students who are assigned to same- and opposite-gender teachers look similar in their observable characteristics. Finally, our results do not differ when additional controls, student fixed effects, or teacher fixed effects are included, as one would expect if assignment is truly random.
Our reliance on random assignment obviates potential sorting issues that have been a major concern in previous work. In this way, our approach is most similar to two previous papers. Antecol, Eren, and Ozbeklik (2015) exploit the random assignment of students in an experiment testing the efficacy of Teach for America, a program that trains and places high-achieving new teachers at disadvantaged schools. They find that female elementary school students with female teachers perform worse than those with male teachers. However, this negative effect disappears for female teachers with stronger math backgrounds. At the higher education level, Carrell, Page, and West (2010) use random assignment of cadets at the United States Air Force Academy to compulsory math and science courses, and show that female professors significantly reduce the gender gap in performance for female students.1
Our study is also unique because we provide more recent evidence from an age group similar to that studied in Dee (2007) and, importantly, because our empirical setting is a culture with somewhat different gender norms than many previously studied. South Korea is ranked 39th of 57 countries in its residents’ attitudes toward gender equality, much lower than the countries studied in the analyses above: 29th for Chile, 18th for the United States, and second for Sweden (Brandt 2011).2
Results show that female students’ performance is positively influenced by having a female teacher, but that there is little same-gender teacher effect for males. The overall increase in the female-male performance gap of about a tenth of a standard deviation is comparable in size to those found by Dee (2007) and Carrell, Page, and West (2010). Unlike our findings, though, both of those papers find that the impact is divided about evenly between reduced performance by males and increased performance by females. Our effect is similar in magnitude to an increase of one standard deviation in teacher quality (Chetty, Friedman, and Rockoff 2014).
The impacts are primarily concentrated in mathematics and English language scores, as compared to Korean language scores. We also provide some suggestive evidence that teacher-centered mechanisms are behind these impacts, with female students reporting that their female teachers are more likely to encourage them and to give them an equal opportunity to express themselves.
II. Data
We use cross-sectional data collected by the Korean Educational Development Institute (KEDI) in July 2004, at the end of the first semester of middle school in South Korea. The target schools, covering 6.8 percent of the relevant population in South Korea in 2004, were selected by proportionate stratified random sampling. Our initial sample consists of 197 schools; 777 Korean, English, and mathematics teachers linked to surveyed classrooms; 14,372 students; and 11,944 parents. Thirty-five of the schools had all-female students and 35 were all-male; 84 classrooms are single-sex within 127 co-ed schools.3 Restricted-use data provided by KEDI allows us to link students to classrooms.
In addition to an extensive set of questions, students’ responses were linked to their scores on the Student Achievement Test administered by the Seoul Metropolitan Office of Education (SMOE). Students in the sample were tested at the beginning of the second semester of ninth grade in three courses: Korean language, English language, and mathematics; 12,363 students’ test results were collected.4
The teacher questionnaire includes information on teachers’ classroom assignments, which we use to link students with their subject teachers. Beginning with 37,034 student–subject combinations with test score information, we first drop 6,033 observations without classroom or teacher information. Of these, 42 observations from 14 students have missing classroom information, and 5,991 observations from 224 classrooms do not have teacher information due to nonresponse by teachers, reducing the number of teachers in the sample to 777 and the number of students to 12,305.5 For our primary sample, we also drop 6,442 observations for students with multiple subject teachers, for which we could not make a student–teacher match representing just one student and one teacher. We also show results including these observations, which are unchanged from those excluding them. This results in 24,489 student–teacher pairings representing 11,659 students and 502 teachers. Among them, 33 percent of observations correspond to a female student with a female teacher, 16 percent are a female student with a male teacher, 32 percent are a male student with a female teacher, and the remaining 19 percent are male students with male teachers.
A. Student Assignment
Elementary school graduates in South Korea are randomly assigned to middle schools within their district.6 At the beginning of each academic year (March 1st), middle school students in South Korea are assigned a classroom where they remain throughout a school day, and where each subject teacher visits to present a lesson. Be it private or public, schools in South Korea use some form of random assignment to classrooms due to both strong social norms and government policies (Kang 2007). The most common approach is to order students by their academic performance in the previous year and assign them across classrooms. As an example, the top-ranked student would be assigned to the first classroom, the second-ranked student assigned to the second classroom, and so on.7 To confirm this point, we surveyed local Offices of Education on schools’ rules for classroom assignment for the 197 schools in our sample.8 All but one of the 180 responding schools with more than one classroom per grade reported that they used this method of classroom assignment, with the sole exception being a school that used alphabetical order of names to assign students.
B. Teacher Assignment
Even with random assignment of students to classroom units, the internal validity of our approach is threatened if teachers are systematically assigned to those classrooms in a way that is related to their gender. For example, female teachers might be assigned to classrooms that, by chance, have students with less-involved parents. There are no written government guidelines on teacher assignment; we interviewed a number of current teachers and principals to gain insight into the process. First, homeroom teachers are assigned, either by lottery or a committee, to a particular classroom. These teachers, who teach a subject themselves, are responsible for discipline, taking attendance at the start of the day, and overseeing study halls before and after school. Subject teachers’ classroom assignments are generally determined in an ad hoc way that is unrelated to student or teacher characteristics. For example, one subject teacher may take odd-numbered classrooms while the other takes evennumbered ones. We surveyed the schools in our sample on these policies as well, with 141 of 153 responding schools reporting that they assign subject teachers without considering student or teacher characteristics. The remaining 12 schools reported considering teachers’ characteristics, such as experience, in making the assignment. Our results are unchanged by excluding these schools, and we once again note that we conducted our survey 11 years after our data were collected. In Section III. A, we further examine whether random assignment holds in our data based on students’ and teachers’ observable characteristics.
III. Methodology
A. Tests of Random Assignment
1. Resampling techniques
While in the institutional setting we study it is clear that students are randomized across classrooms without respect to teacher gender, we also provide empirical evidence to support our identification strategy. We begin by following Carrell and West (2010), Lehmann and Romano (2005), and Good (2006) in using resampling techniques to test the randomness of teacher and student matching in terms of student’s observable characteristics. First, for each classroomwithin a school, we randomly draw 10,000 synthetic classrooms of the same size from the sample of all students in the school, without replacement. We do so for each of the three subjects (Korean, English, and mathematics) and for each of six variables (indicator variables for student being male, parents being married, father with BA degree or higher, mother with BA degree or higher, having housing ownership, and having Internet access at home). Then, for each subject and characteristic combination, and for each classroom within a school, we calculate the number of students with the characteristic within a classroom.9 We obtain an empirical p-value, namely, the proportion of the 10,000 resampled classrooms with fewer students with the characteristic (for example, male) within the observed classroom.
Under random assignment, any p-value will be observed with equal probability; we therefore expect the empirical p-values to be uniformly distributed. We test whether the distribution of the empirical p-values for each subject and characteristic combination is uniform using Kolmogorov–Smirnov and χ2 goodness-of-fit tests. Table 1A presents the results of this exercise, aggregating results over school subject for brevity. Overall, we reject random assignment for 34 out of 1,942 (1.8 percent) school-by-subject-by-characteristic test statistics10 at the 5 percent level in the Kolmogorov–Smirnov test and 71 of 1,942 (3.7 percent) test statistics using the χ2 goodness-of-fit test. Therefore, we do not find evidence of nonrandom assignment of students into classrooms by observable characteristics.
We also check the random assignment of teachers with respect to student’s observable characteristics. For each characteristic, we regress the empirical p-values on a set of teacher characteristics, controlling for subject and school fixed effects. The results, in Table 1B, show that only one of 30 coefficients is statistically significant at the 5 percent level. Therefore, there is little evidence of nonrandom assignment of teachers into classrooms with respect to student’s observable characteristics.
2. Pearson’s χ2 tests
We next turn to testing random assignment with respect to observable characteristics by conducting a series of Pearson’s χ2 tests for independence of a variety of characteristics and the classroom to which they are assigned. Tested characteristics include student’s gender, parents’ marital status, parents’ education, as well as whether parents own their own home and whether student’s home has access to the Internet, as proxies for family resources. Parents’ education has seven categories, and the other variables are indicator variables.
We perform 2,082 Pearson’s χ2 tests across six characteristics and 453 school–subject combinations.11 We find that 208 (9.99 percent), 115 (5.5 percent), and 38 (1.8 percent) of these p-values are lower than or equal to 10 percent, 5 percent, and 1 percent, respectively. This provides further evidence for the random assignment mechanism described in Section II. A.
To check whether the rejections are concentrated in particular schools, we examine distributions of the number of rejections by school. Figure 1 shows the distributions for all subjects and each subject. Two schools have a total of six rejections in all subjects combined and one school has five rejections. Only one school has as many as three rejections in one subject, suggesting that no schools that are failing to comply with the random assignment of students to classrooms. Further, omitting the three schools with five or six total rejections from our estimates does not affect the results.
3. Difference in mean characteristics
Another approach is to compare the groups of students taught by same- and opposite-gender teachers. If the students are randomly assigned to the teachers of the same and opposite gender, then the two groups should look similar in terms of observable characteristics.
Table 2 presents sample means from our data, with each observation as a student– teacher pair. Recall that the randomization in our sample is within schools, although even when looking across schools, the results are fairly well balanced. In Panel A, the characteristics for female and male students are presented separately by teacher gender, demonstrating that students are not more likely to be assigned to a teacher of the same gender based on observable characteristics. For male students, there is a statistically significant difference for home ownership, but it is economically small. Moreover, since random assignment was done within schools, adjusting for school fixed effects eliminates the significance of this difference. We also show the mean standardized test scores by group as a preview of our results. Female students perform substantially better than male students overall, but particularly when they have female teachers. Meanwhile, male students are not greatly affected by the gender of their teacher. In Panel B, we compare teachers’ characteristics when assigned female and male students. As in most schools around the world, female teachers are much more prevalent in our sample, but there are no significant differences in the types of teachers assigned to students of different gender. These results further show that students and teachers are randomly assigned to classrooms irrespective of gender matches.12
B. Specifications
To analyze the effect of teacher–student gender interaction, we estimate the following linear regression equation:
(1)where yijsb is the test score of student i who was taught by teacher j in school s for subject b. The test scores are normalized in each subject to have mean zero and variance of one. Because the scores in Korean language, English language, and math are pooled together, we also include subject fixed effects αb. fsi and ftj are indicator variables having value of one when student i and teacher j, respectively, are female. Xij is a vector of student and teacher characteristics including indicators for married parents and parental education. Teacher characteristics include indicators for graduate degree and graduation from a teachers college, and indicators for teacher experience of two years or below, two to three years, three to four years, four to five years, and five years or more. αs are school fixed effects, included since random assignment of students is done within schools.
We estimate the equations by ordinary least squares (OLS), which produces unbiased estimates given the random assignment of students and teachers to classrooms. Standard errors are clustered at the school level to accommodate correlations among students within the same schools. We obtain similar standard errors clustering at the classroom level or with two-way clustering at the student and teacher level.
β1 is the average difference in academic achievement for female compared to male students with male teachers, while β2 indicates the impact of a female versus male teacher on performance for male students. The total effect of having a female teacher for female students can be obtained by adding β2 to β3, with β3 as the differential effect on female students, as compared to male students, of having a female teacher. This last coefficient is the change in the gender gap between female and male students when switching from a male teacher to a female teacher.
IV. Results
A. Main Effects
Table 3 presents the coefficients from estimating variations of Equation 1. We begin in Column 1, with school and subject fixed effects. The coefficient on the female student variable indicates that female students perform better than male students by about 0.15 of a standard deviation on average across Korean language, English language, and math when paired with a male teacher. The change in the performance gender gap between females and males when switching from a male teacher to a female teacher, as indicated by the interaction effect between female student and female teacher, is 0.098 standard deviations. This total effect is comprised of a small and statistically insignificant decrease in male performance of 0.021 standard deviations and an increase in female performance of 0.076. This widening of the gender gap is substantial, representing more than one-third of a year of schooling based on the general rule of thumb that 1 percent of a standard deviation of performance is roughly equivalent to 10 days of schooling (Carlsson et al. 2015).
Including teacher background controls in Column 2 does not change the coefficients of our interest much.13 We replace school and subject effects with school-by-subject fixed effects beginning with Column 3. In Column 4, we add student fixed effects to test for the presence of unobserved student characteristics correlated with the variables of interest. These also subsume classroom fixed effects and also control for peer effects because students do not change classrooms. Their inclusion does not change the gender gap appreciably. In Column 5, we follow Fairlie, Hoffmann, and Oreopoulos (2014) and include classroom-by-subject fixed effects to account for the possibility of subject-specific classroom shocks. The results are unchanged. Finally, in Column 6, we add teacher fixed effects to test whether unobserved teacher characteristics are driving our results, despite random assignment. The teacher–student gender interaction coefficient remains the same size and is statistically significant at p = 0.056. Taken together with the evidence in Section III. A, the stability of this coefficient strongly suggests that the random assignment to classrooms in South Korea is indeed in place. As such, the interpretation of our results is free of the potential problems caused by sorting on unobservable characteristics.
Our findings are comparable in magnitude to those in Dee (2007) and Carrell, Page, and West (2010). Dee’s estimate of the increase in the gap between female and male students when assigned to a female teacher is about 0.092, with opposing positive and negative effects of similar size for female and male students, respectively. While our effect is concentrated on improvements for female students, it is quite similar in magnitude for a similar length of exposure to that year’s teachers (about one semester). Carrell, Page, and West’s effect, for somewhat less than one semester of exposure to a female professor, is 0.097 standard deviations with a reduction in male performance of 0.050 standard deviations.
To show that our results are not affected by the 6,442 observations that were dropped due to students having multiple subject teachers, we include them in Column 7. This specification corresponds to that in Column 1, but the female teacher variable represents the fraction of the student’s subject teachers who are female. About 90 percent of these additional observations are groups of two teachers; nearly all the remaining ones have three teachers. The results are essentially unchanged, with the gender gap increasing by 0.10 standard deviations when all of a female student’s teachers are female themselves.14
B. Effects by Subject
The gender gap differs by subject, with female students generally performing substantially better than males in language arts but about even or slightly worse in science and mathematics (OECD 2015). Teachers’ impacts may be greater in mathematics, given negative stereotypes about female mathematical ability. For example, Spencer, Steele, and Quinn (1999) found that negative stereotypes regarding the mathematical ability of female students negatively affects their test scores.
To test whether our results vary by subject, we fully interact the specification in Column 1 of Table 3 with indicators for English and mathematics. The coefficients, in Panel A of Table 4, show the full set of interactions. We note that female students perform far better than male students in Korean (0.34 standard deviations) and English (0.20 standard deviations) and about evenly in math (-0.04 standard deviations), with the last of these differences being statistically insignificant. In Panel B, we combine the relevant coefficients to calculate the change in the gender gap between female and male performance when switching from a male to female teacher. For Korean language courses, the gender gap between girls and boys does not widen significantly, though it does for English and math. However, there are no statistically significant differences between these effects.
V. Evidence on Mechanisms
To investigate the mechanisms underlying the positive impact of female teachers on female students, we examine a series of student responses about classroom interactions, as well as questions about private tutoring asked of parents. There are numerous such questions in the KEDI data, but we chose to focus on those that may distinguish student- and teacher-centered mechanisms. The results in Table 5 correspond to the specification in Column 1 of Table 3. In Columns 1–4, the dependent variable is an indicator for whether the student agrees or agrees strongly with the following sentiments, respectively: the teacher provides students with equal opportunity to participate in class, the teacher encourages students to express themselves, I feel comfortable asking the teacher a question, and I ask many questions in this class. The first two questions are proxies for teacher-centered mechanisms—that is, they are about the teacher’s behavior. The next two questions are about the student’s behavior, as are the estimates in Columns 5–7. In Column 5, the dependent variable is a continuous measure of hours of study in that subject (excluding hours spent at tutoring). Column 6, asked of parents, reports the likelihood of receiving tutoring in the subject; note that over 60 percent of students receive tutoring. In Column 7, we examine the effect on the log of tutoring expenditures, conditional on reporting any. This variable, reported by parents as well, provides an indication of tutoring intensity, both in terms of time and personal attention. Finally, Column 8 is the impact on student’s self-report that the subject is his or her favorite. This can be influenced by both student- and teacher-centered mechanisms and provides a useful proxy for the student’s overall response to the teacher.
With male teachers, female students are significantly less likely to feel as if they have an equal opportunity to participate or are encouraged, but this negative outcome is eliminated when the teacher is female. On the other hand, while all students report greater comfort in asking questions when the teacher is female, there is no additional effect on female students. They also are somewhat less likely to report asking many questions. There is no effect on hours of study, nor on either tutoring outcome variable. Overall, female students are significantly more likely to report a subject as their favorite when the teacher is female.15
Finally, we examine whether the effects differ by the proportion of female students in the classroom. A greater number of female students means that a female teacher can give less attention to each individual student; on the other hand, a higher proportion of female students may enable the teacher to provide a more welcoming environment for girls. We begin by examining whether the impacts of the teacher–student gender match are greater at single-sex schools. While the interaction effect is somewhat larger (0.037 standard deviations), it is not statistically significant (p = 0.50); similar results are obtained when comparing single-sex to coeducational classrooms. Female students in classrooms with above-median numbers of females do perform better with female teachers (0.036 standard deviations), but once again the difference is not significant (p = 0.57). We also interact the proportion of female students in a coeducational classroom with the teacher–student gender interaction. Once again, the difference is fairly large but statistically insignificant. For example, a ten percentage point increase in female representation increases the interaction effect by 0.013 standard deviations, with a standard error of 0.023 standard deviations. Without making too much of these differences, they suggest that female teachers may be conducting their classrooms differently in a manner that has a positive impact on female students.
Taken together, the results on female students’ responses to female teachers and the somewhat large effects for classrooms with more female students provide suggestive evidence that the increase in female student performance with female teachers is driven by teacher rather than student behavior.
VI. Conclusion
Understanding the effect of teacher–student gender interactions on student’s academic achievement is important not only for evaluating policies to close the gender gap in academic achievement, but also to enhance understanding of the education production function. However, it is difficult to estimate a student–teacher gender match effect free of selection bias because of the nonrandom sorting of students.
In this study, we estimate the impact of teacher–student gender matches on academic achievement using the random assignment of students in South Korea. We find that the performance gender gap between female and male students increases dramatically when switching from male to female teachers (0.098 standard deviations). Male students do not appear to benefit appreciably from a teacher of the same gender, but female students’ performance increases by about 8 percent of a standard deviation when they are taught by a female teacher. This effect is large and driven primarily by performance in English and mathematics courses. We also provide evidence that teacher behavior drives this increase in student achievement.
Our findings are consistent with the results of Dee (2007) and Carrell, Page, and West (2010). Combining these similarities, the random assignment nature of our approach, and the evidence on South Korea’s attitudes toward gender equality (Brandt 2011), we conclude that these interactions reflect genuine changes in the classroom environment that are not necessarily driven by the environment being studied.
Acknowledgments
They are grateful for valuable comments from Tom Dee, Mark Hoekstra, Jason Lindo, and James West. The data used in this article can be requested from the Korean Educational Development Institute (KEDI) at http://eng.kedi.re.kr/.
Footnotes
Jaegeum Lim is a legislative researcher at the National Assembly of the Republic of Korea. Jonathan Meer is a professor of economics at Texas A&M University.
↵* Supplementary materials are freely available online at: http://uwpress.wisc.edu/journals/journals/jhr-supplementary.html
↵1. Other evidence on gender-matching effects on student grades, course-taking, and persistence in colleges is mixed; see, for example, Canes and Rosen (1995), Bettinger and Long (2005), and Hoffmann and Oreopoulos (2009).
↵2. Our work also is related to the literature on the impact of single-sex schools. Park, Behrman, and Choi (2013) find significant positive impacts of single-sex schooling using random assignment in South Korea, while Jackson (2012) exploits the nature of rules-based school assignment in Trinidad and Tobago and finds little effect for most students.
↵3. As discussed below, excluding single-sex schools or single-sex classrooms does not change our results.
↵4. This exam is administered to ninth graders in Seoul every September; these students would have taken the test regardless. Students living outside of Seoul but in the KEDI sample took the same exam on the same day.
↵5. A concern is that teacher nonresponse somehow could be correlated with their impact on students of different genders. While we cannot completely exclude this possibility, students dropped from the sample due to teacher nonresponse have similar test scores (p = 0.77) as those remaining in the sample. There were also no statistically significant differences in the other student characteristics we examined.
↵6. Since 1996, students in districts whose superintendents allow it are permitted to list several preferred schools. They are entered into a lottery for each school on their preferred list (Korea Legislation Research Institute 2011).
↵7. Kang (2007) uses this same random assignment feature and a different data set on the performance of Korean students to examine peer effects. As mentioned above, Park, Behrman, and Choi (2013) examine the effect of single-sex education on college-going behavior using data from Seoul, in which students are not allowed to list preferred schools. Lee et al. (2014) examine schools in the Seoul metropolitan area to study the effects of single-sex versus coeducational schooling on academic performance.
↵8. Note, of course, that schools were responding 11 years after the KEDI survey was conducted. In recent years, the Korean education system has shifted somewhat from its original strictly egalitarian approach, so it seems quite likely that these as-good-as-random practices were in place in 2004. See, for example, Byun and Kim (2010), who discuss increased use of ability tracking in South Korea over the past decade.
↵9. Carrell and West (2010) use the sums of SAT scores or academic composite to obtain empirical p-values. Similarly, we sum the indicator variables.
↵10. For some school by subject by characteristic combinations, test statistics cannot be calculated because of missing variables, only a single classroom from the school remaining in the sample, or the school itself being single-sex for the student gender characteristic.
↵11. Some combinations cannot be tested due to missing variables, only a single classroom from the school remaining in the sample, or the school itself being single-sex.
↵12. We also follow Carrell, Page, and West (2010) in examining whether the student characteristics in Table 1 predict teacher gender; they are jointly insignificant at p = 0.31.
↵13. Results are similar when including a variety of student background characteristics, but survey nonresponse reduces the sample substantially.
↵14. We also estimate the specification in Column 1 excluding 7,964 student-teacher observations at single-sex schools, and an additional 1,620 observations assigned to single-sex classrooms in coeducational schools. The female teacher-female student coefficient for the former sample is 0.087 (SE = 0.035) and 0.083 (SE = 0.035) for the latter. We also test whether our results differ for students in rural and urban areas and by parental education. No consistent patterns emerge, and none of the differences in the interaction variable are statistically significant.
↵15. We estimated versions of Table 5 with interactions by subject, as in Table 4. While effects tended to be larger in math and, to a lesser extent, English classes relative to Korean classes, none of the differences were statistically significant.
- Received December 2015.
- Accepted May 2016.