Abstract
We show that a student’s ordinal ability rank in a high-school cohort is an important determinant of engaging in risky behaviors. Using longitudinal data from representative U.S. high schools, we find a strong negative effect of rank on the likelihood of smoking, drinking, having unprotected sex, and engaging in physical fights. We further provide evidence that these results can be explained by sorting into peer groups and differences in career expectations. Students with a higher rank are less likely to be friends with other students who smoke and drink, while they have higher expectations towards their future educational attainment.
I. Introduction
Risky health behaviors of adolescents such as smoking, binge drinking, and unprotected sex are suspected to have immediate negative impacts on educational performance, as well as far-reaching consequences for a person's labor market prospects and health (Carrell, Hoekstra, and West 2011; Cawley and Ruhm 2011; Carpenter and Dobkin 2009). To prevent adolescents from these negative consequences, it is important to know the determinants of risky behaviors. A major determinant identified in the literature is spillover effects from peers, whereby the more likely one's peers are to smoke, drink or take drugs, the more likely that person is to engage in such behaviors.
In this paper, we explore an additional important channel through which peers affect the engagement in risky behaviors: a student's ordinal rank in a school cohort. Depending on the cohort composition, the same student may rank highly in the ability distribution of one cohort and have a low rank in another. Our paper makes three contributions. First, we show that between two otherwise-identical students, the student with the higher rank is significantly less likely to engage in risky behaviors, while we provide evidence that this relationship is causal. Second, we explore potential channels that could explain this result, among them selection into peer groups and the ordinal rank affecting career expectations. Third, in comprehensive simulations we show the conditions under which we can obtain a meaningful estimate of the impact of ordinal rank even if we only observe a sample of the underlying ability distribution.
We use data from the National Longitudinal Study of Adolescent Health (AddHealth), a representative panel survey of U.S. middle- and high-school students that offers several key features for our analysis. For instance, multiple cohorts were sampled within each school, allowing us to apply a within-school/across-cohort design and observe each student's peer group in high school. Moreover, the survey contains a standardized cognitive ability test, which makes students' ability comparable across schools and cohorts and allows us to rank students according to their cognitive ability. Finally, the survey includes detailed information on risky behaviors, expectations, attitudes, and feelings, helping us to explore potential mechanisms behind the reduced-form results.
We measure a student's ability rank as his/her ordinal position in the ability distribution of his/her school cohort. Figure 1 shows a clear negative relationship between the within-cohort rank and engagement in risky behaviors after controlling for school fixed effects and a student's absolute cognitive ability. To give these correlations a causal interpretation, we exploit idiosyncratic variation in cohort composition within a school. Students with the same absolute ability face different ability distributions, thus having different ordinal ranks in different cohorts within the same school. To isolate the variation in the ordinal rank from variation in the average peer composition we exploit the fact that the rank is assigned individually to every student in a school cohort and estimate a model with school-by-cohort fixed effects, in which the rank effect is identified from differences in the variance and higher moments of the ability distribution across school cohorts. These absorb all confounding factors at the school-cohort level, such as teacher quality, differences in average ability, different shares of disruptive students, or other unobserved group shocks. Under the identifying assumption that being in one cohort or another is determined by a student's birth date and the cutoff date for school entry, the variation in the ordinal rank can be considered quasirandom, giving the impact on risky behavior a causal interpretation.
Partial Correlations: Ordinal Rank and Risky Behavior
Notes: Bin scatters using 20 bins illustrating the relationship between likelihood of engaging in risky behavior and the ordinal rank within a school cohort (0 = lowest rank, 1 = highest rank), conditional on absolute ability and school fixed effects.
Applying this research design, we find a negative effect of a student's ordinal rank on a large number of risky behaviors. The effects are large and statistically significant for smoking, drinking, risky sex, and engagement in physical fights. For a given level of absolute ability, a one-decile increase in the ordinal rank—reflecting an increase of ten rank positions in a cohort of 100 students, or around one within-school standard deviation in rank—reduces the probability of smoking by 1.2 percentage points (relative to a mean of 17.8 percent) and the probability of drinking by 1.4 percentage points (mean 18.3 percent). It also significantly reduces the likelihood of having sex without birth control by 0.6 percentage points (mean 8.7 percent) and engaging in physical fights by 1.4 percentage points (mean 19.5 percent). These effects are large, given that we compare students who attend the same school and have the same absolute level of ability and the same observable characteristics. The effects for marijuana use, stealing, and drug selling are also negative, albeit smaller and statistically insignificant.
These findings are consistent with two theoretical models predicting a negative effect of a student's ordinal rank on risky behaviors through different channels. First, our findings can be reconciled with a human capital model in the spirit of Grossman (1972), in which students trade off the short-run pleasure against the long-run costs of risky behaviors. A high rank may signal to students a high future income, thus increasing the opportunity cost of risky behaviors. Second, our findings are in line with a model of self selection into social categories within a school cohort or class (Akerlof and Kranton 2002). Cicala, Fryer, and Spenkuch (2017) present such a model with two groups: “nerds,” who attain a high status through high educational achievement, and “troublemakers,” who attain a high status through engaging in risky behaviors. Because a student with a low ordinal rank has a comparative advantage in being a “troublemaker” relative to being a “nerd,” he/she engages more in risky behaviors than if he/she had a higher rank.
Exploiting extensive survey information on friendship formation, expectations, and attitudes, we further investigate which of these theoretical mechanisms are supported by the data, finding evidence for both. To test whether a student's ordinal rank affects the sorting into peer groups, we analyze data on friendship networks within the school, showing that students with a higher rank are less popular and less likely to be friends with peers who smoke or drink. We also show that students with a higher rank have a higher perceived intelligence and higher expectations towards their educational career. This result provides evidence of the ordinal rank shaping expectations, which may in turn affect the engagement in risky behaviors, thus supporting the notion that the ordinal rank provides students with a noisy signal about their actual ability.
In a series of robustness checks, we carefully consider several threats to identification. One issue is reverse causality, whereby students may achieve a low rank because they have engaged in risky behaviors. To address this concern, we estimate a lagged dependent variable (LDV) model and also show that risky behaviors do not predict cognitive ability measured two years later. A further issue is individual-level confounders that may be correlated with the rank but also have a direct effect on risky behaviors. In extensive simulations, we show the conditions under which these confounders bias the estimates, as well as the direction in which the bias would work. For the majority of plausible confounders such as parental pressure or intrinsic motivation, we show that our estimates are biased towards zero. Finally, because we base the ranking on a random sample of every school cohort, the rank is measured with error. In simulations, we demonstrate that this measurement error results in a moderate attenuation bias.
This paper provides new insights into the determinants of risky behaviors among adolescents. In particular, it complements the literature on peer effects, which finds substantial spillovers of risky behaviors (Gaviria and Raphael 2001; Lundborg 2006; Clark and Loheac 2007; Soetevent and Kooreman 2007; Argys and Rees 2008; Eisenberg, Golberstein, and Whitlock 2014).1 Based on our identification strategy, we demonstrate that a student's ordinal rank is an additional—and equally important—channel through which a peer group affects behavior. The paper also relates to the work of Balsa, French, and Regan (2014), who identify relative material deprivation as a determinant of risky behavior, whereby students with a higher social status within their cohort are less likely to engage in risky behavior. Our paper complements their findings by showing that, after controlling for social status, the relative ability of a student is an equally strong determinant of risky behaviors.
Furthermore, our paper highlights the importance of ordinal rank in high school for noneducation outcomes. Recent research has found a significant impact of a student's ordinal rank in school on test scores (Azmat and Iriberri 2010; Murphy and Weinhardt 2014; Goulas and Megalokonomou 2015).2 This paper extends previous work in which we show that a student's ordinal rank significantly affects their decision to attend college (Elsner and Isphording 2017). The present paper departs from this work in several important ways. First, we show that the ordinal rank in high school matters for a large number of noneducational outcomes, which in turn are important determinants for health and success later in life. As a second contribution, we explore sorting into peer groups as an important channel through which the ordinal rank affects risky behaviors, showing that highly ranked students are less popular and less likely to be friends with students who themselves engage in risky behaviors. Finally, we make a methodological contribution by assessing the biases from measurement error and omitted variables inherent in the empirical analysis of the impact of ordinal rank. In extensive simulations, we demonstrate the conditions under which these biases matter for the estimation, as well as what sign and magnitude they have.
The remainder of the paper unfolds as follows. Section II describes the dataset and the construction of the main variables of interest. In Section III, we present the identification strategy and discuss potential threats to identification. In Section IV, we show the main results and present a series of robustness checks, as well as summarizing the results of comprehensive simulation exercises aimed at quantifying the biases from measurement error and omitted variables. In Section V, we explore several channels that can help to explain why a student's ordinal rank affects risky behaviors. Finally, Section VI concludes.
II. Data and Descriptive Statistics
We base our empirical analysis on data from the National Longitudinal Study of Adolescent to Adult Health (AddHealth). The AddHealth dataset is particularly suited for our application because its main focus lies on the interaction between adolescents' education and health behavior. Moreover, it covers multiple cohorts within a school, allowing us to hold school characteristics constant through a within-school/ across-cohort design. In the following, we describe the dataset and the sample, as well as presenting descriptive statistics for the main outcome variables.
A. The AddHealth Dataset
AddHealth is a panel survey of 144 representative middle and high schools in the United States. Students are followed from adolescence into adulthood in four waves. In our application, we use the first two waves of the survey, the first of which was collected in 1994–95, when students were on average 16 years old, while the second wave was collected in 1996. Within each school, up to six different cohorts were initially sampled in Wave I of the survey. Each cohort is observed in a different grade level—that is, Cohort 1 in Grade 7, Cohort 2 in Grade 8, etc. In Wave II, the same cohorts are one grade level higher.
The AddHealth data comprise multiple samples. The in-school sample comprises all students of sampled schools that were present on a fixed interview date. This sample provides brief information on health behaviors and educational achievement, but lacks more in-depth information on risky behaviors and skills, and students have not been followed over time. Therefore, for our main analysis we use the in-home sample of AddHealth, which includes comprehensive information on health conditions and behavior, family environments, cognitive ability, educational achievement, and friendship relationships with repeated observations over time.
The in-home sample comprises a random sample of 17 boys and 17 girls drawn from every grade level of each school—the so-called “core sample.”3 In addition, students from specific minorities were oversampled (Puerto Ricans, Chinese, Cubans, high- educated blacks, twins, siblings, and students with disabilities). A small number of schools was sampled completely.4
From 14,398 students that we observe in both Wave I and II of the AddHealth, we drop observations from all schools with 20 individuals or less (55 obs.) and all grades with five students or less (340 obs.). Furthermore, we delete from the in-home survey all observations with missing information in dependent and control variables (1,545 obs.).5
The final sample comprises 12,536 students in 132 schools and 461 school-cohort combinations. See the Online Appendix (Table 5) for the summary statistics for the main control variables.
B. The Ordinal Rank
Our regressor of interest is a student's ordinal ability rank in a high-school cohort, which measures how a student ranks in terms of cognitive ability relative to all other students in the same school cohort. We construct this rank based on a standardized measure of cognitive ability, which is available for all students in our sample. The in-home sample of AddHealth includes an abridged version of the Peabody Picture Vocabulary Test (PPVT), which measures logical reasoning and has been shown to strongly correlate with other intelligence tests, such as the Wechsler Intelligence Test or the Armed Forces Qualifying Test (AFQT) (Baker et al. 1993; Dunn and Dunn 2007). The test is age-specific and is carried out in 87 rounds. In each round, students are shown four pictures and given a word that they have to match to the picture that fits best. With every round, the test increases in difficulty. From this series of answers, standardized scores are computed, with more difficult tasks receiving a higher weight. The Peabody test has been shown to be a feasible and successful method for assessing basic cognitive abilities in large-scale surveys, with high retest reliability and stability of scores during childhood (Dunn and Dunn 2007).
With all students in the in-home sample, the PPVT was carried out face-to-face with the interviewer. The results were computed after the survey day and were neither disclosed to the students nor to their teachers or parents. Therefore, students had no incentive to achieve a particular rank position, and their performance is unlikely to be influenced by parental pressure, peer pressure, or teaching to the test.6
On the basis of the Peabody score, we rank all students within a school cohort, assigning Rank 1 to the student with the lowest score and Rank N—the total number of students in the cohort—to the student with the highest score. To ensure comparability across cohorts with differing size, we compute a student's relative rank position by standardizing the absolute rank to the cohort size,

which results in a rank measure bounded between 0 (the lowest-scoring student) and 1 (the highest-scoring student).
In our research setting, we measure the cognitive ability of a student in a given school year and rank them relative to all other students in the same school cohort in the same school year. We subsequently estimate the impact of the ordinal rank on risky behaviors under the maintained assumption that a student's cognitive ability is formed during childhood and thus predetermined during adolescence. This assumption is supported by the findings of Cunha et al. (2006) and others, showing that cognitive skills are mostly formed before the age of ten and remain fairly stable thereafter.7
In each school cohort, we observe the ability distribution of a random sample of around 30 students, which we use to approximate the ability distribution of the entire school cohort.8 In a regression of risky behaviors on a student's rank, this approximation introduces measurement error because some students are assigned a higher rank than they would have in the population, while others are assigned a lower rank. In the Online Appendix (Table 6) we demonstrate that even without observing the full population, one can obtain a meaningful estimate for the average impact of the ordinal rank on various outcomes as long as the sample is drawn randomly from every school cohort.
The main advantage of using Peabody scores as a base for the ranking is their comparability across cohorts within a school. A potential alternative metric would be grades, which are more visible to the student than cognitive ability. However, grades have the disadvantage of not being standardized within and across schools. Rather, many teachers apply grading on a curve; that is, they grade exams according to an a priori determined distribution.9 With such a grading scheme, two students with the same grade point average (GPA) in the same school may considerably differ with respect to ability and other characteristics. In addition, grades in AddHealth are self-reported and have many missing observations, thus making them less suitable for our analysis compared with the Peabody test.10
One might be concerned whether students actually know their rank, given that school cohorts are large, and students do not observe their score on the ability test. However, while the precise rank in a large peer group may not be perfectly observable, it is plausible that students know more about their relative ability in their cohort than about their absolute ability. It should also be noted that our identification does not rely on small idiosyncratic differences between ranks. In our sample, the within-school standard deviation in ordinal rank conditional on absolute ability is 0.12, which indicates that two students in the same school with the same level of ability end up on average in very different percentiles if they are in different cohorts. While it may not be plausible that a student accurately assesses whether he/she has rank 70 or 71 in a group of 100 students, he/she presumably can distinguish between peers who are ranked higher or lower by ten ranks, which is about one within-school standard deviation in ranks in our sample. In later estimations, we provide evidence that students with a higher rank have a higher perceived intelligence compared to those in the same school with the same ability but a different rank, suggesting that students indeed have an idea about their position in the ability distribution of their school cohort.
C. Outcome Variables: Risky Behaviors
We consider as outcomes five types of risky behaviors: smoking, (binge) drinking, marijuana consumption, risky sex, and delinquent behavior (stealing, physical fights, and drug selling). All dependent variables are constructed as binary indicators for different intensities of risky behavior. The information on risky behaviors is available in Wave I and II of AddHealth. Because the questions are retrospective (for example, “During the past 30 days, on how many days did you smoke cigarettes?”), we use the answers from Wave II as outcome variables and regress them on the ordinal rank in Wave I. Wave II was collected in 1996, around 18 months after Wave I. We further use similarly constructed indicators from Wave I to control for trends in risky behaviors in some specifications.
All behaviors are self-reported. To reduce the risk of misreporting—whether due to peer pressure, the presence of an unknown interviewer or the fear of being reported to the school authorities—the answers to sensitive questions about drugs, sex, health behaviors, or criminal activity were elicited through computer-assisted self-interviews (CASI). The questions were played to the participant via headphones and the answers were anonymously typed into a laptop without being shown to the interviewer.11 Our main dependent variables are constructed as follows:
1. Smoking
The indicator for smoking is based on the question “During the past 30 days, on how many days did you smoke cigarettes?” In our main analysis, we focus on an indicator for regular smoking, which equals one if students report having smoked on at least ten out of the past 30 days. Further intensities considered in the Online Appendix are whether individuals have ever smoked and whether they have smoked intensively (ten daily cigarettes on at least ten out of the past 30 days).
2. Drinking
The indicator for drinking is based on the question “During the past 12 months, on how many days did you drink alcohol?” In our main analysis, we focus on regular alcohol consumption, which we define as drinking on at least two days per month during the last year. In the Online Appendix, we also consider whether individuals have ever consumed alcohol and whether they have been drunk on at least two days per month during the last year.
3. Marijuana use
The indicator for marijuana use is based on the question “During the past 30 days, on how many days did you consume marijuana?” We consider as a regular marijuana user someone who consumed at least once in that period. In the Online Appendix, we will also consider as outcomes whether someone has ever consumed marijuana or whether someone consumed more regularly.
4. Sex
In our main analysis, we consider an indicator for sexual intercourse without any measure of birth control within the past six months, which equals one if neither the respondent nor his/her partner used contraceptives during their most recent intercourse. In the Online Appendix, we additionally consider binary indicators for having ever had sexual intercourse, as well as for having had any sexual intercourse (with or without contraception) within the past six months.
5. Delinquent behavior
Finally, we assess three categories of delinquent behavior. We construct binary indicators for stealing (including shoplifting and burglary), engagement in physical fights, and drug selling, which equals one if a person reported that he/she engaged in these behaviors at least once in the last 12 months. For delinquent behaviors, we only observe the incidence but not the intensity.
6. Descriptive statistics
Table 1 lists baseline probabilities of risky behaviors in Wave II of AddHealth for different subgroups. Smoking, drinking, and marijuana consumption are widespread in the observed population, with 18 percent of students regularly smoking or consuming alcohol and 16 percent having recently consumed marijuana. Rates for alcohol and marijuana consumption are larger for boys than girls. Consumption appears to be largely unrelated to parental socioeconomic status, although children of college-educated parents display marginally lower consumption rates. There are strong racial disparities in consumption: black students are much less likely to smoke and drink than their white counterparts, although the difference is smaller for marijuana consumption.
About 9 percent of all students report having had sexual intercourse without using contraceptives within the past six months. This number is larger for female students (10 percent) compared to male students (8 percent) and is strongly related to parental education (14 percent for children of high school dropouts compared to 6 percent for children of college graduates).
Average delinquency rates range from 23 percent for stealing to 7 percent for drug selling and are predominantly driven by male students. Racial disparities and differences by socioeconomic background are less pronounced than with the previous categories of risky behavior.
Finally, we report the probabilities for different grade levels. Students who attended Grade 7 in Wave I were on average 15 years old when we measure their risky behaviors in Wave II, while those in Grade 12 were on average 20 years old in Wave II. With the exception of delinquent behaviors, engagement in risky behaviors increases with age. The baseline probabilities for smoking, drinking, marijuana use, and sex—as well as the age gradients in these behaviors—are similar to those reported in Argys and Rees (2008), which were based on the National Longitudinal Survey of Youth 1997 (NLSY97).
III. Empirical Strategy
We aim to estimate the causal impact of a student's ordinal rank on risky behaviors. In this section, we explain how we identify the effect by exploiting differences in the ability distribution across cohorts within a school. We first describe the empirical model that allows us to isolate the identifying variation before discussing the identifying assumptions. We also briefly highlight some threats to identification, although we provide a more extensive discussion along with the results.12
A. Identification: Basic Idea
Figure 1 in the introduction has revealed a significant negative relationship between the within-cohort rank and engagement in risky behavior. However, a simple correlation cannot be interpreted as causal because it could be driven by selection into schools, differences in parental background, average peer quality, or other unobserved factors that may simultaneously affect a student's rank and his/her engagement in risky behavior. Identification of a causal effect thus requires exogenous variation in a student's ordinal rank.
Our research design is based on the idea that the same student would face a different ability distribution if he/she was in a different cohort and would thus have a different ordinal ability rank in the respective cohort. Given that we can only observe each student in one school cohort, we compare students in the same school who have the same level of absolute ability but a different ability rank due to being in different cohorts. With rank being assigned according to one's own ability as well as the ability distribution of one's school cohort, this variation in rank may come from differences in the average cohort ability (Figure 2, Panel A), the variance (Panel B), or higher moments of the ability distribution.
Variation in Mean and Variance of Ability
Notes: This figure illustrates two sources of variation used in the identification of the rank effect. A student with a given level of ability has a different rank in different school cohorts if the cohorts differ in their mean ability (Panel A), the dispersion of the ability distribution (Panel B), or both. This graph is adapted from Elsner and Isphording (2017) and Murphy and Weinhardt (2014).
To estimate a causal effect, we want to exploit the component of cohort-to-cohort variation in the shape of the ability distribution that is plausibly exogenous from a student's perspective. In principle, the ability distribution differs from cohort to cohort for a variety of reasons, many of which are beyond the influence of the student or his/her parents. In most schools, whether a student is in one cohort or another is determined by a cutoff date and the student's birth date: if a student is born before the cutoff, he/she goes to school one year earlier than someone born after the cutoff. This may create idiosyncratic variation in the cohort composition if children with higher ability are born before the cutoff in some years and in other years after the cutoff. In addition, the ability distribution within a school catchment area naturally fluctuates more than it would in the entire United States. If we looked at entire school entry cohorts in the United States, the cohort composition would not strongly differ from one year to another due to the law of large numbers. However, within a school catchment area, where a school entry cohort is 100–200 students, the law of large numbers may not hold, which is why the ability distribution may naturally fluctuate across cohorts.
To be sure, there may also be systematic factors explaining why the ability distribution may vary between cohorts within a school, such as selection into schools.
Our identification strategy seeks to isolate the idiosyncratic variation from these systematic—and potentially confounding—factors. The variation in the ordinal rank across cohorts within a school can be considered quasirandom under two assumptions: (i) being in one cohort or another is beyond the influence of the student, his/her parents or school administrators, and (ii) the difference in the ability distribution across cohorts within a school is not systematic—that is, it is not driven by unobserved factors that may also affect a student's engagement in risky behaviors. In the following, we present the empirical model and discuss the conditions under which these assumptions hold.
B. Empirical Model
To estimate the impact of a student's ordinal rank on his/her engagement in risky behavior, we estimate versions of the following model.

The dependent variable is a binary indicator that equals one if student i in school s and cohort c has engaged in a risky behavior. We regress this variable on a student's ability rank in a school cohort in wave I and control for absolute cognitive ability in wave I with a fourth-order polynomial.13 The vector Xisc includes individual control variables measured in Wave I, namely gender, age in months, height, race/ethnicity (white, black, Asian, Hispanic), indicators for highest parental education (less than high-school, high-school, some college), highest parental occupational status (not working, blue collar, white-collar low-skilled, white-collar high-skilled), a dummy for both parents being present in the household and the size of a student's school cohort. The vector δsc represents a set of school and cohort fixed effects for which we choose several parameterizations. The error term εisc captures unobserved determinants of risky behaviors. To account for common shocks, we cluster the standard errors at the school level.
Our parameter of interest is g, which relates a student's ordinal rank to his/her engagement in risky behavior. For g to have a causal interpretation, we have to assume that the regressors are strictly exogenous, that is, uncorrelated with the error term, formally E(εisc │ risc, αisc, Xisc, δsc) = 0. The plausibility of this assumption depends on the choice of fixed effects. We consider three versions of the model in Equation 2.
1. A two-way fixed-effect model
This model includes separate fixed effects for schools and cohorts; that is, δsc = ρc + ρs. It compares students with the same ability who attend the same school but are in different cohorts. The school fixed effects capture static selection into schools— that is, the fact that students in different schools differ on average along many dimensions, such as parental background or ethnicity. In a two-way fixed-effect model, the parameter g is identified through differences in all moments of the ability distribution across cohorts within a school. Despite being a useful starting point for the empirical analysis, the separate school and cohort fixed effects do not capture any school-cohort specific confounders that may violate the strict exogeneity assumption. There are many candidates for such confounders. One is the average cognitive ability of a school cohort, which is mechanically related to a student's ordinal rank. A student with high-ability peers will mechanically have a lower rank than he/she would have in a school cohort with low-ability peers. Therefore, a two-way fixed- effect model without any school-cohort-level controls does not allow us to disentangle the ordinal rank effect from an average peer effect. A further confounder can be dynamic selection into schools, such that the cohort quality may increase or decrease over time. Furthermore, a student's peers may differ between cohorts along many other dimensions, such as race/ethnicity, parental background, or disruptive behaviors. Moreover, there are many school resources that may vary across cohorts within a school—such as teacher quality—and some school cohorts may be more exposed to health campaigns like “Safer Sex” or “Drink Responsibly,” which may affect their engagement in risky behaviors.
2. A two-way fixed-effect model with school-cohort-specific controls
In this specification, we additionally include school-cohort-specific control variables, that is, δsc = ρc + ρs+S'scϕ. Some of the confounders mentioned above—such as average ability or parental background—may be observable or can be proxied, whereby they can be included as controls in the regression, summarized by Ssc. In the regressions, we control for the following school-cohort level variables: cohort size; average cognitive ability; share of girls; mean parental education; share of first- or second-generation immigrants; share of black, Asian, and Hispanic students; share of students who smoke, drink, use marijuana, engage in risky sex, steal, engage in physical fights, and engage in drug crimes in Wave I.
Other factors such as teacher quality or peers' disruptive behaviors are unobservable to us, thus preventing us from giving c a causal interpretation.
3. A model with school-cohort fixed effects
The strictest version of the model includes school fixed effects that vary by cohort, δsc. The school-cohort fixed effects absorb any average differences in observables and unobservables between cohorts within a school. Therefore, all confounders at the school-cohort level—such as specific educational inputs, average school cohort characteristics, or dynamic selection into schools—are absorbed by the fixed effects. Moreover, the fixed effects absorb all unobserved common shocks to a school cohort, which have been shown to bias conventional peer effects estimates (Angrist 2014; Feld and Zolitz 2017). This increases the plausibility that the regressors are strictly exogenous and that γ can be interpreted as causal.
One may wonder, though, where the identifying variation in a model including school-cohort fixed effects is coming from. Identification of γ is possible on top of school-cohort fixed effects because the ordinal rank varies within school cohorts. Therefore, even if we control for school-cohort effects and absolute ability, there is a great deal of variation in individual ranks in the sample. Formally, person i's rank in school cohort sc is a function of his/her ability as well as the abilities of all other N - 1 students in a group, risc = g(ai, a1, a2,..., ai - 1, ai+1,..., aN), or, more compact, risc = g[ai, fsc(a)], where fsc(a) is the probability density function (pdf) of ability in group sc. Controlling for mean ability centers the ability distributions fsc(a│āsc) of all groups around the same mean. Controlling for further group characteristics—variance, skewness, or in the extreme case group fixed effects—will not reduce the within-group variation along the support of a. Compared to a model with separate school and cohort fixed effects, the source of variation here is different. By including school-cohort fixed effects, we compare students with the same absolute ability across all school cohorts, after having eliminated all mean differences between schools and between cohorts within a school. Identification of γ comes from differences in the shape of the ability distribution across school cohorts, fsc(a│δsc).
One may be further concerned that there is little variation left to identify the effect in a model with school-by-cohort fixed effects. However, we show in the Online Appendix (Table 6) that even in such a demanding specification there is considerable variation in the ordinal rank as well as the main outcomes. The most relevant variation is that of the ordinal rank conditional on absolute ability. Without any fixed effects, the standard deviation of the ordinal rank is 0.16, which means that at a mean cohort size of 180 students for a given level of absolute ability the ability rank would vary on average by (0.16· 180 =)28.8 absolute rank positions. Once we additionally control for separate school and cohort fixed effects, the within-school standard deviation of the rank becomes slightly smaller, with 0.12. In the most demanding specification with school-by- cohort fixed effects, the standard deviation is 0.115.
In the empirical analysis to follow, we present results for all three models. Due to its intuitive appeal, the second model is our preferred specification, although the model with school-cohort fixed effects provides cleaner estimates because it precludes any school-cohort-specific confounders.
C. Potential Threats to Identification
While the model with school-cohort fixed effects alleviates many concerns about identification, there may be more sources of bias in the estimation of γ. First, the estimates can be biased by reverse causality, as students may have a low rank because they engaged in risky behaviors before taking the test. A further issue is measurement error, which may bias the estimates because we do not observe the entire school cohort but rather a random sample thereof. Moreover, there may be individual-level omitted variables that are not absorbed by the school-by-cohort fixed effects. Students may also misreport their risky behavior, which is problematic if misreporting is correlated with rank. A further problem is redshirting, which occurs if some parents send their kids to school one year later than the legal starting age (Deming and Dynarski 2008). Finally, the ability test score may be a function of a student's prior rank, which would not permit a causal interpretation. In a series of robustness checks, we will later address all these issues and provide guidance about the likely direction and magnitude of the resulting biases. However, for the time being, we will interpret the results of the model with school-cohort fixed under the maintained assumption that the regressors are strictly exogenous, and thus we will interpret γ as causal.
IV. The Impact of Ordinal Rank on Risky Behaviors
In this section, we present estimates for the impact of ordinal rank on engagement in risky behaviors. We begin with the main results using different sets of fixed effects and summarize the results from robustness checks that address several threats to identification.
A. Baseline Results
Table 2 presents the results of OLS regressions of risky behavior in Wave II on the within-cohort percentile rank in Wave I. Each entry displays the result of a separate regression of Equation 2 and represents the marginal effect of an increase in rank on the risky behaviors indicated on the left, conditional on the controls and fixed effects in the respective column. The coefficients can be interpreted as percentage-point changes at the mean for each percentile change in the rank position. Overall, the results reveal a very robust pattern across specifications and outcomes. They confirm our hypothesis that students of higher rank less commonly engage in risky behavior. All coefficients have negative signs, although magnitude and statistical significance vary between outcome variables and specifications.
The first column displays the results of a linear model with individual controls and separate sets of school and cohort fixed effects, which control for static selection into schools and unobserved differences across schools, as well as differences between age cohorts. Identification in this specification is based on variation in the mean, as well as higher moments of the ability distribution across cohorts within schools. We find a negative and statistically significant relationship between ordinal rank and the likelihood of most risky behaviors, with the exception of negative but insignificant coefficients for marijuana use and drug selling.
Relative to the baseline probabilities, these are large effects. In the case of smoking, an increase by one decile in the local ability distribution—going up ten ranks in a cohort of 100 students, or going up by almost one within-school standard deviation—reduces the probability of regular smoking by 1 percentage point, or by about 5.8 percent evaluated at a mean of 17.79 percent (−0.104 × 10/17.79 ≈ 0.058). The effects are of similar magnitude for drinking (6.1 percent of the mean), risky sex (6.5 percent of the mean), and engagement in physical fights (5.9 percent of the mean).
In Column 2, we add school-cohort characteristics, notably cohort size, the mean and variance of cognitive ability, as well as means of control variables (gender, race, parental education) and risky behaviors (smoking, drinking, marijuana, risky sexual intercourse, and delinquent behavior) to the two-way fixed-effect model. By controlling for mean risky behavior, we shut off average peer effects as a potential confounding factor. In addition, controlling for mean ability breaks the mechanical negative correlation between mean cohort ability and own rank. In this specification, the identifying variation relies on differences in the variance and higher moments of the ability distribution across cohorts within schools. The results are similar to those in Column 1, although the coefficient of drug selling is now marginally significant.
In Column 3, we estimate a model with school-by-cohort fixed effects as described in Section III.B. Compared to the specification in Column 2, this model absorbs all mean differences in unobservables across school cohorts that have not been captured by the controls included in Column 2. Although this specification is more demanding on the data, the negative sign for all risky behaviors prevails, and the magnitude remains almost unchanged. We are more comfortable interpreting these results as causal compared to those from a two-way fixed-effect model because the school-cohort fixed effects absorb any confounder at the school-cohort level. In the remainder of the paper we refer to the results in Column 3 as our baseline results.
Finally, in Column 4, we address the concern of reverse causality. Risky behavior could influence cognitive ability—based on this we construct the ordinal rank such that the causality would run from risky behavior to rank rather than the other way round. To alleviate this concern, we estimate a lagged dependent variable (LDV) model in which we control for the engagement in risky behavior in Wave I of AddHealth. The coefficients represent the marginal impact of an increase in the ordinal rank on risky behavior in Wave II conditional on earlier risky behavior, which is why the magnitude is not directly comparable to those in Columns 1–3. For all risky behaviors, the negative sign prevails, and the effects of the ordinal rank on drinking, physical fights, and drug selling remain statistically significant. All other coefficients are negative but statistically insignificant. For smoking and risky sex, we cannot say whether the coefficient becomes statistically insignificant due to reverse causality or because these behaviors are persistent. In the Online Appendix (Table 10), we provide further evidence against reverse causality by showing that risky behaviors do not predict a person's Peabody score in Wave III of the survey.14
In sum, we consistently find that students with a higher rank are less likely to engage in risky behavior. Most effects are statistically as well as economically significant. The estimates are large, given that average peer effects have been absorbed by the school-by- cohort fixed effects and that we control for a rich set of individual determinants of risky behaviors. In terms of magnitude, our results are comparable with most linear-in-means peer effect estimates, which lie in the range of 1.5–2-percentage-point increases in the likelihood of drinking and smoking for a 10-percentage-point increase in the share of peers who drink or smoke (Gaviria and Raphael 2001; Lundborg 2006; Clark and Lohéac 2007; Fletcher 2010). In our paper, effects of similar magnitude can be found for a one-decile increase in rank, which is less than a within-school standard deviation in ordinal rank.
Thus far we have only displayed the results for one level of intensity for each type of risky behavior. In the Online Appendix (Table 8), we present the results of the specification in Column 3 for different levels of intensity. For smoking and marijuana use, we find larger effects for low and intermediate intensity and smaller effects for high intensity. For drinking, we find small effects for low intensity and large effects for intermediate and high intensity. In addition, in the Online Appendix (Table 9), we report estimates of heterogeneous effects for different parts of the local and global ability distribution, as well as by gender, race, and parental education. In particular, we test whether the rank effect differs between the upper and the lower half of the rank distribution, but find no evidence for a linear effect.
B. Robustness Checks
In a series of robustness checks and Monte Carlo simulations, we address several threats to identification. Here we present the main findings; detailed explanations and results can be found in the Online Appendix.
1. Measurement error
There are at least four sources of measurement error due to the sample design and the difficulty of measuring latent ability without error.
First, AddHealth oversampled minority students. Within each school cohort, several minority students were randomly sampled on top of the random sample of17 boys and 17 girls. We show that the rank based on the sample with minorities is virtually the same as the rank based on the sample without.
Second, the fact that boys and girls were drawn from every school cohort in equal numbers may not reflect the actual gender shares in the population. In a robustness check we show that the estimates are robust when based on school cohorts with an even gender balance.
Third, rather than observing the full population of a school cohort, we only observe a random sample. By computing the rank based on this sample, we assign to some students a higher rank and to others a lower rank than they would have in the population. In Monte Carlo simulations, we show that this sampling error works like classical measurement error, introducing attenuation bias in the estimates.
Fourth, the Peabody test score, on the basis of which we calculate the ordinal rank, may be measured with error. Whether this form of measurement error actually introduces a bias depends on which rank matters for students. Simulations show that if students care about their rank in terms of mismeasured ability, then measurement error would not bias the estimates. If, on the other hand, what matters for students is the rank based on true ability, we show that we underestimate the true effect (see the Online Appendix).
2. Omitted variable bias
In simulations, we address two potential sources of omitted variable bias, one due to a correlation between an omitted variable and the Peabody score, and one due to a correlation between an omitted variable and the rank. The results are shown in the Online Appendix (E.4). We provide arguments why most factors that are correlated with rank conditional on own ability represent causal mechanisms rather than confounders. We also show that an omitted variable that is correlated with absolute ability introduces no bias because ability is controlled for in the regression.
3. Further robustness checks and discussions
In the Online Appendix, we also discuss problems of attrition as well as the misreporting of risky behaviors. We also provide evidence that students' abilities are not affected by their prior rank.
V. Exploring Potential Mechanisms
The results confirm our initial hypothesis that students with a lower ordinal rank more commonly engage in risky behavior. This finding can be reconciled with at least two theoretical models. In a standard human capital model, such as Becker (1962) or Grossman (1972), a person chooses the optimal amount of risky behavior by trading off the short-run gains, that is, the pleasure from smoking, drinking, or sex, against the long-run costs of these behaviors, such as health problems, lower income, or unwantedchildbearing.Apersonwithahigherabilityhasahigherexpectedincomeand consequently a higher opportunity cost of engaging in risky behaviors. The standard modelimplicitlyassumesthatapersonknowshis/herability.However,ifapersondoes not know his/her absolute ability, the ordinal rank in his/her peer group may provide him/her with an imperfect signal about his/her absolute ability. A person who is actually smart but happens to have a low ordinal rank may choose to engage more in risky behavior because the low rank conveys low expected earnings and a low perceived opportunity cost of risky behaviors.
A second explanation could be that the effect is driven by status concerns in combination with sorting into peer groups. This follows the idea of Akerlof and Kranton (2002), whose model explains how students with different characteristics sort into social categories based on their social distance to each category. Moreover, in a recent paper, Cicala, Fryer, and Spenkuch (2017) develop a model in which a peer group is divided into two subgroups: “nerds,” students who achieve social recognition by being successful in school, and “troublemakers,” students who achieve social recognition by being disruptive in school and engaging in risky behaviors. Students sort into these groups depending on their comparative advantage within the peer group and behave as such to conform to their subgroup. In turn, the comparative advantage depends on a student's ordinal rank. The same student who tries to succeed in a class where he/she has a high rank may become a “troublemaker” in a class where he/she has a low rank.
In the following, we provide evidence in support of both mechanisms. While there may be further theories that could explain the effect, our data do not provide sufficient information to test them.
A. Popularity and Sorting into Peer Groups
We first test whether we find evidence for the rank affecting sorting into peer groups as predicted in the model of Cicala, Fryer, and Spenkuch (2017). Their prediction is that low-ranked students are more likely to be friends with students engaging in risky behaviors themselves.
To test this hypothesis, we exploit sociometric information on friendships. All students participating in AddHealth were asked to nominate up to five male and five female best friends from the school roster. On the basis of this information, we construct indicators of popularity and peer engagement in risky behavior.15
Table 3 displays the results of OLS regressions of various friendship variables on the ordinal rank as well as all control variables from Equation 2. We distinguish between in-nominations—how often a student is nominated as a friend by fellow students and the characteristics of those students—and out-nominations, namely how many fellow students a student nominates. The first row displays the effect of a student's ordinal rank on friendship nominations. A high rank significantly reduces the number of in-nominations, while having a small and statistically insignificant effect on out-nominations. This result suggests that highly ranked students are less popular in their school. For a one within-school standard deviation increase in the ordinal rank, a student would receive 1.8 fewer friendship nominations, which is large given that the mean number of in-nominations is 4.2.16 We also look at the average GPA as well as the likelihood of drinking and smoking among nominated friends. A student's ordinal rank has a large negative impact on the likelihood of being nominated by fellow students who smoke (smoking intensity is measured on a scale 0–6, mean = 1.07). This finding is in line with the predictions by Cicala, Fryer, and Spenkuch (2017), showing that highly ranked students receive lower social recognition from other students who engage in risky behaviors.
B. Other Mechanisms
Besides sorting into peer groups, a number of alternative mechanisms can explain the negative relationship between rank and risky behaviors. We provide suggestive evidence of the importance of these mechanisms by estimating Equation 2 with school-by- cohort fixed effects, using as outcome proxies for each mechanism. The results are displayed in Table 4. Each coefficient measures the impact of the ordinal rank on the outcome displayed on the left. The digit “1” represents a dummy variable that equals one if the student agreed to the statement in parentheses, and zero otherwise.
1. Distorted beliefs
While students may not know their absolute level of cognitive ability, they most likely have some idea of how their ability compares to that of people with whom they regularly interact. Therefore, the ordinal rank can provide students with a signal about their actual ability. Two students with the same absolute ability may assess their ability differently if they have different ordinal ranks. Put simply, students with a high rank may think that they are smarter than is actually the case, whereby they have a higher expected income and thus a higher opportunity cost of engaging in risky behavior. We first test whether students with a higher rank have a higher self-perception, based on a survey question on perceived intelligence. Indeed, higher rank is strongly related to higher self-perceptions on own ability. Going up by one decile in the rank distribution increases the probability of believing to be more intelligent than the average by 2.3 percentage points.
We further test whether students with a higher rank have higher expectations about their future, using various questions about career expectations in Wave I of AddHealth. As shown in Table 4, students of higher rank are significantly more likely to expect that they will attend college and that they will have a college degree by age 30. These results confirm the predictions of a human capital model, whereby a student's ordinal rank shapes his/her expectations, thereby distorting the tradeoff between the short-term pleasure and long-term costs of engaging in risky behavior.
2. Support from others
Besides providing a noisy signal to one self, the ordinal rank may also provide a signal to others. For example, Kinsler and Pavan (2014) show that parents of young children adjust their parental support depending on the relative ability of their child in preschool. However, not only parents but also friends and teachers might base their support on the ability rank of a student. We test this channel using questions on whether the student thinks that his/her parents, friends, or teachers care about him/her, although we find no evidence.
3. Self-esteem
The ability rank could affect individual self-esteem, for example, through increasing self-confidence in one's own ability. Self-esteem has been shown to significantly affect adolescent risky sexual behavior (Favara 2013). We assess the effect of a student's ordinal rank on three items of a common self-esteem questionnaire, again finding no significant relationship between the ordinal rank and these indicators.
VI. Conclusions
In this paper, we show that a student's ordinal rank in a high-school cohort is an important determinant of risky behaviors. Using data from AddHealth and applying a within-school/across-cohort research design, we show that highly ranked students are significantly less likely to smoke, drink, have unprotected sex, and engage in physical fights. These effects are robust to controlling for average peer effects, dynamic selection into schools, and school-cohort specific unobserved factors.
On the basis of rich survey information, we show that these effects can be reconciled with two theoretical models. We find that students of higher rank have significantly higher career expectations and thus lower perceived opportunity costs of risky behavior. This result is in line with a human capital model in which the ordinal rank provides students with an imperfect signal about their actual ability, thereby influencing the tradeoff between the long-run costs and short-run gains from risky behaviors. We also find evidence that the rank affects sorting into peer groups. Students with a higher ordinal rank are less likely to be friends with those who smoke and drink, as well as being generally less popular. This is consistent with a model in which the rank determines a student's comparative advantage of being in a social category, that is, being either a “nerd” or a “troublemaker” (Cicala, Fryer, and Spenkuch 2017).
These results highlight the importance of a student's ordinal rank in high school as a determinant for outcomes later in life. Parents should be concerned by these findings because they can have an important influence on the ordinal rank of their child via their school choice. Our results suggest that choosing the best possible school is not always optimal because a child with a low rank in the best school may be more inclined to engage in risky behavior than he/she would be in the second-best school. While our results show that a child's rank is important, it is important to highlight one caveat: our results are based on estimates within schools, with school inputs being held constant. Choosing a better school may result in a lower rank, although the costs of a low rank may be outweighed by the benefits of better teachers and a better learning environment.
Our results also provide insights for policymakers. Given that risky behaviors impose a significant cost for society, it is important to know their determinants to design interventions that prevent adolescents from engaging in them. Given that an ordinal ranking is present as soon as students slightly differ in their ability, it is not possible to prevent students from engaging in risky behaviors by having particularly homogeneous or heterogeneous classrooms. A more effective measure would be to specifically target low- ranked students and inform them about the long-run consequences of risky behaviors.
Acknowledgments
The authors thank three anonymous referees, as well as Laura Argys, Sanni Nørgaard Breining, Arnaud Chevalier, Damon Clark, Deborah Cobb-Clark, Tine Mundbjerg Eriksen, Jason Fletcher, Hani Mansour, Dan Rees, Derek Stemple, Felix Weinhardt, and Ulf Zölitz, as well as audiences at IZA, the Workshop Health. Skills. Education in Essen, EEA Mannheim, the workshop of the Copenhagen Education Network, U Aarhus, SOLE 2016, CU Denver, U Leicester and EALE 2016 for their helpful comments. This article uses confidential data from AddHealth. Researchers may obtain the data from the Carolina Population Center. The authors are willing to provide guidance in how to acquire the data.
Footnotes
Benjamin Elsner is Assistant Professor of Economics at University College Dublin, a Research Fellow at the Institute of Labor Economics (IZA) and a Research Fellow at CReAM, University College London. Ingo E. Isphording is a Senior Research Associate at the Institute of Labor Economics (IZA) in Bonn, Germany.
↵1 Besides the studies cited here, all of which document peer effects for multiple behaviors, a wealth of studies focuses on single behaviors, including Kremer and Levy (2008) and Fletcher (2012) on drinking, Card and Giuliano (2013) on intercourse, Krauth (2007) and Fletcher (2010) on smoking, and Lin (2014) on delinquent behavior.
↵2 Moreover, as shown by Tincani (2015), rank concerns among students are an important determinant of peer effects in student achievement. If the ability distribution of a classroom has a low variance, students have a greater incentive to work harder to achieve a higher rank. Additional evidence on the importance of ordinal rank is given by Gill et al. (2017), who show in a laboratory experiment that the ordinal rank affects effort provision, especially at the bottom and top of the rank distribution, as well as Kuziemko et al. (2014), who use experimental and observational data to show that people exert effort to avoid being ranked in last place.
↵3 Given the importance of random sampling for our analysis, we inquired with the data provider about the randomization. The sample was drawn by a NORC statistician before the survey based on a school roster. The rosters contained all of the students in the schools unless the parents requested their names to be removed. The schools were not involved in the selection of the sample. We would like to thank Joyce Tabor from the Carolina Population Center for providing us with this information. More detailed information is provided in Tourangeau and Shin (1999).
↵4 The sampling design is described in greater detail in Harris et al. (2009) and Harris (2009). In the Online Appendix, we show that excluding oversampled minorities in the construction of the rank measure does not alter our results.
↵5 For most variables, the share of missings is below 2 percent. The ability measure is missing for 618 students (4 percent). We ran an auxiliary regression of missing ability status on socioeconomic background variables. Results indicate that Asian students and children of parents without a high-school degree are more likely to not participate in the ability test. Regressions using imputed ability based on observable predetermined characteristics (age, height, gender, race, and parental background) yield similar patterns as the main results.
↵6 In a section on robustness checks, we will later discuss and quantify the omitted variable bias that arises if these factors affect the rank as well as risky behaviors.
↵7 In a robustness check, we will later relax this assumption.
↵8 Initially, AddHealth sampled around 40 students per school cohort: 17 boys and 17 girls sampled at random plus several students who were randomly drawn from minority groups. The actual average number in the sample is around 30, because some school cohorts are smaller than 40, and because some students dropped out of the sample due to missing information.
↵9 See Dubey and Geanakoplos (2010) for a theory explaining the work incentives for students under grading on a curve and the incentive for schools to implement it. Piopiunik and Schlotter (2012) provide empirical evidence of grading on a curve in German primary schools.
↵10 In fact, there are grades available from administrative transcripts in AddHealth. These are measured with little error, but are only available for a small subsample.
↵11 Despite CASI reducing misreporting, there may still be misreporting systematically related to rank. In the Online Appendix, we discuss the direction and magnitude of the bias resulting from misreporting.
↵12 This section draws from Elsner and Isphording (2017), where we use a similar setup to study the long-term impact of ordinal rank on educational attainment later in life.
↵13 As we show in the Online Appendix, results are robust to changing the order of the polynomial as well as controlling for ability nonparametrically.
↵14 A negative effect of drinking and marijuana usage on cognitive skills has—among others—been shown by Meier et al. (2012), although the evidence in their paper has been challenged by Rogeberg (2013).
↵15 The friendship information was elicited in the “in-school” sample of AddHealth, thus covering the entire student population. The same sample comprises basic information on smoking and drinking for every student. We construct the peer engagement variables based on this information, which is why the intensity of risky behaviors may not be comparable with the one used in our main estimates.
↵16 To take into account that friendship nominations are count data, we use poisson panel regressions in these cases.
- Received July 2016.
- Accepted December 2016.