Abstract
We assess the extent to which the large sibling correlations in substance use are causal. Our primary approach is based on a joint dynamic model of the behavior of older and younger siblings that allows for family specific effects, individual specific heterogeneity, and state dependence. We use the model to simulate the dynamic response of substance use to the behavior of the older sibling. Overall, we find that substance use is affected by the example of older siblings but only a small fraction of the sibling correlation is causal.
I. Introduction
Teenage smoking, substance abuse, involvement in crime, and engagement in risky sexual activity fluctuate but remain at high levels.1 Understanding the factors that lead adolescents to engage in these behaviors is a high research priority.
This paper examines whether substance use of one child directly influences the behavior of a younger sibling. Several studies have found significant correlations between risky behavioral patterns among siblings.2 Table 2 uses data from the National Longitudinal Survey of Youth 1997 (NLSY97) to show that the probability an adolescent has smoked, used alcohol, smoked marijuana, or used hard drugs in the past year is dramatically higher if an older sibling engaged in the corresponding behavior when at the same age, even after one includes a basic set of control variables. Findings of this nature are consistent with the possibility that substance use and other risky behaviors are contagious among siblings in a household. However, siblings share many influences, including common family backgrounds, neighborhoods, schools, and genes. These common influences potentially could account for most or even all of the correlations. And it is difficult to successfully control for the range of shared characteristics that affect siblings. As a result, there are very few convincing attempts to distinguish direct sibling influences from the plethora of unobserved factors that might contribute to the high correlation in delinquent behavior among siblings.
We address the problem of unobserved shared influences using two related empirical strategies. Both require panel data on sibling pairs and both make the maintained assumption that younger siblings do not influence older siblings. Several studies in the psychology literature support this assumption as a first approximation, including Buhrmester (1992) and Rodgers and Rowe (1988). To the extent that it is false, our estimates are likely to understate the influence of the older sibling on the younger one.
Our primary approach is based on a series of dynamic models of the behavior of the older and younger siblings. The models allow for state dependence and unobserved heterogeneity at the individual and sibling pair levels. They consist of a dynamic system of discrete choice equations in which the behavior of each sibling depends on exogenous variables, past behavior, and a person-specific error component. The behavior of the younger sibling also depends on the past behavior and, in some specifications, the current behavior of the older sibling. We use the model estimates to simulate the dynamic response of the younger sibling to the behavior of his or her older sibling. We use both binary choice models and ordered choice models. For our main specifications we use a simulation procedure to correct for bias arising from the fact that we do not observe use of hard drugs until after age 15 or use of other substances until after age 14. Asecondary contribution of the paper is to propose the use of single equation or multiple equation dynamic ordered response models with large numbers of categories but restrictions on the category-specific model parameters as a way to allowfor nonlinear state dependence in the presence of unobserved heterogeneity.
The results using the dynamic model show positive effects on smoking, drinking, and marijuana use. The bias corrected estimates indicate that smoking among older siblings in the period before we first observe the younger sibling increases smoking among younger siblings by 10 percent of the baseline probability, although this value is not statistically significant. The corresponding values are 21.7 percent for drinking and 23.6 percent for marijuana use, and these values are highly statistically significant. However, the effects are considerably smaller in later periods. We obtain substantially larger estimates when the younger sibling is affected by both the current and the lagged behavior of the older sibling although endogeneity is more of a concern for this specification. Estimates for hard drugs are also positive but fall short of statistical significance. The bias corrected ordered probit results are qualitatively consistent with the probit results and indicate that the sibling effect increases with the frequency of the older sibling’s substance use. However, they are imprecise.
We supplement the analysis of the joint dynamic model with probit regressions that use a correlated random effects (CRE) design in the spirit of Mundlak (1978) and Chamberlain (1984). These regressions relate the behavior of the younger sibling at time t to the behavior of the older sibling before that date, using the sum of the older sibling’s behaviors before and after time t as a control variable. The coefficient on the early behavior is the sibling effect. The coefficient on the sum of the past and future behaviors identifies the part of the link in sibling behavior that is due to common unobserved influences. The main advantage of the CRE approach is simplicity. Its main weakness is that state dependence (e.g. habit formation) and nonstationarity with respect to age could lead the past behavior and the future behavior of the older sibling to have different relationships with the younger sibling’s error component. This would bias the CRE estimate of the older sibling’s influence although the direction of the bias is not clear. The CRE results indicate that smoking by the older sibling raises the probability that the younger sibling smokes by about 15.6 percent of the sample mean for smoking. In the case of drinking alcohol the effect is about 9.1 percent of the sample mean.
Overall, we conclude that there is a modest positive sibling effect on substance use. However, simulations from the dynamic probit model indicate that sibling effects account for only a small fraction of the strong sibling correlation in substance use, although point estimates are noisy.
The results suggest that parenting behaviors and anti-substance use programs aimed at an adolescent would have beneficial spillovers on younger siblings. The appropriate parenting and public responses would depend on the mechanisms. Unfortunately, we do not get very far in distinguishing the mechanisms that underlie the sibling link that we uncover. Doing so will likely require better data and a structural model.
Although our focus is on sibling influences, the qualitative findings may be of some interest to the rapidly growing literature on peer influences among adolescents. Estimates of peer effects may be biased upward by the fact that adolescents select friends who share similar interests while children cannot choose their siblings. On the other hand, the problem of common genes and family factors is less severe for friends and acquaintances than for siblings. Furthermore, some of the strategies that have been employed recently in studies of peer effects, such as variation arising from quasirandom assignment of roommates, are not feasible for siblings.3 Perhaps for this reason, there is little quantitative evidence on peer influences among siblings. This knowledge gap provides the motivation for our study, despite the limitations of our identification strategies.
The paper continues in Section II, which provides a brief review of the economics and psychology literature concerning social influences on adolescent substance use, with a focus on sibling effects. In Sections III and IV, we discuss the NLSY97 data and document the strong correlation in substance use across siblings. In Section V, we present the joint dynamic probit model of sibling links in substance use. In Section VI, we describe results based on different specifications of the model and discuss checks on specification bias. In Section VII, we consider five-category joint dynamic ordered probit models. In Section VIII, we present results using the CRE approach. We close with conclusions and a research agenda.
II. Literature Review
We begin with a brief survey of the literature on family influences on risky behaviors, particularly substance use, drawing across the social sciences.4
Developmental psychologists and sociologists were first to investigate the importance of social environment on adolescent development and behavior. Although some perceive peer group influence as the single most important factor shaping a child’s behavior (Harris 1998), a number of psychologists continue to emphasize the primacy of the family in shaping a child’s attitudes and behaviors (Jessor and Jessor 1977; Kandel 1980; Barnes 1990).
Within the family, the psychology literature suggests two main mechanisms through which siblings may influence each other. The first one is that a sibling, most likely the younger one, may see his older sibling as a role model to observe, imitate, and use in shaping his notions about what types of behaviors are suitable (Widmer 1997; Buhrmester 1992; Rodgers and Rowe 1988; Bikchandani et al. 1992). In the context of risky behaviors, Patterson (1984) argues that siblings are more likely to learn these type of behaviors from each other when they have conflict ridden and aggressive relationships, because these promote antisocial behavior.
The second mechanism, which we refer to as the “opportunity hypothesis,” suggests that siblings influence each other’s behaviors by providing opportunities (friends and settings) for substance use and sexual intercourse. In contrast to Patterson’s hypothesis, this mechanism is more likely to occur with siblings who have better and warmer relationships, share friends, and hence engage in risky behavior together. For the purpose of our study, it is important to note that most of the literature surveyed here argues that the pattern of influence runs from the older to the younger child (Buhrmester 1992; Rodgers and Rowe 1988).
Although the economics literature does not focus specifically on siblings interactions, it also offers some rationales for conformity in behavior, which can be applied to siblings. For example, in his social distance model, Akerlof (1997) represents social interaction as a mutually beneficial trade between agents. Agents occupy a location in the social space, which is partly inherited. The model creates incentives for agents to interact with those that are close in the social space, thus possibly explaining their tendency to conform to the behavioral norms of those who share their inherited social location.
In addition to the theoretical work reviewed above, a large number of studies have investigated social influence on youth behavior empirically. However, most of these papers provide evidence of large correlations between siblings in a variety of behaviors, without necessarily devising a strategy for distinguishing causality from the effect of common unobserved factors. For example, Duncan et al. (2001) examine sibling correlations in measures of delinquency for a sample of adolescents in grades 7 through 12 using Add Health data. The sample includes genetically differentiated siblings within a family, peers, grade mates, and neighbors, thus allowing the authors to compare correlations in the same behavior across different types of relationships. The correlations are highest for siblings, especially for twins, thus suggesting a large scope for family influences. Using the same data set, Slomkowski et al. (2005) find that both genetic and environmental factors contribute to similarities between siblings’ smoking behavior. On the environmental side, parental behavior has been shown to be a source of imitation, although studies, such as Conger and Reuter (1996) and Windle (2000), show that it is less potent than sibling influences. Averett et al. (2011) do not examine sibling correlations but do show that having an older sibling increases substance use. Controlling for measures of parental supervision has little effect on the strength of this effect.
Researchers have also used data about the quality of the relationship between siblings to study sibling influences. For example, using the Arizona Sibling Study, Rowe and Gulley (1992) find that correlations in substance use and delinquent behavior are higher when interactions are warmer, less conflict-ridden, and more frequent, and when siblings have more mutual friends and are of the same gender. Although these results do not directly test for the presence of a direct sibling influence, they are consistent with one, as suggested by the opportunity hypothesis described above. Overall, however, results based on this type of data are mixed and often contradictory.5
Although some of the studies mentioned above control for a large array of family and parental characteristics and some find interactions that are consistent with a sibling effect, the sibling effects they estimate could reflect the impact of unobserved common factors. A few studies, mostly in economics, attempt to identify a sibling causal effect by using instrumental variables strategies. One of them, Oettinger (2000), estimates linear probability models of high school graduation of an older sibling on the probability that the younger sibling graduates and vice versa in the NLSY97. He finds that sibling influence runs mostly from the older to the younger sibling but his identification strategy relies on exclusion restrictions that seem questionable.6
Ouyang (2004) develops a dynamic model of the older and younger siblings’ behaviors. Her model allows for state dependence and for the older sibling’s behavior to contemporaneously affect that of the younger sibling. She estimates the model with NLSY97 data on cigarettes, marijuana, and alcohol consumption and finds strong evidence of a sibling effect. In contrast with our approach, however, she does not allow for individual specific unobserved heterogeneity and proxies family specific heterogeneity with the older sibling’s smoking history.
Finally, Harris and Lopez-Valcarel (2008) propose an interesting theoretical model in which siblings learn about whether smoking is desirable by observing their siblings’ decisions. They allow the decision not to smoke to have a different effect than the decision to smoke cigarettes. Using data on smoking behavior of family members from supplements to the CPS, they estimate a multivariate probit model in which the number of one’s siblings who smoke appears on the righthand side. They find a powerful sibling influence as well as some evidence that the positive effect of smoking is stronger than the deterrent effect of not smoking. However, their estimates imply that the variance of the error component that affects the behavior of all siblings is zero. That is, conditional on a limited set of observables, sibling effects account for the entire sibling correlation in smoking. We suspect that their finding of powerful sibling effects may be due in part to problems in separately identifying the common factors that influence smoking from the sibling influence.7
In sum, there are good theoretical reasons for believing that substance use and other behaviors of adolescents are causally influenced by siblings. However, the strong similarity in the behavior of siblings may be due to genes and shared environments, as well as a direct influence of one sibling on another. To date, little is known about the relative contribution of these mechanisms, let alone the precise nature of sibling interactions.
III. The NLSY97 Data
The empirical analysis uses the first eight rounds of the National Longitudinal Survey of Youth 1997 (NLSY97), a panel study of men and women who were between 12 and 16 years of age at the end of 1996. In the first round, the NLSY surveyed 8,984 individuals originating from 6,819 households in the United States. Because the sample design selected all household residents in the appropriate age range, the NLSY97 original cohort includes 1,892 households with more than one respondent. Using information about the relationship between the different respondents of the same household, we created a sample of pairs of biological siblings.
For every year since 1997, the NLSY97 contains extensive information about a wide range of risky behaviors. We focus on smoking cigarettes, drinking alcohol, using marijuana, and using cocaine and/or other hard drugs.8 The substance-use questions are administered directly by the respondent into a computer. The interviewer cannot observe the responses. The main outcome we analyze is whether the individual reports having engaged at all in the particular behavior since the last interview date. For example, for smoking, the variable takes the value 1 if the respondent reports having smoked since the last interview, and 0 otherwise. For each behavior, we construct this variable from two NLSY97 variables. The first and most important one is a dummy variable indicating whether the respondent has engaged in the behavior since the last date of interview. When it is available (usually in the first few survey rounds), we also use a second dummy variable, which indicates whether the respondent has ever engaged in the specified behavior. This second variable allows checking the consistency of some of the answers to the first question, as well as filling in some of the missing observations. These questions were not asked in every year. Web Appendix A reports the exact name, reference numbers, and survey years of the variables we used. The substance use information is first available in 1998 (1999 for cocaine and hard drug use), and we select those observations that are part of uninterrupted sequences of nonmissing answers. Because individuals do not answer questions about all behaviors in every round, the analysis sample is slightly different for each behavior. In the case of cigarette smoking for example, the analysis sample is composed of 1,650 pairs of siblings, for whom we have between one and six rounds of observations.
We also estimate models that use reports of the number of days the person engaged in the behavior in the previous month to construct an indicator for high consumption and an indicator for low consumption. We present results based on five consumption categories.
The younger siblings are between 15 to 19 years old when they enter our analysis sample while the older siblings are between 16 and 20. The average age of the younger sibling is 16.04, while the average age of the older sibling is 18.06. We use all pairs with adjacent birth orders (i.e., the first-born with the second-born and the second-born with the third-born if we have the three oldest siblings in our sample). A total of 1,456 pairs come from two-sibling families, while 176, 12, and 6 come from three, four, and five sibling families respectively.9 Our sample is 24 percent Black and 23 percent Hispanic. The high minority proportions stem from the fact that we use supplemental and military samples along with the cross-sectional sample. Unless we indicate otherwise, descriptive statistics and multivariate analyses we report are unweighted, and we do not account for nonrandom attrition.10
In all of our empirical work, we control for a set of individual and environmental characteristics. These consist of race, gender, AFQT percentile score, education completed by age 19, number of siblings, birth order dummies, mother’s education, and a dummy for whether the child lived with both biological parents at age 12. We also include three dummy variables describing aspects of the individual’s environment up to age 12. These consist of an indicator for whether the respondent ever heard gun shots or saw someone get shot at with a gun, an indicator for whether her house was broken into, and a third indicator for whether she ever was a frequent victim of bullying.11 As a sensitivity check, we experimented with using the child’s report of the percentage of his peers who engage in the behavior as an additional control, although the behavior of the child may influence his choice of peers. In some models, the explanatory variables include an indicator for whether the two siblings have the same residence and measures of parenting styles and the degree to which the child is influenced by parents and siblings. We use these variables as controls and as determinants of the strength of the direct sibling influence.12
We provide further details about variable construction and sample selection in Web Appendix A. Appendix Table 1 reports the age distribution of the sample. Appendix Table A2 reports unweighted and weighted descriptive statistics for the explanatory variables used in our analysis.
IV. Sibling Correlations in Substance Use
To set the stage, we document the strong relationship in substance use among siblings. Table 1 reports the mean values of the substance use measures for males, females, and the combined sample. The values are high for many of the behaviors. For example, 61 percent of the males and 58 percent of the females report drinking alcohol during the previous year. Twenty-five percent of the males and 19 percent of the females report using marijuana. The figure is about 40 percent for cigarette smoking. About 6 percent of the sample reports having used hard drugs in the previous year. The unweighted means are similar to the weighted means. (See Web Appendix Table 1.) The fractions who used the substance one or more days in the past month are lower, not surprisingly. Web Appendix Table 2 shows that incidence of the behaviors tends to increase with age until about age 20. The fractions of older siblings who engage in the behavior in all years and who engage in the behavior in some years are 0.21 and 0.42 (respectively) for smoking, 0.33 and 0.54 for drinking, 0.06 and 0.43 for marijuana, and 0.005 and 0.19 for hard drugs. The “some years” group plays a key role in distinguishing between family correlations and sibling effects.
Sample Means for Substance Use
In Table 2, we use a regression to summarize the relationship between substance use of the sibling pairs when they were the same age. Specifically, we report OLS estimates of ρ from the regression
Estimates of the Coefficient on the Older Sibling’s Behavior in a Linear Probability Model of Younger Sibling’s Behavior at the Same Age
(1)where
and
are the behaviors of the younger and older siblings at age a, respectively, j is the siblings’ age gap,
is a set of age dummies for the younger sibling, and X2 is a vector of controls that refer to the younger sibling and that are listed in Section III. We also report estimates with controls excluded. Throughout the paper, the superscripts 1 and 2 indicate whether a variable refers to the older sibling or the younger sibling, respectively.
The results are striking. Consider smoking cigarettes. If the older sibling smoked, the probability the young sibling smoked shifts by 0.226. This value is very large relative to the sample mean of about 0.4. With controls, the shift in the probability remains large at 0.17. In the case of marijuana, if the older sibling smoked at a given age, the probability that the younger sibling used marijuana at that age increases by 0.157, which is very large relative to the sample mean of about 0.22. Having an older sibling who uses hard drugs shifts the probability for the younger sibling by 0.092. This shift is larger than the unconditional mean of 0.06. In all cases, adding control variables weakens the relationship to some degree but a strong relationship remains.
We also present separate results for brother pairs and sister pairs. The relationship across siblings tends to be larger for sister pairs. Below we explore whether the size of the peer effect depends on the gender composition of the pair.
In the remainder of the paper, we address the key but difficult question of whether the sibling correlations are due, at least in part, to a causal effect of the older sibling’s behavior.
V. A Joint Dynamic Probit Model of Substance Use and Sibling Influences
In this section we present the joint dynamic model of substance use that underlies much of the econometric analysis. We begin with the behavioral model and then turn to the econometric specification and estimation issues.
Consider a sibling pair. The variables
and
are the ages of the older and younger siblings in year t: We leave family subscripts implicit throughout the paper. We focus here on the case in which y is a binary choice.
Let
be the difference between the perceived benefit and the cost of
, including the opportunity costs of foregoing other goods.13
is determined by
(2)The older sibling chooses
if
and chooses
otherwise. Thus
(3)The net benefit of
depends on a set of covariates X1, the vector
of age dummies indicating whether the older sibling is aged a in year t, the family-specific component ε and the person-specific component for the older sibling,
is a transitory error component affecting the older sibling in period t.
The benefit of
also depends on
, the lagged choice of the older sibling. The dependence parameter γ1 reflects two mechanisms. The first one is the effect of habit formation and informational effects. The second one is the effect of the information the parent has about the child, as well as the positive or negative influence of the parents’ reaction, on the net benefit of y1t to the older sibling. In principle, the net benefit of y1t could depend on
through a direct peer influence from the younger to the older sibling or because the parental response to the behavior of the younger child affects the older sibling. We assume that both of these effects are 0 and leave
out of Equation 3.
The younger sibling faces a similar problem, except that the net benefit
of his behavior also depends directly on the action of his older sibling in t and t -1. He chooses
if
and
otherwise. Thus
(4)where v2 is a component specific to the younger sibling and
is a transitory error component for the younger sibling at time t. The variable
is the older sibling’s age in the previous period, X2 is a set of observed covariates, and
is a vector of age dummies for the younger sibling.
The main parameters of interest, π2 and λ2, capture the direct influence of the older sibling on the younger sibling. We do not attempt to identify the specific mechanisms that underlie them. However, a brief discussion of what these mechanisms might be is in order. The coefficient λ2 on the lagged behavior of the older sibling may reflect information provision about how to get and use the substances of interest, as well as about the consequences of substance use. It also will pick up an effect on the preferences and expectations of the younger sibling about norms of behavior.14 In addition, it will reflect how the parent’s reaction to signals they receive about substance use of the older sibling in t – 1 affects the costs and benefits to substance use by the younger sibling in t.
The coefficient π2 on the contemporaneous behavior of the older sibling may reflect any and all of these mechanisms. However, it also may pick up a direct effect of
on opportunities for the younger sibling to get and use substances, as stressed by the opportunity hypothesis. This might be particularly important for younger siblings once they are old enough to be seriously at risk to use substances but too young to easily get access to them on their own. Because we assume that the older sibling influences the behavior of the younger sibling but not vice versa, we are implicitly assuming that the younger child observes
before choosing
.
Below we place restrictions on the distributions of
and
over time and across siblings. Without loss of generality, v1 and the corresponding component v2 for the younger sibling are assumed to be uncorrelated.
To complete the model, we assume that substance use is 0 for all individuals if at £ A0 where A0 is the minimum age of initiation. Let
and
be the years in which the older sibling and the younger sibling reach age A0: At age A0, the behavior of the older sibling is determined by
(5)where
is the year in which the older sibling reaches A0. On the other hand, the younger sibling’s behavior at A0 depends on the behavior of the older sibling and is given by
(6)Thus, the behavioral model consists of Equations 4–6.
The econometric model also contains Equations 3 and 4. But we do not observe behavior at the minimum age of initiation. Consequently, in the econometric model we replace Equations 5 and 6 with approximate equations for the behavior of the older and younger siblings in the year we first observe them. The choice of
in year
, the first year we observe the older sibling, is determined by
(7)The corresponding equation for the younger sibling is
(8)where
is the first year we observe the behavior of the younger sibling.15 From now on, the subscript 0 refers to parameters for the first year of observation. In the above equation,
and
are the sibling-influence parameters. Each is the sum of two components. The first component of
is the direct effect of
on
holding the prior behavior of the younger sibling constant. These components of
and
are directly analogous to π2 and λ2‚ but they will be larger if the influence of older sibling’s behavior is stronger at younger ages. (We allow the sibling influence parameters to depend on age in Section VI.B.) The second components of
and
capture the continuing influence of the behavior of the older sibling in prior periods on
. This lagged influence will be part of
and
because
and
are correlated with lags of the older sibling’s behavior, which influence
through the state dependence term
. The term
appears in Equation 4 (when
), but it is excluded from Equation 8 because we do not observe it.
The coefficients on the age controls and X also are allowed to differ between the equations for the first observations and the equations for the later periods, both because they may depend on age and because they will pick up lagged influences in Equations 7 and 8. They also differ between the older and younger siblings.
The model generalizes easily to allowing the sibling influence parameters to depend on other observables. Below we experiment with allowing the sibling influence parameters to depend on the gender mix of the sibling pair, on the gap in age between the younger and older sibling, and on whether they share the same residence in models with
and π2 constrained to zero. The psychology literature suggests that these parameters could depend on “ family process” variables characterizing the relationships between the parent and child and between the siblings. We did not explore this possibility using the joint dynamic probit model because results using the correlated random effects regressions suggest lack of power to identify the interaction effects. (See Section VIII.B.)
We assume the sibling pair specific error component ε is
. The person specific error components v1 and v2 are
and
, respectively. They are independent across siblings. The error components
and
are N(0‚ 1). They are independent across siblings and years. One might expect temporal variation in factors such as stresses within the family (eg., parental unemployment, marital conflict, parental substance abuse) or variation in access to drugs or alcohol in a neighborhood or in a school to lead
and
to covary conditional on ε and thus violate this assumption. We experienced numerical difficulties in estimating a version of the model with a common iid sibling pair specific shock. We strongly suspect that, in order to identify the model with contemporaneous sibling influences without strong restrictions on the error structure, one needs to observe powerful time-varying exogenous determinants of substance use that influence one’s own behavior but not one’s sibling. Unfortunately, we do not observe such variables. We believe that the estimates of the contemporaneous peer influence parameters
and π2 will be biased upward if common transitory shocks are important.
In our main specification, we restrict the factor loadings α on the family effect ε and the factor loadings δ on the individual effects v1 and v2 to equal 1 in all equations. Note, however, that we allow the variances of v1 and v2 to differ.16 Our results are not sensitive to replacing the assumption that ε is normally distributed with a three point discrete distribution. We ran into numerical problems when we attempted to increase the number of points beyond 3.17 In Section VI.G.3, we consider an alternative, less restrictive error specification. We estimate the models by maximum likelihood.18
A. Assessing the Potential for Bias from Misspecification of the Equations for
and 
We use the specifications 7 and 8 for
and
for empirical tractability, but they are restrictive given the behavioral model 3–6 First consider model 7. The nonlinearity of (5) and (3) in combination with state dependence implies that the excluded lagged term
is a complicated nonlinear function of age of the older sibling, X1‚ v1‚ ε‚ and lagged values of
. The linear, additively separable form we use for the latent index is at best an approximation. Also, we exclude the lagged values of
entirely from Model 7, although this is not a major concern given our assumption that the
and the
terms are uncorrelated across time and siblings.
Similar comments apply to Equation 8 for
. The lag
that appears in Equation 4 but is excluded from Equation 8 is a complicated nonlinear function of the age of the older sibling, the age of the younger sibling,
X2, ε, v2, and lagged values of
. We have excluded the lags of
and use a linear, additively separable form for the other variables. Furthermore, in the presence of both sibling influences and state dependence,
also will depend on X1, v1, and prior values of
. We have excluded these as well.
Our use of separate equations for the first period of observation of a dynamic discrete choice model with random effects may bring to mind the approach to dealing with the endogeneity of the initial observations suggested by Heckman (1981) and Wooldridge (2005) in a single equation context. However, in those papers, the parameters of the equation for the first period of observation are treated as nuisance parameters. Inference about the main parameters of interest are based on the later periods. In contrast, we treat sibling influence parameters
and
as parameters of interest that capture the sum of the direct response of
to
and
and to persistent effects of prior substance use by the older sibling that are correlated with
and
. We do so because it is important to examine the early behavior. However, as we have just discussed, and as is clear from Heckman (1981) and Wooldridge (2005)’s analyses, the specification of Equations 8 and 7 imposes linearity and exclusion restrictions that are not consistent with the full behavioral model. This leads to the potential for bias. The hope is that the joint model will handle most of the endogeneity of
in equation for
arising from the family-specific error component. In practice, the specification may be too restrictive, leading to bias in the sibling influence parameters. The fact that the age of the younger sibling at
varies across sibling pairs is a further worry, because it implies that the parameters governing the first observation on the older and younger sibling may vary, further complicating the error structure.
To address these concerns, for our main specifications we use a 4 step simulation procedure to estimate the bias in
, and
.
1. Simulate data on the substance use of the older sibling and the younger sibling from the minimum age of initiation A0 (age 13) forward for the older and younger sibling under the assumption that there is no sibling effect. The simulation model consists of Equation 5 and 6 for the behavior of the older sibling and the younger sibling at age A0 and Equation 3 and 4 for subsequent periods. We set
to
, and the age intercept
to the estimated age intercept corresponding to the youngest age observed for older siblings (age 15). We set
to
, andG
to the estimated age intercept corresponding to the youngest age observed for younger siblings (age 15). Similarly, we set the error variances and the coefficients in Equations 3 and 4 to their estimated values. Crucially, we set the sibling influence parameters
, and λ2 to 0. To minimize random variation associated with the simulation while preserving the demographic distribution of the data, we simulate 100 histories for each sibling pair used in the actual estimation.2. From the simulated data on each sibling pair, select observations for ages that are observed in the NLSY97 and used in estimating the model. This subsample of the simulated data corresponds to the NLSY97 estimation sample but it is generated from a model in which there is no sibling effect.
3. Estimate the joint dynamic probit model on the simulated data. The values of
for the simulated data are our estimates of the bias from mispecification of the equations
.4. Subtract the bias estimates from the MLE estimates to obtain bias corrected estimates of
, and λ2:
We treat the bias corrected estimates as our preferred estimates.19 The procedure is analogous when we work with our baseline specification, which excludes contemporaneous sibling effects.
VI. Results for the Joint Dynamic Probit Model
We now turn to estimates of the joint dynamic probit model. In Section VI.A, we start with our baseline model specification, which assumes that sibling influences operate through
but not
. That is, it restricts the contemporaneous sibling influence parameters
and π2 to 0. It also excludes age interactions. In Section VI.B, we allow the model parameters to depend on age. In Section VI.C, we examine whether the sibling influence parameters depend on the gender mix, on the age difference between the siblings, or on their coresidence. In Section VI.D, we add contemporaneous sibling influences. In Section VI.E, we simulate the effect of an exogenous switch in the behavior of the older sibling from 0 to 1 in period
on the time paths of substance use of the older and younger siblings. In Section VI.F, we use the baseline model to address the fundamental question of how much of the sibling correlation in substance use is causal. Finally, in Section VI.G, we discuss various checks for specification bias and robustness to alternative model specifications.
A. Baseline Results
Table 3 reports the estimates of the main parameters of interest for our baseline specification. 20 Column 1 reports the results for smoking cigarettes. The estimates of the state dependence parameters are 0.947 (0.068) for the younger sibling and 0.906 (0.062) for the older sibling. Thus, lagged behavior matters. Dynamic simulations reported below indicate that smoking today raises the probability that the older sibling smokes by 0.508 (0.051) next year and by 0.040 (0.010) two years out relative to the baseline.21
Estimates of Dynamic Probit Model
The value of
is 0.746 (0.051). This indicates that a substantial common error component drives the smoking behavior of sibling pairs. We also find an important individual specific error component:
and
are 1.034 and 0.837, respectively. Consequently, temporal correlation in cigarette smoking comes from the influence of the family specific and individual specific error components, as well as from true state dependence.
Next we turn to the sibling influence parameters
and λ2. These are the coefficients on
in the equations for
in the first period of observation
and the subsequent periods respectively. A priori, we would expect both to be positive. We also would expect
to exceed λ2 because we do not condition on
in (8).
is 0.213 (0.102), which is significant at the 5 percent level. Comparing this value to the state dependence term indicates that having the older sibling smoke shifts the latent variable for smoking by about one fifth the amount that smoking in the past does. The coefficient
for subsequent years is .054 (0.069), which is positive but not significant.
The fact that we find a stronger sibling effect in the equation for the first period of observation than in the equation for subsequent periods could reflect the fact that
captures influences over more than one period and that the effect is larger at younger ages. However, as we discussed above, the simplified specification of Equations 7 and 8 in the econometric model may lead to bias. Using the simulation procedure outlined in SectionV. A, we estimate the bias in
to be 0.063, which is 29.6 percent of the estimate. The bias in λ2 is small and negative in sign (–0.013). We subtract these values from
and
to arrive at the bias corrected estimates reported in the rows labelled “Bias-corrected estimate”. The bias adjusted estimate of
is 0.149, but has a p-value of only 0.14.
Column 2 reports results for drinking. We find strong evidence of state dependence, although the own lag coefficients are substantially smaller than for cigarette smoking.
One must keep in mind that the coefficients on the lagged dependent variables should be judged relative to the standard deviation of the composite error, which is smaller for drinking than for smoking. Nevertheless, the dynamic simulations reported in Web Appendix Table 9 indicate that state dependence is indeed a bit weaker for drinking.
The sibling influence parameter
is 0.405 and is highly significant. The estimate of λ2 is close to zero and insignificant. The results suggest that siblings have a substantial influence at early ages but not later. This makes some intuitive sense, but we expected less of a difference between
and
. Using the simulation procedure we estimate the bias in these parameters to be 0.060 and -0.003. The bias corrected estimate of
is 0.345.
Column 3 reports estimates for marijuana. They are very similar to the results for drinking. We find strong evidence of a sibling effect on
.However, the point estimate of λ2 is actually negative although it is not significant. We also find substantial state dependence and an important role for both family and individual heterogeneity. The bias correction reduces
by about 12 percent, from 0.290 to 0.255. It slightly increases
.
Column 4 reports results for the use of hard drugs. Qualitatively, the results are similar to the results for drinking and marijuana use. The point estimates suggest a considerable sibling influence but they are not statistically significant. Correcting for bias makes little difference.
B. Allowing Model Parameters to Depend on Age
Columns 5–8 of Table 3 report results after we replace
with the function
and we replace λ2 with the function
.22 The interaction terms are normed so that the main effects of
on the younger sibling refer to the effects at age 15. We also add linear interactions between
and elements of X1 in Equations 7 and 3 and add linear interactions between
and elements of X2 in Equations 8 and 4. One would expect age interactions to be particularly important in (7) and (8).
For all four substances, the point estimate of the main effect of
is larger than the corresponding values in the basic model. The effects decline somewhat with age, although the interaction terms are not statistically significant. However, it is still the case that the sibling effects are larger for
than in the subsequent periods. We have not computed bias corrections using the simulation procedure in Section V. A, but suspect that they would be similar to what we obtain for them in the model without age interactions. For the most part, the state-dependence parameters and sibling-effect parameters are not very sensitive to the addition of the interaction terms and so, going forward, we focus on the models without the interaction terms.
C. Allowing Sibling Effects to Depend on the Gender Mix, Age Gap and Co-residence
As we noted earlier, the psychology literature (and common sense) might lead one to expect the strength of the peer influence to depend on the gender mix of the siblings. Web Appendix Table 6 reports estimates for a specification that allows
in Equation 8 and λ2 in Equation 4 to be different for brother pairs, sister pairs, and mixed sex pairs. We set the contemporaneous sibling influence terms
and π2 to 0: For smoking and marijuana use, the sibling influence parameters are substantially larger for sister pairs. However, the standard errors are relatively large.23
We also estimated models in which we allow the sibling influence to depend upon whether the siblings are more than two years apart. On the one hand, siblings who are close in age may spend more time together and have a closer bond. On the other hand, the difference between the younger and the older siblings in the degree of access to alcohol, marijuana, and other drugs may increase with the age gap, thus increasing the impact of the older sibling even when the age of the older sibling and the younger sibling are held constant. Furthermore, with a wider age gap, the assumption that older siblings influence younger siblings but not vice versa is more likely to be true.24 For smoking, drinking, and marijuana use, the point estimates indicate that sibling effects are larger for sibling pairs with an age separation of more than two years, but the parameter estimates are imprecise. (See Web Appendix Table 7.)
Finally, we estimated a specification of our model in which we allowed the sibling effect to depend on whether the siblings reside in the same household. Indeed, as our sample ages, it is likely that one or both siblings move out of the parental home. This would lead to reduced contacts between siblings, which presumably would reduce the strength of the sibling influence. On the other hand, the older sibling’s access to substances might increase, which could strengthen the effect. We report the results of this specification in Web Appendix Table 8. Unfortunately, the results are imprecise.25
D. Model Estimates with Both Contemporaneous and Lagged Sibling Effects
Columns 1–4 of Table 4 present the estimates of the joint dynamic probit model allowing for both contemporaneous and lagged sibling effects. They can be compared to Columns 1–4 of Table 3. Overall, the sibling effects are stronger when we allow the older sibling’s behaviour to affect the younger sibling both in the same period and in the next one although the concern about potential upward bias from correlated transitory shocks bears repeating. Correcting for bias from mispecification of equations for
and
using the simulation procedure reduces the estimates, as was the case in Table 3 for the model without contemporaneous effects. However, the reductions tend to be smaller.
Estimates of Dynamic Probit Model with Past and Contemporaneous Sibling Effects
Whether the contemporaneous effect or lagged sibling effect is stronger depends on the particular outcome however. For smoking (Column 1), the bias corrected lagged sibling effect
is 0.11 and insignificant while the bias corrected contemporaneous effect
is more than twice as large and significantly different from zero. In contrast, for drinking (Column 2), most of the sibling effect seems to operate with a lag, while for marijuana (column 3), the older sibling’s behavior has a significant influence on the younger sibling’s behavior both contemporaneously and with a lag. The sibling effects are larger when
than at the later ages, with
consistently larger than λ2 and
larger than π2 in all cases except hard drugs (where sampling error is particularly large). Adding the contemporaneous sibling effects leads to some changes in the point estimates of
and λ2, but does not systematically increase or decrease them. The estimates of the unobserved heterogeneity and state-dependence parameters are not very sensitive to the inclusion of the contemporaneous sibling effect in the model.
E. The Dynamic Response to the Older Sibling’s Substance Use
1. Model with lagged sibling effects
The estimates of the parameters of the dynamic probit model refer to effects on the latent variable index rather than to effects on the probability of substance use. Furthermore, they do not provide a quantitative sense of how persistent the effects are. To address these issues, we use the bias corrected estimates of the baseline model to simulate the effect of an exogenous switch in the behavior of the older sibling from0 to 1 in period
on the time paths of substance use of both the older and younger siblings.26 In Section VI. E.2, we simulate the models with both contemporaneous and lagged sibling effects.
Figure 1A presents the results for smoking using the model in the first column of Table 3. The vertical axis measures the change in behavior relative to the baseline probability. The horizontal axis measures the time period relative to
, so 0 corresponds to
. Web Appendix Tables 9A and 9B report point estimates and standard errors. These are based on a parametric bootstrap method.27
Effect of Shifting the Older Sibling’s Probability of Substance Use From 0 to 1 on the Older and Younger Siblings’ Probabilities of Behavior, Relative to Baseline (Based on Dynamic Probit Model)
Note: The solid line and the broken line represent the effects on the probabilities of behavior, relative to baseline, of the older sibling and of the young sibling, respectively. Error bars show the 90% confidence intervals. The x-axis measures the number of periods after the exogenous change in the older sibling’s behavior. Baseline probabilities for smoking in the first and last period displayed on the graphs are, respectively: 0.4091(0.0116) and 0.4269(0.0135) for the older sibling, and 0.3311 (0.0118) and 0.3860 (0.0309) for the younger sibling. For drinking, they are 0.5699 (0.0104) and 0.7227 (0.0113) for the older sibling and 0.4486 (0.0128) and 0.5860 (0.0290) for the younger sibling. For smoking marijuana, they are 0.2482 (0.0101) and 0.1968 (0.0098) for the older sibling and 0.2067 (0.0098) and 0.2814 (0.0343) for the younger sibling. We display the confidence interval bars for the responses of the older sibling and younger sibling slightly to left and slightly to right of the number of period after the exogenous change in the older siblings’ behavior.
We begin with the older sibling’s response. The solid line in the graph reports the effect of exogenously switching
from 0 to 1 on the time path of the average value of
, relative to the baseline average for
. The vertical bars represent 90 percent28. To be more specific, for each older sibling, we first set
to 1 in
, simulate forward, and take the average of
for the values of t reported at the top of each column of Web Appendix Table 9a and on the horizontal axis of Figure 1a. We repeat the procedure with
set to 0 in
, take the difference in the two averages for each value of t, and then divide by baseline value in the top row of Web Appendix Table 9. confidence bands. One can see that the exogenous change in smoking behavior from 0 to 1 in
raises the probability of smoking one year later by 0.508 (0.047) of the baseline value (0.409). The effect is 13.6 percent of the baseline value two years later and essentially dies out after four periods.
The broken line in the graph displays the effect on the time path of
, relative to the baseline average of
, of a one-time exogenous shift in the smoking behavior of the older sibling from 0 to 1 in
, with the distribution of the future behavior of the older siblings unaffected. Smoking among older siblings increases smoking among younger siblings in
by 0.10 (0.077) of the baseline value.29 This is 19.7 percent of the effect of the older sibling’s behavior in
on his own behavior in the next period. The value is 0.025 (0.017) in the second period. The effect on the probability that the younger sibling smokes relative to baseline is essentially zero after three years.30
Figure 1B displays simulations for drinking using the bias corrected point estimates in Table 3, Column 2. For the older sibling, drinking last year raises the probability of drinking this year by 0.296 (0.031) of the baseline value, which is 0.570. After three periods, the effect is only 0.011 (0.003) of the baseline value. An exogenous change in the drinking behavior of the older sibling in
increases drinking among younger siblings by 0.217 (0.057) of the baseline probability (0.449). The effect on the younger sibling is essentially zero after three periods.
Figure 1C shows that marijuana use by the older sibling in
increases the probability that the older sibling uses marijuana one year later by .649 (.078) of the baseline probability of 0.248. The effect on the older sibling’s behavior is 0.033 (0.010) three years later and close to 0 after that. The bias corrected estimates imply that a onetime exogenous shift in the smoking behavior of the older sibling from 0 to 1 in
increases the probability that the younger sibling uses marijuana in
by 0.236 (0.091) of the baseline value. The effect on the younger sibling is 0.010 after two periods.
Web Appendix Table 10 reports the results of dynamic simulations based on the model with age interactions. We are particularly interested in how the responses depend on the age of the younger sibling at the time of the shock. Because response parameters depend not only on age but also on whether the shock occurs in
or a later period, we first contrast the response using the sample of younger siblings who were 15 at
with the response of those who were 17 at
. In the case of cigarettes, the results show that the effect on behavior at
is substantially larger for 15-year-olds than 17-yearolds: 0.163 versus 0.109. This is also true for alcohol and marijuana. We have not attempted to use simulation to estimate the bias in the sibling influence parameters in the model with age interactions. We suspect the sibling effect estimates for cigarettes are biased upward by perhaps 30 percent and that the bias is small for alchohol and marijuana. For all three outcomes, the effects are essentially zero after three periods.31
Overall, the effects of substance use by the older sibling in one period on the younger sibling are substantial but they die out fairly quickly. It is important to note that most of our parameter estimates indicate that the peer influence is biggest in the equation for the first observation on the younger sibling. For this reason, when we simulate the average effect of exogenously shifting the behavior of the older sibling from no substance use in all periods to substance use in all periods, we find only modest effects on the behavior of the younger sibling for
.
2. Model with both contemporaneous and lagged sibling effects
To assess the dynamic impact of the contemporaneous and lagged sibling effects, we used the bias corrected estimates to perform three different counterfactual simulations. First, we exogenously switch the behavior of the older sibling from 0 to 1 in period
. Second, we exogenously switch the behavior of the older sibling from 0 to 1 in period
. Finally, we switch the behavior of the older sibling from 0 to 1 in both
.
Table 5 reports the results of this exercise for smoking, drinking, and smoking marijuana. Looking at smoking first, we see that an exogenous shift in smoking behavior among older siblings from0 to 1 at
leads to an increase in the probability of the younger sibling smoking by 0.072 at
and this effect dies out quickly. In contrast, shifting the older sibling’s smoking at
raises the probability that the youngest sibling smokes by 0.198 immediately, by 0.112 in the next period and 0.033 two periods later. This persistence is higher than what we found in the model with only a lagged effect in part because λ2, the sibling effect of
when t > tmin, is larger than in the specification with only the lagged effect (0:122 versus 0.068). Finally, shifting the older sibling’s smoking from 0 to 1 in both
and
increases the probability that the younger sibling smokes in
by 0.272 and by 0.131 at
, which is approximately equal to the sum of the effects of shifting
and of shifting
separately.
Bias Corrected Effect of Shifting the Older Sibling’s Probability of Behavior from 0 to 1 on the Younger Sibling’s Probabilities of Behavior Relative to Baseline (Based on Dynamic Probit Model with both Contemporaneous and Lagged Sibling Effects)
The second panel reports the simulation results for drinking. In line with the estimates discussed above, for drinking, the sibling effect mostly operates with a lag. This can be seen in the table, which indicates that shifting the older sibling’s drinking behavior from 0 to 1 in
raises the probability that the younger sibling drinks at
by 0.224, whereas shifting the older sibling’s drinking behavior from 0 to 1 at
only raises the probability that the younger sibling drinks in the same period by 0.056.
Finally, the results for marijuana reveals a substantial effect of the older sibling’s behavior both contemporaneously and in the next period. Shifting the older sibling’s behavior from 0 to 1 in
raises the probability that the younger sibling smokes marijuana in
by 0.196. Shifting the older sibling’s behavior in
raises this probability by 0.357 of the baseline probability (0.21). These effects fade away very quickly however and are basically null after two periods.
F. The Relative Contribution of Sibling Effects and Common Influences to Sibling Correlations In Substance Use
The fact that our estimates imply that younger siblings’ behavior is relatively insensitive to whether or not the older sibling consumes the substance in all periods suggests that only a small part of the large sibling links in substance use reported in Table 2 is causal. To quantify this, we simulated data from our model using the bias corrected parameter estimates for the baseline model (Table 3, Columns 1–4) and then used the simulated data to estimate the parameter ρ in the descriptive Regression 1. Recall that (1) relates the behavior of the younger sibling at age a to the behavior of the older sibling at the same age. Next, we performed a similar simulation, this time setting
and λ2 to zero and all the other parameters to their estimated values.
Results of this exercise are reported in Table 6. For ease of comparison, we present ^ρ based on the actual data in Column 1. Columns 3 and 4 reports estimates of ρ based on data from a simulation in which the peer influence parameters are set to their bias corrected values and to zero, respectively. Column 5 reports the difference between columns 2 and 4 divided by Column 2. This is the fraction of the sibling link ρ that is due to peer influence. In the case of smoking, the point estimate is 0.082 (0.065). The corresponding fractions are 0.046 (.034) for alcohol and 0.001 (0.087) for marijuana. The relatively large standard error estimates reflect the difficulty of estimating a ratio, particularly when ρ is small.32
Proportion of the Correlation Between Siblings’ Behavior Explained by the Direct Effect (Based on the Joint Dynamic Probit Model)
G. Checks for Specification Bias and Robustness to Alternative Model Specifications
1. Bias if the younger sibling influences the older sibling
If the younger sibling positively influences the behavior of the older sibling, then we are likely to underestimate the sibling effect although in the presence of state dependence the implications of reverse causality are not entirely transparent. An influence of the younger sibling on the older sibling will tend to increase the strength of the link between future values of y1 and past values of y2. This will raise the importance of the family effect relative to the sibling influence parameter as the econometric explanation for the strong correlation between y1 and y2 in econometric models that assume that the sibling influence goes in only one direction. We expect this will lead to underestimation of the direct sibling influence.
Simulations support this intuition. We simulated data from the joint dynamic probit model for smoking reported in Table 3, Column 1 after adding a term that allows the younger sibling to positively influence the older sibling. We set the coefficient to a positive value. All other parameters were set to the estimates of the dynamic probit model for smoking reported in the table. We then used the simulated data to estimate the dynamic probit model with the parameter governing influence of the younger sibling on the older sibling set to 0: As expected, the estimates of
and λ2 decline when data come from a model in which the younger sibling influences the other sibling.
2. Robustness to an alternative error specification
Web Appendix Table 11 reports estimates using an alternative error specification, which allows the factor loadings associated with ε, v1 and v2 to differ between the younger and older siblings and to differ between the first observation and the subsequent periods. To be more specific, under this alternative specification, the family effect ε enters (7), (8), (3) and (4) with the factor loadings 1,
, and α2 respectively, v1 enters (7) and (3) with the factor loadings 1 and δ1 respectively, and v2 enters (8) and (4) with the factor loadings 1 and δ2 respectively. We restrict the variance of v1 and v2 to be the same across siblings. We have not used the simulation procedure to correct for bias. For hard drugs, we have difficulty identifying the separate roles of family heterogeneity and individual heterogeneity when we use the less restricted version, and the estimates of the sibling influence parameters λ2 and
tend to be noisier. The results for alcohol and marijuana are similar to those in Table 3 and show strong evidence of a sibling influence. In the case of cigarettes, the sibling coefficient in the first period falls while the sibling coefficient for subsequent periods rises to 0.12 although neither is statistically significant.33
3. Robustness to allowing for gateway drugs
There is considerable policy interest in the idea that cigarettes may be a gateway to marijuana, alcohol a gateway to marijuana, marijuana a gateway to hard drugs, and so on. Policies to control marijuana are justified in part by a concern that it leads to hard drug use. The idea of gateway drugs seems plausible although the patterns of causal influence among substances are not well established.34
We experimented with extended versions of the joint dynamic probit framework that jointly model pairs of drugs. Our motivation is primarily to check the robustness of our findings rather than to test the gateway drug hypothesis. Consider the case in which cigarettes are a gateway drug for marijuana. The model consists of equations for cigarette smoking that have the same form as the model above. After the first period, marijuana use depends on lagged cigarette use as well as lagged marijuana use. An implication of the model is that cigarette smoking by the older sibling can influence marijuana use by the younger sibling through its effect on marijuana use by the older sibling. We estimated models with smoking as the gateway drug for marijuana, drinking for marijuana, smoking for hard drugs, drinking for hard drugs, and marijuana for hard drugs. We present the model in Web Appendix B and model estimates in Web Appendix Tables 13 (main error specification) and 14 (alternative error specification described in SectionVI.G.2). We have attempted to correct for bias using the simulation procedure.
The results are exploratory but may be summarized as follows. First, the joint models indicate state dependence for both the gateway substance and the paired substance that is fully consistent with the results when we examine each substance in isolation. Second, family heterogeneity and individual heterogeneity are important and contribute to the correlation in substance use across siblings and across substances for each sibling. Third, the effect of the gateway drug on future consumption of the paired drug is never significantly positive although we could not rule out modest positive effects. Fourth, and most importantly for present purposes, the estimates of the sibling influence parameters are generally a bit larger than the corresponding estimates obtained when we model the substances separately.
VII. A Joint Dynamic Ordered Probit Model with Many Categories
The degree of state dependence and the strength of the peer influence are likely to depend on the amount of substance use. In this section, we propose and implement a simple way to allow for flexible forms of nonlinearity in the dynamic behavior of substance use and sibling influence while continuing to allow for sibling pair effects and individual effects. Expanding from two categories (0,1) to an arbitrary number of categories, the equations for the latent variables
and
become





with thresholds q1, q2, . . ., qM - 1, where M is the number of categories and category m= 1 corresponds to zero consumption. The equation for
is identical to (5). The equation for
is
(9)The values of
are determined by the indicator function
for m= 1, by
for 2 ≤ m ≤ M − 1, and
for m=M. The variable
is determined by
in the same fashion. For simplicity of presentation and in anticipation of the empirical work, we have excluded the contemporaneous variables
, from the equations for
. The state dependence parameters
and
should increase with m if the positive influences of habit, social connections, and information on the propensity to engage in substance use are increasing in the quantity consumed in the previous period. If the size of the sibling influence is positive and increasing in the intensity of the older sibling’s behavior, then the sibling influence parameters
and
will be increasing in m.
By including a large number of groups, one can accommodate an arbitrary nonlinear relationship between substance use today and own past substance use, as well as past substance use by the older sibling. Of course, with a large number of categories, freely estimating the thresholds q along with the γ and λ parameters would be hopeless without a very large sample. However, one can restrict these parameters to lie on a flexible but relatively parsimonious function, such as a linear spline with a number of break points less than M.
Note that Fortin and Lemieux (1998) proposed the use of an ordered probit model with a large number of categories as a way to allow for arbitrary nonlinearity in the link between a continuous variable (the wage in their case) and a latent variable. The latent variable is an index of regressors plus an error term. Essentially, the ordered probit model provides a model of the probability that the continuous variable falls in a particular interval. Our model is very different in that it involves a system of equations with dynamics and unobserved heterogeneity. Basically, we are combining the idea of specifying the category-specific parameters parsimoniously with the idea of using the ordered categorical response model with a large number of categories as the specification for the link function relating y to the observed and unobserved variables that determine the latent variable y*. Our approach seems well-suited to the estimation of nonlinear dynamic panel data models with unobserved heterogeneity.
We have estimated models for days of use of cigarettes, alcohol, and marijuana last month using five categories (M= 5). For all three substances, the categories are 0, 1–7, 8–14, 15–22 and 23–30 days of use.35 We restrict the γ and λ parameters to lie on a linear spline with two breakpoints in slope but leave the thresholds qm, m= 1, . . .,M - 1, free. The changes in slope occur at 8 and eighteen days.36 We use the simulation procedure described in Section V.A to quantify the bias that might arise from treatment of the first period. The only difference is that here we use the ordered probit specification rather than the probit specification.37
The estimates of the parameters of the spline functions and the variance parameters are in Web Appendix Table 15. We also report the bias corrected estimates of the sibling influence parameters, which form the basis of the discussion that follows. The spline function estimates are used to compute the implied values of the state dependence parameters (γ) and sibling influence parameters (λ) for m= 2 to 5 (Table 7). In keeping with the binary probit results, we find that both family heterogeneity and individual heterogeneity are important for all three outcomes. We also find substantial state dependence for both the older sibling and the younger sibling. The 24 state dependence parameters are all positive. They also are increasing in m in all cases, with the exception that
is less than
for drinking.
State Dependence and Sibling Effect Parameters Implied from the Estimates of the Joint Dynamic Ordered Probit Model with Five Consumption Categories
The bias corrected sibling influence parameters
are all positive with the exception of category 5 for marijuana. The size of
increases in m through category 4 in the cases of smoking and drinking and through category 3 in the case of marijuana, but the effects are smaller for the highest category for all three outcomes. Many of the point estimates of
are substantial relative to the corresponding values of
, the state dependence for the older sibling. However, the standard errors of the
are large, which is the price that is paid for the five-category model. After the bias correction, the estimates are statistically significant at the 10 percent level or better in only 3 of the 12 cases. The sibling effect in later periods is smaller, and the point estimates are negative in several cases. The ordered probit findings are broadly consistent with the binary case, but the bias correction makes more of a difference, particularly for drinking.
Figure 2 graphs simulations of the effect of exogenously shifting the older sibling’s substance use in
from 0 to one of the two highest categories (with equal probability) on substance use in subsequent periods.38 The effects are relative to the baseline probabilities. Panels A–D refer to cigarettes. One period later, the shift raises the low consumption (1–7 days) probability for the older sibling by 0.24 and the high consumption (23–30 days) probability by 0.55 relative to the baseline averages. The effects become very close to 0 after four periods. The shift in the older sibling’s behavior increases the probability that the younger sibling is in the low consumption category by 0.15 (0.059) relative to baseline. It increases consumption in the high consumption category by 0.33 (0.145) relative to baseline, and the effects are monotonically increasing. The sibling effects are very small after two years.
Effect of Shifting the Older Sibling’s Probability of Substance Use from 0 to 1 of the Two Highest Categories (with Equal Probability) on the Older and Younger Siblings’ Probabilities of Consumption, Relative to Baseline (Based on Ordered Probit Model with 5 Categories)
Note: The solid line and the broken line represent the effects on the probabilities of behavior, relative to baseline, of the older sibling and of the young sibling, respectively. Error bars show the 90% confidence intervals. The x-axis measures the number of periods after the exogenous change in the older sibling’s behavior. Baseline probabilities for smoking on 1–7 days a month in the first and last period shown in the graphs are: 0.0860 (0.004) and 0.0892(0.004) for the older sibling and 0.0946(0.004) and 0.0927(0.005) for the younger sibling. In the same order, for smoking on 8–14 days: 0.0203(0.001), 0.0216(0.001), 0.0207(0.001) and 0.0227 (0.002). For smoking on 15–22 days: 0.0320(0.003), 0.0348(0.003), 0.0310(0.003), 0.0364(0.004). For smoking on 23–30 days: 0.1914 (0.009), 0.2358 (0.010), 0.1153 (0.007), 0.2444(0.035). For drinking on 1–7 days: 0.3421 (0.010), 0.4106(0.010), 0.2959(0.011), 0.3901(0.019). For drinking on 8–14 days: 0.0569 (0.003), 0.0979 (0.004),0.0345(0.002), 0.0908(0.012). For drinking on 15–22 days: 0.0335 (0.003), 0.0707 (0.005), 0.0159 (0.002), 0.0654(0.012). For drinking on 23–30 days: 0.0140 (0.002), 0.0395(0.004), 0.0045 (0.001), 0.0354(0.011). For marijuana on 1–7 days: 0.0881(0.006), 0.0805 (0.006), 0.0874 (0.005), 0.1044 (0.012). For marijuana on 8–14 days: 0.0887(0.006), 0.0811(0.006), 0.012 3(0.002), 0.0176(0.005). For marijuana on 15–22 days: 0.0209(0.002), 0.0198(0.002), 0.0177(0.002), 0.0287 (0.006). For marijuana on 23– 30 days: 0.0400(0.004), 0.0407(0.004), 0.0248(0.003), 0.0675(0.027). We display the confidence interval bars for the responses of the older sibling and younger sibling slightly to left and slightly to right of the number of years after the exogenous change in the older siblings’ behavior.
In the case of alcohol (Figure 2 E–H), the shift increases the older sibling’s drinking relative to baseline by 0.272 (0.034) in the low category, 0.745 (0.11) in the 8–14 category, and 1.23 (0.22) in the high category. The corresponding effects on the younger sibling are 0.067 (0.177) in the low category, 0.22 (.415) in the 8–14 day category, and 0.51 (0.873) in the high category. These effects are substantial, but are also imprecise.
In the case of marijuana (Figure 2 I–L), the corresponding effects are 0.77, 1.07, and 1.44 for the older sibling and 0.10 (0.164), 0.146 (0.223) and 0.252 (0.33) for the younger sibling. The fact that the effects are much larger in the highest category makes sense. However, the effects decay to essentially 0 after three years.
Overall, the results for the five-category ordered probit model are qualitatively consistent with those for the binary probit specification but are less precise. The point estimates indicate that the sibling effect is increasing in the older sibling’s consumption, but they are statistically significant only for smoking. The results provide strong evidence that the probability of substance use and the quantity of substance use today depend positively on one’s consumption level in prior periods.
VIII. Using Correlated Random Effects Regression to Estimate the Sibling Effect
We supplement the investigation of sibling effects based on the joint dynamic probit model with a simple regression approach. Correlated random effects (CRE) estimates of the direct effect of older siblings on younger siblings’ behavior are based on the following linear least squares projection equation:
(10)The intuition is that if
and
have the same relationship with the error term that determines
, then the coefficient on
identifies the part of the link in the behavior of siblings that is due to common unobserved influences, leaving β2 as the sibling effect. In Web Appendix C, we discuss in detail the assumptions on the above model that are required for β2 to equal the sibling influence parameter λ2 in the case in which y is continuous. The basic argument carries over to the case in which y is a binary choice. The assumptions include no state dependence, covariance stationarity of
, and the symmetry restriction
. Furthermore, the effects of ε or v1 on
must not vary with the age of the older sibling
and the influence of ε on
must not vary with the age of the younger sibling
. The estimates of the dynamic probit model show substantial state dependence, and the presence of the terms
and
are enough to lead to age dependence in the influence of the error components on
and
in a nonlinear binary choice model such as (3) and (4).39
We also consider the case in which both contemporaneous and lagged behaviors of the older sibling influence the younger child with coefficients λ20 and λ2, respectively. Consider the following projection equation:
(11)In this case, we need to make two additional assumptions for β2 and β3 to capture λ2 and λ20. These assumptions are that
and
are independent across siblings at all leads and lags and that
is serially uncorrelated. If any of these assumptions fail, then in general β2 ≠ λ2 and β3 ≠ λ20 in (11).
If only the assumption of no serial correlation fails, one still can estimate an average of λ2 and λ20 by using the regression
(12)to test for sibling effects, as we do below. Finally, if one uses (10) when (11) is correct, then the coefficient on
will pick up part of the effect of
, but one will still detect sibling influences.
A. Results Using the CRE Approach
Table 8 presents estimates of sibling effects using the correlated random effect model discussed above. Each column refers to a different outcome. The top panel presents estimates of our main specification, which we refer to as Model 1. Model 1 is a variant of (10) for the case in which
is binary and the control variables X and age dummies are added:
Marginal Effects of the Older Sibling’s Behavior on the Younger Sibling’s Behavior (Correlated Random Effects Model)
(13)In the middle panel, we replace
in the equation above with
. We refer to this specification as Model 2. In the bottom panel, we allow for the possibility of a contemporaneous influence. We replace
with
(Model 3). The peer influence coefficient on
in Model 3 is likely to be positively biased if transitory environmental factors are correlated across siblings. It also may be positively biased as the result of an interview effect when the sibling interviews occur on the same day.40
We report marginal effects of the row variables on the probability that
based on MLE probit estimates of β1, β2, and the other parameters in the model. Standard errors are clustered at the household level.41
Column 1 refers to smoking. The results for Model 1 indicate that
raises the smoking probability by 0.062 (0.026). This estimate is statistically significant and is equal to 15.6 percent of the mean probability. The marginal effect of
is 0.085 (0.018), so about 3/5th of the link between the older sibling’s past smoking and the younger sibling’s current smoking is due to common influences and 2/5th is due to the sibling effect. The results for Model 2 and Model 3 suggest an even stronger causal sibling effect on smoking.
For drinking, the estimates of Model 1 indicate that
raises the probability of drinking by about 0.054 (0.024), which is 9.1 percent of the mean probability. The link due to common influences is 0.118 (0.018). The evidence for a causal effect in the case of marijuana is weak. The estimates are positive, but are statistically significant only in the case of Model 3, which allows for a contemporaneous influence of the older sibling on the younger one.
The point estimates for use of hard drugs are positive and substantial relative to the sample mean. For example, the marginal effect of
is 0.011 (0.019) for Model 1 while the sample mean is 0.062. However, the effect is not statistically significant. We obtain even larger estimates using Model 2 and Model 3. Overall, the results for hard drugs suggest a positive causal effect but are too noisy to support strong conclusions.42
B. Family Process Variables
In view of the child psychology literature’s emphasis on the importance of family process variables for child outcomes, one might expect the nature of the child’s relationship with his parents and siblings to affect the size of the peer effect. We investigate this issue in the NLSY97 with data about the child’s relationship to family members. In particular, we use measures of parental supportiveness, parental monitoring, parenting style (uninvolved, permissive, authoritarian, or authoritative) and whether both biological parents are present. We also use a variable indicating whether a sibling is the first person the youth turns to for advice and another one indicating whether the youth turns to someone other than the parents for advice.43We incorporated these variables one at a time into the CRE specification by adding the interaction between the family process variable and the older sibling’s lagged behavior as well as with the sum of the sibling’s lagged and lead behaviors. We also included the family variable itself as a control variable. The family process variables have a strong association with substance use, particularly the parenting style measure. This is in line with Averett et al. (2011)’s findings that children with greater parental supervision are less likely to engage in these behaviours.44 The coefficients on the interaction terms with
often have the sign that we expect but they are usually not statistically significant (not reported).45 See Section IX.B of Altonji et al. (2013) for more details.
IX. Conclusion
Parents frequently implore their older children to set a good example for younger brothers and sisters. Social scientists, particularly psychologists, have long been interested in the influences that siblings have on each other. Many studies, including ours, have found strong sibling correlations in a variety of behaviors, including substance use, that are robust to the inclusion of a rich set of controls. The difficult question is whether these correlations reflect causal influences or result from shared genes and environment. To identify causal effects, we use panel data and the key assumption that older siblings influence younger siblings but not vice versa. We rely primarily on a joint dynamic probit model and a joint dynamic ordered probit model that allow for state dependence and nonstationarity. Our use of a multiple equation dynamic ordered response models with large numbers of categories but restrictions on the category specific model parameters as a way to allow for nonlinear state dependence in the presence of unobserved heterogeneity may have other applications.
The point estimates of the sibling effect for smoking, drinking, and marijuana are almost always positive but the estimates are also noisy. They are statistically significant for drinking and marijuana using the dynamic probit model and for cigarette smoking using the ordered probit model. We also find a positive effect for hard drugs but the coefficients are not statistically significant. For the most part, the effects are largest in the equation for the first observation for the younger sibling. Although in many cases we find fairly large effects of past behavior on the latent variable that determines substance use, the effect on the younger sibling of a one-time shift in the behavior of the older sibling dies out fairly quickly. Simulations using the dynamic probit model indicate that only a small fraction of the large sibling correlation in substance use is causal although the estimates of the fractions are noisy.
We also report probit regressions motivated by the correlated random effects literature, subject to a number caveats. These results indicate that smoking, drinking, and, more tentatively, marijuana use by the older sibling increase the probability that the younger sibling engages in these behaviors. Combining the evidence from the different estimation approaches and different substances, we conclude that there is a modest positive sibling effect on substance use but that only a small part of the sibling correlation in substance use is causal.
There is a substantial research agenda. First, the analysis should be repeated with additional data sets containing panel data on substance use for large samples. These are steep data requirements. Add Health, which has been used in some previous studies of sibling links in risky behavior, is a natural possibility. The availability of genetic markers that influence substance use could be incorporated, building on some of the work discussed in Fletcher and Lehrer (2011). However, the time between interviews makes Add Health less than ideal. Second, other behaviors, including positive behaviors such as volunteering and study time, could be examined. Third, as evidence accumulates on the dynamic interrelationship among the use of different substances, it would be desirable to revisit our analysis of sibling effects in models with multiple substances. The question of “gateway” drugs is salient in policy discussions of drug law reform but given limited information in the data it is hard to quantify the linkages without strong a priori information about which linkages are most likely. Finally, while we do identify a causal effect, we do not make much progress in identifying the mechanisms underlying it. Statistical power is a serious problem, at least in the NLSY97, but a more structured approach in which the researcher constrains the way in which parental characteristics and home environment measures alter the strength of the sibling effect on a multiple set of behaviors may be worth trying.
Acknowledgments
They thank Stan Panis for advice on how to adapt his program aML to estimate the joint dynamic probit and dynamic ordered probit models presented in the paper. They acknowledge that they are among a large set of researchers who have benefited from Panis’ generous support. Their research has been supported by the Economic Growth Center, Yale University. They claim responsibility for the remaining shortcomings of the paper. The data used in this article can be obtained beginning December 2017 through November 2020 from Sarah Cattan, Institute for Fiscal Studies, 7 Ridgmount Street, London WC1E 7AE, UK, sjcattan{at}gmail.com.
Appendix
Footnotes
* Supplementary materials are freely available online at: http://uwpress.wisc.edu/journals/journals/jhr-supplementary.html
↵1. See Levitt and Lochner (2001) on teenage homicide, Gruber and Zinman (2001) on smoking, Pacula et al. (2001) on marijuana usage, and Grossman et al. (2004) on teenage sex.
↵2. For example, Amuedo-Derantes and Mach (2002) find that having a sibling who abuses illegal drugs significantly increases the likelihood that an adolescent also will take drugs. Duncan et al. (2005) compare correlations of various measures of achievement and delinquency across siblings, peers, neighbors, and schoolmates and find that these correlations are substantially stronger among siblings than among other groups.
↵3. See Sacerdote (2001), Marmaros and Sacerdote (2002), Duncan et al. (2005), and Stinebrickner and Stinebrickner (2006). One could examine whether the sibling influence is larger for siblings who share a bedroom. With data on the number bedrooms and the number of male and female children by age, one could create a proxy even if information on sharing a bedroom is unavailable. We do not have the necessary data to perform this analysis.
↵4. Cawley and Rhum (2011) provide a comprehensive survey of the economics literature on risky health behaviors.
↵5. See, for example, Slomkowski et al. (2001). Several papers also have looked at sibling influences on smoking patterns although results for this activity also are mixed (Otten et al. 2007; Bricker et al. 2005; Slomkowski et al. 2005).
↵6. Oettinger (2000) uses the gender of the older sibling, measures of the family’s “intactness” during his or her childhood, and local and national unemployment rates at age 18 as instrumental variables.
↵7. Consider a family with two siblings. Their model contains exogenous variables and is nonlinear but in a simple regression model of the older sibling’s behavior on the younger sibling’s behavior, one cannot separately identify the causal effect of the younger sibling’s behavior from the correlation in error components that determine the two.
↵8. In preliminary work, we also examined selling or helping to sell drugs, gang membership, and sexual behavior. We did not find strong evidence of a sibling effect for these variables though we did find some suggestive evidence of a positive sibling effect for “selling drugs.”
↵9. 403 of the families who contribute sibling pairs have children who were excluded from NLSY97 because they were older than 16 at the end of 1996. 359 of the families had children who were younger than 12 at the end of 1996. 167 had children who were older than 16 and younger than 12. No data were collected on these children.
↵10. One could use inverse probability weighting to account for effects of attrition at the sibling pair level in the correlated random effects analysis but we are not entirely clear about how to construct the attrition weights for sibling pairs. One possibility would be to estimate the probability that data for a given observation on a sibling pair are available conditional on the age of the youngest sibling in the base year, the age gap, and base year characteristics. We are not sure how to correct for attrition when estimating the joint dynamic discrete choice model given that our models use data from multiple waves of the survey and that the data needed depends on the equation of the model.
↵11. Because the bullying measure reflects a possibly traumatic childhood experience, we think of it as measuring, albeit very imperfectly, aspects of the individual’s mental health and social adjustment.
↵12. In earlier drafts of this paper, we stated that we could not construct a variable indicating whether siblings lived together, based on initial explorations of the data and consultations with Steven McClasky from the NLSY97 data team. Indeed, it is not possible to directly identify whether members of a respondent’s household (as recorded on the household roster) are also survey respondents. However, we later discovered that one can circumvent this limitation for the purpose of constructing an indicator of siblings’ co-residence. We first identified the sibling in the sibling pair in the 1997 household roster, by combining information on household members’ month and year of birth and their relationship to the main respondent. We then used the household roster’s person identifier to track this sibling through time and record whether, in each subsequent survey round, he or she still lived with the main respondent.
↵13. The budget constraint, which we leave implicit, is static. The decision function implicitly allows for the possibility that agents account for the action’s costs and benefits that play out over time. They also may consider the effects of their actions on the utility of others, including parents and siblings. The costs include punishment by the parents, school authorities, and criminal sanctions. However, we do not allow the benefit of an action to the older child to depend upon the characteristics or choices of the younger child. Furthermore, older siblings do not consider the influence of their behavior on the younger sibling’s choices. We also assume that agents are myopic in the sense that they do not account for the effects of the choice of y today on the costs and benefits of choosing y in the future. What if agents were, instead, aware of the fact that actions today influenced the marginal costs and benefits of future substance use? Forward-looking rational addiction models following Becker and Murphy (1988) imply that expectations about the costs and benefits of future substance use affect substance use today. This suggests that information that the older sibling has about future values of
would affect behavior in t. The effects that our time-invariant control variables pick up would include effects operating through the current utility of substance use as well as the agents’ response in t to the fact that these variables influence the costs and benefits of substance use in future periods. (Similar comments apply to the random effects ε, v1 and v2). If the younger sibling believes that the behavior of the older sibling in future periods, say t + 1, will affect the costs and benefits of his or her substance use in t + 1, then the younger child’s beliefs about
will influence his or her choice of
. Here, we do not account for these possibilities.↵14. Adolescents may form beliefs about how they should behave at a certain age, say 16, based on how their older siblings behaved at that age. To address this empirically, one could add additional lags of
to the model and allow the coefficient on
to depend on the age of the older sibling in t - j and the age of the younger sibling in year t. This is not feasible given the available data. However, we do experiment with allowing
to depend on the age gap between the siblings.↵15. The value of
varies from 1998 to 2000 in (7) while
ranges from 15 to 20. The value of
varies from 1999 to 2001 while
varies from 15 to 19.↵16. Note that we restrict the variance of the idiosyncratic error components to be 1 in the equations for the initial observation and the later years for both the younger and older sibling equations. This is implicitly a normalization under the assumptions
and
because we allow the coefficients of all observed variables to differ across these equations for both the older and younger siblings. The restrictions say that the relative importance of the family effect, the sibling effect, and the transitory components are the same in the first period and in later periods. In Section VI.G.2 we leave the factor loadings unrestricted.↵17. The estimates using the three-point distribution for ε are reported inWeb Appendix Table 3. They may be compared to the results in Table 3 below.
↵18. For computational ease, each pair coming from the same household is assumed to receive an independent draw of the common component ε. Thus we are implicitly allowing for the possibility that the common household environment is sibling pair-specific. Our reported standard errors for the joint dynamic probit and ordered probit models (see below) do not account for the possible error correlation across pairs that come from the same household. Relatively few households supply more than one pair of observations, so any bias in the standard errors is likely to be small (see Section III).
↵19. The true bias in the sibling parameter estimates may differ from the estimated bias if the difference between the other parameters in the simulation model and the estimates we use is large and the biases interact. One could imagine using an indirect inference approach to estimating the parameters of the behavior model. Our current econometric model could serve as the auxiliary model. However, this would be a computational nightmare, because it would seem to require estimating the joint dynamic probit model each time the criterion function for indirect inference is evaluated.
↵20. Web Appendix Table 4 reports the estimates of the coefficients on all the control variables and age dummies for this baseline specification and each of the four outcomes.
↵21. It is possible that the probability of smoking, for example, depends not only on behavior last year but on behavior in prior years. However, the intertemporal correlations in behavior based on simulated data from the model match the autocorrelations in the data quite well. This gives us confidence that our modeling choice is not too highly restrictive, although it does not rule out the possibility of more complex state dependence. Because the focus of our paper is on sibling influences on the behavior of teenagers, we have limited the analysis to the first eight rounds of the NLSY97, which makes it difficult to allow for additional lags in behavior. There would be value in a separate paper that uses dynamic ordered probit models and allows for richer state dependence (for example, by allowing the state dependence coefficients to vary with age or with unobserved heterogeneity). One could do so by extending the sample to later ages using more recent rounds of the NLSY97.
↵22. Web Appendix Table 5 reports the estimates of the coefficients on all the control variables and age dummies for the specification with age interactions and each of the four outcomes.
↵23. Estimates allowing the effects for mixed pairs to depend on whether the female is oldest are also imprecise.
↵24. This discussion mirrors the different predictions of opportunity versus role model views of sibling influence that we touched upon in the literature review.
↵25. Just adding the coresidence indicator as a control variable in the baseline model (without the interactions with the sibling effect variables) has almost no effect on the estimates. This suggests that the sibling influence parameters are not picking up the effect of siblings living together.
↵26. In all but ten cases,
, so we use the actual age of the older sibling in creating the age dummies
. For the 10 cases in which
, we set the age of the older sibling in year
to the actual age plus the value of
for the pair and construct dummies for subsequent years accordingly.We obtain the mean baseline path as follows. Using the sample distribution of X1 and estimated parameters based on our main error specification, we first simulate
from
using (7) and (3). With simulated values of
and the estimated model parameters for the younger siblings, we simulate
from
to
using (8) and (4). All error terms are drawn from the distributions implied by the model estimates. We obtain the effect of an exogenous shift in behavior of the older sibling from 0 to 1 in period
by conducting a similar simulation with
set to 0 for all pairs rather than the value implied by (7) and a simulation with
set to 1 for all pairs. For each sibling pair i, we performed each of the three simulations 100 times. We then averaged over the 100 simulations for all the pairs.↵27. We draw 150 values of the parameter vector for the joint dynamic probit model from a multivariate normal distribution with mean and variance matrix set to the point estimates of mean and variance of the parameter vector. For each draw of the parameter vector we perform 100 simulations and take the average, as described in the previous footnote. The standard errors are the standard deviations across the 150 averages. The 90 percent confidence bands are computed from the point estimate and standard error estimates under a normality assumption. The bias estimates used to adjust
and
are based on the point estimates and are constant across the 150 draws of the parameter vector.↵28. To be more specific, for each older sibling, we first set
to 1 in
, simulate forward, and take the average of
for the values if t reported at the top of each column of Web Appendix Table 9a and on the horizontal axis of Figure 1a. We repeat the procedure with
set to 0 in
, take the difference in the two averages for each value of t, and then divide by baseline value in the top row of Web Appendix Table 9.↵29. The estimate is 0.141 when the bias correction is not made. The bias correction makes very little difference in the case of drinking and marijuana.
↵30. In Web Appendix Table 9A and B, we report the baseline simulation for
. The bias corrected estimates are in 9B. In the rows for the younger sibling labelled W/ Feedback, we report the path of the difference in
relative to the baseline simulation for younger siblings when
is set to 1 in
and when it is set to 0 in
, respectively, and the shift in
is allowed to affect future values of
in accordance with the model. The effect of the shift on
is the same in
(by construction). It is a bit larger in subsequent periods because of the persistence in the behavior of the older sibling when we allow for feedback. However, the values are pretty similar to the effect of a one-time shift in the older sibling’s behavior, which are reported in the rows W/out feedback and graphed in Figure 1.↵31. We also computed the effects on substance use from age 17 on of a one-time shift in the older sibling’s substance use from 0 to 1 in period
, when the younger siblings who were 16. (Not reported.) These effects are small.↵32. The corresponding fraction is 0.167 (0.215) for using hard drugs. We focus on the results that exclude controls from (1) since the correlation among the observed characteristics of siblings is part of the common influence in sibling behavior. The bottom panel of the table reports results with controls included. The part of
due to peer effects (Column 3–5) is similar to the values in the top panel but this difference is a larger fraction of the value of Column 3.↵33. When we use the model parameters for our alternative error specification to perform the simulations, we obtain similar results to those in the figure in the case of marijuana and drinking. (SeeWeb Appendix Table 12.) However, the effect of smoking by the older sibling on the younger sibling is essentially zero although the standard error is large.
↵34. See Deza (2015) for references and evidence of a modest causal effect of prior use of consumption of soft drugs on consumption of harder drugs using methods somewhat similar to ours to account for unobserved heterogeneity. She does not model siblings’ behavior.
↵35. We chose M = 5 because of sample-size limitations and the likelihood of diminishing returns. We did not experiment with this number. As is common in surveys, responses tend to cluster at 5, 10, 15, 20, etc. If one were to use a large M, one would need to extend the model to account for this because the tendency to cluster would account for a larger fraction of the difference in the response probabilities across clusters.
↵36. Specifically, suppose that splines with break points at m = c1 and m = c2 provide a good approximation. Then the
parameters can be written as a function of 3 parameters
:
Similarly, the other γ and λ parameters can be written as:
We chose c1 = 8 and c2 = 18. One also could restrict the threshold parameters qm (m = 1, . . .,M - 1) to lie on a flexible function, but we did not need to do so with M = 5.↵37. Also, in the ordered probit case we generate 175 simulated observations for each NLSY97 sibling pair rather than 100 simulated observations.
↵38. Point estimates and standard errors for the dynamic ordered probit model with the main error specification are reported in Web Appendix Tables 15 and 16. We use the biased corrected estimates of the sibling parameters in the simulations. The results from the simulations are also reported in Web Appendix Table 17.
↵39. Following Chamberlain (1984) one could relax the assumption of no age dependence to some extent by replacing
with
and still identify β2, but we stick with the simpler CRE specification and use it primarily as a more descriptive supplement to the main analysis using the joint dynamic probit model.↵40. The substance use questions are administered directly by the respondent into a computer, with the computer turned away from the interviewer. Only the respondent can observe the responses. However, sometimes siblings are interviewed on the same day. To investigate whether this affects the response pattern, we estimated the linear probability model
, where SAMEDAYt is 1 if the siblings were interviewed on the same day in year t and is 0 otherwise. The values of b1 and b2 are 0.192(0.019) and 0.085(0.022) for smoking, 0.266(0.018) and 0.013(0.023) for drinking, 0.163 (0.018) and 0.058(0.024) for marijuana, 0.065(0.025) and 0.092(0.036) for hard drugs, and 0.050(0.020) and 0.008(0.030) for selling drugs. The results are mixed but overall the evidence indicates reports are more strongly linked when the interview occurs on the same day. The estimates of b3 are negative and significant in all cases.↵41. The sample sizes differ substantially across models due to the requirement for additional leads and lags in the case of Model 2 and, to a minor extent, the loss of observations due to missing data on
in the case of Model 3. In Web Appendix Table 18, we report the marginal effects of the control variables for Model 1. The estimates for variables that are correlated across siblings are reduced by about 10 percent in absolute value by the presence of
and the age dummies for the older siblings. We also experimented with a number of additional controls, including self-reports of the percentage of peers who engage in the behavior. These did not have much effect on the correlated random effects estimates or the joint dynamic probit estimates of the sibling influence parameters.↵42. We also tried a fixed effect approach. Specifically, we estimated
(14)
treating ε + v2 as a fixed effect. In terms of the model in Web Appendix C, the advantage of the fixed effect estimator is that it requires assumptions (A1) and (A2), but not (A3). On the other hand, it requires (A5), while
may be correlated in the case of the CRE procedure subject to (A1)–(A4). This is a substantial disadvantage. A second disadvantage is that the fixed effect estimator requires multiple observations on the younger sibling, which reduces power. When we include fixed effects, we use a linear probability model rather than a probit specification. As reported inWeb Appendix Table 19, the estimates of the coefficient on
are 0.028 (0.014) for smoking and 0.045 (0.014) for drinking. Both coefficients are significant at the .05 level, but are smaller than the estimates based on (13). We also obtain a small positive coefficient for marijuana that is larger than the CRE estimate, but is significant at only the 0.25 level. The coefficient for use of hard drugs is also positive and close to the CRE values but not statistically significant. Thus the results are qualitatively consistent with our findings based upon (13), but the point estimates tend to be smaller. We do not know why this is the case, although the nature of the variation in the behavior of the older sibling that the two estimators use to identify the sibling effect is different. The difference in the magnitude across estimation strategies is robust to selecting the sample for (14) to match the sample for (13) and to using a linear probability specification for the CRE model in place of the probit specification.↵43. Details about the construction of these measures for the analysis are available in Web Appendix A.
↵44. We also estimated a version of the joint dynamic probit model, in which we control for parenting style, intensity of parental monitoring, and parental supportiveness but do not allow the sibling effects to depend on them. For the older sibling, the main effect of the parenting style variable is large, negative and highly significant in both the equations for the initial period and for the equations for later periods. For the younger sibling, they are also negative but substantially smaller in most cases. The other variables are usually not significant in most cases. Controlling for the parenting variables does not change the estimates of the sibling effects. However, it does reduce the variance of the family effect by 8 percent in the case of smoking, 2 percent in the case of alcohol, and 5 percent in the case of marijuana. The reduction is large–almost 24 percent—in the case of hard drugs.
↵45. In keeping with the discussion in the literature, we expected negative effects of parents being more supportive and involved through authority and monitoring. We did not have a clear prior about the sign of the main effect of turning to a sibling for advice or of living with both biological parents.
- Received July 2014.
- Accepted September 2015.
















































































