## Abstract

The underrepresentation of blacks in the healthcare professions may have direct implications for the health outcomes of minority patients, underscoring the importance of understanding movement through the educational pipeline into professional healthcare careers by race. We jointly model individuals’ postsecondary decisions including enrollment, college type, degree completion, and choosing a healthcare occupation requiring an advanced degree. We estimate the parameters of the model with maximum likelihood using data from the NLS-72. Our results emphasize the importance of pre-collegiate factors and of jointly examining the full chain of educational decisions in understanding the sources of racial disparities in professional healthcare occupations.

## I. Introduction

Over the last half century, the representation of blacks in the pool of health professionals with graduate education (for example, physicians, dentists, psychologists, etc.), as well as other careers requiring postbaccalaureate training, has grown episodically rather than continuously.^{1,2} Immediately after the passage of the Civil Rights Act, barely 2 percent of all medical students were black. Just a decade later in 1975, more than 7 percent of first-year medical students were black. In the subsequent quarter century, however, gains in black representation among health professionals have slowed with blacks currently comprising slightly more than 8 percent of first-year medical students (Association of American Medical Colleges 2005). While recent cohorts entering medical school are unquestionably more racially diverse than those entering the profession three decades ago, blacks still receive advanced training in the health sciences at rates far below their population share of about 15.4 percent (U.S. Census Bureau 2007).

The question addressed in this paper concerns how individual characteristics and achievement observed at the precollegiate level affect the chain of decisions leading to training as a healthcare professional by race. We evaluate how the representation of healthcare professionals, by race, would differ if observed between-group differences, such as gaps in parental education, were eliminated. We trace individuals’ decisions about college enrollment, college degree completion, advanced degree completion, and choice of a health occupation that requires an advanced degree in the context of a unified economic model that allows for the correlation of unobservable determinants of each of these outcomes. For example, if individuals who are more likely to complete baccalaureate degrees are also more likely to complete advanced degrees in the health sciences for unobservable reasons, then simple estimates of the determinants of the decision to become a health professional would be biased. By jointly modeling these decisions, we are able to examine the extent to which the overall “leakage” from the pipeline into a professional healthcare occupation stems from precollegiate factors, differences in collegiate attainment, or gaps at the transition from undergraduate to graduate study in the health sciences. Finally, our parameter estimates enable us to focus on how changes in the precollegiate characteristics of students over time might be expected to narrow the racial gaps in professional degree attainment in the health sciences. While a number of economists have examined the underrepresentation of black students in higher education more generally and there have been descriptive pieces on the participation of black students in medical education, this is the first paper to examine the factors that influence the racial gap in the choice to become a healthcare professional.

The underrepresentation of blacks in the health professions is a concern for reasons of social equality, but also because members of the black community may have unique healthcare needs that may be better addressed and more successfully treated by black healthcare professionals who are knowledgeable about cultural aspects of health and care. Blacks have significantly more health problems than other groups, including high rates of diabetes, heart disease, prostate cancer, HIV/AIDS, breast cancer, and infant mortality (U.S. Department of Health and Human Services 2001). Many of these health disparities can be explained partially by demographic factors, lack of health insurance, and decreased access to care or inferior care. If black healthcare professionals possess some comparative advantage in treating black patients, the underrepresentation of black healthcare professionals would have a direct effect on health outcomes in the black community and the attendant racial gaps in health. Race-concordant care (for example, a black patient visiting a black physician) may be associated with greater trust by the patient and better communication between individuals and healthcare professionals regarding seriousness of illness and proper implementation of prescribed treatment (Rosenheck 1995; Cooper et al. 2003). Research suggests that better communication between race-concordant patient-physician pairs is associated with greater patient involvement in decision-making and higher overall patient satisfaction, which is associated with improved continuity of care, timely and accurate diagnoses, adherence to effective programs, and health outcomes.^{3} Moreover, as the Association of American Medical Colleges argued in an amicus brief (2002) in the Supreme Court case regarding the use of affirmative action in University of Michigan graduate admissions, racial diversity among students in medical education is a direct input to the training of all physicians, producing physicians who are “culturally competent” and “who are better prepared to serve a varied patient population.”^{4}

The paper begins by considering the historical context of the underrepresentation of black Americans in the health professions and college completion more generally. Section III presents a theoretical model of the individual decisions described above and then generalizes that basic model to allow for variation in the type of postsecondary institutions individuals choose to attend for their baccalaureate training. The data are described in Section IV, results are discussed in Section V, and model fit is examined in Section VI. Section VII concludes.

## II. Historical Context

There is no question that segregated universities and labor market discrimination limited the incentives and opportunities for black Americans to pursue advanced training in the healthcare fields in the first part of the twentieth century. Particularly in the South, opportunities for collegiate study often were limited to a modest set of institutions specifically for blacks, now known as “historically black colleges and universities” (HBCUs), which include both private and public institutions. The public HBCUs were originally part of explicitly segregated state systems of education and were often underfunded relative to the institutions for whites. Starting with the desegregation cases including *Brown* (1954) and continuing through the Civil Rights Act of 1964, the structure of collegiate opportunities available to black Americans changed dramatically, and the associated changes in the labor market provided new incentives for blacks to attend college and enter the health professions (Freeman 1976). To illustrate, while about 90 percent of black undergraduates were enrolled at one of the historically black colleges and universities prior to *Brown*, this percentage dropped to less than 50 percent by 1970, and the expansion of enrollment of black students at predominately white institutions continued in the subsequent decades (Drewry and Doermann 2001).^{5} At the graduate level, two institutions—Howard University in Washington, D.C., and Meharry Medical College in Nashville—trained the overwhelming majority of black physicians through the first part of the twentieth century. Blackwell (1981) estimates that, in 1967, approximately 83 percent of the 6,000 practicing black physicians received training at one of these two schools.

The combination of expected returns in the health professions and the new recruiting efforts of medical schools brought a dramatic increase in the representation of blacks in medical schools from the late 1960s through the early 1970s. The number of black students enrolled as first-year medical students jumped from 266 in 1968 to 1,106 in 1974, rising from 2.7 percent of the entering class to 7.5 percent (Figure 1) (Association of American Medical Colleges 2005). The latter half of the 1970s and the 1980s brought some stagnation in the representation of black students in medical schools before the share rose again in the late 1980s.^{6}

Overall, the difference in the college enrollment rate between black and white students narrowed in the 1970s, reflecting both changes in opportunities brought about by the Civil Rights movement and broader changes in socioeconomic conditions including increased odds of parental high school attainment (Kane 1994). Still, while the achievement gap between black and white students at the time of expected college entry has narrowed somewhat over the last three decades, progress has been slow and uneven.^{7} One implication, which follows in our empirical analysis, is that differences in the representation of blacks and whites at the postbaccalaureate stage can be traced to gaps generated much earlier in the educational pipeline.

## III. The Basic Model

Given our primary interest in examining where in the pipeline the underrepresentation of blacks in the heath care professions emerges, the model of individual behavior combines choices and outcomes at the various decision points along that pipeline (college enrollment, college degree completion, and entry into a health occupation with an advanced degree). Because each stage of decision-making influences the next, we allow past choices/outcomes to influence future choices/outcomes in the pipeline and allow unobservables in all stages to be correlated. This approach allows us to identify the differential effects of various individual characteristics on progression through each stage of the educational/career pipeline toward becoming a health professional with an advanced degree.

Assume that each individual, indexed by *i*, has some unobserved propensity to choose each of these outcomes, where their propensities are functions of individual and family characteristics denoted by *X _{i}*. In practice,

*X*contains information on gender, race, academic ability, parental education, and urbanicity of the location in which the individual attended high school. For each individual

_{i}*i*, let be the latent value of enrolling in college, be the latent value of completing a four-year college degree conditional on enrolling, and be the latent value of becoming a health professional with an advanced degree conditional on completing a four-year college degree.

^{8}Each of these choices can be expressed as functions of observable individual-specific characteristics in

*X*, linear and quadratic terms for the latent values of choices made earlier in the educational and career pipeline, and an unobservable component denoted by

_{i}*u*=1,3,4;

_{ji}, j^{9}

(1)

(2)

(3)

It may be that an individual’s propensity to complete a college degree is a function of her propensity to enroll in college. Individuals with high values of may have unobserved ability that positively affects enrollment and completion (), or individuals with high values of may be pushing unrealisticly to attend college and thus have lower probabilities of completion (). Furthermore, the dominant effect may be nonlinear. For example, it may be that but that because colleges provide extra services for risky students. Alternatively, it may be that but because, beyond some point, the skills necessary for success in high school are different than those necessary for success in college, or it may be that because high levels of skill are more important for completion of college than for entry into college. Thus, we model the effect of on as a quadratic function.^{10} Additionally, an individual’s propensity to become a health professional with an advanced degree may depend on her latent value of enrolling in college and completing a degree. Thus, we allow to be a function of both and .^{11} Finally, define the vector of unobservables for individual *i* as *u _{i}* =(

*u*

_{1i},

*u*

_{3i},

*u*

_{4i})′ and allow these unobservable factors to be correlated across individual

*i*’s three choices by assuming

*u*∼

_{i}*iidN*(0,Ω).

An individual’s propensities to enter college, complete a baccalaureate degree, and select a health occupation that requires an advanced degree are all unobserved in the data. Instead, we observe binary outcomes indicating whether or not individual *i* actually made these choices. Thus, let *y*_{1i}, *y*_{3i}, and *y*_{4i} represent entry into college, completion of college, and entry into a health profession requiring an advanced degree, respectively. Mathematically, for *k*=1, 3, and 4, respectively. The definitions of these binary outcome variables are used to specify individual *i*’s probabilities of making various choices that are possible in the data, where the four possible outcomes and their associated probabilities, conditional on observable individual characteristics, are:

Do not enroll in college (

*P*_{1}= Pr[*y*_{1i}=0|*X*_{1i}])Enroll in college but do not graduate with a baccalaureate degree (

*P*_{2}= Pr[*y*_{1i}= 1,*y*_{3i}= 0|*X*_{1i},*X*_{3i}]);Enroll in college, graduate with a baccalaureate degree, but do not choose a health profession that requires an advanced degree (

*P*_{3}= Pr[*y*_{1i}= 1,*y*_{3i}= 1,*y*^{4i}= 0|*X*_{1i},*X*_{3i},*X*_{4i}]);Enroll in college, graduate with a baccalaureate degree, and choose a health profession that requires an advanced degree (

*P*_{4}= Pr[*y*_{1i}= 1,*y*_{3i}= 1,*y*^{4i}= 1|*X*_{1i},*X*_{3i},*X*_{4i}]).

Each of these four probabilities are functions of the model parameters to be estimated, θ, which include β_{1}, β_{3}, β_{4}, α_{31}, α_{32}, α_{41}, α_{42}, α_{43}, α_{44}, and Ω, and are conditional on observed individual characteristics in *X _{i}*. The assumed joint normality of the unobservables (

*u*) in Equations 1, 2, and 3 enable each of the four probabilities listed above to be expressed in terms of univariate, bivariate, and trivariate normal distribution and density functions. The detailed expressions for these four choice probabilities are presented in Appendix A1.

_{ki}The model parameters are estimated by maximum likelihood estimation (MLE), which involves specifying the log-likelihood function, to which each individual in the sample makes a contribution. An individual’s log-likelihood contribution is the log probability of observing the choices made by that individual in the data, and it can be written as

Summing over all individuals’ log-likelihood contributions, the value of the parameters in θ that maximizes is the maximum likelihood estimator of θ.^{12,13}

### A. Decomposing the Effect of a Change in Individual Characteristics

The model described above provides a framework for making predictions about how changes in the explanatory variables affect the probability that an individual becomes a health professional. Such simulations allow for the analysis of the extent to which changes in background characteristics, such as a narrowing in the black-white gap in parental education, would affect the relative representation by race in the health professions. Significantly, the predicted effect of changes in characteristics like parental educational attainment can be decomposed into the component effects in each stage of the educational pipeline that we specify in the model. This is particularly useful for determining where in the pipeline black representation is predicted to be affected by such a change (that is, college entrance, college completion, or transition to health professional). For example, if *j* indexes the different individual characteristics in which we are interested, the partial derivative of *P*_{4} with respect to *X*_{1ij} tells us the effect of increasing characteristic *j* on the probability of becoming a health professional due to its effect on the propensity to enroll in college. The partial derivative of *P*_{4} with respect to *X*_{3ij} tells us the effect of increasing characteristic *j* on the probability of becoming a health professional conditional on enrolling in college due to its effect on the college completion. Finally, the partial derivative of *P*_{4} with respect to *X*_{4ij} provides the effect of increasing characteristic *j* on the probability of becoming a health professional conditional on college completion. Thus, if characteristic *j* is parental educational attainment, the three derivatives described here indicate how increased parental attainment would change an individual’s probability of becoming a health professional at three important stages of the process: college enrollment, college completion, and postbaccalaureate career and degree decisions.

### B. Altering the Model to Permit Variation in College Type and Quality

One issue that we abstract from in the basic theoretical model presented above is that college-bound individuals select and attend institutions of varying characteristics. If college characteristics including resources and peers affect individuals’ college completion rates, propensity to obtain an advanced degree, or propensity to choose a healthcare occupation, then allowing variation in college choice may be an important addition to the model specification.^{14} In this section, we generalize the model so that colleges chosen at the baccalaureate level are permitted to differ along two dimensions: institutional quality (proxied by institutional selectivity) and whether the institution is a historically black college or university (HBCU).^{15} We cannot simply add college quality and an HBCU indicator to the explanatory variables in Equations 1, 2, and 3 because individuals *choose* these attributes through their application and enrollment decisions, making both variables endogenous. Instead, enrollment at colleges of varying quality or at an HBCU are modeled as additional latent choice variables.

Assume that is a latent variable measuring the quality of the non-HBCU undergraduate college individual *i* can attend. Because an individual’s enrollment choice is also a function of college admission decisions, also captures whether individual *i* has the qualifications to be admitted to a non-HBCU college of a particular quality,

(4)

Next assume that is a latent variable measuring the value of attending an HBCU.^{16} It may be that an individual’s propensity to choose an HBCU is a function of the quality of the non-HBCU colleges to which they could obtain admission. Thus, we allow to be a function of ^{17} as well as of observable individual characteristics *X*_{2i} and an unobservable component, *u*_{2i},

(5)

As in the basic model, α_{21} is not identified given that *X*_{1i} ⊆ *X*_{2i}. We set α_{21} = 1 and think of β_{2} as the degree to which *X*_{2i} affects in excess of , the value of attending college. Assume that individual *i* attends an HBCU if and only if the value from doing so is positive, or .

Finally, we need to specify the quality of non-HBCU colleges attended by individuals in the sample and how this additional variation changes the basic model. Define college quality threshold values τ_{k}, *k* = 0,1,…,*K*, that decompose the support of into regions consistent with the data. Individual *i* attends a non-HBCU of quality level *k* if and only if he/she does not attend an HBCU and if the quality of the non-HBCU attended falls between thresholds τ* _{k}* and τ

_{k+1}. Mathematically, we observe the set of

*K*possible non-HBCU college choices given by

(6)

We can define *k*=0 as the case of not attending college and allow college quality to be increasing in and *k*. Without loss of generality, we can also define τ_{0} = −∞, τ_{1} = 0, and τ_{k+1} = ∞. It is worth noting that Equation 6 is an ordered discrete choice structure.

The latent value of completing a four-year college degree conditional on enrolling, , and the latent value of becoming a health professional with an advanced degree conditional on completing a four-year college degree, , are defined as in Equations 2 and 3 in the basic model. These decisions are functions of observable individual-specific characteristics in *X _{i}*, linear and quadratric terms for the latent values of choices made earlier in the educational and career pipeline, and an unobservable component;

(7)

(8)

We assume that the vector of unobservable components of the latent variable equations above are and with diagonal elements of *u _{i}* = (

*u*

_{1i},

*u*

_{3i},

*u*

_{4i})′ and

*u*~

_{i}*iidN*(0,Ω) with diagonal elements of Ω equal to one for identification purposes. Also, as in the basic model, all four variables are latent, but the outcomes we observe include multinomial choice variables.

There are now seven possible outcomes we might observe in the data for each individual. These possible outcomes, along with their associated conditional probabilities of occurring in the data, are:

Do not enroll in college (

*P*_{1}= Pr[*y*_{1i}= 0,*y*_{21}= 0|*X*_{1i},*X*_{2i}]);Enroll in a non-HBCU of type

*k*but do not graduate with a baccalaureate degree (*P*_{2k}= Pr[*y*_{1i}=*k*,*y*_{2i}= 0,*y*_{3i}= 0|*X*_{1i},*X*_{2i},*X*_{3i}]);Enroll in an HBCU but do not graduate with a baccalaureate degree (

*P*_{3}= Pr[*y*_{2i}= 1,*y*_{3i}= 0|*X*_{1i},*X*_{2i},*X*_{3i}]);Enroll in a non-HBCU of type

*k*, graduate with a baccalaureate degree, but do not choose a health profession that requires an advanced degree (*P*_{4k}= Pr[*y*_{1i}=*k*,*y*_{2i}= 0,*y*_{3i}= 1,*y*_{4i}= 0|*X*_{1i},*X*_{2i},*X*_{3i},*X*_{4i}]);Enroll in an HBCU, graduate with a baccalaureate degree, but do not choose a health profession that requires an advanced degree (

*P*_{4k}= Pr[*y*_{2i}= 1,*y*_{3i}= 1,*y*_{4i}= 0|*X*_{1i},*X*_{2i},*X*_{3i},*X*_{4i}]);Enroll in a non-HBCU of type

*k*, graduate with a baccalaureate degree, and choose a health profession that requires an advanced degree (*P*_{6k}= Pr[*y*_{1i}=*k*,*y*_{3i}= 1,*y*_{4i}= 1|*X*_{1i},*X*_{2i},*X*_{3i},*X*_{4i}]);Enroll in an HBCU, graduate with a baccalaureate degree, and choose a health profession that requires an advanced degree (

*P*_{7}= Pr[*y*_{2i}= 1,*y*_{3i}= 1,*y*_{4i}= 1|*X*_{1i},*X*_{2i},*X*_{3i},*X*_{4i}]).

The explicit forms of these probabilities are provided in Appendix A2.^{18}

Again, the probabilities discussed above form individual *i*’s log-likelihood contribution:

and we maximize over the parameters in θ to get consistent, asymptotically normal estimates of θ. These parameter estimates will also be used to decompose the effect of changing individual characteristics on choices made at various stages in the pipeline.

## IV. Data

The primary data we employ are from the National Longitudinal Study of the High School Class of 1972 (NLS-72). The National Center for Education Statistics (NCES) of the U.S. Department of Education designed and conducted this study and refers to it as “probably the richest archive ever assembled on a single generation of Americans” (NCES 1994). Participants in the study were high school seniors in the spring of 1972, and followup surveys of these respondents were conducted in 1973, 1974, 1976, 1979, and 1986. The database contains information from high school records as well as postsecondary transcripts (collected in 1984). Because the original 18-year-old respondents were last interviewed when they were approximately 32-years-old, we believe this panel data set is sufficiently long to allow individuals to acquire postbaccalaureate training and choose an occupation in a health profession. This data set is supplemented with information on college and university selectivity rankings from *Barron’s Profiles of American Colleges* (1994). We collapse the scale of ten selectivity rankings in *Barron’s* into five categories such that higher level institutions are associated with higher quality and better reputation. Postsecondary institutions that are historically black colleges and universities (HBCUs) are coded as a separate category and not assigned a selectivity ranking. Additionally, attendance at a two-year, nonvocational, postsecondary institution is considered college entry if the individual eventually completed a four-year baccalaureate degree.

In this research, health professional status is defined as being in a health occupation and possessing a postbaccalaureate degree. This definition means that the health professionals identified in the NLS-72 include primarily physicians, therapists, and dentists; modest numbers of registered nurses, pharmacists, psychologists, and optometrists; and a few veterinarians, biological scientists, dieticians, health technicians, podiatrists, chiropractors, and health service technicians. Although the medical literature focuses heavily on the effect of race concordance between physicians and patients on patient outcomes, there also exist empirical studies of racial and ethnic concordance between patients and mental health providers, substance abuse counselors, and medical students (McGinnis et al. 2006; Halliday-Boykins et al. 2005; Sterling et al. 2001). Because the medical literature addresses patient-provider concordance more broadly than just between physicians and their patients, we feel comfortable with our more broad categorization of healthcare professionals modeled in the previous section. Additionally, this broader definition of health professional is appropriate given that one of the interesting policy angles of our paper concerns funding for and recruiting of minorities into graduate study in the health sciences.

Summary statistics for the sample of high school graduates, college entrants, college graduates, and health professionals with advanced degrees are provided in Table 1. In the sample of approximately 13,000 high school graduates, 72 and 73 percent of respondents’ fathers and mothers, respectively, have at least a high school education, while 19 and 11 percent of respondents’ fathers and mothers, respectively, have a baccalaureate or advanced degree. Consistent with early 1970s data from the October Current Population Survey analyzed in Kane (1994), 51 percent of our sample of high school graduates enroll in some type of nonvocational postsecondary institution. Table 1 also indicates the types of postsecondary institutions chosen. For example, 9.3 percent of high school graduates begin their college career at a two-year college, while 4.6 percent start at highly selective (Level 5) four-year institutions. Most college-bound high school graduates enter college at an institution of moderate selectivity, or Level 3. Reading Table 1 from left to right, our sample changes in predictable ways as we follow these respondents through the educational pipeline from high school graduation through college entrance and completion and, finally, to becoming a health professional with an advanced degree. The sample becomes more male and less racially diverse, and socioeconomic status (proxied by parental educational attainment) increases.^{19} The students who successfully complete each additional stage are also of higher academic ability, as proxied by student SAT score and less likely to be from rural and farming communities.^{20} Table 1 also indicates that nearly 60 percent of college entrants graduate with a baccalaureate degree, and 5 percent of those degree recipients go on to obtain advanced degrees and select a health occupation.

Because our primary interest in this paper is in racial differences, Table 2 identifies between-group differences in the samples of whites and nonwhites at various stages of the educational pipeline. The data are consistent with known differences in demographics and the socioeconomic status between whites and minorities. At the first observable point in the pipeline in the NLS-72, we see that white high school graduates are much more likely to have better-educated parents than nonwhite high school graduates; 72.2 percent of white fathers have at least a high school diploma compared to only 45.2 percent of nonwhite fathers. Differences in precollegiate academic ability, proxied by SAT score, are also substantial. White high school graduates are fairly evenly distributed across the four SAT quartiles, while 60 percent of nonwhite graduates fall in the lowest quartile of SAT scores in the sample. Likewise, nonwhite high school graduates are much less likely (6.0 percent) to score in the top SAT quartile than white high school graduates (28.9 percent). These observed differences in academic preparation are consistent with well-documented test scores gaps between whites and minorities.^{21}

Table 2 also indicates that the racial gaps that exist upon high school graduation are still present and, in some cases, exacerbated further in the educational pipeline. The between-group differences in parental educational attainment actually grow more pronounced when we look at college entrants compared to high school graduates, as do differences in the representation in the highest SAT quartile. While the between-group difference in representation in the top SAT quartile widens through the college graduation stage, this gap narrows dramatically among those who choose to enter health professions.

## V. Results

### A. Basic Model

The parameter estimates from the basic model are presented in Table 3. Given the various binary variables and interaction terms, the base category in these results is a nonrural, white female with an average SAT score and both parents lacking a high school diploma. Relative to this base group, for example, males with otherwise similar characteristics are less likely to enter college (−0.022) and complete a degree conditional on entering college (−0.009) but are more likely to become health professionals conditional on completing college (0.087). To examine the effect of being black on college entry and completion, which is interacted with both the parental education and urbanicity variables in the college entry and completion equations, Figures 2 and 3 graphically display the various combinations of parameters for both males and females. Conditional on SAT score, Figure 2 shows that blacks are more likely than whites to enroll in college regardless of their urbanicity and parents’ educational attainment, and these effects are more pronounced for black females than for black males. Figure 3 shows that black students are also more likely to complete a college degree than white students, except for those blacks in nonrural areas with college-educated fathers. The result that shifts the unconditional deficit in black college enrollment to greater enrollment probability for blacks conditional on parental background and a student’s high school achievement is well-established in the prior empirical work. The seminal work by Manski and Wise (1983) shows that, conditional on observable characteristics, blacks from both the North and the South are substantially more likely to enroll in college than their white counterparts, while blacks from the South are also appreciably more likely to persist in college. Kane (1999) finds a similar advantage in enrollment using data from the NELS for students expected to graduate from high school in 1992.

Because the college degree completion and health professional equations contain *y*^{*} values from earlier stages in the educational pipeline and because the vector of individual attributes, *X _{i}*, is similar or identical across the various equations, a note is necessary about the interpretation of the estimated β parameters. The estimated βs are combinations of two effects, direct and indirect. For example, β

_{3}in Equation 2 consists of the direct effect of

*X*

_{3i}on as well as an indirect effect through the propensity to enroll in college, because α

_{31}is not identified and set to zero in estimation. Thus, the estimated value of β

_{3}in Equation 2 is a combination of the true β

_{3}and α

_{31}.

An individual’s SAT score is positively associated with enrolling and completing college, as is having a parent with a college degree, although this latter effect on completing college is actually negative for blacks in nonrurual areas. Because our sample respondents were born in approximately 1954, their parents’ generation had high school completion rates that were approximately half of what they are today (Goldin 2003). Thus, it is not particularly surprising that even paternal high school completion influences respondents’ college enrollment rates by nearly the same magnitude as paternal college completion. Whites who attended high school in a rural or farming area are less likely to enroll in and more likely to graduate from college than whites in nonrural areas. While the role of urbanicity does not appear to have a differential effect on college entrance for black high school graduates, blacks who attended high school in rural areas are much more likely to graduate from college than blacks in non rural areas, conditional on entering college and other observable characteristics (see Figure 3).

The effect of covariates on the likelihood of choosing occupations requiring advanced degrees in the health professions is shown in the third panel of Table 3.^{22} The probability of following this path increases with an individual’s SAT score, while parental education has a mixed effect on a college graduate’s decision to become a health professional. The effect of being black on the probability of pursuing a health profession is positive but considerably higher and only statistically significant for black males. This result, as the discussion in the next section demonstrates, does not persist when the type of college in which an individual enrolls is incorporated as a determinant of degree completion. Growing up in a rural area decreases the likelihood of choosing to become a health professional.

We also include a measure of individual *i* ’s propensity to enroll in college in the degree completion equation in both linear and quadratic form. Only the parameter on the quadratic term (α_{32}) is identified; thus we set α_{31} equal to 0 and estimate α_{32}. The positive estimated value of α_{32} indicates that an individual’s propensity to complete a four-year college degree increases in their propensity to enroll. Although this result indicates that high school graduates with the strongest propensity to attend college are also more likely to complete a degree, the estimate is not statistically significant. Additionally, we include in the health professional equation linear and quadratric terms for the latent values of choices made earlier in the educational and career pipeline. As above, only the parameters on the quadratic terms are identifed. The estimates indicate that an individual’s propensity to become a health professional with an advanced degree eventually decreases in their propensity to enroll in college, and increases in their propensity for completing a four-year degree. This latter result indicates that the strongest college students, in terms of likelihood of completion, are the most likely to go on to become health professionals.

The lower panel of Table 3 displays the estimated covariances between unobservable factors in each of these three stages.^{23} Surprisingly, those individuals who are more likely to enter college for unobservable reasons are *less* likely to complete a four-year degree for unobservable reasons.^{24} The correlation in unobservables works in the anticipated direction for the other choices. Unobservables that make it more likely that a person completes college are positively related to those unobservables that encourage a person to become a health professional with an advanced degree.

### B. Model with Variation in College Type and Quality

The parameter estimates from the structural model that includes college quality and historically black institutions are presented in Table 4. Recall that our impetus for adding variation in college attributes is that variation in the types of colleges individuals attend may influence college completion rates, propensity to obtain an advanced degree, or propensity to choose a healthcare occupation. Many of the qualitative conclusions regarding the determinants of college entry are the same as in the basic model discussed above, but there are some noticeable differences in other stages of decision making.

Conditional on college entry *and* the attributes of the college chosen, as well as the other covariates, blacks are now even more likely than whites to complete a four-year degree regardless of their urbanicity and parents’ educational attainment. In the basic model, Hispanics and Asians were conditionally less likely than whites to complete a college degree. In the quality-adjusted model, both groups are conditionally *more* likely to complete. The effect of coming from a rural area on college degree completion for nonblacks changes sign between the basic and quality-adjusted models, indicating that growing up in a rural area and college quality are negatively correlated. In the quality-adjusted model, individuals from rural areas who enter college are less likely to graduate from college, conditional on other factors, although this effect is mitigated for blacks from rural areas regardless of parental educational attainment.

The parameter estimates in Table 4 also enable us to examine the determinants of choosing an HBCU institution. Black high school graduates are, not surprisingly, more likely than whites to choose (and be chosen by) a historically black college or university regardless of urbanicity and parental education, and this effect is somewhat stronger for black females than black males (−0.312). Individual SAT score is negatively associated with choosing an HBCU, although this parameter estimate is statistically insignificant. We also included a measure of the quality of non-HBCU institution individual *i* could attend, , in the HBCU equation in both linear and quadratic form. Only the parameter on the quadratic term (α_{22}) is identified; thus we set α_{21} equal to 1 and estimate α_{22}. The negative estimated value of α_{22} indicates that an individual’s propensity to choose an HBCU eventually decreases in the quality of non-HBCU alternatives available. This result indicates that high school graduates with the ability to garner admissions offers from top-tier non-HBCU institutions are less likely to select a historically black institution. We include the same non-HBCU quality measure, , in the degree completion equation. After setting the parameter on the linear term equal to zero, the negative estimated parameter on the quadratic term, α_{32}, indicates that the slope of an individual’s ability to complete a four-year degree decreases in the quality of the non-HBCU institution individual *i* could attend. This result is consistent with previous research on the quality of the match between individuals and colleges that finds the optimal college quality for an individual is slightly above the individual’s own ability (Manski and Wise 1983).

The final column of parameter estimates in Table 4 refers to individuals’ propensities to become a healthcare professional with an advanced degree conditional on all previous choices and outcomes in earlier stages of the educational pipeline.^{25} Modeling the variation in college choice and the decision to attend an HBCU in the specification presented in Table 4 leads to a shift in the sign on the parameter estimate on the race indicator for black to negative and statistically significant, indicating that black college graduates are less likely than observationally equivalent whites to go into the health professions with advanced degrees. The change in this parameter’s sign between the two model specifications follows from the change in the correlation of the error terms across equations when collegiate choice is included explicitly in the model, with covariances between unobservable factors shown in the bottom panels of Tables 3 and 4. To illustrate, the covariance between *u*_{1} and *u*_{4}, ρ_{14}, changes from negative to positive when college quality is embedded in the enrollment decision. Finally, the inclusion of HBCU status in the second specification yields statistically significant covariances between *u*_{2} with both *u*_{3} and *u*_{4}. In essence, the changes in these covariance terms drive changes in the expected value of *u*_{4} conditional on observables, which explains the decrease in the black coefficient in the health professional stage of the second specification.^{26} Perhaps more intuitively, attendance at an HBCU provides a strong pathway to medical school and, as such, what initially appeared as an effect associated with race is more properly connected to the effects associated with HBCU participation. Necessarily, we are cautious in ascribing a strong causal interpretation to this shift in the magnitude of the estimated effect as our data make it difficult to distinguish between the institutional effects of attendance at an HBCU and the selection effects that distinguish those students attending HBCUs.

Black students who attend HBCU institutions are appreciably more likely to enter the health professions than observationally similar students who attend non-HBCU institutions. Our result, from a formal econometric specification, is consistent with other evidence such as Drewry and Doermann (2001), who examine the undergraduate origins of black first-year students in U.S. medical schools. Drewry and Doermann (2001) note that, while black students made up about 5.8 percent of first-year medical students in 1978, black students attending private historically black colleges and universities made up a disproportionate of 16 percent of these black first-year medical students, more than double their representation among baccalaureate degree recipients, leading to the conclusion that “the private black colleges are particularly productive for healthcare professionals” (p. 192). Historically black institutions such as Xavier in New Orleans are frequently cited for their large pre-med programs; in May of 2001, “73 Xavier graduates were headed to medical schools, and dozens more were entering graduate school in health related fields” (Stewart 2001).

#### 1. Marginal Effects of Individual Characteristics on Choice Probabilities

To understand how the parameter estimates from the quality-adjusted model in Table 4 affect the probabilities of entering college, enrolling in a college with certain characteristics, completing college, and becoming a healthcare professional with an advanced degree, we calculate marginal effects of each of the covariates. The marginal effects presented in the first three columns of Table 5 and in Table 6 are conditional on successfully completing all previous stages in the educational pipeline as well as on other observable characteristics. The final column of Table 5 presents the total (or unconditional) marginal effects associated with becoming a health professional with an advanced degree. For example, the total (or unconditional) marginal effect of binary variable *X _{k}* on the probability of education/career outcome

*j*is notationally described by , however, conditioning on previous stages changes the marginal effect to .

^{27}It is worth noting that, although our model is more complicated than a simple binary discrete choice problem, the same basic intuition about interpreting marginal effects applies.

^{28}

A primary question in this analysis is how race affects the probability of different outcomes in the collegiate pipeline. We present the estimated effects in Table 5 relative to outcomes predicted for nonrural white females with average SAT scores and both parents lacking a high school diploma. For example, the second row of marginal effects, labeled “Black,” indicates how the probability of each outcome would be expected to differ for a black female relative to a white female, assuming a nonrural high school location, average SAT scores, and less-educated parents. Relative to a white woman with these same characteristics, a black woman is appreciably more likely to enroll in college (29.56 percentage points) and to complete an undergraduate degree (30.74 percentage points). Yet, conditional on college enrollment, college type, and degree completion, there is a decline in progress in to the health professions of approximately three percentage points for these black females relative to white females. This pattern remains for all combinations of urbanicity and parental educational attainment. Given that the overall share of college graduates who become health professionals is 5 percent, this is a sizeable effect. When we consider the unconditional total effect of race in the final column of Table 5, the effect is small in magnitude and indistiguishable from zero because the large positive effects at the college entry and completion stages for blacks are offset by the negative effect in the health professional stage. Similar statements can be made about going from a white male to a black male by combining the marginal effects in the second and third rows of Table 5. For the Hispanic and Asian group membership, there is a positive marginal effect on college entry and undergraduate degree receipt, while membership in these groups is not linked to the health professional outcome in a statistically significant way. However, the unconditional total effects of Hispanic and Asian group membership on becoming a health professional are both positive and statistically significant. The magnitude of the effects (both equal to 0.005) appears to be small, but these effects are actually quite substantial given that the proportion of high school graduates who become health professionals is also quite small (0.015).

The marginal effect of a 100-point increase in individual SAT score is, not surprisingly, associated with a higher probability of college entry and degree completion. The probability of college enrollment increases by 9.68 percentage points when SAT increases by 100 points. Conditional on college entry and observables, Table 6 indicates that a 100-point SAT score increase is associated with a roughly three percentage point increase in the probability of attending a more selective four-year institution at moderately to most-selective colleges. Higher SAT scores are also associated with an increased probability (2.77 percentage points) of becoming a health professional with an advanced degree.

Tables 5 and 6 also indicate the effect of parental education on children’s educational and career outcomes. Among nonblacks, having a father with a high school education is associated with a 7.57 percentage point increase in college entry, higher probabilities of attending a more selective institution (conditional on college entry), a 7.36 percentage point increase in the probability of degree receipt (conditional on college entry and college type), and a 2.62 percentage point increase in the probability of becoming a health professional (conditional on college entry and type and degree receipt). High school completion by nonblack mothers is similarly associated with a child’s probability of progressing through the educational pipeline. There is also a positive association between nonblack parental baccalaureate degree completion and a child’s probability of progressing through the pipeline. Nonblack college-educated fathers (relative to high school educated fathers) are associated with a higher probability of college entry by 7.94 percentage points, of going to a more selective college by 5 to 15 percentage points, of degree completion by 11.08 percentage points, and of becoming a health professional by 1.21 percentage points. Nonblack college-educated mothers have similar marginal effects.

The marginal effects of parental educational attainment differ somewhat by race. For blacks, the marginal effects of their father’s high school degree completion on college entry and degree completion are 10.85 and 11.96 percentage points compared to 7.57 and 7.36 for nonblacks. While a black individual with a father who also completes a college degree has a 6.62 percentage point higher probability of entering college, paternal college completion actually has a negative association with a child’s probability of completing a college degree (0.55 percentage point decline). There is no real difference between blacks and nonblacks in the marginal effect of paternal college completion on the probability of becoming a healthcare professional with an advanced degree. It is interesting to note that the parameter estimate on father’s high school completion in the health professional stage is negative in Table 4 (−0.190) and positive (0.0262) in Table 5. This result stems from selection and correlation in the unobservable determinants of the decisions to enter college, complete a degree, and become a health professional, thereby demonstrating the importance of jointly modeling these decisions in the way that we do. Although the children of high school educated fathers are *less* likely to become health professionals, conditioning on college entry, college selectivity, graduation, and unobservables indicates that college graduates with high school educated fathers are substantially *more* likely to become health professionals than their peers with fathers who did not complete high school.

Finally, moving from a nonrural to a rural location is not statistically associated with the college entry and degree completion probabilities of either blacks or nonblacks but does have a small negative and statistically significant marginal effect on the probability of becoming a health professional. From Table 6, originating from a rural area also has no discernible marginal effect on the probability of going to a more selective four-year institution for nonblacks but has a negative effect for blacks.

## VI. Specification Tests

The quality-adjusted model presented at the end of Section III specifies the probabilities of observing a variety of different educational and career outcomes. Because we model the decision to enter college, the type of college chosen (HBCU or non-HBCU in one of five selectivity categories or selectivity unknown or two-year college), degree completion, and choosing a health profession that requires an advanced degree, there are 22 different educational/career paths available to each individual.^{29} We use the parameter estimates in Table 4 to compute predicted probabilities that individuals choose each educational/career path and compare the predicted behavior with actual outcomes. Table 7 presents predicted and actual proportions of individuals choosing each educational/career path. Although predicted behavior appears to be very similar to actual behavior in many cases, we also divide the sample into quintiles based on predicted probabilities to facilitate the construction of more formal specification test statistics.

We perform χ^{2} goodness-of-fit tests to more rigorously examine how well the model fits the choices and outcomes that we actually observe in the NLS-72 data. The null hypothesis for this statistical test is that the proportions predicted by the model equal the actual proportions in the data, thus, test statistics that fall below the critical value indicate that the model fits the data well. χ^{2} goodness-of-fit statistics for each outcome, by quintile and overall, are presented in Table 8.^{30} Overall, the model fails this specification test. However, a closer examination of the disaggregation by outcome and quintile reveals that the model does a poor job primarily in those outcomes that involve college entrance with no degree completion, particularly at lower quality institutions.

## VII. Discussion and Conclusion

The dramatic underrepresentation of blacks in the health professions is a cause for policy concern because it may capture group differences in educational achievement and opportunities as well as potentially affecting the quality of health provision in the United States. For the cohort that we follow that graduated from high school in 1972, the representation of blacks declined from 11 percent at the point of high school graduation, to 9 percent at college entry, to 7.2 percent at college graduation, and to 4.1 percent at the stage of entry to the health professions. On net, differences in the representation of blacks and whites at the postbaccalaureate stage of entry to a health profession can be traced to gaps generated much earlier in the educational pipeline. Still, our model, which accounts for college entry, type of college, measured by both institutional selectivity and status as an HBCU, and college completion demonstrates significant race-specific effects at different transitions. While black students are more likely to enroll in college and complete college conditional on family circumstances and high school achievement, there is substantial underrepresentation of blacks in the transition from baccalaureate degree receipt to participation in a health profession requiring an advanced degree.

There is little evidence to suggest that changes over the last three decades in student achievement or parental circumstances have been sufficiently large in absolute terms and relative to other groups to predict substantial changes in the representation of blacks among those with advanced degrees in the health professions. The underrepresentation of blacks in the health professions is part of the more general social and economic problems generating substantial group differences, entrenched before the college years, which is documented in our results and work by other social scientists (see, for example, Jencks and Phillips 1998). Focusing only on the ratio of black to white healthcare professionals by age using data from the 2000 decennial Census, changes over the last two decades in the representation of blacks in the health professions have been modest. What modest gains are apparent in the black-white ratio among those age 35 relative to those age 45 is driven by an erosion in the number of whites choosing the health professions rather than a sustained increase in blacks choosing healthcare professions.

While we emphasize that much of the overall gap in the representation of blacks can be traced to outcomes at the precollegiate and collegiate levels, the question of why we have not observed greater increases in the representation of blacks in healthcare professions remains primary. The value to entering the healthcare professions is necessarily relative to other outside options. One hypothesis for the failure to achieve greater gains in postbaccalaureate healthcare programs is that outside options for black college graduates improved far more rapidly than opportunities in the health professions. As such, demand from professions like law and business, where the gap in wages between black and white professions narrowed rapidly in the 1970s and 1980s, drew many high achieving blacks to MBA programs and law schools. To illustrate, the number of blacks enrolled in law school increased from 3,744 in 1971–72 to 9,529 in 2006–2007, representing an increase of more than 250 percent (American Bar Association 2007). That demand for advanced study in the health professions has not increased markedly among blacks is borne out in data showing major undergraduate fields of study in 1977 and 1997 by race (see Table A3). If life sciences study at the undergraduate level is an indication of future advanced study in the health professions, black participation in these fields has fallen off over the last two decades at a rate somewhat greater than that observed for whites.

Our evidence suggests that further efforts to understand the pathway from undergraduate degree receipt to entry in advanced degree health programs by race and type of undergraduate experience may be a constructive direction for future research. Still, we caution that, even with a compelling public policy interest to increase the representation of blacks in the health professions, efforts to target students at the margin between college completion and entry to a graduate program in the health professions may well generate substantial distortions in the educational marketplace in the absence of a full understanding of the causes of race-specific differences in the collegiate pipeline.

## Appendix A

Define Φ(·) as the standard normal distribution function, ϕ(·) as the standard normal density function, *B*(·,·;ρ) as the standard bivariate normal distribution function (with correlation ρ), *b*(·,·;ρ) as the standard bivariate normal density function (with correlation ρ), *t*(·,·,·;Ω) as the standard trivariate normal density with covariance matrix Ω,^{31} and .

### 1 Choice Probabilities in the Basic Model

Recall from Section III that there are four choice probabilities in the basic model. To aid in the exposition of the functional form of these probabilities, define three indexes:

The conditional probability of not going to college is^{32}

(9)

the conditional probability of going to college but not finishing is

(10)

the conditional probability of finishing college but not becoming a health professional with an advanced degree is

(11)

and the conditional probability of becoming a health professional with an advanced degree is

(12)

### 2 Choice Probabilities in Model with College Quality and HBCUs

Recall from Section III that there are seven choice probabilities in the model that allows for variation in college characteristics. To aid in the exposition of the functional form of these probabilities, define four indexes:

The conditional probability of not going to college is

(13)

the conditional probability of going to a non-HBCU college of type *k* but not finishing is

(14)

the conditional probability of going to an HBCU institution but not finishing is

(15)

the conditional probability of going to a non-HBCU college of type *k*, finishing, but not becoming a health professional with an advanced degree is

(16)

the conditional probability of going to an HBCU institution, finishing, but not becoming a health professional with an advanced degree is

(17)

the conditional probability of going to a non-HBCU college of type *k*, finishing, and becoming a health professional with an advanced degree is

(18)

the conditional probability of going to an HBCU institution, finishing, and becoming a health professional with an advanced degree is

(19)

There are some observations where we observe the individual enrolling in a four year non-HBCU, but are not able to observe the quality of the institution. The relevant likelihood contributions change from Equation 14 to

(20)

from Equation 16 to

(21)

and from Equation 18 to

(22)

Note that *k* = 1 corresponds to enrolling in a two year college and so is not consistent with such an observation.

Equations 13 through 22 are the probabilities for the ten possible events that can occur in the data. The log likelihood contribution for *i* when the quality of non-HBCU institutions is observed is

and the adjustments required when the quality of non-HBCU institutions is not observed involve changing the appropriate term to its replacement. As in the basic model, we maximize over over θ to get consistent, asymptotically normal estimates of θ.

## Footnotes

Ivora Hinton is the coordinator, data analyses and interpretation, at the University of Virginia’s School of Nursing. Jessica Howell is an assistant professor of economics at California State University, Sacramento. Elizabeth Merwin is associate dean at the University of Virginia’s School of Nursing. Steven N. Stern is a professor of economics at the University of Virginia. Sarah Turner is a professor of economics at the University of Virginia. Ishan Williams is an assistant professor at the University of Virginia’s School of Nursing. Melvin Wilson is a professor of psychology at the University of Virginia. The authors thank Michelle Bucci and Elizabeth Katz for their excellent research assistance. The data used in this article are available from the authors from August 2010 to July 2013.

↵1. Health professionals, along with their representation in the data analyzed, include physicians (33.2 percent), therapists (17.4 percent), dentists (14.3 percent), registered nurses (6.1 percent), pharmacists (5.1 percent), psychologists (4.6 percent), optometrists (4.1 percent), veterinarians (3.6 percent), biological scientists (3.6 percent), dieticians (2.0 percent), health technicians (1.5 percent), podiatrists (1.0 percent), and chiropractors (0.5 percent). Additionally, because we define health professional status as being in a health occupation and possessing a postbaccalaureate degree, there are a very small number of other health services technicians that are categorized as health professionals.

↵2. “Black” is used in the data set as the category for racial identification and will be used in this paper as the more inclusive term representing African Americans and other black individuals.

↵3. In the economics literature, Stinson and Thurston (2002) investigate the extent to which observed racial matching reflects patient preferences for medical services delivered by same-race physicians or physician choices in location and practice settings. While physician choices on such dimensions as location reduce the magnitude of racial concordance, Stinson and Thurston find that such effects are persistent in the data. In the medical literature, a number of studies, including Kaplan, Greenfield, and Ware (1989), Giron et al. (1998), Stewart (1995), and Ware and Davies (1983), present evidence on patient-physician racial concordance, and there are also empirical studies of racial and ethnic concordance between patients and mental health providers, substance abuse counselors, and medical students (McGinnis et al. 2006; Halliday-Boykins et al. 2005; Sterling et al. 2001). Based on this literature and a summary of empirical evidence on concordance by the U.S. Department of Health and Human Services (2006), we define health care professionals in this paper more broadly than just physicians.

↵4. The Court ruled against the undergraduate admissions policy at the University of Michigan in

*Gratz v. Bollinger et al.*and supported the “narrowly tailored” use of race by the University of Michigan law school in*Grutter v. Bollinger et al.*↵5. The Johnson administration’s call for “affirmative efforts to provide opportunities for black Americans,” combined with campus activism, led many leading colleges and universities to undertake active efforts to recruit black students to both graduate and undergraduate programs (Bowen and Bok 1998). Indeed, there were dramatic changes in the representation of black students at leading colleges and universities, with black representation in Ivy League institutions rising from 2.3 percent in 1967 to 6.3 percent in 1976 (Karen 1991). In addition, many medical schools explicitly endorsed the objective of increasing minority representation in the health professions, and the Association of American Medical Colleges (AAMC) endorsed this position in 1968.

↵6. The mid-1970s brought judicial scrutiny to efforts to increase the representation of minority students in medical schools through preferential admissions. A case involving the application of Allan Bakke to medical school at the University of California, Davis entered the legal system in 1974 and led to a landmark Supreme Court ruling in 1978. In a quite narrow ruling, the court held that admissions policies could not use a quota system or “set aside” places for minority students but that student race could be considered among other factors in circumstances where racial diversity could be thought to yield educational benefits (Bowen and Bok 1998).

↵7. Krueger, Rothstein, and Turner (2006) note that the black-white gap in the performance of 17-year old students on the National Assessment of Education Progress narrowed from over one standard deviation in 1970 to about three quarters of a standard deviation in reading (and a larger gap in math), though nearly all of the convergence occurred before 1990.

↵8. The nonconsecutive subscript numbering makes it easier to compare results from the basic model with the more complicated model that is presented later.

↵9. Note that

*X*⊆_{ji}*X*and ; however, we do not have to assume that_{i}*X*⋃_{ji}*X*= Ø for_{ki}*j*≠*k*(that is, the explanatory variables for each set can have common elements). Also, because we have assumed that there are no endogenous variables in*X*, we do not need the typical identification conditions that are usually satisfied by having, for each equation, at least one variable belonging to_{i}*X*having a zero restriction on the associated coefficient and not having zero restrictions in the other two equations for that variable._{i}↵10. We experimented to some degree with which latent variables to include in which equations. Using a Lagrange Multiplier test, we rejected null hypotheses limiting the inclusion of early choice latent variables in later choice equations. Thus, we present results throughout allowing for a full set of effects.

↵11. Note that α

_{31}is not identified if*X*_{1i}⊆*X*_{3i}, which is the case given that we have a somewhat limited set of individual attributes in our data set. Similarly, α_{41}and α_{43}are not identified if and*X*_{1i}⊆*X*_{4i}and*X*_{3i}⊆*X*_{4i}. Since we cannot separately identify these α’s from the β’s, we set α_{31}= α_{41}= α_{43}=0 in estimation.↵12. The asymptotic covariance matrix of the MLE can be estimated in the usual way as .

↵13. These decisions could have been alternatively modeled in a discrete choice dynamic programming framework, which involves specifying values of being an advanced degree health professional, of getting an advanced degree in a health field, of not getting an advanced degree in health, of finishing college, and of attending college. Such an approach would allow us to decompose the value of going to college and finishing college into a utility term and the value of later higher earnings and utility from having more education. Given the question posed in this paper, it is not clear that all of the extra modeling is worth the benefit. We feel that, as is frequently the case, a model like ours is a good first step in understanding the relevant issues prior to the investment in modeling associated with a discrete choice dynamic programming model. Additionally, a discrete choice dynamic programming framework might allow us to make some policy statements we otherwise would not be able to make. However, most of our results point to the importance of precollege events, which would not be part of the dynamic programming model, and our results show that blacks are less likely to become advanced degree health professionals, but they do not point to the reason why.

↵14. Bowen and Bok (1999) demonstrate that graduate degree completion in general and completion of an M.D. in particular, is much higher among graduates of selective colleges and universities than among the overall pool of college graduates. Among graduates of the selective

*College and Beyond*institutions, 56 percent of both blacks and whites went on to receive M.A., professional, or Ph.D. degrees; nationally, the share of college graduates completing advanced study is much lower, with 34 percent of blacks and 38 percent of whites receiving advanced degrees (Figure 4.2, Bowen and Bok).↵15. An institution’s status as historically black may be especially important for our research question regarding black representation in the health professions. According to the American Association of Medical Colleges, the top three undergraduate institutions that send black students to medical school (in percentage terms) are Xavier, Howard, and Spelman, which are all HBCUs (http://www.aamc.org/data/facts/2005/mblack.htm).

↵16. While there is some variation in institutional selectivity (our measure of quality) among HBCUs, we observe very few individuals enrolling in the highest quality HBCUs, and it is not econometrically feasible to model quality variation in HBCUs.

↵17. Note that measures the quality of non-HBCU school one can attend, while measures the net value of an HBCU relative to a non-HBCU. By allowing to affect , we permit the quality of the non-HBCU one can attend to affect the

*relative*value of attending an HBCU. Because is measuring something inherently different than , it leads to affecting but not vice versa.↵18. For some individuals in our data, we observe that they enroll in a non-HBCU four-year college, but the identity of the institution is unknown. Appendix A2 also includes the way in which choice probabilities

*P*_{2},*P*_{4}, and*P*_{6}are affected by this missing information.↵19. Due to substantial missing parental income data in the NLS-72, we use only parental educational attainment. Households with missing parental education information, consisting of 501 observations, were dropped from the sample.

↵20. Not all high school students take the SAT test; some opt for the ACT test or no college entrance exam at all. In addition, the NLS-72 survey respondents took a standardized test with sections on vocabulary, picture numbers (associative memory), reading, letter groups, mathematics, and mosaic comparisons. Using the scaled math scores and scaled reading scores, we employed regression analysis to generate imputed values for SAT score and then treated them as data.

↵21. See Johnson and Neal (1998).

↵22. The omission of various interaction terms from the health care professional equation is due to small sample sizes among black health care professionals.

↵23. Covariance terms are identified by correlation in generalized residuals a la Gourieroux et al (1987).

↵24. This result also appears when estimating the basic model with data from the National Educational Longitudinal Survey (NELS), which tracks the postsecondary choices of the high school class of 1992. Note that NELS is not suitable for estimating the complete model that includes the decision to enter a health profession with an advanced degree because NELS respondents are not followed through their career and graduate educational choices.

↵25. The omission of various interaction terms from the health care professional equation is due to small sample sizes among black health care professionals.

↵26. Focusing on the addition of HBCU choice to the model, attendance at an HBCU implies a large value of

*u*_{2}and, since ρ_{24}> 0, the large value of*u*_{2}causes to be large. However, since ρ_{23}< 0, an individual requires an unusually large value of*u*_{3}in order to graduate, and decreases because ρ_{34}< 0. In essence, is proportional to , λ(•) where is the inverse Mills ratio and for*j*=2,3.↵27. Conditioning on

*y*_{ij−1}and*X*changes the distribution of_{i}*u*_{j−1}and therefore to be consistent with the condition that*y*_{ij−1}. This affects because directly affects and because*u*_{j−1}and*u*are correlated._{j}↵28. Consider a simple binary choice model and

*y**=*X*β +*u*and*u*∼*iidF*, where we observe*y*= 1(*y**>0). Then Pr(*y*= 1) =*F*(−*X*β) and . With interaction terms, the coefficient on black*male is not meaningful because one can not go from*not*being a black male to being a black male without changing black, male, or both. Thus, the average marginal change associated with going from white male to black male is .↵29. The seven probabilities listed in Section III have nested within them the choice of college type, which expands the total number of choices from seven to 22. For example, the educational/career paths available to individuals include: (1) Do not enter college, (2) Enter an HBCU, but do not complete a degree, (3) Enter a non-HBCU of level 5 selectivity, but do not complete a degree, (4) Enter a non-HBCU of level 5 selectivity, complete a degree, but do not become a health professional with an advanced degree, and so on.

↵30. Test statistics are reported for 16 out of the 22 educational/career paths due to insufficient variation in choice probabilities for six of the possible outcomes. The six paths omitted for this reason include paths that involve becoming a health professional with an advanced degree if the undergraduate college was an HBCU, a two-year institution, or a four-year non-HBCU of selectivity level 2 (the lowest selectivity for nonspecial four-year institutions), and any path that involves choosing a “special” four-year institutions (level 1 non-HBCU).

↵31. Note that the standard trivariate normal density function has a covariance matrix with diagonal elements of Ω equal to 1.

↵32. Note that an implication of Equation 1 is that Pr[

*y*_{1i}= 0|*X*] = Pr[_{i}*y*_{1i}= 0|*X*]. Similar statements can be made about Equations 10 through 12 using Equations 1 through 3.

- Received August 2007.
- Accepted October 2008.