Abstract
We suggest the first large-scale international comparison of labor supply elasticities for 17 European countries and the United States using a harmonized empirical approach. We find that own-wage elasticities are relatively small and more uniform across countries than previously considered. Nonetheless, such differences do exist, and are found not to arise from different tax-benefit systems, wage/hour levels, or demographic compositions across countries, suggesting genuine differences in work preferences across countries. Furthermore, three other findings are consistent across countries: The extensive margin dominates the intensive margin; for singles, this leads to larger responses in low-income groups; and income elasticities are extremely small.
I. Introduction
The study of labor supply behavior continues to play an important role in policy analysis and economic research. In particular, the size and distribution of work hour and participation elasticities represent key information when evaluating tax-benefit policy reforms and their effect on tax revenue, employment, and redistribution. Several excellent surveys report evidence on elasticities for different countries and periods.1 However, the literature only reaches a consensus on certain aspects, establishing that own-wage elasticities are largest for married women and small or sometimes negative for men. In terms of magnitude, large variation in labor-supply elasticities is found in the literature, with little agreement among economists on the elasticity size that should be used in economic policy analyses (Fuchs et al. 1998). For instance, Blundell and MaCurdy (1999) report uncompensated wage elasticities ranging from –0.01 to 2.03 for married women while Evers et al. (2008) indicate huge variation in elasticity estimates. Admittedly, much of the variation across studies is due to different methodological choices, including the type of data used (tax register data or interview-based surveys), selection (for example households with or without children), the period of observation (see Heim 2007), and estimation method. Bargain and Peichl (2013) have collected empirical evidence focusing on 15 European countries and the United States. For each demographic group, they observe a large variance in estimates across all available studies, pointing to data year and estimation methods as the main sources of variation. The authors show that international comparisons based on existing evidence are generally imperfect and incomplete, with insufficient common support across studies to conclude about genuine differences in labor-supply responsiveness between countries. The only clear pattern in the literature is that elasticities are larger for women in countries where their participation rate is lower. However, estimates are missing or scarce for several E.U. countries and also some demographic groups, such as childless single individuals. Accordingly, this situation justifies a serious attempt to estimate labor-supply elasticities for a large number of Western countries in a comparable manner.
Beyond such differences in empirical methods, the following question remains: Do genuine differences that could be explained by different demographic compositions, tax-benefit systems, labor market conditions, and cultural backgrounds exist between countries? While consistent findings across a large number of countries could make some of the policy recommendations more broadly viable, inversely, contrasted results may explain different policy choices; for instance, different degrees of redistribution between welfare systems. The implicit cost of redistribution between European systems has recently received renewed attention (Immervoll et al. 2007) yet information on actual international differences in labor supply behavior was lacking. Another related question concerns whether participation decisions (the extensive margin) systematically prevail over responses in terms of work hours (the intensive margin). Indeed, this issue gives rise to the debate about whether welfare programs should be directed to the workless poor, through traditional demogrant policies, or the working poor, via in-work support (Saez 2001). Large participation responses may subsequently lead to large elasticities in the lower part of the income distribution, which is crucial for welfare analysis. (See Eissa et al. 2008.) Finally, the optimal taxation of couples, and notably the issue of joint versus individual taxation, critically relies on the knowledge of cross-wage elasticities of spouses (Immervoll et al. 2011). At present, empirical evidence on labor-supply responsiveness from an international perspective is virtually absent from the literature.2
The present paper attempts to fill this gap, providing the first set of comparable labor supply elasticity estimates for 17 E.U. countries and the United States. For this purpose, we suggest a harmonized approach that nets out possible measurement differences arising from data, periods, and methods. We benefit from a unique set of data with comparable variable definitions, estimating the same labor supply model for each country. To establish consistent cross-country comparisons, we rely on a structural discrete choice model.3 In this context, the identification is usually obtained by the nonlinearity of the tax-benefit code. This present study offers the opportunity to have the complete simulation of all direct tax and transfer instruments for 18 countries at our disposal so that we can fully exploit all nonlinearities and discontinuities in household budget constraints. In addition, we exploit some geographical (for example, across U.S. states) and time variation in tax-benefit policies for some of the countries, which allows us to estimate elasticities for all demographic groups including childless singles and individuals in couples; this makes the present study very comprehensive compared to existing studies, which typically focus on particular groups.
Our estimations are conducted on 25 representative microdata sets covering 18 countries and two years of data for seven countries. The data sets cover a relatively short time period (1998–2005), which facilitates cross-country comparison. We provide detailed estimates of own-wage elasticities for single individuals and individuals in couples, cross-wage elasticities for couples, and income elasticities for all groups. We analyze the distribution of elasticities across income groups and decompose labor-supply responses between intensive and extensive margins. Using a flexible random utility model admittedly renders our results immune to the risk of a systematic bias caused by restrictive assumptions on preferences. Nonetheless, we check whether elasticities vary with the form of the utility function, the way we introduce additional flexibility (fixed costs or mass points on certain part-time options), or the hour choice set (from four choices to a much finer discretization closer to a continuous model). The complete analysis is based on nine different specifications, three demographic groups, and 25 different countries × periods; hence, a total of 675 maximum likelihood (ML) estimations.
Our results show that own-wage elasticities, both compensated and uncompensated, are relatively small and much tighter across countries than suggested by results in the literature. In particular, estimates for married women lie in a narrow range between 0.2 and 0.6, with significantly larger elasticities obtained for countries in which female participation is lower (Greece, Spain, Ireland). Elasticities for married men, expectedly smaller, are even more concentrated while elasticities for single individuals show substantial variation with income levels. Consistent results are also found across countries with important implications for welfare and optimal tax analysis: the extensive margin systematically dominates the intensive margin; for single individuals, this contributes to larger elasticities in low-income groups in most countries; income elasticities are extremely small. The one area where differences remain concerns the cross-wage effects, consistent with substitution in spouses’ household production in Western Europe and complementarity in their leisure in the United States. Using a decomposition analysis, we rule out differences in tax policy, wage/hours levels, and demographics as explanations for cross-country differences in labor supply responses. Accordingly, our results are consistent with Western countries having genuinely different individual and social preferences—for example, different preferences for work and childcare institutions.
II. Common Empirical Approach
The principal object of examination in this study is the size of wage and income elasticities, which are standard representations of labor-supply responsiveness and particularly convenient in terms of conducting international comparisons. While the ideal methodological situation would be to use a generally agreed-upon standard estimation approach, there is no such consensus on this matter. We have opted for the estimation of discrete choice models. This approach is based on the concept of random utility maximization (see van Soest 1995; Hoynes 1996, among others), which requires the explicit parameterization of consumption-leisure preferences, for utility to be evaluated at each discrete alternative. It is not necessary to impose tangency conditions, and in principle the model is very general.4
Labor-supply decisions are reduced to choosing among a discrete set of possibilities—for example, inactivity, part-time, and full-time. In this way, both extensive and intensive margins are directly estimated; the complete effect of the tax-benefit system is easily accounted for, even in the presence of nonconvexities in budget sets; work costs, which also create nonconvexities; and joint decisions in couples are dealt with in a relatively straightforward manner.
Our methodological choice was guided by two considerations, the first of which involved the need to conduct consistent comparisons across many countries. The only realistic way of doing so was to estimate the same structural, discrete choice model separately for each country, which compels with our attempt to net out all methodological differences that hinder international comparison. The second key issue is the identification of behavioral parameters, with the main problem that unobserved characteristics (for example, being a hard-working person) may influence both wages and work preferences to potentially bias estimates obtained from cross-sectional wage variation across individuals. In the traditional approach (for example, in MaCurdy et al. 1990), hours of work are regressed on the aftertax wage and on virtual income. The validity of the instrumental variable estimator hinges on whether the exclusion assumptions of the economic model hold.5 Therefore, a preferred approach consists of using policy changes to directly identify responses to exogenous variation in net wages. One may rely on a particular tax reform (see Eissa and Hoynes 2004, among others) or long-term variations (Blundell, Duncan, and Meghir 1998; Devereux 2004).6 In our approach, identification is mainly provided by nonlinearities, nonconvexities, and discontinuities in the budget constraint due to the tax-benefit rules of each country. Closer to the natural experiment method, some exogenous variation also stems from spatial and time variations in these rules, as discussed below.
A. Model and Identification
We opt for a flexible discrete choice model, as used in well-known contributions for Europe (van Soest 1995; Blundell et al. 2000) or the United States (Hoynes 1996; Keane and Moffitt 1998). We refer to these studies for more technical details, simply presenting the main aspects of the modeling strategy. In our baseline, we specify consumption-leisure preferences using a quadratic utility function with fixed costs. Accordingly, the deterministic utility of a couple i at each discrete choice j = 1, …,J can be written as:
(1)
with household consumption Cij and spouses’ work hours and
. The J choices for a couple correspond to all combinations of the spouses’ discrete hours (for singles, the model above is simplified to only one hour term Hij, and J is simply the number of discrete hour choices for this person). Coefficients on consumption and work hours are specified as:
(2)
(3)
(4)
that is they vary linearly with several taste-shifters Zi (including polynomial form of age, presence of children, or dependent elders and region). The term αci also incorporates unobserved heterogeneity, in the form of a normally distributed term ui, for the model to allow random taste variation and unrestricted substitution patterns between alternatives. The normality assumption is mainly made for convenience, and in principle could be replaced by a more flexible distribution (for instance, a discrete distribution with a finite number of mass points, see Hoynes 1996). The fit of the model is improved by the introduction of fixed costs of work, estimated as model parameters as in Callan, van Soest, and Walsh (2009) or Blundell et al. (2000). Fixed costs explain that there are very few observations with a small positive number of worked hours. These costs, denoted for k = f,m, are nonzero for positive hour choices and depend on observed characteristics (for example, the presence of young children).
As discussed above, this approach allows us to impose very few constraints on the model. In fact, there is nothing to impose in terms of leisure (see van Soest, Das, and Gong 2002). This is especially the case as the utility from leisure is not nonperametrically identified from fixed costs of work. For instance, only very few people work a short week, usually because of these costs—which also could be picked up a flexible utility function. Furthermore, work may not be a source of disutility, as in textbook models, if staying at home is seen as a depressing activity; namely, fixed costs of work could be negative for some people. Hence, we do not attempt to interpret them literally—that is as an income deflator—rather, we express them in utility metric. They may also pick up other, nonmonetary fixed costs of work, or account for international differences in institutional settings that are not explicitly modeled—for example, differences in childcare support in the form of subsidies or free childcare at school.7
The only restriction to our model is the imposition of increasing monotonicity in consumption, which seems a minimum consistency requirement for meaningful interpretation and policy analysis. Positive marginal utility of consumption is directly imposed as a constraint in the likelihood maximization.8 The potential restrictions due to the choice of this functional form are examined in Section III.D.
For each labor supply choice j, disposable income (equivalent to consumption in the present static framework) is calculated as a function
(5)
of female and male earnings, nonlabor income yi, and household characteristics Xi. The tax-benefit function d is simulated using calculators that we present in the next section. In the discrete choice approach, disposable income only needs to be assessed at certain points of the budget curve. Male and female wage rates and
for each household i are calculated by dividing earnings by standardized work hours, rather than actual hours, in order to reduce the so-called division bias. We estimate a standard Heckman-corrected wage equation to predict wages. To further reduce the division bias, we predict wages for all observations, rather than only for nonworkers. (Note that we use the inverse Mills ratio in the prediction to account for difference between the two groups.) The two-stage procedure—namely first estimating wage rates and subsequently using them in the labor supply estimation—is common practice. (See Creedy and Kalb 2005.)9 However, ignoring the wage prediction errors in a nonlinear labor-supply model would lead to inconsistent estimates of the structural parameters. We take these error terms explicitly into account in the labor-supply estimations, assuming that they are normally distributed and following van Soest (1995).
The stochastic specification of the labor supply model is completed by independent and identically distributed error terms ϵij for each choice j = 1, …, J. That is, total utility at each alternative is written
(6)
with Uij defined in Expression 1. Error terms are assumed to represent possible observational errors, optimization errors, or transitory situations. Assuming that they follow an extreme value type I (EV-I) distribution, the (conditional) probability for each household i of choosing a given alternative j has an explicit analytical solution:
(7)
The unconditional probability is obtained by integrating out the two disturbance terms—that is, preference unobserved heterogeneity and the wage error term, in the likelihood. In practice, this is achieved by averaging the conditional probability Pij over a large number of draws for these terms so the parameters can be estimated by simulated maximum likelihood. We proceed with simulated ML yet rely on Halton draws of these residuals.10
Identification. The model accounts for the comprehensive effect of tax-benefit policies on household budgets with nonlinearities and discontinuities from tax-benefit rules providing a usual source of identification to models estimated on cross-sectional data. (See van Soest 1995; Blundell et al. 2000.) Precisely, individuals with the same gross wage usually receive different net wages. Indeed, given that they are characterized by different circumstances Xi (different marital status, age, family compositions, home-ownership status, disability status) or levels of nonlabor income yi, their effective tax schedules are different, that is different actual marginal tax rates or benefit withdrawal rates.11
In addition, regional variation in tax-benefit rules generates additional exogenous variation, and can be identified in our data and policy simulations for many countries. For the United States, variation across states in income tax and EITC is a well-known source of variation. (See Eissa and Hoynes 2004; Hoynes 1996.) For E.U. member states, local variation in housing benefit rules can be identified for some countries in our samples/simulations (for instance, variations across “départements” in France or municipalities in Finland). In Estonia, Hungary, and Poland, local governments provide different supplements to almost all benefits, including child benefits/allowances and social assistance. For Germany and Italy, regional variation in benefit rules also exists and is accounted for. Nordic countries operate national and local income taxation, which we account for in the case of Sweden and Finland (with municipal flat tax rates varying from 16–21 percent in Finland and 29–36 percent in Sweden). In the United Kingdom, the council tax varies between the four main regions. Local taxes on dwelling vary with Belgian regions. Regional variations in church tax rates are significant in Finland and Germany while social insurance contributions can vary by region (for example, in Germany).12
Finally, we can avail of two years of data for seven countries. The three-year interval between the two corresponding tax-benefit systems, 1998 and 2001, covers a period of time where significant tax-benefit reforms took place. We discuss and explore this additional source of exogenous variation in Section IV.C.
Elasticities. While labor-supply elasticities cannot be derived analytically in the present nonlinear model, they can be calculated by numerical simulations using the estimated model. For wage (income) elasticities, we simply predict the change in average work hours and participation rates following a marginal uniform increase in wage rates (nonlabor income). We have checked that results are similar when wage elasticities are calculated by simulating either a 1 percent or a 10 percent increase in gross wages (unearned incomes). For income elasticities, we give a marginal amount of capital income to households with zero capital income in order to include them in the calculation. For couples, cross-wage elasticities are obtained by simulating changes in female (male) hours when male (female) wage rates are increased. Standard errors are obtained by repeated random draws of the model parameters from their estimated distributions and recalculating elasticities for each draw.
B. Data, Selection, and Tax-Benefit Simulations
We focus on the United States, the E.U. 15 member states (except Luxembourg), and three new member states (NMS), namely Estonia, Hungary, and Poland.13 For each country, we draw information about incomes and demographics that can be used for detailed tax-benefit simulations and labor-supply estimations from standard household surveys (data sources are specified in Appendix 1). For the EU-15, the data sets have been assembled within the framework of the EUROMOD project (see Sutherland 2007) and combined with tax-benefit simulations for either 1998, 2001, or both. When available we use both data years for a country. For the NMS, data were collected only for 2005, and policies simulated for that year, in a more recent development of the EUROMOD project.14 For the United States, we use the 2006 Current Population Survey (IPUMS-CPS), which contains information for 2005. Data sets have been harmonized within the EUROMOD project, in the sense that similar income concepts are used together with comparable variable definitions (for example for education). We explain this in more detail in Appendix 1 and, for the wage estimation, in Appendix 2. For each country, we extract three samples for the purpose of labor supply estimations: couples, single men, and women (which include single mothers). We only retain households where adults are aged between 18 and 59, available for the labor market (not disabled, retired, or in education), and we also exclude self-employed, farmers, and “extreme” situations, including very large families and those who report implausibly high levels of working hours.
Tax-benefit Simulations. For each discrete choice j and each household i, disposable income Cij is obtained by aggregating all sources of household income and calculating benefits received and taxes and social contributions paid. We cover all direct taxes (labor and capital income taxes), social security contributions, family, and social transfers. These tax-benefit calculations, represented by function d() in Expression 5, are performed using tax-benefit simulators together with information on income and sociodemographics Xi (for instance, the children composition affecting benefit payments), as previously indicated. For Europe, we use EUROMOD, a calculator designed to simulate the redistributive systems of all the EU-15 countries and of some of the NMS, which includes simulation of all direct taxes, payroll tax (social security contributions), social, and family benefits. An introduction to EUROMOD, a descriptive analysis of taxes and transfers in the European Union and robustness checks is provided by Sutherland (2007). EUROMOD has been used in several empirical studies, notably in the comparison of European welfare regimes by Immervoll et al. (2007, 2011). For the United States, calculations of direct taxes, contributions, and tax credits (EITC) are conducted using TAXSIM (version v9), the NBER calculator presented in Feenberg and Coutts (1993), augmented by simulations of social transfers (TANF, Food Stamp). Tax-benefit simulations for the United States are used in combination with CPS data in several applications (for example, Eissa, Kleven, and Kreiner 2008).15 We assume full benefit takeup and tax compliance. More refined estimations accounting for the stigma of welfare program participation—as, for example, in Keane and Moffitt (1998) or Blundell et al. (2000)—would require precise data information on the actual receipt of benefits, which is not always available or reliable in interview-based surveys. Chan (2013) has recently suggested an extension of Keane and Moffitt (1998) to a dynamic discrete-choice model of labor supply with welfare participation as well as time limits and work requirements. While these features are important in the U.S. context, they are less so for European countries.
Statistics. Descriptive statistics of the selected samples are presented in Appendix 1. For married women, mean worked hours show considerable variation across countries, which is essentially due to lower labor market participation in southern countries (with the noticeable exception of Portugal), Ireland and, to a lesser extent, Austria and Poland. While the correlation between mean hours and participation rates is 0.92, there is some variation in work hours among participants, with shorter work duration in Austria, Germany, Ireland, the Netherlands, and the United Kingdom. The participation of single women is lower in Ireland and the United Kingdom due to the larger frequency of single mothers. (The average number of children among single women is highest in these two countries and Poland.) There is much less variation for men, with the main notable fact being a lower participation rate for single compared to married men. Moreover, the variation in wage rates and demographic composition across countries is also noteworthy. Especially for married women, participation rates are correlated with wage rates (corr = 0.36) and the number of children (–0.61). Attached to these patterns, there may be interesting differences across countries in the responsiveness of labor supply to wages and income, and we turn to this central issue in the next sections. In Appendix 1, we take a closer look at the distribution of actual worked hours. For men, this shows the strong concentration of work hours around full time (35–44 hours per week) and nonparticipation. There is more variation for women, particularly with the availability of part-time work in some countries: A peak at 15–24 hours can be seen in Belgium or 25–34 hours in France, where some firms offer a 3/4 of a full-time contract, while the Netherlands shows high concentration in these two segments. The United States is characterized by a particularly concentrated distribution, around full-time and inactivity, and a relatively high rate of overtime. To accommodate the particular hours distribution of each country while maintaining a comparable framework, we suggest a baseline estimation using a seven-point discretization—that is, J = 7 for singles and J = 7 × 7 for couples, with choices from 0 to 60 hours/week (steps of ten hours). Below, we check the sensitivity of our results to alternative choice sets.
III. Results
Before presenting and discussing a large set of results regarding elasticities, we comment on the model estimation and how it fits the data.
A. Estimates and Goodness-of-Fit
Labor supply estimations are conducted for each country separately yet with the same specification (except the “region” variable, which is country-specific). Estimations are also carried out for couples, single men, and single women separately, with the results reported in Appendix 3. To summarize results concerning our model estimations, we can say that parameter estimates are broadly in line with usual findings. For instance, as expected, the presence of children significantly decreases the propensity to work for women, both in couples and single mothers, in most countries. Taste shifters related to age are often significant for women in couples yet not systematically for other demographic groups. The constant of the cost of work is significantly positive for all groups while the presence of young children most often has a significantly positive impact on the work cost of women. For single men and women, higher education leads to lower costs, which can be interpreted as demand-side constraints in the form of lower search costs. (See van Soest, Das, and Gong 2002.) We cannot truly directly compare preferences across countries, given the large number of model parameters. While a simpler model would allow us to do so—for instance, a LES specification—it would certainly be too restrictive. Hence, we directly focus on the comparison of labor supply elasticities in the next subsection.
Log-likelihood and pseudo coefficient of determination (R2), reported with the estimates in Appendix 3, convey that the fit is reasonably good: 0.31 on average for couples (0.28 for singles), from 0.23 for the United Kingdom to 0.45 for Poland (from 0.14 to 0.40 for singles). For couples, Table 1 shows that mean predicted hours compare well with those observed, with the discrepancy less than 1 percent in most cases. There are some exceptions, with larger discrepancies for women in Portugal, Greece, and Spain. For the two latter countries, we report the distribution of observed and predicted frequencies for each choice underneath Table 1. We use a four-choices model for the ease of reading, reporting the 16 combinations (Hr, Hm) for couples. We can see that the option (40, 40) is slightly underestimated, while option (0, 40) is overpredicted. However, the overall distributions of observed and predicted hours even compare relatively well for these countries. For all countries, we have checked that satisfying comparisons at the mean do not hide wrong hour distributions. As an illustration of this, we report two additional graphs in cases where mean hours are correctly predicted (France and the Netherlands), confirming that the underlying distributions of predicted and observed choices are also well in line. Indeed, these same conclusions are obtained for the model with J = 7 choices. For single individuals, mean predicted and observed hours compare well for many countries, as shown in Table 2. However, the fit is not as good as for couples, which is a typical result in the literature (Blundell and MaCurdy 1999).16 The discrepancy is less than 5 percent in almost all cases. For three cases with the largest discrepancies (Belgian women 1998, Irish men 1998, and Portuguese women 2001), we present the hour distributions underneath Table 2 (baseline situation with seven choices). Differences are generally due to bad predictions in terms of participation, as is the case for Irish single men (Portuguese single women), where nonparticipation is over(under)predicted. It is also due to the model not being able to reproduce the hours distribution for the workers well. This is the case for Belgian women, for whom participation rates are well predicted yet part-time options are overpredicted at the expense of full-time. This is also the case when the overall fit is good, such as in the case of French men 2001 reported in our illustration.17 The overall conclusion is that the model performs relatively well, which provides reassurance regarding the reliability of our elasticity measures.
Predicted and Observed Mean Hours: Couples
Predicted and Observed Mean Hours: Singles


B. The Size of Own-Wage Elasticities
Our main results on labor supply elasticities are illustrated in the graphs below, and are also reported in the Appendix 4 tables, which contain detailed own-wage hour elasticities, compensated and uncompensated, overall, and for quintiles of disposable income.18 We start with own-wage elasticities, as reported in Figure 1.19
Own-Wage Elasticities: Total Hours
Results for Married Individuals. We first focus on married women, the group mostly studied in the literature.20 Total hour elasticities are to be found in a very narrow range 0.2–0.4 for several countries (Austria, Belgium, Denmark, Germany, Italy, and the Netherlands), while they are slightly smaller, around 0.1–0.2, yet significantly different from zero, in France, Finland, Portugal, Sweden, the NMS, the United Kingdom, and the United States. Furthermore, they are significantly larger, between 0.4 and 0.6, in Ireland (1998), Greece, and Spain. Accordingly, our results show that elasticities are relatively modest and hold in a narrow interval, comparable once data sets, selection, and empirical strategies are used.21 However, estimates are sufficiently precise so that differences between the three aforementioned groups of countries are statistically significant. Over all countries and periods, the mean hour elasticity is 0.27 with a standard deviation of 0.16. The simple intuition that elasticities are larger when female participation is lower is broadly confirmed by the data; that is, the cross-country correlation between mean wage hour (participation) elasticities and mean worked hours (participation rates) is around –0.81 (–0.84). In Tables A8–A9, we show that elasticities are only slightly larger for women with children. They are significantly larger in a few countries and notably in the high-elasticity group (Greece, Spain, and Ireland 1998).22 For married men, results are even more compressed, with own-wage elasticities usually ranging between around 0.05 and 0.15 (see Figure 1). Over all countries/periods, the mean hour elasticity is 0.10, with a standard deviation of 0.05. Estimates are precise enough to find statistical differences across some countries yet are less pronounced than for women. The correlation between elasticities and worked hours (participation) is around –0.41 (–0.64). Compared to some of the older literature, we find total hour elasticities that are significantly larger than zero. However, as discussed below, pure intensive margin elasticities are very close to zero.
Results for Single Individuals. While there are numerous studies on the labor supply of single mothers in the United Kingdom and the United States, by contrast, and despite the large increase in the number of childless single individuals over the last few decades, the labor-supply behavior of single women and men has received relatively little attention. The main reason is probably that most of the policy reforms used to estimate labor supply responses in the United States and the United Kingdom concerned families with children. In this way, the present study adds valuable information to the literature by providing new estimates for all three groups and many countries. As seen in Figure 1, elasticities for single men show a little more variation than for married men, usually in a range between 0 and 0.4. They are significantly different from zero in most cases, with some exceptions. Overall, estimates are slightly larger than for married men, which is in line with lower participation rates and attachment to the labor market among young single individuals. This is particularly the case in Spain and Ireland, where estimates are significantly larger than in other countries. The number of single men with children is marginal and we do not need to discuss it. We observe some variation among single women (mean estimates for the pooled childless women and single mothers), usually between 0.1 and 0.5 with larger elasticities for some countries (around 0.6–0.7 in Belgium and Italy). Single mothers tend to have larger elasticities than childless women yet differences are usually not significant (with notable exceptions of Greece and Ireland).23 The correlation between elasticities and worked hours (participation) among single individuals is usually smaller than for couples: –0.50 (–0.50) for women and –0.32 for men (–0.46).
C. International Comparisons
We have established that international differences in the magnitude of wage elasticities are modest provided comparable data sets, selection, and a common empirical approach are used. This is an interesting result, given the substantial differences across countries in terms of labor market conditions, institutions, and preferences/culture. Nonetheless, we have found significant differences between broad groups of countries, as discussed above, which we investigate more thoroughly in Section IV. We now focus on interesting regularities and salient differences between countries.
Extensive versus Intensive Margins. In Figure 2, we decompose total hour elasticities (that is, changes in total work hours due to a marginal wage increase) into hour changes among workers (intensive margin) and hour changes due to participation responses (extensive margin), and clearly see that most of the response is driven by the extensive margin. This result is important for tax and welfare analyses, as motivated in the introduction. The literature has documented this for a few countries. (See Heckman 1993 for the United States; Bargain and Peichl 2013 for many countries).24 However, our results show that this pattern holds almost systematically across many Western countries and for all demographic groups. Even in the rare situations where the intensive margin is nonzero, the extensive margin is larger (for example, for Dutch married women). For singles, largest participation responses come from low-income groups, as discussed in further detail below.
Own-Wage Elasticities: Intensive Versus Extensive Margins
The intensive elasticities are extremely small for all countries and all demographic groups—for example, lower than 0.08 for married women in all countries (except the Netherlands). Intensive margin elasticities are sometimes negative for men in couples (for example in the United Kingdom), single men (for example, Belgium, Portugal, and Ireland 1998), and single women (Denmark). Small responses at the intensive margin are mainly due to the few possibilities of working part-time in most countries. Among exceptions where responses are significant, the extreme case is married women in the Netherlands, with an intensive margin representing almost half of the response. We conjecture that this is due to the outstanding role of part-time work in this country and the possibility of adjusting labor supply along this margin. (On average, around 25 percent of prime-age working women work part-time in the OECD, around 50 percent do in the Netherlands: Compare Table A2 and the discussion in the data section.) Supply-side interpretations of hour restrictions—for example, in terms of job search—are discussed in the robustness checks below and the concluding section.
Distribution of Own-wage Elasticities by Income Groups. In the tables of Appendix 4, we provide the distribution of own-wage elasticities of total hours by quintiles of the income distribution (with quintiles defined for couples and singles separately). This information is represented graphically in Figure 3, with a box-plot showing the cross-country dispersion for each quintile. In Figure 4, we show the detailed distribution of elasticities across quintiles, separately for each country. The first striking result is that there is much more variation than when only considering mean elasticities. For all groups except married men, elasticities for some income quintiles can go up to one. More precisely, for single individuals, the distribution of elasticities across income groups shows a clearly decreasing pattern, with largest elasticities for lower quintiles. The fact that elasticities may be very heterogeneous across different earning groups—and that participation elasticities can be significantly larger at the bottom of the distribution—is crucial for welfare analysis. (See Eissa, Kleven, and Kreiner 2008; Saez 2001.) However, very few studies report this kind of information.25 Our results generalize it, and show that participation elasticities indeed drive the large responses in lower quintiles for single individuals.
Own-Wage Elasticities by Income Quintile (Box Plots Over All Countries)
Wage Elasticities by Income Quintile
Results for married women do not show such a pattern, in fact pointing to larger elasticities at the top, while Eissa (1995) finds similar results for the United States. This is consistent with the added worker theory (see Blundell, Pistaferri, and Saporta-Ekstein 2012)—namely that women in poor households must complete family income while the labor supply of those in wealthier families is sensitive to financial incentives. For married men, our results show a flat or decreasing pattern, closer to that of singles, although there are some exceptions (that is an increasing pattern in France, Italy, Spain, and the United Kingdom). Results are usually not driven by a decreasing intensive margin, but, again, rather by the participation margin. In fact, for some countries like the United States, elasticities decrease with income along the extensive margin while the intensive margin (the difference between total and extensive effects in Figure 4) seems to increase with income. This is in line with the elasticity of taxable income literature, which reports more responses at the top (admittedly due to margins not accounted for here, yet also to more adjustment possibilities for top earners). Other countries (for example, the United Kingdom) show intensive elasticities becoming negative for higher incomes, more in line with backward-bending labor supply curves.
Cross-wage Elasticities. Perhaps the most interesting difference across countries is the measure of cross-wage elasticities within couples, with estimates of uncompensated elasticities plotted with confidence intervals in the lefthand side graph of Figure 5 and reported in the tables of Appendix 4. While these are usually negative and smaller in absolute value than own-wage elasticities, they are nonetheless sizeable for women in some countries, including Austria, Denmark, Germany, and Ireland, which is not an unusual result. (See, for example, Callan, van Soest, and Walsh 2009.) Cross-wage elasticities are much smaller (in absolute terms) for men, between –0.05 and 0 in most countries. Income effects being small, compensated cross-wage elasticities are close to uncompensated ones. We plot compensated elasticities for both men and women on the righthand side graph of Figure 5, in order to easily check the complementarity or substitution between spouses’ working hours. With sufficient complementarity, an increase in one spouse’s wage must increase both spouses’ working hours—that is, cross-wage elasticities are positive. Interestingly, this situation seems to characterize the United States. (Elasticities are small but significant.) It sounds reasonable that spouses enjoy spending time together, and all the more so as free time is relatively more scarce than in Europe and more likely to coincide with pure leisure. An alternative explanation could be higher assortative mating on productivity levels (compared to Europe). However, recent evidence in an intertemporal framework by Blundell, Pistaferri, and Saporta-Ekstein (2012) tends to support the former explanation. By contrast, our results point to substitutability between male and female working hours in most European countries. This is consistent with, yet not exclusively explained by, the fact that nonmarket time of European couples is more often associated with household production. (See Freeman and Schettkat 2005.) Four countries show an apparently asymmetrical situation. In fact, only the female cross-wage elasticity is positive in Poland and Hungary. (Male elasticity is not significantly different from zero.) For Spain 2001 and Italy, cross-wage elasticities are negative for men and positive for women (a similar result exists for low income groups in Aaberge, Colombina, and Wennemo 2002) yet female elasticities are not significantly different from zero. Finally, note that a large literature has attempted to test restrictions of the unitary or collective household models. (For instance, Browning and Chiappori 1998 reject Slutsky conditions on data for couples but not for singles.) Using confidence intervals for compensated cross-wage elasticities (not reported), we find rather similar patterns as for uncompensated elasticities. For the majority of countries, we cannot reject Slutsky symmetry and, hence, the unitary model.
Cross-Wage Elasticities
Income Elasticities. Income elasticities are plotted in Figure 6 and reported in the tables of Appendix 4.26 As often in the labor supply literature, income elasticities are very close to zero and negative for a majority of countries. (See Blundell and MaCurdy 1999; insignificant income effects are also found in the literature on taxable income elasticities, see Saez, Slemrod, and Giertz 2012.) They are positive for some countries yet rarely significant in this case, with the main exceptions being Finland and Sweden.27 Considering the estimates more closely, we find that this result is driven by singles without children, located in the lowest income quintiles, and responding along the participation margin. The fitting explanation is that Nordic countries are characterized by stricter asset-tests for social assistance than other E.U. countries. (See Eardley et al. 1996.) Hence, cross-sectional variation may capture the fact that those among the least productive singles in Nordic countries with nonlabor income are more likely to work, given that they are not eligible for welfare.
Income Elasticities
Finally, let us make a few remarks. First, the literature on optimal taxation usually assumes income effects to be zero in order to simplify the derivation of optimal tax rules. (See Saez 2001.) Our results tend to support this assumption. Second, one might ask “what is small?” For comparison, own-wage elasticities for women are computed with a 1 percent wage increment that corresponds, in additional weekly income, to between 2 and 15 times (across countries, on average) the increment in weekly nonlabor income used for income elasticity calculation. Third, for couples, male and female income elasticities are very similar (this is not directly visible from the graphs), although exceptions include Italy, Spain, and France. When ignoring Italy, where male income elasticities are very negative, the correlation between married men and women’s income elasticities is 0.79.
D. Sensitivity Checks
We suggest an extensive sensitivity analysis, focusing on married women, which is the main group studied in the recent literature.
Improving Identification: Policy Reforms. As previously discussed, identification is often improved by pooling several years of data in order to exploit exogenous variation in net wages stemming from policy reforms. For seven countries, we have two years of data at our disposal, 1998 and 2001. Indeed, the three-year interval coincides with significant reforms in these countries, including tax credit reforms in the United Kingdom (1999), France, and Belgium (2001), significant changes in income tax schedules in Germany, Spain, and Ireland, and several changes in transfers. A very detailed review of these policy changes is suggested in Appendix 5. We reestimate the labor supply model for each country by pooling the two years of data and assuming stable preferences over the period, with results plotted in Figure 7 and reported in Appendix Table A12. The important point is that the overall picture does not change. For 11 of the 14 country × year observations, results are essentially unchanged compared to baseline estimates. However, for France 1998 and Spain 1998, elasticities are now smaller and more similar to those of 2001, confirming that France (Spain) is placed in the group of countries with low (high) elasticities. For Ireland 2001, the elasticity is now more similar to the 1998 estimate, placing this country in the high-elasticity group.
Pooling Years to Improve Identification
Specification Check. We have argued that models with discrete choices are very general, given that they do not require imposing much constraint on preferences and allow accounting for complete tax-benefit policies affecting household budgets. Nonetheless, as discussed in Section II, we may check whether our estimates are sensitive to several crucial aspects of the model specification. Results of these extensive robustness checks are provided in Appendix Table A13. The first row of each panel in this table corresponds to the baseline, namely a seven-choice model with quadratic utility and fixed costs, whereby elasticities are obtained by averaging expected hours over all observations (frequency method).
Firstly, results are not sensitive to the way that we calculate elasticities. (See discussion in Appendix 5 of frequency versus calibration methods.) Secondly, and more importantly, we check whether the main restriction of the model plays a role—that is, the fact that the choice set is discretized. The fourth and fifth rows of each panel in Table A13 report elasticities when alternative choice sets are used—namely, a discretization with 4-and 13-hour choices. The model with J = 4 choices for singles (4 × 4 = 16 for couples) essentially captures the commonly agreed durations of work: nonparticipation (0), part-time (20), full-time (40), and overtime (50 hours/week). However, such a model does not adapt particularly well to the hour distribution of each country. The narrower discretization with 13 choices, from 0 to 60 hours/week with a step of five hours, and 13 × 13 = 169 combinations for couples, is more computationally demanding. However, it may pick up more country-specific peaks in hour distributions and, in fact, makes it closer to a continuous model. Interestingly, Table A13 shows that results are very similar in all three cases (J = 4,7, and 13), with only slightly larger elasticities observed in the four-point case for some countries (for example, Belgium and Ireland).
Finally, we check whether elasticities are sensitive to the functional form. Similar to van Soest, Das, and Gong (2002) for the Netherlands, we experiment alternative specifications by increasing the order of the polynomial in the utility function: quadratic (baseline) then cubic and quartic (Rows 6 and 7 of the panels in Table A13). We also change the way flexibility is gained in the model by replacing fixed costs of work, as used in Blundell et al. (2000), using part-time dummies (last rows in Table A13). Precisely, we include dummies at the 10, 20, and 30 hour choices in the 7-choice model, as used in van Soest (1995). These parameters may be interpreted as job search costs for less common working hours, therefore including some of the labor market restrictions on the choice set.28 Results for these different specifications are relatively stable: The size of elasticities hardly changes across the different modeling choices.29 This result reinforces our main conclusions regarding international comparisons.
IV. Assessing Cross-Country Differences in Elasticity Size
The evidence presented above suggests that cross-country differences in elasticities remain, even after controlling for methodological differences. Accordingly, we attempt to isolate important factors explaining these differences in this section. We still focus on married women, mainly because this group shows the largest variation in elasticities across countries.
A. Wage and Labor Supply Levels
Hour and participation elasticities are strongly correlated with mean hours and participation levels across countries. Here, we check that larger elasticities in countries such as Greece, Ireland, and Spain are not simply due to the hour and wage levels. Denote ϵc = ∂ Hc / ∂wc)(wc / Hc) the hour elasticity for country c. We recompute elasticities as , using the country-specific responsiveness ∂ Hc / ∂wc while holding hour and wage at the mean levels
and
for all countries (adjusted for PPP differences in the case of wages). We focus on own-wage elasticities of total hours, reporting the results in Figure 8. The upper left panel compares elasticities in the baseline (circles) and in this “mean levels” scenario (triangles) together with their 95 percent bootstrapped confidence intervals. The two scenarios are plotted one against the other in the upper right panel. We observe little difference when holding wages and hours constant, with the only exceptions being Estonia, Hungary, and Portugal (the United States), which are pushed in the high (low) elasticity group under the mean level scenario. This is clearly due to the NMS and Portugal (the United States) having significant lower (higher) wage rates while their female participation rates are somewhat close to the international average. The lower left (right) panel represents the “mean hour” (“mean wage”) scenario, where only hours (wages) hold at the international mean value H (w). We see that high-elasticity countries like Greece and Spain are not only characterized by lower female labor supply but also by lower wage rates. However, these two effects cancel each other; consequently, these countries remain in the high-elasticity group under the total mean level scenario. The main message of this exercise is that cross-country differences are preserved when elasticities are evaluated at mean values, and must therefore be explained by other factors.30
Effect of Wage/Hour Levels on Wage-Elasticities of Total Hours (Married Women)
B. Tax-benefit Systems
The size of hour elasticities might be influenced by differences in tax-benefit systems across countries. Precisely, baseline elasticities are calculated by incrementing gross wages by 1 percent, as is common in the literature. Accordingly, the fact that high-tax countries are characterized by smaller net wage increments could explain smaller elasticities. To check this point, we simulate a 1 percent increase in the net wage in order to cancel out differences in effective marginal tax rates (EMTR) across countries due to different tax schedules or benefit withdrawal rates. Figure 9 reports total hour elasticities in the baseline and this “net-wage increment” scenario. The right panel plots the two situations, while the left panel additionally indicates the 95 percent bootstrapped confidence intervals. Elasticities after a 1 percent increase in net wage are generally larger; indeed, a 1 percent change in gross wages corresponds to smaller increments due to taxation. However, and most importantly, cross-country variation in elasticities is not truly affected when accounting for differences in implicit taxation of labor income.
Effect of Tax-Benefit Systems on Wage-Elasticities of Total Hours
C. Demographic Characteristics
We finally turn to the role of demographic composition. As indicated in Section III.B, important differences exist across countries in this respect, notably concerning the number of children yet also the age and education structure. Given that it is plausible that these demographic differences affect the size of mean elasticities, we decompose differences in elasticities across countries to investigate this point, using an approach similar to that in Heim (2007). Let i denote a woman’s age cohort, j her education group, and k the number of her children.31
Let εijk,c denote the wage elasticity of total hours for a woman of type ijk in country c. The mean elasticity in this country, εc, can be written as a weighted average ∑i ∑j ∑k Pijk,c εijk,c, where Pijk,c denotes the proportion of women of type ijk in this country. This proportion can be rewritten as Pijk,c = Pi,cPj|i,cPk|ij,c where Pi,c denotes the proportion of women in age cohort i in country c, Pj|i,c the proportion of women in education group j given membership in age cohort i, and Pk|ij,c denotes the proportion of women with k children given membership in age cohort i and education group j. Letting P denote the mean proportion of a certain type over all countries, the proportion Pijk,c can be expressed as:
(8)
This expression can be used to decompose the mean elasticity where denotes the mean elasticity for type ijk over all countries:
(9)
The decomposition starts with the overall mean weighted elasticity, a term common to all countries, while the next term denotes how elasticities vary due to the different composition of age cohorts, keeping the distributions of education and family size constant within an age group. Keeping the distribution of the number of children within education levels constant, the variation in elasticities due to different education levels is captured in the third component. The fourth term indicates the difference in elasticities due to different distributions of family size, and the last component denotes the difference in elasticities left explained by different elasticities within an age-education-children cell, which can be interpreted as a residual difference due to factors other than composition effects (for instance, differences in preferences). The results of this decomposition are presented in Figure 10. We show the deviation of the country-specific elasticities from the mean elasticity that can be attributed to differences pertaining to each of the three demographic factors, as well as the residual, unexplained difference. It turns out that differences in demographic composition regarding age and education are never statistically significant while variation in family size contributes very slightly to larger elasticities in some countries, including Estonia, France, Ireland, Portugal, and Spain. However, these differences are only significant in a few cases, and certainly do not explain the bulk of country differences. Once controlling for these composition effects, the residual term corresponding to “overall” differences in labor-supply responsiveness shows a significantly positive effect for Greece, Ireland, and Spain (the high-elasticity group) and a significantly negative effect for Finland, France, Sweden, the United Kingdom, and the United States (the low-elasticity group). Therefore, we must conclude that differences in demographic compositions between countries are not responsible for variations in labor supply elasticities.32
Deviation to the Mean Hour Elasticity Due to Demographic Characteristics
D. Alternative Explanations
This leaves room for other explanations. Firstly, there may be genuine differences in work preferences, possibly due to long-lasting differences in culture and norms vis-à-vis female labor market participation. Secondly, and in a related manner, social preferences may vary across countries and lead to different institutions, notably childcare arrangements. It may be that differences in some of the estimated parameters, and particularly the fixed costs of work, reflect country heterogeneity vis-à-vis nonsimulated policies. Furthermore, differences in industrial or occupational composition might also play a role, given that employment in, for example, the Nordic countries is often reported to be more stable due to better work-family reconciliation policies. The data at hand does not allow probing such differences across countries, which we leave for future research. Finally, an explanation in terms of selection can be suggested: we find that marriage rates are significantly higher in high-elasticity countries (the fraction of married women over single women is 6.3 in Ireland or 5.6 in Spain, compared to an average of 3.9 over all countries); hence, it could be that married women in these countries cover a large range of the distribution of elasticities while the relatively smaller fraction of women who marry in France, the Nordic countries, the United Kingdom, and the United States are in the low range of this distribution. If this was the case, one would expect to find larger elasticities among single women in the latter group of countries. However, our main results show that it is not the case—the cross-country correlation between elasticities of married and single women is positive (0.25)—and thus this possible explanation can be ruled out.
V. Concluding Discussion
This paper presents new evidence on labor supply elasticities in 17 E.U. countries and the United States. Given the effort applied in adopting a common empirical approach, estimates are more comparable than is usually the case in the literature, with results extremely robust to modeling assumptions and specification tests. The main lesson from the exercise is that elasticities are more modest than usually considered, with international differences relatively small. Furthermore, we also show that the remaining variation across countries relates little to differences in tax-benefit systems, heterogeneity in demographic composition, or selection into marriage. Instead, it may rather reflect differences in individual and social preferences across countries, and primarily differences in work preferences and childcare policies, as captured by variation in labor supply parameters. As far as married women are concerned, these differences contribute to more intermittent labor force participation patterns in Greece, Ireland, and Spain, as opposed to more consistent participation and more constant hours in other countries, notably France, the Nordic countries, the United Kingdom, and the United States.33
Future work should consider both time and country variation. The present study was based on data years for which policy simulations were available for E.U. states. For a subgroup of countries, we have used two years of data, with a three-year interval characterized by important tax-benefit reforms. This source of exogenous variation is usually called upon to improve the identification of behavioral parameters. In our case, results are not very sensitive, pointing to good performances of the cross-sectional identification strategy based on spatial variation and tax-benefit nonlinearities. In the elasticity of taxable income literature, changes in income between pairs of years also relate to changes in marginal tax rates between these years, albeit pooling a long panel of tax returns. (See Saez, Slemrod, and Giertz 2012.) Ideally, we would like to gather many years of data for each country, thus allowing for more exogenous variations in net wages. However, this is certainly an enormous task when trying to compare many countries and accounting for complete tax-benefit systems.
Other improvements are necessary, notably a better modeling of demand-side constraints, although this was not possible with the data at hand. A bias may stem from assuming that nonworkers choose to be so. This primarily concerns single individuals, for whom involuntary unemployment may be an issue, yet not so much married women and single mothers, two groups who frequently choose nonparticipation on a voluntary basis due to fixed costs of work and preferences. Information on local unemployment could be used to better address labor market constraints, as in Keane and Moffitt (1998); an interpretation is that job search costs are higher in a bad economy, which leads to higher utility costs of work. Rationing may not only affect participation but also hours, and, accordingly, information on both actual and desired hours of work could be used in this case to disentangle supply and demand sides. Usually, related studies simply estimate labor supply models on desired hours (for example, Callan, van Soest, and Walsh 2009). However, even when this variable is available, it is difficult to ensure that answers to the preferred hours question only reflect preferences and are not themselves contaminated by constraints (as could be the case for discouraged workers).
Despite these restrictions, we believe that the estimates provided in this paper can be useful for researchers who want to implement optimal tax or CGE models in a comparative framework and need to refer to “reasonable” values from the literature. In particular, our results can be exploited for applications in the field of taxation. Two recent studies (Immervoll et al. 2007 and 2011) have conducted international comparisons of redistributive systems in Europe, and their results could be reassessed in the light of the estimates provided in the present study. Immervoll et al. (2007) measure the implicit cost of redistribution using plausible elasticities and sensitivity analyses—yet without information on actual cross-country differences. They assume that participation elasticity decreases with income levels, and the implications of this are crucial for welfare analysis (Eissa, Kleven, and Kreiner 2008). Notably, the optimality of policies that support the working poor, compared to traditional “demogrant” policies, depends fundamentally on this assumption. While very limited evidence exists, the present study broadly supports this assumption for single individuals, providing a precise range of estimates for each country.
Moreover, international comparisons of the tax treatment of couples by Immervoll et al. (2011)—essentially the long-studied issue of joint versus individual taxation—could be reevaluated using our new evidence on couples’ labor-supply elasticities. Related to this point, Heckman (1993) noted “whether labor supply behavior by sex will converge to equality as female labor-force participation continues to increase is an open question.” This question has thus far remained open, and the present study contributes to answering it. In fact, we can draw from our results that male-female differentials in participation rates are strongly negatively correlated with male-female differentials in participation elasticities (corr = –0.89).34 Hence, the Ramsey argument against the high implicit taxation of secondary earners and subsequent deadweight loss from joint taxation (or, more frequently, from joint-income assessment for benefit or tax-credit eligibility) can now be assessed on the basis of comparable estimates for many countries.
Appendix 1 Descriptive Statistics and Hour Distribution
Table A1 presents the data sets used and the main statistics of the sample selected for wage and labor supply estimations. As further described in the next section for the wage estimations, demographics are defined across countries in a comparable manner. The number of children corresponds to children living in the household. For comparability purposes, we only define three education categories (“high,” corresponding to tertiary education and reported in Table A1, “low” corresponding to no education and junior school, and “middle”). Table A2 reports the hour distribution for all countries. Hours are based on contract hours in order to avoid seasonality issues for data sets collected in time of bank holidays or holidays. In all countries, earnings correspond to basic salary plus bonuses and additional payments.
Descriptive Statistics (Selected Samples)
Distributions of Weekly Worked Hours (Selected Samples)
Appendix 2 Estimates of the Wage Equation
As explained in the paper, we first proceed with a Heckman-corrected wage estimation to predict wages for all the individuals in our sample. The wage equation depends on human capital variables: cubic form of age, education, and basic family status. (Men in couples are known to earn more than single men; women with many children have often stopped working so their productivity has decreased.) We choose three education groups for comparability purposes (with “low,” corresponding to “no education or junior school,” as the omitted category); more detailed education groups would be difficult to define in a comparable way across countries. The Heckman selection correction relies on a participation probit that can be seen as a (linearized reduced form) approximation of the extensive margin of the labor supply model, with the somewhat usual exclusion restrictions for identification. (See van Soest 1995.) That means it depends on the same variables plus detailed information about children and “other” incomes. The latter correspond to partner’s and other family members’ income as well as capital income of various sources. The different income sources have been defined in a harmonized way within the EUROMOD project. (See Sutherland 2007.) The assumption of normality of the wage residual is made. Tables A3 and A4 report the results of the Heckman-corrected wage estimations for each country and for men and women separately.
Wage Estimations: Women
Wage Estimations: Men
Appendix 3 Labor-Supply Model: Estimates
In Tables A5–A7, we report the maximum-likelihood estimates of the seven-discrete-choice model of labor supply. We report the estimates for each individual year, as used to calculate baseline elasticities. Estimates of the utility function parameters show relatively stable results over time, when two years of data are available. This is reassuring about the fact that preferences do not change substantially over the three-year interval. The variable “region” corresponds to broad regional categories (for instance, Paris region versus the rest of France), so it does not compromise the identification of the model based on thinner regional variation in tax-benefit rules. Broad regional information is missing in our samples for Denmark and the Netherlands. The variable “elderly”—that is, the presence of dependent parents aged 70 or above—is also ignored in the specification for Danish couples and Swedish single men as the selected samples for these groups contained almost no such observations.
Labor-Supply Estimations: Couples
Labor-Supply Estimations: Single Women
Labor-Supply Estimations: Single Men
Appendix 4 Labor Supply Elasticities
For both years when available, and for each demographic group separately, Tables A8–A11 report the own-wage hour elasticities, compensated and uncompensated, and for quintiles of disposable income. We distinguish the hour elasticity for the subgroup of participants pure intensive margin) and the participation elasticity (extensive margin). The extensive margin is expressed in percentage change of the employment probability (“participation”). Alternatively, it is expressed in hour changes corresponding to participation responses (“hour”), so this measure and the intensive margin sum up to the total uncompensated hour elasticity. We show cross-wage hour elasticities for individuals in couples and income elasticities. Bootstrapped standard errors are reported in brackets for the main elasticity results.
Labor-Supply Elasticities: Married Women
Labor-Supply Elasticities: Married Men
Labor-Supply Elasticities: Single Women
Labor-Supply Elasticities: Single Men
Appendix 5 Robustness Checks
Table A12 reports estimates for seven countries where two years of data are available. We give here a detailed account of the 1998–2001 policy changes used for the additional exogenous variation discussed in the paper. The United Kingdom has experienced important changes in the income tax schedule, social insurance contributions, and council taxes, as well as an increased generosity of income support for the elderly (minimum income guarantee) and for families with children. The latter have also benefited from the replacement of the family credit by the more generous working family tax credit (WFTC) in 1999. (See Blundell et al. 2000.) In France, housing benefits have been reformed in 2001 and a refundable tax credit for low-wage individuals was introduced that year in France and Belgium. In Germany, the year 2001 corresponds to the first step of major income tax reforms, including a widening of the income brackets and tax cuts; child benefits were also raised by more than 20 percent over the period of interest. In Sweden, the income tax schedule changed with the introduction of an additional lower income tax bracket; a special local income tax credit for low-income earners was introduced in 2001 and child benefits were raised by 25 percent over the period. In Ireland, substantial cuts in income tax have taken place over 1998–2001; income tax allowances were replaced by deductible tax credits while welfare payment rates have failed to keep pace with overall growth in disposable income. The Spanish personal income tax has undergone a dramatic change with the reform of 1999 (reduction in the number of tax brackets from 9 to 6, cuts in the bottom and top marginal tax rates, and changes in the treatment of the family dimension through a new system of tax credits).
Results in Table A12 compare the baseline estimates to those obtained when pooling the two years of data and calculating separate elasticities for each year (“pooled years”) or the elasticity for the pooled sample (“pooled, mean elasticity”). Results are unchanged in most cases. For France, Spain, and Ireland, we now find very similar elasticities for the two years, which broadly correspond to the average of the two elasticities obtained from independent estimations.
Table A13 reports detailed estimates for the extensive specification check described in section IV.C of the paper. The preliminary check concerns the sensitivity of the results to the way we calculate elasticities. The first row of each panel in Table A13 corresponds to the baseline—that is, a seven-choice model with quadratic utility and fixed costs—whereby elasticities are obtained by averaging expected hours over all observations (frequency method). The second row reports the average elasticity over the 250 draws used to bootstrap standard errors in the baseline model. The third row shows elasticities obtained with a calibration method.35
Reassuringly, we see very little differences in the three sets of results. The following rows correspond to specification checks, as explained and commented in the paper.
Robustness Checks: Improving Identification by Pooling Years
Robustness Checks: Specification
Appendix 6 Assessing Cross-Country Differences in Elasticity Size
Table A14 reports the elements found in graphical form in Figures 8–10 of the paper: baseline own-wage elasticities of hours for married women (Column 1), elasticities when canceling the role of different mean work hours and wages between countries (Column 2), elasticities obtained with a 1 percent increment in net rather than gross wages (Column 3), and the elasticity decomposition used to assess the role of different demographic compositions (Columns 5–8).
Elasticities Decomposition
Footnotes
Olivier Bargain is affiliated to Aix-Marseille U. (Aix-Marseille School of Economics), CNRS and EHESS.
Andreas Peichl is affiliated to ZEW, U. of Mannheim, CESifo, ISER and IZA.
Kristian Orsini was affiliated to U. of Leuven at the time the paper was written. The authors are grateful to two anonymous referees, as well as R. Blundell, G. Kalb, D. Hamermesh, A. van Soest and participants to seminars/workshops at UCD, AMSE, IZA, ISER, Leuven, Milan, ZEW. Research was partly conducted during Peichl’s visit to the ECASS and ISER and supported by the Access to Research Infrastructures action (EU IHP Program) and the Deutsche Forschungsgemeinschaft (PE1675). They are indebted to the EUROMOD consortium and to Daniel Feenberg and the NBER for granting them access to TAXSIM. They thank Raj Chetty, Julie Berry Cullen, Hilary Hoynes for providing them with transfer calculators. The ECHP was made available by Eurostat; the Austrian version by Statistik Austria; the PSBH by the Universities of Liège and Antwerp; the Estonian HBS by Statistics Estonia; the IDS by Statistics Finland; the EBF by INSEE; the GSOEP by DIW Berlin; the Greek HBS by the National Statistical Service; the Living in Ireland Survey by the ESRI; the SHIW by the Bank of Italy; the SEP by Statistics Netherlands; the Polish HBS by the University of Warsaw; the IDS by Statistics Sweden; and the FES by the UK ONS through the Data Archive. Material from the FES is Crown Copyright and is used by permission. The usual disclaimer applies. The data used in this article can be obtained beginning January 2015 through December 2017 from the corresponding author: O. Bargain, DEFI, Chateau Lafarge, Route des Milles, 13290 Aix-en-Provence, France. Email: olivier.bargain{at}univ-amu.fr.
↵1. Those written in the 1980s focus on estimations using the continuous labor supply model of Hausman (1981) and provide evidence for individuals in couples (Pencavel 1986 for married men; Killingsworth and Heckman 1986 for married women). More recent surveys incorporate other methods (see Blundell and MaCurdy 1999) including life-cycle models (see Meghir and Phillips 2008; Keane 2011).
↵2. To our knowledge, only Evers, de Mooij, and van Vuuren (2008) gather evidence for a large set of countries with their meta estimations controlling for different dimensions, including country fixed effects and methodological differences across studies. However, there may not be enough variation across existing studies, and, moreover, not enough studies per country to isolate genuine international differences from other factors. Furthermore, the special issue of the JHR published in 1990 provided evidence from different countries using variants of the Hausman approach. (See Moffitt 1990 for an overview.) However, these studies bear methodological differences that prevent their estimates from being directly comparable.
↵3. We focus our analysis on labor supply responses in a static framework (referred to by Chetty et al. 2011 as steady-state elasticities). We exclusively analyze labor supply decisions (hours and participation) and ignore the other margins captured in the literature concerning the elasticity of taxable income. (See Saez, Slemrod, and Giertz 2012.) Arguably, these other margins partly relate to responses not directly pertaining to productive behavior, such as tax evasion. In this regard, hours of work still constitute an interesting benchmark. We also leave aside the macroeconomic literature, in which elasticities are often obtained through calibration of general equilibrium models and are usually much larger than in microeconomic studies (for example, Prescott 2004). However, macro estimates can be reconciled with micro ones when using life cycle models with human capital accumulation (Keane and Rogerson 2012). Our study also relates to the recent attempts to explain labor supply differences across countries, initiated by Prescott 2004. (See Blundell, Bozio, and Laroque 2011, for a recent statement and additional references.) While Prescott (2004) and several related studies ignore differences between the United States and Europe (and among European countries themselves) in preference/culture, we precisely aim at using microdata to characterize international differences in elasticities and the likely role of country-specific preferences.
↵4. In practice, specific utility functions are used. In Section 4.3, we check whether the degree of flexibility or moving closer to the continuous case affect the estimated elasticities. (See also Heim 2009, for a model combining continuous and discrete dimensions.)
↵5. Estimates are also potentially contaminated by measurement errors (for a discussion of the division bias, see Ziliak and Kniesner 1999).
↵6. For the former approach, we would need significant policy reforms (and ideally a common one) for many countries, with all occurring around the same time period. Meanwhile, the latter approach would require using panel data or many repeated cross sections for a large number of countries. However, either case seems hardly feasible. As noted by Imbens (2010), there are many important research question for which no experimental or quasi-experimental setup is available, of which our large-scale comparison is one.
↵7. Note that we refrain from estimating childcare jointly with labor supply. This is not undertaken systematically in the literature, owing to data limitations (notably, the availability and market price of childcare, which can vary locally and with individual circumstances). However, some studies suggest joint estimations; see, for example, Blau and Tekin (2007).
↵8. This is achieved by choosing the smallest Lagrangian multiplier that reaches the target—that is, at least 95 percent of the observations with no negative marginal utility of income at all potential labor supply choices. The remaining observations, less than 5 percent of the samples, are simply discarded before we calculate elasticities. In practice, we obtain very small leftover, as a target of more than 99 percent is achieved for most countries and demographic groups. (Detailed results are available from the authors.) We also check quasi-concavity of the utility function, which is verified for all observations that pass the positive marginal utility of income requirement.
↵9. There are actually few studies adopting simultaneous estimations of wages and labor supply (for example, van Soest, Das, and Gong 2002), given that tax-benefit simulations must be run at each iteration of the ML estimation. This is not possible in our case given the fact that EUROMOD is not programmed with an econometric software. Moreover, approximations relying on a presimulated set of disposable income for a whole range of wage values for each individual would be too time-consuming, given the large number of countries.
↵10. Halton sequences generate quasi random draws that provide a more systematic coverage of the domain of integration than independent random draws. Train (2003) explains that the accuracy can be markedly increased in the context of mixed logit models. Following Train, we use r = 100 draws from Halton sequences.
↵11. Arguably, some of these characteristics are included in Zi and also affect preferences so the model is only parametrically identified. In practice, tax-benefit rules depend on characteristics Xi, which are much more detailed than usual taste-shifters Zi. For instance, benefit rules depend on the detailed age of all children in the household, on more detailed geographical information, etc.
↵12. However, detailed information on regions is missing for Spain, Denmark, Austria, and Portugal (countries for which we use the ECHP data), as well as the Netherlands.
↵13. In the result tables, we use the official country acronyms as follows: Austria (AT), Belgium (BE), Denmark (DK), Finland (FI), France (FR), Germany (GE), Greece (GR), Ireland (IE), Italy (IT), the Netherlands (NL), Portugal (PT), Spain (SP), United Kingdom (UK), Sweden (SW), Estonia (EE), Hungary (HU), Poland (PL), United States (US).
↵14. We make use of policy/data years available in EUROMOD at the time of writing (1998, 2001, or 2005, as indicated above), while its future developments should enable extending our results to more the recent period and more countries.
↵15. Information on tax-benefit rules for each E.U. country is available at: www.iser.essex.ac.uk/research/ euromod together with modeling choices and validation of EUROMOD. For the United States, tax-benefit rules and TAXSIM are presented in detail at www.nber.org/~taxsim/.
↵16. Moreover, estimates are also slightly more precise for couples than for single individuals, and both issues may be due to less variation in labor market behavior among singles (with the exception of lone parents). Furthermore, the model for couples generally fits the data better because inactivity is more of a voluntary choice for married women than single individuals.
↵17. In order to compare the within-sample fit with out-of-sample predictions, we have also estimated the baseline model on a random half of the sample for each country, subsequently using it to predict hours for the other half. Fit measures on the holdout sample show similar results as those discussed in the text, thus conveying that the flexible model used does not overfit the data in a way that would reduce external validity.
↵18. For the sake of a clear exposition, the graphs focus on the most recent year when two years are available. Appendix tables report detailed estimates for both years, based on separate estimations for each year. As we show below, preferences are relatively stable over the three-year interval considered in this case.
↵19. We focus on uncompensated elasticities. As reported in Appendix Tables A8–A11, compensated own wage elasticities are only slightly larger than uncompensated ones in most cases, owing to very small and negative income elasticities, as discussed below. They are slightly smaller in rare cases where income elasticities are positive, such as single women in Denmark.
↵20. Bargain and Peichl (2013) show that this statement is particularly true for Europe. Among all studies surveyed, they report 40 estimates for married women, 24 for married men, and 24 for single individuals (including single mothers). For the United States, they report 11 estimates for married women, nine for married men and six for single individuals (including single mothers).
↵21. When compared with the survey estimates in Bargain and Peichl (2013), we see that our estimates are very close to, or not statistically different from, past findings for Austria, Belgium, Finland, Germany, Sweden, and the United Kingdom. However, our estimates are smaller or close to the lower bound of past confidence intervals for Ireland, Italy, and the Netherlands, which could be explained, among other things, by the use of older data in previous studies (for example, in papers by van Soest and co-authors). Our estimates for the United States are very small and compare well to the most recent results (Blau and Kahn 2007; Heim 2007). U.S. studies that report larger elasticities rely on older data while it has been shown that elasticities have dramatically decreased over time in this country.
↵22. Appendix Table A1 shows that the number of couples with children is large in Ireland yet close to average in Greece and Spain. Hence, higher elasticities among married women in these countries do not seem to be driven by a higher proportion of families with children. This is confirmed by the decomposition analysis in the last section.
↵23. Drawing from the survey in Bargain and Peichl (2013), it is evident that our results are broadly in line with the available estimates for the Netherlands or Germany while several studies report comparable estimates to ours for the United Kingdom and the United States (Dickert, Houser, and Scholz 1995). However, our results point to more moderate elasticities than in Keane and Moffitt (1998) for the US, or for several British studies. This is possibly due to the fact that we cover a more recent time period, which implies methodological differences, and that this group has become relatively larger over time (hence, less negatively selected in terms of labor market participation). Indeed, Bishop et al. (2009) report small elasticities for single women over 1979-2003, at least compared to married women, and a significant decline in wage elasticities over the period.
↵24. For the United States, Heim (2009) finds the intensive margin to be larger than the extensive one. We confirm this for the upper half of the distribution for married men hereafter. Chetty (2012) explains that responses at the intensive margin, due to potential optimization frictions, may not be detected by standard approaches.
↵25. The rare exceptions, Meghir and Phillips (2008) for the United Kingdom and Aaberge, Colombino, and Wennemo (2002) for Italy, indicate that low-educated single men significantly respond to financial incentives. The former study reports a participation wage elasticity of 0.27 for unskilled single men and zero for those with college education while the latter reports participation elasticities as high as 0.5 for single men in the lower part of the income distribution and almost zero higher up.
↵26. Data provides information on property income and investment income. However, given the many zeros, we increase nonlabor income by a marginal amount for all observations in order to compute income elasticities over all observations (bottom-coding).
↵27. Positive income elasticities are encountered in other papers as well (including two studies for Finland and Sweden, as discussed in Bargain and Peichl 2013, plus van Soest 1995, for the Netherlands, and Blau and Kahn 2007 for the United States, among others). Substitutability between time and money inputs in household production may explain this result. Indeed, an income effect may not only increase leisure (a normal good) but also decrease housework, and eventually could increase labor supply if the latter effect dominates. However, this seems to apply less to singles than to married individuals with children.
↵28. The fact that some choices may not be available to some people due to institutional constraints or individual/job characteristics can be modeled explicitly as a probability of choice availability in the log-likelihood. (See Aaberge, Dagsvik, and Strom 1995, who also allow for different wage rates at each choice.) Such a model represents a different parameterization of the present one, where dummies for specific, possibly constrained hours of work are used (van Soest 1995). As for hour restrictions, see the discussion in the concluding section.
↵29. The only exception seems to be Italy, where higher-order polynomial utility leads to larger elasticities. The difference with the baseline is only statistically significant in the case of participation elasticities, and partly disappears when we restrict the condition of participation to people working at least five hours a week when calculating elasticities. (Indeed, there are a number of initial nonworking women for whom the predicted number of weekly hours is very small after the wage increase used to calculate elasticities; the additional restriction is reasonable if we consider that it is unusual to observe such small values.)
↵30. We have previously highlighted the importance of distributional differences across countries in the labor supply responses to wages. Thus, difference in elasticity size across countries may lie in the tails. To check this, we have also replicated the decomposition of elasticities at the first and fifth income quintiles (available from the authors). That is, we have assessed international differences in elasticities at each quintile by decomposing these differences when holding wages and hours fixed at quintile-specific levels. Our conclusions does not change: Most of the country difference in responses at this income levels remains after controlling for differences in wage and hours levels.
↵31. In our application, we retain three age groups (aged 18–35, 36–45, and 45–59), two education groups, and three family sizes (no children, 1–2 children, and 3 children or more). Refining with three education groups leads to too many empty cells.
↵32. We have checked that alternative decomposition paths—given the path dependency of the method—provide similar results. Similar conclusions are also obtained when using the “net wage” elasticities.
↵33. This result corroborates the findings of Heim (2007) regarding the time variation of elasticities in the United States. Considering time rather than cross-country variation, Heim (2007) also finds that higher participation rates coincide with much smaller elasticities, and that this trend is not due to demographic changes but more likely rather to shifts in work preferences over time.
↵34. The gender participation gap in Nordic countries is below 10 points, coinciding with insignificant gender differences in labor supply elasticities. In Spain or Greece, men’s participation is above women’s by a large margin (around 50 points), and the gender difference in elasticities is significant and larger than 0.45. Most E.U. countries and the United States are somewhere between these two extreme cases.
↵35. The frequency approach implies averaging the probability of each discrete choice over all households before and after a change in wage rates or unearned income. The calibration method, consistent with the probabilistic nature of the model at the individual level, consists of repeatedly drawing a set of J+1 random terms for each household from an EV-I distribution (together with terms for unobserved heterogeneity), which generate a perfect match between predicted and observed choices. (See Creedy and Kalb 2005.) The same draws are kept when predicting labor supply responses to an increase in wages or nonlabor income. Averaging individual responses over a large number of draws provides robust transition matrices.
- Received July 2012.
- Accepted July 2013.