Abstract
We investigate the relationship between management practices and long working hours by combining large-scale establishment panel data on management practices with the corresponding employee data on overtime hours in the manufacturing sector. We find that the adoption of more structured bonus and promotion practices is correlated with an increasing probability of workers working more than short-to-medium overtime hours. In addition, the adoption of more structured production monitoring and targeting practices is associated with a lower probability of workers working long overtime hours, resulting in narrowing disparities in overtime hours across workers within establishments.
I. Introduction
Overwork can have negative consequences on workers’ welfare and labor productivity. From the workers’ perspective, it may worsen work–life balance, health (Pega et al. 2021; Cygan-Rehm and Wunder 2018; Berniell and Bietenbeck 2020), and overall life satisfaction (Hamermesh, Kawaguchi, and Lee 2017). Work–life balance is a serious issue in countries such as Japan, where the percentage of employees working more than 50 hours per week exceeds the OECD average of 11 percent (OECD 2021). Overwork may impair workers’ health, and in the worst-case scenario, it may result in death (karoshi), which can also result in the employer being sued. The WHO and the ILO estimate that 745,000 deaths from ischemic heart disease or stroke globally in 2016 were attributed to working very long hours, defined as more than 55 hours per week (Pega et al. 2021). Overwork can also be harmful on the production side. Several empirical studies show that the marginal output of an extra hour of work substantially declines after the worker works for a number of hours (for example, Pencavel 2015, 2016). To avoid these issues, firm managers are generally considered responsible for preventing employees from overworking while inducing a moderate amount of effort.
However, how firms’ management practices are related to working hours remains largely unknown. In the economics literature, the determinants of working hours have been discussed in a simple demand-and-supply framework featuring concave utility and production functions to derive short-run equilibrium hours of work (for example, Hamermesh 1993). In this simple model, actual working hours can be long, depending on the functional forms of utility and production. More recently, bonus and promotion schemes, that is, human resource management (HRM) practices, for coping with moral hazard and hidden ability, have been shown to be important determinants of long working hours (for example, Gicheva 2013; Sousa-Poza and Ziegler 2003; Landers, Rebitzer, and Taylor 1996). However, not only human resource management (HRM) practices, but also non-human resource management (non-HRM) practices, such as production monitoring and targeting may affect working hours. For example, inappropriate monitoring and targeting practices can cause inefficient allocation of tasks across workers and time, which results in working long hours for some workers. In such a case, improving monitoring and targeting practices could reduce the number of working hours, especially for workers working long hours. The relationship between management practices and working hours has been difficult to empirically investigate because of limited availability of large-scale data on management practices that can be linked to working hours.
To fill this gap, this study investigates the empirical relationship between management practices and working hours by matching a large panel of Japanese firm management practice data with the overtime hours of employees. The firm management practice data were obtained from the Management and Organizational Practice Survey in Japan (JP-MOPS), conducted by the Japanese Cabinet Office in 2017. The Management and Organizational Practice Survey (MOPS) is an internationally coordinated project that collects information on establishments’ management practices in 2010 and 2015 regarding non-HRM practices, such as production monitoring and targeting practices, and HRM practices, such as bonuses, promotions, and displacement. The JP-MOPS is a part of the international MOPS project, and its questionnaire is carefully harmonized with that of the 2015 US-MOPS (Bloom et al. 2019). We matched the JP-MOP data in the manufacturing sector with information on the overtime hours of multiple workers in the same establishments around the same years, from another government survey. The establishment-level panel structure of our matched data enables us to focus on changes in management practices by controlling for unobserved establishment fixed characteristics affecting both management practices and overtime hours.
We find that the HRM and non-HRM practices play different roles in the management of overtime hours. On one hand, the adoption of more structured bonus and promotion (HRM) practices that evaluate workers based on individual performance predicts a higher probability of working above short-to-medium overtime hours. A plausible explanation is that such high-powered incentive schemes prevent moral hazard and induce some workers to exert greater effort, as widely discussed in the literature on pay for performance and career concerns (Bloom and Van Reenen 2011; Holmström 1999). On the other hand, the introduction of more structured production monitoring and targeting (non-HRM) practices predicts a lower probability of workers working for long hours, in particular, those with more than 45 hours of overtime (equivalent to around 50 hours of work per week). This can be attributed to the introduction of these practices, which enables identification and reduction of problems that trigger long overtime hours.1 Hence, our results suggest that the adoption of more structured monitoring and targeting practices is associated with a concentration of the distribution of overtime hours to zero. This evidence calls for greater attention to firms’ non-HRM practices as a means of achieving labor market policy goals, such as reducing excessive overtime work and improving labor productivity. In addition to these results, we find only a weak and insignificant association between changes in management practices and labor turnover. Hence, we interpret our results as short-run outcomes before significant compositional adjustments occur.
This paper broadly relates to three strands of literature. First, this work relates to previous studies on labor demand-side factors that determine working hours focused on employers’ responses to tax and regulations on working hours and compensation schedules (for example, Hamermesh 1993; Trejo 1991; Crépon and Kramarz 2002).2 However, empirical studies relating working hours to firm management practices are limited, with a specific focus on HRM. Among these, Bell and Freeman (2001), Gicheva (2013), and Frederiksen, Kato, and Smith (2018) show that longer working hours predict an increased probability of future promotion, higher wages, and wage growth, which they interpret as evidence supporting firms’ usage of certain promotion incentive mechanisms. Our contributions to this literature are threefold. First, we link the direct measures of HRM practices (bonus and promotion policies) obtained from an establishment survey to working hours. Second, we examine the roles of non-HRM practices (monitoring and targeting) as well as HRM practices in determining working hours. Finally, we analyze changes in the distribution of working hours within establishments, which have rarely been examined in extant empirical studies.
Second, our study also builds on existing empirical studies using direct measures of both HRM and non-HRM practices obtained from large-scale surveys of firm managers (Bloom and Van Reenen 2007; Bloom et al. 2019). These studies show that management practices explain a range of firm performance measures, including productivity, profitability, and growth. There is also a substantial body of evidence concerning the causal effect of this relationship based on field experiments on certain sets of firms (for example, Bloom et al. 2013; Bruhn, Karlan, and Schoar 2018; Gosnell, List, and Metcalfe 2020). Nevertheless, studies linking management practices to labor market outcomes are limited. A few exceptions are Bender et al. (2018) and Cornwell, Schmutte, and Scur (2021). Using cross-sectional data on management practices matched with panel data on employee earnings, they show that firms using structured management practices tend to hire and retain workers with higher abilities. In contrast, we use a panel structure of management practices matched with employees’ working hours to distinguish the effects of management practices from time-invariant firm-specific unobserved factors. This enables us to examine the relationship between changes in management practices and changes in labor outcomes of establishments, which is one of our major contributions. We show that excluding the effects of firm-fixed unobserved factors largely alters the observed relationship between management practices and working hours.
Third, our findings are consistent with the results of laboratory experiments that show that some management technologies lead to leveling workload across workers (Bewley 2000; Bartling and von Siemens 2011). This leveling may improve firm productivity, assuming that the marginal product of labor diminishes as working hours increase, as empirically supported by Brachet, David, and Drechsler (2012), Pencavel (2015, 2016), and Collewet and Sauermann (2012). This could partly explain why structured management practices improve employee productivity and firm performance, as shown in previous studies (for example, Bloom et al. 2013; Gosnell, List, and Metcalfe 2020; Kambayashi, Ohyama, and Hori 2021).
In the following, Section II presents the data of the study. In Section III, we describe our analysis of working hours, management practices, and the results. Section IV presents the results of the analysis of labor turnover and within-establishment outcomes, and Section V presents the conclusions.
II. Data
A. Data on Establishments’ Management Practices
The MOPS is an internationally coordinated governmental survey that collects information on the management practices of establishments. The original survey was conducted in 2011 by the U.S. Census Bureau for the manufacturing sector (Buffington et al. 2017), and the second wave in the United States was implemented in 2016 (Buffington et al. 2017; Bloom et al. 2019). Several other countries have conducted near-identical surveys, including Pakistan in 2015 (Lemos et al. 2016) and the UK in 2017 (Office for National Statistics 2018). In Japan, the Economic and Social Research Institute of the Japanese Cabinet Office funded and directed the Japanese version of the MOPS (JP-MOPS) administered in January 2017.
The original US-MOPS contained 16 management questions concerning non-HRM practices (production monitoring and targeting) and HRM practices (bonuses, promotions, and displacement). These questions are similar to those in the World Management Survey initiated by Bloom and Van Reenen (2007). Each question in the JP-MOPS is the precise Japanese-language translation of a corresponding question in the 2015 U.S. MOPS. The questionnaire asked about practices in the base year (2015 in the case of the JP-MOPS) and retrospectively about practices in the five years before the base year (2010 in the case of the JP-MOPS). Responses are obtained from those responsible for overall management practices in the establishment, typically presidents, executive officers, plant managers, or general or department managers (Online Appendix Table A.1). Less than 0.4 percent of the respondents were from human-resource-related departments, and approximately 90 percent of the respondents had tenures longer than five years, with a median tenure of about 21 years.3
The JP-MOPS sample is based on the 2014 Economic Census for Business Frame, which is a complete census of Japanese establishments in 2014, where 56,237 establishments met the criterion for being survey targets: private manufacturers with 30 or more employees.4 Based on a stratified random sampling by two-digit industry and size, 36,052 establishments were selected and sent questionnaires in the second week of January 2017. After two follow-up calls, we received 11,405 responses (31.6 percent response rate). In general, the establishments that participated in the JP-MOPS were larger than those that participated in the 2014 Japanese Census of Manufacturers (Online Appendix Table A.2).
The non-HRM section of the survey questionnaire contains questions regarding production monitoring and targeting practices, which mainly involve the collection and utilization of information about the production process by the establishment. Online Appendix Table A.3 shows the list of exact questions and response options.5 A question about monitoring asks, “What best describes what happened at this establishment when a problem in the production process arose?” The response options vary from “No action was taken” to “We fixed it and took action to make sure that it did not happen again, and had a continuous improvement process to anticipate problems like these in advance.” The other questions on monitoring are related to the variety of key performance indicators (KPIs) collected, the frequency with which these are reviewed, and the existence of display boards showing the KPIs. Questions about targeting ask, “How easy or difficult was it for this establishment to achieve its production targets,” for which the available responses range from “Possible to achieve without much effort” to “Only possible to achieve with extraordinary effort.” The remaining questions about targeting ask firms about the time frame of the production targets and who are aware of those targets.
For the HRM questions, the survey asks whether there are any performance bonuses and, if so, the determinants of the bonuses (whether individual, team, establishment, and/or firm performance) and what proportion of workers are covered by the bonuses. The questions on promotion concern the primary manner in which managers and non-managers are promoted, that is, whether it is solely based on performance and ability or if other factors such as tenure or family connections also matter. Finally, the questions about displacement ask, “When is an underperforming non-manager usually reassigned or dismissed?” to which each respondent selects a response from “Within 6 months of identifying non-manager underperformance,” “After 6 months of identifying non-manager underperformance,” and “Rarely or never.” The same questions were also asked about the managers.
Following the literature, we first score the response to each question by assigning a value ranging from zero to one (for details, see Online Appendix Table A.3).6 For the monitoring section, a higher score is assigned when the management uses more KPIs, reviews them more frequently, shares them more widely, and continuously improves the production process. For the targeting section, a higher score is assigned when the management uses a composite of long- and short-term perspectives with a moderate level of difficulty to be achieved when designing its production targeting. The scores for the bonus section are higher when bonuses cover a wider range of workers and depend on individual rather than team, establishment, or company performance. For the promotion section, scores are higher if promotion is based solely on performance and ability rather than other factors, such as tenure and family connections. The scores for the questions about displacement are higher if the establishment is more inclined to reassign or dismiss underperforming workers quickly.
To aggregate these scores for empirical analysis, Bloom et al. (2019) employed a single score referred to as the “overall management score,” which is the unweighted average of the scores of the 16 questions. However, while aggregating a single score may be appropriate for the analysis of firm performance, it may not be ideal when analyzing worker outcomes. Given that HRM practices have been considered as the main channel through which management affects worker outcomes in the literature on personnel economics, we separate the responses into HRM and non-HRM practices. Furthermore, we disaggregate HRM practices into those related to wages and employment. More concretely, we used three scores by taking the unweighted average of the scores across three groups of questions: monitoring and targeting (MOPS Questions 1–8), bonus and promotion (MOPS Questions 9–14), and displacement (MOPS Questions 15–16). We show that further disaggregation provides qualitatively similar results, although the statistical power is reduced.
B. Data on Employees
We match the JP-MOPS data to the Basic Survey on Wage Structure (BSWS) using the establishment identifiers provided in the establishment census. The BSWS is an employer–employee matched survey conducted annually by the Japanese Ministry of Labor, Health, and Welfare (MLHW) on June 30. The main information we use from the BSWS is employee-level overtime hours worked. The sample size is large, covering approximately one million workers across approximately 50,000 establishments each year. The sampling is done in two stages. First, establishments are selected using a stratified sampling method from private establishments with five or more employees. Second, within each establishment selected in the first stage, employees are selected using a uniform sampling method with a predetermined sampling rate depending on the industry and establishment size.7 The survey is a repeated cross-section, covering approximately 5 percent of establishments across Japan each year.
The survey asks human resource managers about the employees’ total overtime hours worked, as well as the total scheduled hours worked in June.8 As such, we use the total overtime hours worked in June for our main outcome variables. As a reference for the definition of scheduled and overtime hours, in Japan, the standard workweek is 40 hours per week, above which work hours are considered overtime. This implies that, for instance, the definition of long working hours in OECD statistics (OECD 2021), working for more than 50 hours per week, is the same as around 40 hours or more of overtime per month in Japan. An overtime premium of at least 25 percent must be paid for overtime of less than 60 hours per month and at least 50 percent for overtime of more than 60 hours. In addition, the survey collected worker attributes, such as education, gender, tenure, and age. Throughout our analysis, we limit our attention to full-time workers who are strictly less than 60 years of age. We also exclude employees in managerial positions, anticipating varied effects on management and other employees.9
C. Data on Production in Manufacturing Establishments
In addition, we match the data to the Census of Manufacturers, which is a survey of all manufacturing establishments with four or more employees conducted by the Japanese Ministry of Economy, Trade, and Industry. We use information on production, such as materials, fuel, and electricity costs, to control for establishment-level demand shocks.
D. Matching Data
We construct management–employee matched panel data by combining the data presented above. Note that the data have a panel structure only at the establishment and not at the employee level because it is not possible to follow workers over time in the BSWS. The overall Venn diagram of the relationships between the data being matched is provided in Online Appendix Figure A.1. The JP-MOPS includes retrospective questions about management practices in 2010 and about those in 2015, allowing us to construct establishment-level panel data on management practices. We then match the JP-MOPS data on management practices at the end of 2010 and 2015 to the two years of employee outcomes (in the BSWS) for June 2010–2011 and 2015–2016, respectively, assuming that management practices tend to be stable over half a year. Among the original 225,984 establishment–year observations in the BSWS in 2010, 2011, 2015, and 2016, there were 40,769 establishment–year observations in the manufacturing and private sectors. Among these, we can match 5,529 establishment–year observations to the JP-MOPS, yielding a matching ratio of 14 percent from the BSWS establishment–year observations.10 We then match the Census of Manufactures in 2010–2011 and 2015–2016 to use the information on the total cost of materials, fuel, and electricity for controlling establishment-level demand shocks.
Since our main empirical strategy exploits the establishment panel structure, the sample for our analysis consists of 72,180 employee–year observations in the BSWS for 827 unique establishments whose employees are observed during both the first (2010–2011) and second (2015–2016) periods, whose management practice scores from the JP-MOPS are observed, and the information on the total cost of intermediate inputs is not missing.11 We examine the differences in worker characteristics between the matched and unmatched samples and find that their basic statistics are similar (Online Appendix Table A.2). Distributions of overtime hours in matched and unmatched samples also look alike in both 2010 and 2015 (Online Appendix Figure A.2).12
E. Descriptive Statistics on Management Practices
Online Appendix Table A.5 provides summary statistics of the establishment-level variables for the establishments in our sample. The mean of the overall management score in 2010 was 0.507, with a standard deviation of 0.166. Online Appendix Table A.6 examines the establishment characteristics associated with higher overall management score in 2010. Overall, larger establishments with more traditional Japanese employment practices (for example, long tenure was taken as proxy) tended to have higher scores on management practices in 2010. This is consistent with the fact that the original measures of management scores in Bloom and Van Reenen (2007) referred to the best practices from Japan, such as lean manufacturing, as well as practices from Western countries.
Between 2010 and 2015, the overall management score improved by about 0.044 on average, although some 33 percent of establishments reported no change in their overall management score. It is also notable that a substantial proportion (11 percent) of establishments reported a deterioration in their management scores (for the distribution of changes in scores, see Online Appendix Figure A.3). Furthermore, there are variations in the directions and magnitudes of the changes in the monitoring and targeting scores, the bonus and promotion scores, and the displacement scores within establishments (see Online Appendix Figure A.4, which shows a large variation in the changes of management scores among the three components).13
We briefly explore what factors explain the changes in management practices. First, Online Appendix Table A.7 details the results of decomposition analysis of the changes in overall management practices by industry and region. Around 18 percent of the variation is explained by industry fixed effects, while prefecture fixed effects explain only 8 percent, suggesting the possibility of knowledge spillover within industries. Second, in the sample of establishments matched with intermediate inputs in the Census of Manufacturers, changes in the intermediate inputs (which is a proxy of demand shock) explain some of the changes in management practices, after controlling for the industry and prefecture fixed effects. A similar result is obtained when we regress changes in management practices on production shocks in the baseline years (2009–2011) measured by changes in the log of shipment values in these years (see Online Appendix Table A.8 for the results). This result implies that changes in management practices are not likely to be driven by changes in production scale or firm-specific demand changes. Third, establishments may have changed management practices to prevent employees from working long overtime hours. In this case, there is a problem of reverse causality for our main analysis. However, neither the establishment’s average overtime hours nor the share of employees working more than 45 hours of overtime significantly predicts later changes in management practices (see Online Appendix Table A.9 for the results). Fourth, firms’ employment practices may affect the adoption rate of new management practices. In particular, we consider traditional Japanese employment practices that have been characterized by strong employment protection and long tenure, resulting in a high share of home-grown employees (defined by regular workers who were hired directly upon graduation, or haenuki in Japanese) (Hashimoto and Raisian 1985). Indeed, we find that the average tenure and share of home-grown employees negatively predict the changes in management practices. In addition, the fraction of women among the employees of the establishment in the base years positively predicts changes in management practices. Also, establishment size (measured by the log of shipment) in the base years negatively predicts changes in management practices. The detailed results are reported in Online Appendix Table A.9. These results imply that management practices tend to improve more in small, nontraditional Japanese establishments.
F. Descriptive Statistics of Overtime Hours
The basic summary statistics for the employee-level variables are given in Online Appendix Table A.10, and more detailed summary statistics for overtime hours are provided in Online Appendix Table A.11.14 While around 45 percent of workers worked less than ten hours of overtime, 11 percent of workers worked for more than 45 hours of overtime per month in 2015–2016. Compared to other countries, the share of workers working long hours is high in Japan, ranking in the top five in the OECD (OECD 2021). Japanese corporate culture is often argued to be a cause of long working hours (OECD 2018). In addition, traditional Japanese employment practices may promote long working hours through the mutual expectation between employers and employees due to their long-term relationships (Kambayashi and Kato 2016).15 Such a large share of employees working long hours has been considered a major policy concern in Japan and a possible cause of health problems, work–life imbalance, work-related stress, and in the worst case, karoshi (death caused by overwork).
In the following analysis, we examine the relationships between management practices and various levels of overtime hours. In particular, to define “long” working hours in our setting, we consider overtime of more than 45 hours per month as a key threshold. Japanese worker’s compensation rules specify that when a worker dies from a brain or heart disease and has worked more than 45 hours of overtime in the preceding month, overwork is considered a possible cause of death. This implies that 45 hours of overtime work is considered the threshold at which the risk of karoshi begins to increase. This threshold number is comparable to the criteria in other countries. For example, the EU Working Time Directive specifies that the maximum hours of work per week, including overtime hours, should be 48 hours. This implies that, if the legal workweek is 40 hours, as in Japan, the maximum overtime hours per month should be around 32 hours. In addition, the WHO and ILO jointly report that 3.6 percent of all deaths globally due to ischemic heart disease and 6.9 percent of those for stroke are attributable to working for more than 55 hours per week (Pega et al. 2021), which is about 60 hours of overtime per month under the Japanese legal workweek. In addition, to define “short” working hours, we consider overtime of more than ten hours per month, which roughly corresponds to more than 42 hours of work per week.
III. Results: Overtime Work and Management Practices
A. Graphical Illustration
We start by graphically illustrating the distribution of overtime hours and relating them to the overall management practices of establishments over the target years. To do this, we divide the sample of establishments into two groups according to whether the overall management score improved during 2010–2015 more than the sample median. Figure 1 depicts the distributions of overtime hours among male workers16 in 2010 (light gray) and 2015 (transparent) in the surveyed month for the two groups of establishments.17 For both groups of establishments, overtime hours increased on average between 2010 and 2015, possibly reflecting the overall economic expansion. However, there were differences in the way hours increased across the two groups. In establishments with more improved management scores, the increase in positive overtime hours mostly corresponds to an increase in the share of workers working between ten and 50 hours of overtime. In contrast, in establishments with less improved management scores, the increase in positive overtime hours largely corresponds to an increase in individuals working overtime for more than 50 hours, combined with a slight increase in workers working overtime for 20–50 hours. Overall, Figure 1 illustrates the importance of considering the effects on the distribution of overtime hours, not only the effects on the average overtime hours.
Distribution of Overtime Hours by Changes in Management Scores
Notes: We divide the sample of establishments into two groups according to whether the change of the overall management scores from 2010 to 2015 is below (left panel) or above (right panel) the median of the changes in the sample. The figures show the histograms of overtime hours among male workers by two groups of establishments and by periods 2010–2011 (light gray) and 2015–2016 (transparent). All workers working for more than 50 hours are top-coded as above 50 hours.
B. Average Overtime Hours and Management Practices
Next, we explore this relationship in a regression framework, controlling for the observed worker and establishment characteristics. We first examine the general relationships between management practices and average overtime hours and then move to investigate the implications for the distribution of overtime hours.
First, without controlling for establishment fixed effects, in Column 1 of Table 1, overtime hours are regressed on the overall management score. We control for year fixed effects and basic worker characteristics, measured by age, age-squared, tenure, tenure-squared, female dummy, and three education dummies.18 In addition, noting that management practices could be correlated with establishment-level demand shocks, we control for the amount of flexible inputs used in production, as measured by the log of the total cost of intermediate inputs (materials, fuel, and electricity). The standard errors are clustered at the level of the establishment. The results show that the use of more structured management practices is positively and significantly correlated with longer overtime hours.
Average Overtime Hours and Management Practices
However, as shown in Column 2, controlling for establishment fixed effects drastically reduces the coefficient and increases the standard error, making the estimated coefficient negative and statistically insignificant. This result highlights the importance of controlling for establishment fixed effects to understand within-establishment variation in management practices and hours. The positive and strong cross-sectional correlation between management practices and overtime hours could be driven by various other kinds of establishment characteristics that are fixed over time and the results of matching and sorting between firms and workers that have happened over a long preceding period.
It is common in the literature on management practice to use a single aggregated score of management practices (for example, an average of all scores, as shown above), owing to the similarity of the effects of these practices on production performance (Bloom and Van Reenen 2007). However, when we consider the relationship of such practices with working hours, each component of the practices should be examined separately because of the distinct roles played by each component. In Column 3, we disaggregate the overall management score into three closely related components: bonus and promotion (wage-related HRM), monitoring and targeting (non-HRM), and displacement scores (employment-related HRM). We find that the coefficient of the monitoring and targeting score is negative, while the coefficient of the bonus and promotion score is positive, and that both coefficients are relatively large. However, the standard errors remain large. This can be attributed to the heterogeneous effects of these management practices across workers, particularly by the level of overtime hours worked, as examined in the subsequent section.
C. Distribution of Overtime Hours and Management Practices
Even if management practices do not alter the average overtime hours in the establishment, they may affect the distribution of overtime hours within the establishment, as presented in the graphical illustration. In particular, the marginal effect of the management practice score on overtime hours can be heterogeneous, depending on the level of overtime hours. One potential approach to examine this possibility is to estimate a quantile regression. However, it is not straightforward to control for establishment and year fixed effects in a quantile regression (although we do it as a robustness check in Section IV and show that the results are similar). Therefore, as our main specification, we estimate a standard ordinary least squares (OLS) analysis using an indicator variable of working overtime for more than k hours, where k = 5, 10, . . ., 50, for individual i in establishment j in year t, denoted by 1(OHijt > k), as the dependent variable. Specifically, we estimate the following equation:

for each k = 5, 10, . . ., 50. Managementjt is a vector of establishment j’s management scores in year t that contains the bonus and promotion score, the monitoring and targeting score, and the displacement score. The subscript t indicates one of the four years 2010, 2011, 2015, or 2016. Xijt is a vector of worker i’s attributes, including age, age-squared, tenure, tenure-squared, dummy variables for educational attainment, and gender. As in Table 1, we control for establishment demand shocks (Demandjt) using the log of the cost for intermediate inputs, measured by the total cost of materials, fuel, and electricity as a proxy for flexible inputs. We also control for year and establishment fixed effects to control for unobserved economy-wide shocks and time-invariant establishment-level factors affecting overtime hours. Standard errors are clustered at the level of the establishment.
Table 2 provides the results. The estimated coefficients of the bonus and promotion score are positive and statistically significant for overtime of more than five to 35 hours. These magnitudes are large, implying that an increase in the bonus and promotion score by 0.1 (roughly equivalent to one standard deviation of the changes in the score from 2010 to 201519) is associated with a 4.7 percent increase in the share of workers working more than ten hours of overtime. The magnitudes of the coefficients peak around overtime above ten to 20 hours and become smaller for longer hours.
Distribution of Overtime Hours and Management Practices
Conversely, the coefficients of the monitoring and targeting score are negative for all ranges, and they are small in magnitude for overtime above five to ten hours and become larger for longer hours. In particular, for overtime above 40–50 hours, the coefficients are statistically significant at the 10 percent level. The coefficients are large; for example, an improvement in the monitoring and targeting score by 0.1 (roughly equivalent to a one standard deviation of the changes in the score from 2010 to 201520) is associated with a 7.7 percent reduction in the share of workers working more than 45 hours.21
The striking differences in the results between HRM and non-HRM practices imply that they play different roles in managing working hours. We discuss possible interpretations of each of their distinct roles below.
D. Roles of HRM Practices
We start by exploring the role of wage-related HRM practices in more detail. An improvement in the bonus and promotion score here means either a change from a fixed wage scheme to a group-based incentive scheme or a change from a group-based to an individual-based incentive scheme. Hence, a natural interpretation of the results for the bonus and promotion score is that such changes prevent moral hazard (hidden action problems), inducing some workers to exert greater effort, as widely shown in the literature on performance pay and career concern (Bloom and Van Reenen 2011; Holmström 1999), in the form of increasing working hours. Interestingly, our results show that the coefficients of bonus and promotion score are large only for short overtime hours, but small and insignificant for excessively long hours. One explanation is that the marginal product of labor decreases with additional overtime hours, as per evidence by Brachet, David, and Drechsler (2012); Pencavel (2015, 2016); and Collewet and Sauermann (2012), while the marginal cost of effort possibly increases. Therefore, payment schemes based on individual output would not encourage workers to work beyond the level at which the marginal effort cost exceeds the marginal increase in (expected) payment. In other words, the variation in coefficient sizes may reflect the balance between the marginal cost and the marginal benefit of working an additional hour at each level of overtime hours.
Alternatively, this result may also be interpreted by an adverse selection (hidden information) model. In theory, the introduction of a promotion scheme based on individual performance can alleviate the adverse selection problem, which induces some workers to signal their ability in the form of long working hours (Sousa-Poza and Ziegler 2003; Landers, Rebitzer, and Taylor 1996). More specifically, the bonus and promotion score in the JP-MOPS improves when promotion schemes change from reflecting factors other than individual performance to evaluation sorely based on individual performance. Such a change potentially comes together with improved measures of individual performance or ability (for example, individual production output) to be used for promotion decisions. According to signaling modeling (Sousa-Poza and Ziegler 2003; Landers, Rebitzer, and Taylor 1996), this could reduce the need for workers to signal ability by working long hours. This mechanism would reduce workers who work long overtime hours and increase those who work short overtime hours instead. While this explains the positive and decreasing coefficients of the bonus promotion score for short and medium hours, it does not explain the nonnegative coefficients for long hours. However, we should note that our data on bonus and promotion schemes do not explicitly indicate whether the measures for individual performance evaluation still include working hours. If the measures still include working hours, the nonnegative coefficients of the bonus and promotion score for long overtime hours may reflect this reason. A better way to examine the implication of the signaling hypothesis is to look at the interaction with non-HRM practices (monitoring and targeting scores), which potentially convey information about the degree of data collection regarding individual performance. This is discussed in Section III.F.
Next, we explore whether the results of wage-related HRM practices in Table 2 are mainly driven by changes in payment schemes (short-term incentives) or changes in promotion schemes (long-term incentives). One way to examine this question is to divide the sample by tenure. The theory of career concern predicts that the incentive for promotion exists only during the earlier career stages and, therefore, should not affect long-tenured workers. However, there is no strong reason to believe that performance pay differentially affects long- and short-tenured workers. To examine this possibility, in Panel A of Table 3, we estimate the equations for overtime above ten (“short”) and 45 (“long”) hours using the sample divided by tenure group, with tenure being less than the median (11 years) (“junior”) or not (“senior”). On average, junior workers tend to work longer overtime hours than senior workers do. The coefficients of the bonus and promotion score are larger for junior workers than senior workers for all levels of overtime. Notably, although the coefficient for long hours is large, positive, and significant among junior workers, this coefficient is insignificant and small among senior workers. This supports that the results of wage-related HRM practices are mainly driven by promotion schemes. As a robustness check, we run the same specification by disaggregating the bonus (MOPS Questions 9–12) and promotion scores (Questions 13–14) and find that the coefficients of promotion are positive for junior workers and negative for senior workers for short and long overtime, although these differences are insignificant (Online Appendix Table A.12).22
Distribution of Overtime Hours and Management Practices by Workers’ Characteristics
Another interesting aspect of HRM practices and working hours is gender. Female workers tend to work fewer overtime hours than male workers on average, and this is considered the source of the gender wage gap, through promotion (Gicheva 2013). Panel B in Table 3 divides the sample by gender. The bonus and promotion coefficients are larger for female workers on both short and long hours. The coefficients for female workers are sizable; for example, an increase in the bonus and promotion score of 0.1 is associated with a 30.4 percent increase in the share of workers working longer hours. As a result, the improvements in bonus promotion scores are associated with a narrower gender gap in the share of workers working long overtime hours. One potential explanation for this result is that performance-based promotion policies help break the gender glass ceiling, which may otherwise prevail, for example, under tenure-based policies. An alternative interpretation is that working hours signal the workers’ commitment to the firm, and such a signal is stronger for female workers, consistent with the model and evidence provided by Kato, Ogawa, and Owan (2017). As before, the results suggest that promotion schemes rather than payment schemes (bonuses) seem to be the main driver of the results for HRM practices. This is consistent with the study by Bandiera et al. (2021), which showed that men and women are equally affected by performance pay.
E. Roles of Non-HRM Practices
Next, for the interpretation of the results on non-HRM practices (monitoring and targeting), one of the key features to note is that all of their coefficients in Table 2 are negative. One explanation is that these practices eliminate waste in production, reducing the total labor hours needed to produce a given amount of output. For instance, collecting and reviewing detailed data on the production progress may alert managers to problems in the production system, and continuously improving a production system could reduce problems that lead to a waste of labor hours. Another key feature of the results is that the coefficient sizes for shorter hours are small, and they become larger for longer hours. One interpretation of this feature is that monitoring and targeting practices also allow leveling of workload across workers and time. In theory, in a simple world where all workers have the same concave individual-level production function of working hours, the most efficient workload allocation is for every worker to work the same hours every day. However, the actual workload can be heterogeneous across workers and days for various reasons, such as miscommunication (bad monitoring practices), planning mistakes (bad targeting practices), and workers’ preference for working long hours (Hamermesh and Slemrod 2008). In such a case, leveling of the workload across workers and time through more frequent and closer production monitoring improves allocative efficiency. In addition, setting and sharing targets at a moderate level of difficulty would help adjust workloads across time, thereby reducing excessively long overtime and possibly increasing short overtime. The negative coefficients, even at short hours in our results, can be explained if we combine this hypothesis with the waste elimination hypothesis. Another explanation is that workload reallocation does not necessarily result in an increase in short to medium overtime if the work can be done by other workers or other times within standard work hours. In such a case, it implies an increase in efficiency during standard work hours, which we cannot measure in our data.
The abovementioned hypothesis suggests that structured monitoring and targeting practices control excessive overtime, and the magnitude of this effect should be larger for workers who tend to do more of it. A typical example of such workers would be junior workers, who tend to have stronger career concerns and may have insufficient experience in managing their own time and workload. Consistent with this result, in Panel A of Table 3, we find that the negative coefficient of the monitoring and targeting score is large and statistically significant at the 5 percent level for overtime of above 45 hours for junior workers but smaller and statistically insignificant for senior workers.23
We further explore this possibility by dividing the sample of workers by their propensities for working overtime over ten and 45 hours. By estimating the probability of working overtime above ten or 45 hours as a function of gender, age, tenure, and education level, we exploit the full variation in workers’ observed characteristics. We use the 2009 BSWS data, as it is the year before our baseline year, and estimate probit models. The results yield some basic characteristics of workers working overtime hours (Online Appendix Table A.14). Consistent with previous results, junior and male workers tended to work longer overtime. In addition, the probability of working overtime increases but is strongly concave in age, and college graduates tend to work overtime less than high school graduates. We use these estimated probit models to predict the propensity of working overtime for all workers in our analysis sample in 2010–2011 and 2015–2016. We then divide the sample by the median of the estimated propensity scores. Panel C of Table 3 reports the results. The monitoring and targeting score coefficients are large, negative, and statistically significant for workers with a greater propensity to work long hours and insignificant for the other workers. The coefficients are sizable, suggesting that a 0.1 increase in the monitoring and targeting score is associated with a 7.7 percent reduction in the share of workers working long hours. This result supports our conjecture that structured monitoring and targeting practices control excessive amounts of overtime by workers who are likely to do more of it.
F. Interaction between HRM and Non-HRM Practices
In the analysis so far, we implicitly assumed that each management practice is independently related to overtime hours. However, HRM and non-HRM practices may work as substitutes or complements. In particular, if better monitoring and targeting practices provide more accurate information about individual performance to the firm, they may also influence the way that bonuses and promotions affect overtime hours. Whether these practices work as substitutes or complements depends on the underlying features of asymmetric information. Recall that we have two interpretations of the effects of the bonus and promotion score: hidden action and hidden information hypotheses. Under the hidden action hypothesis, the two practices are likely to be complements because a shift to individual-level performance-based pay/promotion would induce greater effort when accurate information about individual performance is measured and reported to the employer, compared to the case where such information is not measured or reported. In contrast, under the hidden information hypothesis, the practices are likely to be substitutes because workers have less need to signal their ability by working overtime if individual performance is accurately measured and reported to the firm (Sousa-Poza and Ziegler 2003; Landers, Rebitzer, and Taylor 1996).
To examine these hypotheses, we estimate the following equation, which adds an interaction term between the bonus and promotion, and the monitoring and targeting practices to Equation 1.24

where BPjt, MTjt, and DPjt denote the bonus and promotion score, the monitoring and targeting score, and the displacement score, respectively. The result is presented in Table 4. The estimates of αk are positive and statistically significant for overtime over ten hours, implying complementarity of the two practices. This result supports the hidden action hypothesis that the two practices are net complements. Note that this result does not reject the signaling hypothesis. While the two practices can have the elements of both complements and substitutes, the evidence suggests that the complementary effects are stronger.
Testing Complementarity of Bonus–Promotion and Monitoring–Targeting
To better illustrate this result, Panel A of Figure 2 plots the estimates of the “marginal contribution” of the bonus and promotion score (defined by ) in percentage terms of the mean of the dependent variable, where ΔBPjt is one standard deviation of the change of the bonus and promotion score from 2010 to 2015. The panel depicts these values at three different levels of the monitoring and targeting score: MTjt is at the tenth percentile, at the sample average, and at the 90th percentile in 2010. At the average level of MTjt, the marginal contributions of BPjt are positive and around 5–10 percent of the dependent variable for all levels of overtime hours. The marginal contributions are larger when MTjt is at the 90th percentile. This is consistent with the hidden action hypothesis, which implies that the introduction of pay/promotion tied to individual performance induces greater effort when the employer collects more accurate information about individual performance. Interestingly, when MTjt is at the tenth percentile, the marginal contribution of the bonus and promotion score is small for short overtime hours, and it is even negative for long overtime hours. This result could imply that the bonuses and promotions tied to individual performance induce only little additional effort under poor performance monitoring and targeting practices. Also, in such a case, the bonuses and promotions tied to group/firm performance could induce relatively more effort than those tied to individual performance.
Complementarity of Bonus–Promotion and Monitoring–Targeting: Graphical Representation
Notes: The figure visualizes the results of Table 4. In Panel A, we plot the estimates of the marginal contribution of bonus and promotion score, defined by , where ΔBPjt is one standard deviation of the change of the bonus and promotion score from 2010 to 2015. The panel depicts this value at three levels of the monitoring and targeting score (MTjt): the tenth percentile, the sample average, and the 90th percentile in 2010. In Panel B, we plot the estimates of the marginal contribution of monitoring and targeting score, defined by
, where ΔMTjt is one standard deviation of the change of the monitoring and targeting score from 2010 to 2015. The panel depicts this value at three levels of the bonus and promotion score (BPjt): the tenth percentile, the sample average, and the 90th percentile in 2010. The marginal contributions in the y-axis are shown in percentage terms of the mean of the dependent variable (that is, an indicator of working overtime above the hours shown on the x-axis).
Panel B of Figure 2 similarly plots the estimates of the “marginal contribution” of the monitoring and targeting score (defined by ) in percentage terms of the mean of the dependent variable, where ΔMTjt is one standard deviation of the change of the monitoring and targeting score from 2010 to 2015. At the average level of BPjt, the marginal contributions of MTjt are negative and decreasing in the level of overtime hours. The marginal contributions of MTjt are negative and large at the tenth percentile of BPjt, while they are close to zero at the 90th percentile of BPjt. This is possibly because two countervailing effects of MTjt are offsetting each other. On one hand, an increase in MTjt could imply collecting more accurate information about individual performance, and this would induce greater effort of workers, especially when their pay/promotions are more strongly tied to the individual performance (as implied by the positive interaction term). On the other hand, an increase in MTjt could also represent improving efficiency of workload allocation across workers and time, which would lead to a reduction of long overtime (as discussed in Section III.E), irrespective of the level of BPjt. Therefore, at a low level of BPjt, the second effect dominates the first effect, while the two effects offset each other at a high level of BPjt.
Overall, the result above suggests that the relationship between the management practices and disparities of overtime hours across workers can depend on the baseline level of the management practice scores. In particular, the adoption of more structured bonus and promotion practices is associated with increasing disparities at a high level of monitoring and targeting score, while it is associated with decreasing disparities at a low level of monitoring and targeting score. However, the introduction of more structured monitoring and targeting practices is generally associated with a reduction in disparities of overtime hours across workers at almost any level of bonus and promotion score.
G. Other Channels
We have explored the explanations for the changes in actual overtime hours in the previous sections. An alternative interpretation of the result on non-HRM practices is that a higher monitoring score is associated with changes in reported overtime hours (not necessarily with actual overtime hours) through better monitoring of overtime hours.25 If some workers overreport hours, better monitoring of hours may reduce the reported hours to the level of actual hours. Since our data on working hours are in principle a payroll record of the workers, it indicates the working hours for which the establishment paid the workers. Therefore, under this alternative explanation, the results imply that an improvement in the monitoring score eliminates a part of the establishment’s wage bill that was used to pay workers to produce nothing. However, we argue that this alternative does not explain our results. This is because under this assumption, the coefficients of monitoring and targeting should be universal across all levels of overtime hours (because workers have an incentive to misreport their overtime hours), while we observe large and significant coefficients only for long overtime hours.
Finally, the results are robust to the alternative specifications. First, we find similar results when we change the variable controlling for demand shock from the intermediate input cost to the log of the establishment’s sales (Online Appendix Table A.15). Second, our main results are robust to controlling for possible differential trends of overtime hours by the establishment’s baseline characteristics (see Online Appendix Table A.15 for the results). As discussed in Section II.E, we found that small non-Japanese establishments tend to improve management practices more than other types of establishments. If these establishments have differential time trends of overtime hours for reasons that are not related to changes in management practices, our main results are biased. We examined this possibility by additionally controlling for the establishment’s baseline characteristics (including the proxy variables for the traditional Japanese employment practices) interacting with the second period (years 2015–2016) dummy to Equation 1 and found similar patterns to the baseline results.26 Third, we test whether the results are driven by recall errors in retrospective questions on practices in 2010.27 We found that our results remain qualitatively the same when we restrict the sample to JP-MOPS respondents whose tenure is longer than seven years (Online Appendix Table A.16). Fourth, the results are also robust to redefining categorical management scores by dropping MOPS questions regarding managers (Questions 11, 12, 14, and 16 in Online Appendix Table A.3), as these questions may be less relevant for workers in our sample.28 As Online Appendix Table A.17 shows, using the redefined scores gives a similar result, although the coefficient estimates become noisier.
IV. Discussions
A. Labor Turnover and Management Practices
The results in the previous sections could be potentially affected by changes in worker composition through hiring and separation, which, in turn, may be influenced by changes in displacement policy or other management practices. To examine this possibility, we estimate the association between changes in management practices and changes in labor turnover measured using three types of data.
First, we use the data on hiring and separation at establishments from the Survey of Employment Trends (SET) conducted by the MLHW (see Online Appendix A.1 for details about the SET). From the data, we define the “hiring rate” (or “separation rate”) by the total number of hires (or separations) of full-time workers in a calendar year divided by the number of full-time workers at the beginning of the year at each establishment. We then match the hiring and separation rates in 2010 and 2011 to the management data for 2010 from the JP-MOPS and those in 2014 and 2015 to the management data for 2015. Table 5 shows the results of regressing the hiring and separation rates on the management scores, year fixed effects, establishment fixed effects, and controlling for demand shocks by either the log of intermediate input cost or the log of shipment value in the year.
Labor Turnover
Overall, the results indicate only a weak association between changes in management practices and changes in labor turnover. In particular, we may expect that modification of management policies toward more frequent displacements results at a higher rate of separation, in which case there may be negative or positive selection of the remaining workers (Jovanovic 1979). However, the coefficients of the displacement score for the separation rate are small and statistically insignificant.29 This result is not surprising, however, given the persistent prevalence of the Japanese long-term employment system in which full-time workers are rarely fired in normal times. The result suggests that, although changes in displacement score might have resulted in a few cases of actual displaced full-time workers, it did not have a large effect on worker composition. The coefficients of the other management scores were also small and statistically insignificant. One limitation of this analysis, however, is that the sample size of the SET that matched the JP-MOPS was small, with around 400 establishment–year observations. Thus, we complemented our analysis using other types of data.
The second set of data used was the employee sample of the BSWS. We aggregate the employee data at the establishment–year level by computing the share of workers hired in the past five years, using the information on tenure. We then decompose this share into new graduate and mid-career hires using information on age. The remainder, those whose tenure is more than five years, are those who have stayed in the establishment for the preceding five years. In Online Appendix Table A.18, we regress the share of newly hired workers and that of staying workers on management practices, controlling for year fixed effects, establishment fixed effects, and the log of intermediate input cost. The estimated coefficients of the displacement score are small and insignificant, which is consistent with earlier results. Furthermore, we find only a weak and insignificant association between changes in other management scores and changes in the share of newly hired workers.30
Lastly, we can also use the establishment survey of the BSWS, which asks establishments about the number of new graduate hires. Using these data matched with the JP-MOPS, we find only an insignificant association between changes in hiring of new graduates and changes in management practice scores (Online Appendix Table A.19). The results are similar when we decompose the number of new graduate hires by gender.
In sum, these results indicate only a weak and insignificant relationship between changes in management practices and changes in labor turnover. This result may reflect two features of our empirical study. First, the Japanese labor market is known for its slow and inflexible labor adjustment for full-time workers relative to other OECD countries. For instance, comparing the average rate at which workers flow into and out of unemployment31 during 1968–2009 across countries, Barnichon and Garda (2016) show that Japanese inflow and outflow rates are approximately 0.5 percent and 20 percent, respectively, while those in the United States are approximately 3.5 percent and 57 percent, respectively. Additionally, we examine changes in management and outcomes over only five years, which may be too short for the adoption of new management practices to affect the composition of workers. Therefore, our results are best interpreted as short-run relationships between management practices and labor outcomes before substantial compositional adjustments occur. Although the long-run analysis is out of the scope of this paper, more structured management may gradually attract workers who prefer these practices, in which case, the long-run relationships between management practices and overtime hours are also possibly affected by these newly hired workers’ preferences for overtime hours.
B. Inequality of Overtime Hours within Establishments
The results of our analysis suggest that improving some of the management scores—in particular, the monitoring and targeting score—is associated with declining inequality in overtime hours within establishments. Next, we directly test this hypothesis in two ways. The first is the use of quantile regression. Note that, while our baseline specification allows us to examine changes in the likelihood of working above certain hours, it does not fully explain changes in the shape and location of within-establishment distributions of overtime hours. Quantile regressions shed light on these aspects. Following the quantile regression method with panel fixed effects proposed by Canay (2011), we use the residuals after the mean regression of individual overtime hours on establishment fixed effects and the log of the total cost for intermediate inputs. We then regress the residual on the change in the overall management scores from 2010 to 2015, an indicator for observations in 2015 or 2016, and their interaction term by quantile regression. The estimated coefficients of the interaction term, summarized in Panel A of Figure 3, indicate that an improvement in the overall management scores is associated with small increases in overtime hours at around 45th and 50th percentiles of the distributions and large decreases around 80th to 95th percentiles.32 When we repeat a similar process for estimating the quantile regression for each of the three disaggregated scores (Figure 3, Panels B–D), we find that the former is driven by changes in the bonus and promotion scores, and the latter is mostly driven by changes in the monitoring and targeting scores, which is consistent with our baseline results.
Quantile Regressions for Overtime Hours
Notes: The figure summarizes the results of quantile regressions for overtime hours. We first obtain the residuals after the mean regression of individual overtime hours on establishment fixed effects and the log of the total cost for intermediate goods. We then regress the residuals on the change in the management scores from 2010 to 2015, an indicator for observations in 2015 or 2016, and their interaction term using quantile regressions. As the management score, Panel A uses the overall management score, Panel B uses the bonuses and promotion score, Panel C uses the monitoring and targeting score, and Panel D uses the displacement score, respectively. The estimated coefficients of the interaction term are presented in the figure. The 95 percent confidence intervals were calculated based on standard errors estimated by bootstrapping with 500 replications. As the distributions of overtime hours are truncated at zero, nonzero changes at the lower percentiles of the distributions are observed only for a limited fraction of establishments. The gray areas indicate the magnitude of this issue by color. As shown in Online Appendix Figure A.5, (i) about three-quarters of the establishments have zero as the tenth percentile of the establishment’s distribution of overtime hours, (ii) about half of the establishments have zero as the 20th percentile of the distribution of overtime hours, and (iii) about a quarter of the establishments have zero as the 45th percentile of the distribution of overtime hours.
The second way to test the implications of within-establishment inequality is to regress establishment–year-level inequality indicators on the management scores, controlling for year and establishment fixed effects, and demand shocks. In Table 6, we first regress the average overtime hours worked in the establishment on the overall management score in Column 1 and on the three disaggregated management scores in Column 2. Consistent with the results in Table 1, the coefficient of the overall management score is positive and insignificant, and the coefficient on the bonus and promotion score is positive and significant. In Columns 3–4 and Columns 5–6, the dependent variables are the difference in overtime hours between the maximum and the minimum in the establishment and the difference between the 95th and 5th percentiles in the establishment, respectively. The coefficients of the overall management practice are negative in Columns 3 and 4, which seem to be mainly driven by the negative coefficients of the monitoring and targeting score. Although these coefficient estimates are noisy due to the low statistical power, their sizes and directions are consistent with the earlier results, suggesting that establishments that improved their monitoring and targeting scores narrowed the gap between the high and low tails of the distribution of overtime hours.
Distribution of Overtime Hours within Establishments
V. Concluding Remarks
Since overwork can cause problems for both productivity and workers’ lives, firms’ management of long working hours is an important issue for society. Nevertheless, how firms’ management practices are related to working hours is still largely unknown. For economists, how working hours are affected by both labor demand and supply factors has been a topic of great interest. However, to date, despite recent empirical studies (using large-scale data obtained from surveys of firm managers) showing that HRM and non-HRM factors strongly affect firm performance, empirical evidence linking working hours and management practices has been limited to a few studies showing the link between promotion and long working hours relying on worker-side data. Therefore, in this study, we obtained a large set of panel data concerning management practices of Japanese manufacturing establishments and matched them with the overtime hours data of their employees.
We obtained the following results. The adoption of more structured bonus and promotion practices (wage-related HRM) is associated with increasing share of employees working short-to-medium overtime hours. This result is consistent with the standard theoretical result that predicts the compensation and promotion schemes tied to individual performance induce worker effort. However, we also find that the introduction of more structured monitoring and targeting practices (non-HRM) is associated with reductions in long overtime hours. One explanation is that when firms collect detailed production processes and systems data and review them regularly, they can make continuous improvements to their production processes and systems, which in turn allows leveling of workload and time across workers and reduces the problems that lead to unnecessary overtime. In addition, setting production targets at a moderate level of difficulty and sharing them across workers helps adjust workload and time across workers, which in turn, also reduces the problems that cause unnecessary overtime. As a result, using more structured non-HRM practices seems to play a key role in narrowing the disparity in hours across workers within establishments.
Finally, we find only a weak and insignificant association between changes in management practices and changes in hiring and separation rates. This implies that changes in worker composition induced by changes in management practices do not significantly affect the above results. Although this result does not exclude the possibility that changes in management practices lead to significant compositional adjustments in the long run, we interpret our results as short-run relationships between the changes in management practices and those in the labor market.
Acknowledgments
The authors thank Nick Bloom, Catherine Buffington, Renata Lemos, and Atsushi Ohyama for their valuable inputs in designing the Management and Organizational Practice Survey in Japan (JP-MOPS). They are also grateful for the helpful comments from the participants at the 2018 Empirical Management Conference, the 2018 Transpacific Labor Seminars, and the 2019 EALE Conference and two anonymous referees. This work was supported by JSPS KAKENHI Grant Numbers JP18H03633 and JP19H00592. The ESRI had the right to review the paper prior to its circulation. The authors have no relevant or material financial interests that relate to the research described in this paper. This paper was previously titled “Management Practices Meet Labor Market Outcomes.” The data used in the analysis are the microdata of government statistics, which we are not allowed to make public. Researchers can gain access to the data by submitting a written application to the Japanese government ministries: Ministry of Health, Labor and Welfare for the Basic Survey of Wage Structure, Social Research Institute (ESRI) of the Cabinet Office for the Management and Organizational Practice Survey in Japan, and Ministry of Economy, Trade and Industry for the Census of Manufacturers. A second Online Appendix of Replication Materials is provided.
Footnotes
↵1. Work design theories of management literature describe such a link between management practices and working hours. For example, see the literature survey by Grant and Parker (2009).
↵2. There are also studies concerning the business cycle as a determinant of hours of work (Prescott 2004; Rogerson 2006).
↵3. Asking recall questions is a common feature of the international MOPS project. This is based on a finding from the original US-MOPS data that confirms the high response quality of such questions (Bloom et al. 2019), especially for respondents whose tenure is above five years (that is, those who were in the same position at the time being recalled). As a robustness check, we estimate the main specification by limiting the sample to respondents who were in the same position in 2010 and confirmed that the results are qualitatively the same as the main results.
↵4. One difference in the survey design of the JP-MOPS compared to the US-MOPS is that the JP-MOPS only includes establishments employing 30 or more workers. This is mainly because the MOPS was expected to be matched with other governmental surveys to minimize the burden on respondents; detailed information about establishments in the Japanese Census of Manufacturers is also collected only for establishments with 30 or more workers.
↵5. For the MOPS questionnaire itself, see https://www.census.gov/programs-surveys/mops/technical-documentation/questionnaires.html (accepted July 19, 2024) for the US-MOPS and https://www.esri.cao.go.jp/jp/esri/prj/current_research/service/manage/chosa_seizo.pdf (accepted July 19, 2024) for the JP-MOPS.
↵6. After removing establishments with missing scores for at least ten questions, we are left with 10,806 establishments in the JP-MOPS data.
↵7. The sampling rate is 1/2 for establishments with 30–99 employees and 1/5 for manufacturing establishments with 100–499 employees, and it varies a lot across industries for larger establishments; see Ministry of Health, Labor and Welfare (2021) for details. Since the response rate of the BSWS was approximately 82 percent in the manufacturing sector in 2010, the sample selection bias caused by nonresponse is considered to be small.
↵8. The BSWS is a copy of firm payroll records. In Japan, there is a set of necessary information about working conditions that each labor contract must include and employers are legally obliged to retain the relevant data as the payroll record by Labor Standard Act Article 107-109.
↵9. These are employees whose title is either Bucho (department manager/head) or Kacho (section manager/head) in the BSWS. They account for approximately 3 percent of employees who are strictly less than 60 years of age. These managers do not include those who actually supervise other workers but not titled as above, such as line manager/head and unit manager/head (Higuchi and Kambayashi 2018).
↵10. Note that the BSWS includes establishments with less than 30 employees. Among 21,612 establishment–year observations of management practices (or 10,806 observations of unique establishments) in the JP-MOPS, 26 percent are matched to the BSWS.
↵11. Among 5,529 establishment–year observations we match from the BSWS to the JP-MOPS, some establishments are observed only in the first (2010–2011) or only in the second (2015–2016) periods, in which cases these observations are useless for our panel analysis controlling for establishment fixed effects. Therefore, we excluded these observations from our analysis sample.
↵12. Furthermore, we examine the differences in labor outcomes (wages and overtime hours) between the matched and unmatched samples and find few systematic differences unexplained by worker characteristics. Moreover, when we regress the labor outcomes on worker characteristics by matched and unmatched samples, we observe similar features of the coefficients. See Online Appendix Table A.4 for the results.
↵13. The displacement score is substantially lower in Japan than in the United States at only 0.24 in the JP-MOPS compared to 0.51 in the US-MOPS (Buffington et al. 2017). Strikingly, about 66.3 percent (73.0 percent) of the JP-MOPS sample responded with a “rarely or never” when asked whether an underperforming nonmanager (manager) was reassigned or dismissed, compared to 33.2 percent (42.8 percent) in the US-MOPS. Given this stark difference, together with the widespread long-term employment in the Japanese employment system (Hashimoto and Raisian 1985), it is possible that establishment characteristics relating to worker displacement play a distinct role in determining working hours.
↵14. The minimum value of overtime hours in our data is zero. The BSWS does not contain the information on the worker’s contractual hours; thus, we do not analyze “negative overtime hours,” that is, the hours worked less than the contractual hours.
↵15. Since we earlier found that management practices were likely to evolve less in establishments applying traditional Japanese employment practices, we conduct a robustness check to examine the possibility that such employment practices drive our main results. We find that our main conclusion is robust to controlling for possibly differential trends of overtime hours by proxy variables for such employment practices. See Section III.G for more details.
↵16. In this graphical illustration, we focus on the sample of male workers because the incidence of long working hours is concentrated among male workers (as shown in Online Appendix Table A.11); hence, their figures are suitable for illustration.
↵17. The cases of zero overtime hours are included in Figure 1 as zero to five overtime hours. The share of workers doing any positive hours of overtime increased by 0.048 in establishments with less improved management scores and by 0.029 in establishments with more improved management scores.
↵18. To control for the worker’s education attainment, we use dummy variables for middle school graduates, junior/technical college graduates, and college graduates (the reference group is high school graduates).
↵19. For example, a change in the bonus and promotion score by 0.1 can be roughly achieved if an establishment changes the bonus system of the nonmanagers to be considered on the basis of team performance rather than establishment performance and modifies their promotion system to be based solely on individual performance rather than both individual performance and tenure.
↵20. For example, a change in the monitoring and targeting score by 0.1 can be achieved if an establishment increases the number of KPIs collected from 1–2 to 3–9 and starts to show the KPIs on not just one but multiple display boards.
↵21. We obtain a similar sizable result for monthly overtime of over 60 hours, although the coefficient is estimated less precisely. The coefficient of the monitoring and targeting score for overtime of over 60 hours is −0.033 with a standard error of 0.026, and the mean of the dependent variable is 0.045.
↵22. The insignificant results for the promotion score can be attributed to the high correlation between changes in bonus and promotion scores. For example, regressing the change in promotion scores on the change in the bonus score in the MOPS establishment-level sample, the coefficient is 0.23 with a standard error of 0.06. However, regressing the change in the displacement score on the change in the bonus score gives a much smaller coefficient of 0.14 with only a slightly higher standard error of 0.07.
↵23. When we alternatively divide the sample by age, we find that younger workers are more likely to work long hours, and the negative coefficient of the monitoring and targeting score for long hours is larger for them than for older workers (Online Appendix Table A.13).
↵24. The use of the interaction term for testing complementarity of factors in production function is common in the literature. See, for instance, Bresnahan, Brynjolfsson, and Hitt (2002).
↵25. A higher monitoring score should not be directly interpreted as better monitoring of hours. Note that questions on monitoring are asked about production, but not about workers. In addition, the respondents were mostly top-level managers of the establishments, and there were only a few respondents from HRM departments. Even so, there is a possibility that monitoring practices regarding production and hours are correlated.
↵26. In this analysis, we assume that the time trend (that is, the conditional expectation of the change in the dependent variable from 2010–2011 to 2015–2016) is linear in the establishment’s baseline characteristics. An alternative specification that avoids such a linearity assumption is to control for some categorical variables constructed from the establishment’s baseline characteristics interacting with second period dummy. We also conducted this exercise and found qualitatively similar results. Results are available upon request.
↵27. According to Bloom et al. (2019), a comparison of management practice scores in 2010 using the 2010 and 2015 US-MOPS, which asked the same retrospective questions as ours, shows that the tenure of the manager responding to the survey at the establishment is an important determinant of recall errors. In particular, they show that the response quality is high if the respondent had been in the establishment at least one year before the period of recall.
↵28. Although we dropped some managers in the BSWS from our analysis sample, the definition of these managers is solely based on job titles and does not include some workers who actually supervise other workers (Higuchi and Kambayashi 2018). On the other hand, the JP-MOPS defines managers more broadly within its questionnaire based on task content. Therefore, questions regarding managers in the JP-MOPS would still apply to managers who remain in our BSWS analysis sample. For this reason, we use all MOPS questions, including those regarding managers, in our main specification.
↵29. The coefficients imply that a one standard deviation increase in the displacement score is associated with just 0.70 percentage points ( = 0.162 * 0.043 * 100) increase in the separation rate, while the average separation rate in the sample is 10 percent.
↵30. For example, an improvement in the monitoring and targeting score by 0.1 (which is roughly equivalent to the standard deviation of changes in this variable) is associated with only a 1.8 percent reduction in the share of hired workers.
↵31. Given the lack of internationally comparable data on recent hiring rates across countries, labor turnover into and out of unemployment is a useful approximation of the frequency of hiring and separation.
↵32. Since the distributions of overtime hours are truncated at zero, it is noteworthy that nonzero changes at lower percentiles of the distributions are observed only for a limited fraction of the establishments. Therefore, the results for lower quantiles may not be informative for selection issues (see Online Appendix Figure A.5 for more details).
- Received April 1, 2021.
- Accepted May 1, 2022.
This open access article is distributed under the terms of the CC-BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) and is freely available online at: https://jhr.uwpress.org.