Abstract
We identify the effects of employment on intimate partner violence (IPV) by collaborating with 27 large companies in Ethiopia to randomly assign jobs to equally qualified female applicants. The job offers increase employment, total hours worked, income, earnings, and earnings shares within couples in the short and medium run, but we find no effects on our main preregistered outcome, physical IPV. In particular, we can reject relatively small positive increases of physical IPV. In the short run, job offers reduce emotional abuse by 26 percent.
I. Introduction
Female employment is on the rise in the poorest countries of the world, driven in part by a general shift from agriculture to service sector jobs and light manufacturing (Heath and Jayachandran 2016). This trend is strong in Ethiopia, where the manufacturing sector is growing quickly and provides many jobs for women (Gelb et al. 2017). Improved employment opportunities for women have been shown to increase their human capital, delay fertility, mobilize career aspirations, and is generally believed to increase female empowerment (Jensen 2012; Heath and Mobarak 2015; McKelway 2021). The effects of women’s employment on intimate partner violence (IPV)1 are, however, ambiguous. On one hand, employment may reduce women’s risk of IPV by increasing their bargaining power and improving outside options. Employment may also increase emotional well-being, by increasing economic security, which in turn may reduce IPV (Buller et al. 2018). On the other hand, it may fuel aggressive responses from partners who view their status as threatened or intend to extract some of the extra resources brought by the job. If IPV increases with female employment, the net utility gain of female employment at the individual level becomes uncertain (Heath and Jayachandran 2016). In addition to being harmful to the victim, IPV has also been shown to entail substantial costs to society and to affect children (Carrell and Hoekstra 2010; Pollak 2004; Doyle and Aizer 2018; Aizer 2011). Fearon and Hoeffler (2014) estimate that the global costs of IPV amount to more than 5 percent of World GDP and that the costs of IPV in Sub-Saharan Africa amount to almost 15 percent of the regional GDP.
We investigate the effects of women’s employment on IPV in Ethiopia using a large-scale preregistered randomized field experiment. Qualified female job applicants were randomly assigned to job offers that substantially increased earnings and job probabilities in our six-month, 12-month, and 18-month follow-up surveys. The job offers also increased the total hours worked and total income. We find no effect on our main preregistered outcome, physical or sexual abuse, and we can reject relatively small increases in IPV. We find that being offered a job even decreases emotional violence after six months, but our longer-term results suggest that this effect is unstable over time. We find no difference in reporting of physical abuse when comparing direct questioning to elicitation by means of a list experiment. To the extent that reporting bias still exists, we would expect it to be in the direction of more reporting by employed women, which makes our rejection of small increases in IPV even stronger.
This work contributes to a rapidly growing literature on IPV in economics. Economists have investigated a range of different determinants of IPV, such as education (Erten and Keskin 2018; Gulesci, Meyersson, and Trommlerova 2018), property rights (Amaral 2017), culture and social norms (Alesina, Brioschi, and La Ferrara 2016; Tur-Prats 2019), divorce laws (Brassiolo 2016; Stevenson and Wolfers 2006; Garcıa-Ramos 2017), weather shocks (Miguel 2005; Cools, Flatø, and Kotsadam 2020; Abiona and Koppensteiner 2018; Sekhri and Storeygard 2014), and gender ratios (Amaral and Bhalotra 2017). They have also investigated the effects of interventions to reduce partner violence, such as female police stations (Amaral, Nishith, and Bhalotra 2021), mandatory arrest laws and no drop policies (Iyengar 2009; Aizer and Dal Bo 2009), gender and entrepreneurship training (Green et al. 2015; Bulte and Lensink 2018), awareness raising (Villanger 2020), and edutainment (Banerjee, La Ferrara, and Orozco 2019; Green, Wilke, and Cooper 2020). There is also a literature on the male motives of partner violence, focusing on expressive factors such as relieving frustration (Tauchen, Witte, and Long 1991), information asymmetries and signaling (Anderberg et al. 2016; Anderberg, Mantovand, and Sauera 2018), emotional cues (Card and Dahl 2011), and instrumental reasons such as resource extraction (Bloch and Rao 2002).
By estimating the causal effects of jobs on IPV, our paper is most closely related to the literature on female employment and IPV. In particular, we provide strong evidence against the existence of large average individual-level increases of IPV in our setting. Previous studies from various contexts2 that have investigated the question with quasi-experimental methods have all investigated the effects of employment at the aggregate level, with mixed results. These studies rely on stronger identifying assumptions than we do.3 There are also randomized experiments on related areas of study, such as the effects of cash transfers (for example, Haushofer et al. 2019; Hidrobo, Peterman, and Heise 2016; Heath, Hidrobo, and Roy 2020; Angelucci 2008) and microcredit (Pronyk et al. 2006). These studies often find that increased resources to women reduce IPV or that it has no effect.4 Cash transfers and microcredit are, however, likely to have other effects than formal employment has. Women’s employment directly challenges men’s breadwinner status and provides access to social networks (Cools and Kotsadam 2017). In addition, working outside of the home may affect time spent together and time spent on other tasks, such as household work, which may affect conflict propensities.
Access to a wide battery of related outcomes, which are highly correlated with abuse, enables us to speak to the literature on other effects of female employment apart from IPV (see Heath and Jayachandran 2016 for an overview of this literature). In particular, we find no effects on decision-making power, attitudes towards gender equality, acceptance of abuse, or controlling behavior.
Our results also speak to the larger literature on the effects of industrialization on individual well-being. Blattman and Dercon (2018) find that industrial job offers in Ethiopia did not increase wages or even the probability of being employed after one year.5 In contrast, we find that the job offers increase earnings and that there are still differences in employment probabilities after 18 months. As such, our results are more in line with results from observational studies, and in particular with Getahun and Villanger (2018), who find that employment in Ethiopian flower farms increased welfare for rural women.
II. Employment and IPV
The correlation between individual level female employment and IPV is generally positive in Sub-Saharan Africa (Guarnieri and Rainer 2018) and even more so in areas with higher acceptance of abuse (Cools and Kotsadam 2017) and in countries with less gender equality (Heise and Kotsadam 2015). The literature using quasi-experimental designs has found that local-level female employment reduces abuse in the United States and the United Kingdom (Aizer 2010; Anderberg et al. 2016) and increases abuse in Mexico (Davila 2018) and in areas of Spain with stronger male breadwinner norms (Tur-Prats 2017).
The effects of female employment are generally thought to be moderated by macro-level factors, such as acceptance of divorce, the share of women working, male identity norms, and the degree of acceptance of abuse in society. One possible reason for the positive correlation between employment and IPV in developing countries is that partnership dissolution may be costlier for financial or social reasons, and, therefore, the outside option is practically nonexistent or further away (Bhalotra et al. 2018; Doyle and Aizer 2018). This could lead to more resource extraction and is the reason provided by Bulte and Lensink (2018) for why a gender and entrepreneurship training in Vietnam increased IPV. They argue that the results are driven by increased female incomes in combination with a large stigma associated with divorce, which leaves few real outside options. Vyas and Watts (2009) point to a pioneering hypothesis whereby the risk of IPV may be largest for the women that start taking the first jobs in an area because they break with norms about women’s roles. Such norm transgressions may, for instance, induce violent responses from partners for emotional reasons. Consistent with the pioneering hypothesis, Heise and Kotsadam (2015) find that the positive association between abuse and working for cash is strongest in countries where fewer women work. Cools and Kotsadam (2017) argue that community-level attitudes toward abuse are also likely to be important by giving a sort of impunity to husbands who want to reinstate their power within the household. They find a larger positive correlation between employment and abuse for women in areas where wife-beating is considered more acceptable. Kotsadam, Østby, and Rustad al. (2017) find that mining increases female employment, and it leads to higher levels of IPV in areas with higher levels of acceptance of IPV. This is also consistent with the finding by Tur-Prats (2017) that the response to better labor market conditions for women is increased violence in parts of Spain with a traditional nuclear family tradition and no effects in areas of Spain with a traditional stem family tradition. She interprets her results in an identity framework where men lose identity utility if their breadwinner role is threatened in traditional cultures. The effects of employment on IPV are thus argued to be context dependent.
III. The Context and The Field Experiment
A majority of the population in Ethiopia works in agriculture. The culture is generally described as patriarchal, and there is a widespread acceptance of IPV (Kedir and Admasachew 2010). While women’s legal rights with respect to divorce and civil liberties are formally equal to men’s, informal rules and adverse cultural norms affect family relations, and in practice women often lose their property when divorcing (CEDAW 2011). Using data from the world values survey (WVS) and from the Demographic and Health Surveys (DHS), we show in Online Appendix Figure A1 that Ethiopia scores low on acceptability of divorce and high on acceptance of abuse. According to the theories outlined in Section II, both factors lead us to expect that employment would be more likely to lead to increased IPV in Ethiopia than in many other places.
The Ethiopian manufacturing sector is growing quickly, and the Ethiopian government is actively accommodating foreign direct investors. One way of doing so is to build industrial parks to provide economies of scale for the potential investors. We worked with 27 firms within such industrial parks. More specifically, our intervention centered on shoe and garment factories in five different regions: Tigray, Amhara, Oromia, SNNP, and Dire Dawa. In the factories, the women earned on average 1021 ETB (around 38 dollars) per month at the first follow-up, and they usually worked for eight hours per day, six days a week.
The factories’ standard procedure of hiring was to advertise bulks of positions by posting on the front gate, by word of mouth, and on local job boards. The applicants were asked to gather on a specific day and were screened for eligibility using verbal and physical tests. The companies we collaborated with were hiring new workers and were willing to slightly alter their recruitment process. They first assessed all job applicants and determined whether each applicant was eligible for the job or not. Then, from the pool of eligible candidates, we created lists of women having partners. From the lists with eligible and partnered entry-level applicants, we randomly assigned around half (depending on the number of available positions and the number of available partnered women) to either receiving a job offer in the given factory (Treatment) or to a control group. The randomization was possible since there was a large surplus demand for jobs. The randomization was done using computers, and the lists were sent back via email. All applicants were informed about the procedure before the randomization was conducted.
IV. Data and Empirical Strategy
The women were interviewed by female enumerators from an independent survey team before they started working. This baseline data collection took place between March 2016 and March 2018, depending on when the firms were hiring, and each follow-up data collection was conducted around six months after the previous interview.6
The survey contains modules on demographic and background information, including measures of earnings and other socioeconomic variables. We developed a comprehensive module for IPV containing questions on both attitudes and experience with IPV. We also include questions on decision-making similar to the questions in the Demographic and Health Surveys (DHS).
We interviewed 1871 partnered women at baseline. Of these, 374 were not randomly allocated to jobs due to a misunderstanding in one place and due to internet problems during the state of emergency in another.7 We still collected data for these women, but we do not include them in our main analysis.8 Out of the 1463 randomly assigned women in our baseline sample, we managed to interview 1262 for the first follow-up, 1174 for the second follow-up, and 1073 for the third follow-up.
Our main specification is:

where i indexes individuals, t0 refers to baseline values, and t1 is the first follow-up. We also show results for t2 and t3, that is, for the more medium-run follow-up surveys. Yi,t1 will most often be a measure of abuse (see below). Treatmenti is a dummy variable equal to one if the woman was randomized to get the job offer and zero if not. This captures the so-called intention to treat effect, and it gives us an estimate of the total effect of being randomized to get a job offer. We always include Listi, which is a list of fixed effects (blocking variables), as women are randomized within this unit. As long as treatment status is randomly assigned, we do not expect any baseline differences between treated and control women. We include control variables in some specifications to see if we can increase precision. In particular, we include abuse at baseline and a vector of individual level baseline controls Xi,t0 (described below). We use robust standard errors.9
Our main outcome variable, physical or sexual abuse in the last three months (Physical abuse for short), is set equal to one for women who answer that they had a partner do one of the following to them during the last three months prior to being interviewed: pushing, shaking, slapping, throwing something, twisting an arm, striking with a fist or something that could cause injury, or kicking or dragging (any of which is classified by the DHS as “less severe violence”); attempting to strangle or burn, threatening with a knife, gun, or other type of weapon, or attacking with a knife, gun, or other type of weapon (any of which is classified by the DHS as “severe violence”); or physically forcing intercourse or any other sexual acts, or forcing her to perform sexual acts with threats or in any other way (any of which is classified by the DHS as “sexual violence”).
It is important to apply accurate descriptions of the violence that has occurred in order to maximize disclosure (Ellsberg et al. 2001), so we therefore ask about a wide range of abusive acts using indicators of internationally validated standardized IPV measures. We base the questions and sequencing on the WHO Violence Against Women Instrument (Ellsberg and Heise 2002) and the Conflict Tactics Scales (Straus 2017). Using a modified Conflict Tactics Scale (CTS) has several advantages compared to many other data sets on violence. A characteristic of CTS is that it uses several different questions regarding specific acts of violence. In this way the measure is less likely to be polluted by different understandings of what constitutes violence. CTS is also argued to reduce underreporting, as it gives respondents multiple opportunities to disclose their experiences of violence (La Mattina 2017).
In Table 1, we see that around 29 percent of the women in the sample have ever been physically abused, and around 13 percent have been so during the last three months before the first follow-up. Notably, we see that the rate of recent abuse in the full sample of treatment and control individuals has decreased from 19 to 13 percent from baseline to the first follow-up. In addition to our main outcome, we also measure emotional violence and controlling behaviors. The questions about emotional violence are the same as in the DHS surveys and are coded as one if the partner humiliated, threatened, or insulted the woman.10 We follow Heise and Kotsadam (2015) and create a variable for the number of controlling issues in the last three months by adding the number of positive responses to questions regarding jealousy and controlling and manipulating behaviors.11
Descriptive Statistics
As a proxy for bargaining power and female empowerment, we create a decision-making index based on 12 different questions.12 For each of the 12 questions, we create a dummy variable that equals one if the woman decides and zero otherwise.13 We then add the 12 variables together and divide by 12 to get an index ranging between zero and one.14
The survey also includes 11 questions on a wider set of attitudes toward gender equality. We recode each of these questions into dummy variables so that one is gender unequal.15 We again create an index where we add the dummies together and divide by 11.
The vector of individual-level controls is all taken from the baseline survey. Employment at baseline is based on the answer to the survey question, “Have you ever had a formal job with salary before?” From this we create the variable Any formal wage job (ever), which equals one if the answer is yes. Table 1 shows that around 31 percent of women have ever had a formal job at any time before the survey.
We also collected data on attitudes toward IPV by asking the same questions as the main ones used in the DHS surveys. For each of the five variables, we code them as one if the respondent agrees that a husband is justified in beating his wife in the five following situations: she goes out without telling him, she neglects the children, she argues with him, she refuses to have sex with him, or she burns the food. Following previous research (for example, Cools and Kotsadam 2017), we also create a variable Father beat mother, which is equal to one if the respondent answers yes to the question, “As far as you know, did your father ever beat your mother?”
We include a set of demographic variables. We retain the continuous coding of age in years and dummy code the religious affiliation of our respondents. The majority are Orthodox Christians, and we let that be the base category (together with the few people answering Catholic or Other) and create dummies for the other two main denominations (Muslim and Protestant). We recode the years of schooling variable into low (<10 years), medium (10 years), and high (>10 years) and use low education as the base category.
We test for baseline balance on these variables both individually and together by regressing treatment on the variables one by one while controlling for the blocking variables (Lists). As many variables are tested, we do not necessarily expect all of them to be statistically insignificant. We see in Column 1 of Table 2 that some of the variables are correlated with treatment status, so that showing results both with and without controls will be useful. We will also present results using an “optimal” set of control variables chosen by means of a double-debiased LASSO approach (Belloni, Chernozhukov, and Hansen 2014). In Column 2, we see that being Muslim and having seen your father abuse your mother are statistically significantly correlated with treatment when we include all variables at the same time. Most importantly, however, we find that the variables cannot predict treatment status together in an F-test (F = 1.26, p = 0.26). In Columns 3 and 4, we test how the same control variables predict our main outcome variable, physical IPV, at follow-up, and we note that they do (F = 3.43, p = 0.06), but that physical IPV at baseline is the only strong predictor. We note that Muslim, which is the variable with the strongest imbalance in treatment probability, is not correlated with physical IPV.
Balance Tests and Predictions of Control Variables
In Online Appendix Section A.2, we compare data from our survey to data from the DHS. The rates of physical IPV are similar, and comparing our data to the same areas in the DHS, the numbers are similar also with respect to employment. If we only include women that have ever had a formal job in the same geographical areas, we note that the rates of physical IPV are even more similar. We also show that there is variation across our study areas with respect to levels of physical abuse, employment, divorce rates, and acceptance of abuse as measured in the DHS.
We have several measures that enable us to investigate the effects of job assignment on job take-up and earnings. In the six-month follow-up analysis, we create a variable, Any wage job last 6 months, which equals one if the respondents answer affirmatively to either of the two questions, “Did you start working at Factory X?” (the one where the respondent applied) or “Have you had any other formal salaried job with salary since the last interview?” For the later follow-up analyzes (at 12 and 18 months), we instead create a dummy variable based on earnings from any wage job (where one equals positive earnings).16
Because not all women offered a job start working and some women not offered a job at this time are able to find another job, we do not expect treatment to perfectly predict job status. To measure and to some extent account for imperfect compliance we also estimate instrumental variables (IV) models. It should be noted that the exclusion restriction need not hold for variables, such as earnings and income shares, as it is likely that getting a job affects a person’s identity in addition to the effects it has on income. We therefore prespecified that the intention to treat specification is the main specification. The IV models should rather be seen as explorative tests of mechanisms for the results.
V. Main Results
We start by showing the effects of the randomization on employment related variables after six months in Panel A of Table 3. We see a large effect on the probability of having had any wage job during the last six months. While 29 percent in the control group have had such a job, this share increases to 69 percent for the treatment group. At the first follow-up, around 63 percent of the women had started working in the factory where they applied for jobs. The main reason women did not start working was that they were not satisfied with the wage and working hours they were offered. At the first follow-up, around one-third of the women who started working in the factory they initially applied for had quit. The main reason for quitting was dissatisfaction with the salary. Most of the women in the control group who had worked between baseline and follow-up had worked in a factory or in retail.
First Stages: Effects of Treatment on Employment and Earnings in Different Waves
We also see large effects on earnings and on the woman’s share of couple earnings and incomes. The women’s earnings from wage jobs is more than doubled (Column 2), her share of within couple wage earnings is increasing (Column 3), and the probability that she earns more than her partner increases from 18 percent to 32 percent (Column 4).17 In Column 5, we show that incomes from any source are also higher in the treatment group. In Panels B and C, we show the longer-term effects on the same variables, and we note that, while the effects are smaller in later periods, there are still effects after 18 months. As we show in Online Appendix Table A1, the results are very similar if we add control variables. In Online Appendix Table A2, we also see that job offers lead to more total hours worked overall, less leisure time, less time spent on social and religious activities, and more travel time.18
In Table 4, we show the effects of job offers (Treatment) on IPV. The results show that Treatment is not statistically significantly related to physical IPV in any of the follow-up waves, but we note that all coefficients on physical abuse are negative. In Panel A, we see the results from the first follow-up data. In Column 1, we show the results from our main specification, which includes only the list fixed effects. The coefficient for Treatment is 0.01, and conducting an equivalence test with two one-sided t-tests (TOST), we can reject effects more negative than −0.043 and more positive than 0.023.19 Hence, we can reject relatively small effects, especially for increased physical abuse. The results are very similar if we add the vector of individual level baseline controls, as we show in Column 2.20
Reduced Form Effects of Treatment Assignment on IPV in Different Waves
We see in Columns 3 and 4 that there is a negative effect on emotional violence in the first follow-up data. This effect is large and suggests that emotional violence is reduced by 5.3 percentage points (26.5 percent from the mean in the control group). In Online Appendix Table A5, we show that the estimated effect on emotional violence seems to be driven by all three components (humiliation, threats, and insults) being reduced. In general, the control variables do not do much to affect the estimates, but they do not affect the standard errors much either. In Online Appendix Table A6, we show that the results are also similar when using an “optimal” set of controls, using a double-debiased LASSO regularization approach (Belloni, Chernozhukov, and Hansen 2014). Notably, the only selected control variables are the outcome variables measured at baseline. This analysis was not prespecified.
In Panels B and C, we show the longer-run effects, and we note that the effects on physical abuse are relatively stable, but the effects on emotional abuse seem less robust. In particular, the coefficient is very close to zero and not statistically significant in the 12-month follow-up. While we preregistered the analyses of emotional abuse we still view the results as exploratory, as this is not our main outcome. All in all, however, it seems as if job offers did not cause an increase in IPV.21
We show in Online Appendix Table A9 that attrition is unrelated to treatment status. The only variable correlated with attrition is age; older women are less likely to attrite. In all, we conclude that selective attrition does not seem to be a problem for the analysis.22
In Table 5, we show results for our main variable to be instrumented, Any wage job last six months. In Columns 1 and 2 of Panel A, we show the ordinary least squares (OLS) relationships between baseline wage job and physical abuse. We note that the correlation is positive, as in previous literature focusing on Africa and as in the DHS survey for Ethiopia in 2016 (where women employed last year have a two percentage point higher IPV rate last year). In Columns 3 and 4, we show the OLS results for emotional abuse. We note that while also these coefficients are positive, they are not statistically significant. In Panel B, we show the causal effects of having had a wage job during the last six months when it is instrumented by the randomized job offer. In Columns 1 and 2, we show we see that the coefficient is negative for physical abuse, but it is not statistically significant. The IV results for emotional abuse are negative and statistically significant. In Online Appendix Tables A10 and A11, we present the results from IV models with other employment related variables.
Correlations and Effects of Wage Jobs on IPV
There may be several reasons why employment does not affect physical abuse. For instance, it could be that employment does not affect important mediators. It is additionally possible that employment affects mediators, but in ways that together have offsetting effects on IPV. Decision-making, attitudes toward gender equality, attitudes toward abuse, and controlling behavior are factors that are likely mediators for how employment could impact abuse. We show in Table 6 that there is indeed a correlation between these variables and physical abuse at baseline (except for the gender equality index, for which the correlation is not statistically significant). For emotional abuse there is no statistically significant correlation with the decision-making index either. In Online Appendix Table A14, we see that the number of controlling issues is the only one of these potential mediators that is statistically significantly correlated with severe physical violence and sexual violence.
Correlation at Baseline between Abuse and Potential Moderators
In Table 7, we see that there is no treatment effect on any of these variables. To check the robustness of the results on decision-making with respect to coding choices (Peterman et al. 2021), we show in Online Appendix Table A12 that the results are similar if we instead code the decision-making variable based on a median split or if we use a principal component analysis to create the index. In Online Appendix Table A13, we also show the estimated effects on answers to each of the questions that comprise the decision-making and equality indexes. We see that there is only one statistically significant effect of job offers out of all the gender equality variables. Women in the treatment group are 4.5 percentage points more likely to agree that “It is okay for women to travel or to leave the house for several nights to do business.”
Reduced Form Effects on Potential Mediators
We find very little evidence of effect heterogeneity with respect to any of the baseline control variables (see Online Appendix Section A.1). In particular, there is no statistically significant difference in the effects for women of different ages, religion, or education levels. Neither is there any difference for women with different attitudes towards domestic violence or who had different experiences with their fathers abusing their mothers. We further note that there is no difference in the effects for women who had been employed before or not, nor between women who had recently been abused before or not.23 In total, we note that there is very limited evidence for heterogenous treatment effects. This may be due to lack of important variables in our data, such as baseline attitudes of the husbands, or because women applying for jobs in our context are relatively similar, so there is little variation in the baseline measures.
VI. Addressing Reporting Issues: Results from List Experiments
Reported abuse is a function of both abuse and the propensity to report it, and we cannot separately identify the two. When asking about experience with IPV, we worry that individuals may conceal their experiences to conform to social norms or because they are ashamed. If such social norm bias is related to employment, it can seriously undermine the credibility of our self-reported measures. In particular, employment can be expected to increase reporting. Getting a job exposes the women to many settings in which they are expected to speak their minds, such as salary negotiations and asking for leave. Working together with other women, away from their husbands, may also lead to discussions of IPV.
While we believe that underreporting may occur in our data, we still think that the problem is limited due to the careful data collection. One indication of this is the high actual reported prevalence and the high acceptance of violence in the data. In any case, there exist no available data on IPV from other sources (for example, from the police or hospitals) at the local level in Ethiopia. Even if such data would exist, it is unlikely that reporting bias would be lower. Using DHS data, Palermo, Bleck, and Peterman (2014) show that there is much larger underreporting to formal sources than in surveys. In fact, only 7 percent of the women who reported IPV in the DHS surveys had reported to a formal source.
To investigate the issue of underreporting and social desirability bias, we randomly divided a sample into two groups and asked respondents to count the number of true statements on a list that either includes a sensitive statement or not, in a so-called “list experiment.” By comparing the number of statements reported as true across the two groups, we get a measure without any specific individual having revealed their own status. By also asking a question about the sensitive statement directly to the list control group we can assess the degree of underreporting by comparing the results when using the two different ways of asking. The degree of underreporting can then also be compared across subgroups of, for example, those offered a job and not or those employed and nonemployed. Three papers use list experiments to investigate underreporting of IPV across subgroups, and none of them find it to be correlated with employment (Peterman et al. 2018; Agüero and Frisancho 2017; Joseph et al. 2017). Bulte and Lensink (2018), however, evaluate an empowerment course and find that using list experiments affects the conclusions.
We conduct the list experiment on a sample of 367 women (254 of whom are in our main sample) who were participating in an empowerment course after the first follow-up survey in January–April 2018. On the final day of the course, we had them answer a questionnaire. The data collection started with a detailed instruction of how to answer the questions (Online Appendix Figure A3). In Figure 1, we show the control and treatment questions when the variable of interest is My partner sometimes hits me. The control questions include four statements that we are not interested in and that are used only to get an average to compare the other group. The treatment list includes the same questions and adds the question of interest. The control questions are created to avoid ceiling and floor effects and to include items that are negatively correlated so as to increase power (Glynn 2013). To take a concrete example, let us say that the list control group answers that two of the four statements are true on average, and the list treatment group answers that 2.5 of the statements are true on average. Since the only difference between the two groups is the extra question on IPV, we would infer that 50 percent of the individuals in the list treatment group had experienced IPV.
List Experiment for the Question “Partner Sometimes Hits”
We also included another list to measure Partner punched last 3 months. The list treatment group got the list shown in Online Appendix Figure A4, and the list control group got a list without Item 2.
In Table 8, we show the results of the list experiments. We see that individuals getting the list with the additional question about partner sometimes hitting answer 0.18 more true statements on average. The interpretation from this is that 18 percent of the individuals have partners that sometimes hit them. When asking the question directly to the control group, we see that 15 percent answer that they have partners that sometimes hit them. While lower, the difference is not large nor statistically significant. For the list experiment with “been punched by your husband in the last three months,” we get a larger difference, but it is not statistically different either. We see that people in the list control group answer that around 1.5 of the four control items are true on average for both lists. In Online Appendix Section A.3, we show the results of additional robustness tests and balance checks.
List Experiment
Moving over to differences in reporting across subgroups, we split the samples into those offered a job (treated) and not (control) and into those employed at baseline or not. As seen in Figure 2, which shows the point estimates and 95 percent confidence intervals, there does not seem to be a difference for the statement “partner sometimes hits” for any of these groups. An important caveat to these analyses is that jobs may affect the control items as well, so the results should be interpreted with care. Another disadvantage is that the list experiment leads to relatively noisy estimates. Online Appendix Figure A5 shows the same type of figure for the second list experiment.
List Experiment: “Partner Sometimes Hits” by Subgroups
Notes: Treated and control refers to the randomization of job offers in the field experiment. List refers to the estimated prevalence of having partner sometimes hitting in the list experiment. Direct refers to the prevalence when using a direct survey question. Difference refers to the difference between asking in the list experiment minus asking directly. The difference-in-difference estimate for the treated and control, that is, the effect as measured with the list experiment, is not statistically significant (p = 0.25). 95 percent confidence intervals are shown.
While we can never completely rule out that being offered a job affects reporting, we find the results reassuring. In addition, we are not particularly worried about researcher demand effects whereby the respondents would answer the questions in a way to try to please the enumerators. First, neither the enumerators nor the respondents had any reason to believe that the main interest lies in investigating IPV. The survey was framed as one “to study the lives of women seeking work in the industrial sector in Ethiopia.” The survey is also long (it takes between 60 and 90 minutes to complete the interviews), and only a small subset of the questions are about IPV.
In our data, abuse decreases for both treatment and control women from baseline to the first follow-up. We do not know why abuse has declined in our sample. It may be that general changes in Ethiopian society and in our areas in particular (such as high growth, increased male and female employment rates, and political liberalization) reduce IPV.24 It may also be that reporting of abuse decreases when women are interviewed several times. We do not believe this to be the case for several reasons. First, we would expect more reporting over time as the women build up a relationship with the enumerators. Secondly, previous studies have not found any evidence for such survey effects, even when explicitly testing for them (Haushofer et al. 2019). For social desirability to affect the internal validity of our conclusions it would have to be the case that abuse either increases, or decreases less, in the treatment group but that they do not want to tell us (anymore) or that abuse decreases in the treatment group but those in the control group do not want to tell us that they are still abused. As we do not observe any effects of treatment on the acceptance of abuse, we find such effects particularly unlikely.
VII. Conclusion
Intimate partner violence (IPV) is harmful and costly for society (Fearon and Hoeffler 2014). It is related to a host of negative outcomes for the women who are abused and the people around them (Carrell and Hoekstra 2010; Pollak 2004; Doyle and Aizer 2018; Aizer 2011). The theoretical predictions on the effects of jobs on IPV are ambiguous, and fears have been raised that increased IPV may make the utility of female employment negative, at least in the short run (Heath and Jayachandran 2016).
We identify the individual-level effects of formal employment on IPV by randomly assigning job offers to equally qualified applicants, in collaboration with large companies in Ethiopia. We find that being offered a job is not increasing physical and sexual abuse, despite finding large effects on the probability of working, on time use, and on the women’s absolute and relative incomes. We find that job offers reduce emotional violence in the short run, but the longer-term results suggest that this effect is not stable over time. There are no effects of job offers on attitudes toward gender equality, attitudes toward abuse, decision-making power, or controlling behavior. Hence, we do not find any effects on the mediators that we have tested. There are many other possible mediators that we do not have in our data, however, such as reduced stress and negative affective states for the husbands.
It is difficult to know why there is a positive correlation between employment and abuse in the cross-section, but our results suggest that it may be driven by selection rather than being a causal relationship. In addition, the margin we study the effects at is one where everyone applies for a job, and it could be the case that it is the decision to apply that causes violence. Furthermore, the women applying for jobs may be only the ones who do not expect abuse to increase. It could also be that contextual-level employment is more important than individual-level employment. In a bargaining framework, improved employment opportunities increase the bargaining power of all women, including those who are currently not employed, and hence the contextual level of employment may be what determines outside options and threat points (Aizer 2010).
The context in which we are investigating the effects is one where we should expect the increases in abuse following job offers to be large if men use IPV to extract resources from the women. Acceptance of abuse is high, and acceptance of divorce is low in Ethiopia. Finding that job offers do not increase abuse in such a setting is comforting, and we view it as possible that job offers could be protective in other settings with different moderating macro-level factors. The results may also differ depending on specific characteristics of the jobs, such as their empowering potential, how far away they are, and their wages. We strongly urge future studies to conduct similar field experiments in different settings, so we will learn whether there is no relationship overall or whether our results stand out in some way. Finally, we hope that future studies using randomization of job offers can investigate mechanisms to a larger extent, preferably by combining quantitative and qualitative methods.
Acknowledgments
The authors thank Sara Cools, Eliana la Ferrara, David McKenzie, Amber Peterman, Gaute Torsvik, Ole Røgeberg, and Henning Øien for valuable comments. Funding from the Norwegian Research Council and an anonymous donor is acknowledged. The research presented in this paper has IRB approval from The Norwegian Center of Research Data (number 55793). An analysis plan was preregistered at the AEA RCT registry (number AEARCTR-0002569) before any follow-up data were received. The plan, a paper with all prespecified analyses (“Populated PAP”), and replication material are found at https://andreaskotsadam.wordpress.com/publications/.
Footnotes
↵1. We mainly use the terms IPV or abuse in this paper, and we take it to mean physical (including sexual) and emotional violence against women perpetrated by their partners.
↵2. The United States (Aizer 2010), Spain (Tur-Prats 2017), Sweden (Ericsson 2020),), the United Kingdom (Anderberg et al. 2016), Mexico (Davila 2018), Turkey (Erten and Keskin 2021), and India (Amaral, Bandyopadhyay, and Sensarma 2015; Chin 2012).
↵3. Most previous studies use Bartik instruments. The identifying assumptions in such models have been scrutinized lately, and reanalyses of papers using the method have shown that identification usually comes from the industry shares (Goldsmith-Pinkham, Sorkin, and Swift 2020). In addition, Adao, Kolesár, and Morales (2019) show that inference in popular Bartik designs is problematic due to correlated residuals across areas with similar industry shares.
↵4. Across the 56 quantitative outcomes included in a review by Buller et al. (2018), more than half were statistically insignificant. Baranov et al. (2021) find statistically significant average reductions in a recent meta-analysis, however.
↵5. They found that an entrepreneurial program had larger effects on employment in the short run, but going back to the sample five years later they found complete convergence in employment across all groups over time (Blattman, Franklin, and Dercon 2019). As compared to the five firms they study, our firms are more geographically spread out and our sample includes mostly married women, whereas their sample includes mostly single men and women.
↵6. There is some variation in timing due to a state of emergency and insecurities in some areas at some points in time.
↵7. The lists and the results from the randomization were sent back and forth by email, and when the internet was not working, the randomization was not carried out. In the case of the misunderstanding, the field worker had not understood that they should wait for the randomized lists to return. In both of these cases, the women were assigned to jobs in a nonrandomized way.
↵8. The results including these women are very similar, and none of the conclusions change if we do include them, as we show in Online Appendix Table A7.
↵9. There is no need to cluster the standard errors at the factory level since the randomization is at the level of the individual (Abadie et al. 2017).
↵10. See survey questions 13–15b in the survey provided in Online Appendix Section A.4 for exact wording.
↵11. See questions 7b–11b.
↵12. We have 15 different questions in the survey on intrahousehold decision-making. Not all questions apply to all people in the sample, however. For example, the decision to send a child to school has missing values for all individuals who do not have children. We therefore preregistered that we would use the 12 questions that were more likely to apply to everyone (questions J1.03–J1.15 in the survey).
↵13. If the individual decides together with the partner we code the variable as one only if she has “a lot” of input into the decision (that is, category 4 on the J1B questions) and otherwise as zero.
↵14. There is an active discussion in the literature on how to best measure decision-making power. Concerns have been raised that small changes in indicator construction may lead to different results, and a recommendation is to first conduct work on the perceptions of decision-making (Seymour and Peterman 2018; Peterman et al. 2021). Our data collection was not preceded by any work on perceptions, but we will present results from different types of indexes and for each variable separately.
↵15. See questions GA1–GA11 in the survey, we recode, for example, 1 or 2 to be 1 on statement GA1 and 3 or 4 on statement GA2.
↵16. This was not prespecified in the analysis plan, but we change it anyway, as it makes little sense to continue to base the variable on whether they started working at the factory
↵17. If we use earnings from any source the results are similar, increasing from 15 to 31 percent.
↵18. The women still live at home and travel to work each day, and we see that travel time is increased by treatment. Women in the control group spend 4.65 hours travelling per week, and this increases by 0.65 hours for the treatment group.
↵19. Alternatively, the 95 percent confidence interval is [−0.049, 0.029], which is very similar. We prefer the one-sided equivalence tests for conceptual reasons as one tests whether effects are larger than a highest value and lower than a lowest value. In practice, it makes little difference. If we pool all post treatment waves together in order to maximize power, we can reject effects more negative than −0.031 and more positive than 0.013, see Online Appendix Table A3. We did not preregister to pool the waves.
↵20. Breaking the effect down by different components of physical abuse, we see in Online Appendix Table A4 that there does not seem to be any effect on less severe, severe, or sexual abuse.
↵21. We also use data on where the individuals lived at baseline to explore whether there seems to be spillover effects. We use the lowest administrative level in Ethiopia, the kebele, which is either a neighborhood or a village. We then calculate the share of people in each kebele that is treated and we include this variable and its interaction with Treatment. This allows us to investigate if there are effects of randomly having more individuals from your Kebele assigned to a job and whether the treatment effect differs for people with a larger share of treated in their village. As seen in Online Appendix Table A8, the answer to both of these questions is no. These analyses were not preregistered.
↵22. Applying so-called Lee bounds (Lee 2009) to our data, the effects on physical abuse are never statistically significant, and the upper bound on the effect on emotional abuse is not statistically significant.
↵23. We also tested whether there was a difference in effects between those that had ever been abused or not. In the theoretical model of Anderberg et al. (2016), such a situation offers the most interesting case in terms of revealing information about husband type. The prediction is that men will be less likely to signal that they are of the abusive type in situations where women have a better outside option. This would also be consistent with Tankard et al. (2019), who find that a savings intervention in Colombia reduced the risk of IPV only for women never abused at baseline. We find no difference in the effects across these groups. We also used the generic machine learning approach by Chernozhukov et al. (2018) to search for heterogeneous treatment effect, but an omnibus test suggest that there is no treatment effect heterogeneity in the data.
↵24. The political liberalization during this time were extraordinary and rapid. For instance, Ethiopia saw the release of thousands of political prisoners, opposition groups and media were allowed to work more freely, and a peace deal with Eritrea led to the Nobel Peace Prize to the new prime minister.
- Received July 2021.
- Accepted May 2022.
This open access article is distributed under the terms of the CC-BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) and is freely available online at: https://jhr.uwpress.org.