What Is Civil War? Conceptual and Empirical Complexities of an Operational Definition Author(s): Nicholas Sambanis Source: The Journal of Conflict Resolution, Vol. 48, No. 6 (Dec., 2004), pp. 814-858 Published by: Sage Publications, Inc. Stable URL: https://www.jstor.org/stable/4149797 Accessed: 08-11-2018 15:30 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at https://about.jstor.org/terms Sage Publications, Inc. is collaborating with JSTOR to digitize, preserve and extend access to The Journal of Conflict Resolution This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms What Is Civil War? CONCEPTUAL AND EMPIRICAL COMPLEXITIES OF AN OPERATIONAL DEFINITION NICHOLAS SAMBANIS Department of Political Science Yale University The empirical literature on civil war has seen tremendous growth because of the compila tive data sets, but there is no consensus on the measurement of civil war. This increases th inferences from unstable empirical results. Without ad hoc rules to code its start and end and from other violence, it is difficult, if not impossible, to define and measure civil war. A wide tion in parameter estimates makes accurate predictions of war onset difficult, and differe results are greater with respect to war continuation. Keywords: civil war; Correlates of War; data sets; coding rules Advances in the empirical literature on civil war depend critically on and refinement of data on civil war occurrence. Most civil war lists rely Correlates of War (COW) project.' Since the first COW list was publish been little peer review of COW coding rules. Within the walls of the C there has been substantial debate on how to improve data quality. That d ever, has not benefited from open scholarly discourse.2 Given the proje nence, many large-N studies use COW data unquestioningly or limit th making only minor changes to COW data.3 Currently, about a dozen rese 1. Singer and Small (1972, 1994), Small and Singer (1982), Sarkees and Singer (200 (2000). 2. Personal communication (July 31, 2001) with Stuart Bremer, founding member of of War (COW). COW recently initiated an online forum for public debate, but the forum 3. Walter (2002) and Collier and Hoeffler (2001) use COW data without making any m Mason and Fett (1996) make only minor changes. AUTHOR'S NOTE: For comments, I thank Keith Darden, Michael Doyle, James Fear Gleditsch, HAvard Hegre, Stathis Kalyvas, David Laitin, John Mueller, Monica Toft, Ben V beth Wood, participants in the Laboratory in Comparative Ethnic Processes (Dartmouth M 2001 ), and participants in the University of British Columbia's workshop on "Econometric War." I also thank Annalisa Zinn, Douglas Woodwell, Katherine Glassmyer, Ana Maria A Rose Armentia Bordon for research assistance. All files necessary to replicate the analysis i be found at http://pantheon.yale.edu/-ns237/index/research.html#Data and www.yale.edu/unsy JOURNAL OF CONFLICT RESOLUTION, Vol. 48 No. 6, December 2004 814-858 DOI: 10.1177/0022002704269355 O 2004 Sage Publications 814 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 815 have produced civil war lists based on apparently divergent definitions of civi there is less pluralism here than one might think. Most projects do not condu historical research and depend heavily on COW. The result may be replic errors due to the original COW coding rules and uncertainty about whethe definitions generate different results. I take a close look at the operational definition of civil war and review se ing rules used in the literature and analyze their implications for our under civil war. To make the discussion concrete, I refer to specific cases. This ex erates insights into our ability to measure civil war and distinguish it from o of political violence. Drawing on those insights, I propose a new coding rule a new civil war list. I then measure the impact of differences across codin regressing the same civil war model on 12 different versions of war onset lence. Although it may be impossible (and some would argue undesirable) t a consensus on a single definition and measurement of civil war, it is im know if our understanding of civil war is affected by different coding rules. sis shows that some of our substantive conclusions are sensitive to differences in coding rules, whereas others are remarkably robust. Significant differences across civil war lists are mainly due to disagreement on three questions: What threshold of violence distinguishes civil war from other form of internal armed conflict? How do we know when a civil war starts and ends? How can we distinguish between intrastate, interstate, and extrastate wars? Answers to these questions are not only relevant for the purposes of accurate coding, but they also reveal the degree to which we share a common understanding of the concept of civil war. I discuss these issues as a way to explore what differences across codin tell us about the meaning of civil war. I do not offer new theory that helps civil war from other forms of violence.4 Rather, I want to see if availab definitions (coding rules) allow us to measure and analyze civil war as a d gory with a "natural history" (cf. McAdam, Tarrow, and Tilly 2001), a that is different from that of other forms of political violence (cf. Tilly implicitly accept here the premise that civil wars are different from other consider if the coding rules we have at our disposal are sufficient to m empirical distinction between civil war and other violence. I find that it is not possible to arrive at an operational definition of civil adopting some ad hoc way of distinguishing it from other forms of arm Although a core set of "ideal" cases of civil war may exist, too many cas ciently ambiguous to make coding the start and end of the war proble question the strict categorization of an event of political violence as a opposed to an act of terrorism, a coup, genocide, organized crime, or i war.5 In the end, it may be difficult to study civil war without considering 4. I offer such a theoretical discussion in a book-length manuscript, currently in prog cussion of why we should distinguish civil war from other political violence, see Sambani 5. Proceeding theoretically, rather than empirically, Tilly (2003, 14) makes a similar a ing that civil war does not have a distinct causal logic. It is a form of "coordinated destruc that includes various other forms of political violence that generate salient "short-run dam petrated by highly coordinated actors (the two dimensions that describe Tilly's typology This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 816 JOURNAL OF CONFLICT RESOLUTION in conflict shift from one form of violence to another, or it may be prof political violence in the aggregate, rather than cut across that complex enon with arbitrary definitions. This article is a first step in exploring t viding an analytical review of existing coding rules that highlights th accurately defining and measuring civil war. It also improves currentl ing rules at the margin, so as to produce a civil war list that is more cons core elements of most working definitions of civil war. HOW WOULD WE KNOW A CIVIL WAR IF WE SAW ONE? In their seminal study Resort to Arms, Small and Singer (1982, 210) d war as "any armed conflict that involves (a) military action internal to t (b) the active participation of the national government, and (c) effective both sides." The main distinction they drew between civil (internal or i and interstate or extrastate (colonial and imperial) war was the internality the territory of a sovereign state and the participation of the government ant. Civil war was further distinguished from other forms of internal arm the requirement that state violence should be sustained and reciprocated war exceeds a certain threshold of deaths (typically more than 1,000). This definition is deceptively straightforward. It is, I will argue, dif impossible, to develop an operational definition of civil war without a ad hoc coding rules to distinguish civil wars from other forms of politic accurately code war onset and termination. First, it is often difficult t extrastate from intrastate wars: for example, the Russian civil war in Ch 1990s might be considered as a war of decolonization, similar to Came independence in 1954. Second, it is unclear what degree of organization is required of the parti guish a civil war from one-sided, state-sponsored violence. In some cases government has ceased to exist, but we still code a civil war (e.g., Somali In other cases, the government may be fighting a war by proxy using mil ingly disorganized intercommunal clashes (as in Kenya's Rift Valley fr 1993), but most would not classify such a case as a civil war. Elsewhere, zations are indistinguishable from criminal networks or ragtag militias Third, if we focus on a numerical threshold of deaths to identify war deal with the problem of unreliable reporting and incomplete data? Even data, should termination be coded only on the basis of significantly redu or should we also focus on discrete, easily coded events, such as peace Fourth, given that violence during civil war is typically intermitten determine when an old war stops and a new one starts? And how can we di end of a civil war from the beginning of a period of politicide, terrorism, of violence? The COW project has provided a valuable public good to the profession by coding cases of war, but it has not offered much guidance on how to answer these difficul This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 817 questions. It may also have caused some confusion. One source of concept sion is the lack of clarity on the threshold of deaths used to distinguish civ other violence. Small and Singer (1982, 213) used an annual death thresho deaths in their early coding efforts, although they later seem to have aban criterion. Many authors still operate under the assumption that the COW an annual battle-death criterion.6 This uncertainty stems from the fact tha ing rules have changed at least three times. The first major change was the an annual death criterion for civil wars in Resort to Arms, whereas no such r existed for international wars in The Wages of War 1816-1965 (Singer 1972). Then the annual death criterion for civil wars was relaxed (Singer 1994), but it was still required for extrasystemic wars, although now deat by nonsystem members were counted (Singer and Small 1994, 5).7 In a th changes, Sarkees (2000, 129) and Sarkees and Singer (2001) classified extrasystemic wars as civil wars while abandoning the annual death crite extrastate wars." It is unclear why the definition of extrasystemic war kept c if the new rules were used to recode all armed conflicts in empires and throughout the period covered by COW.9 More important, if data on annual been collected to code civil wars according to the old (annual) criterion up revision, why would COW researchers not have made use of those data, inst ing much of that information by using a binary variable denoting the onset a nation of civil war? To fully evaluate the research that went into compilin lists, one needs access to COW's raw data and coding rules.'0 COW's efforts to constantly refine its data and improve its coding rules a ble. But this process raises the following question: were the new rules co reapplied to historical data? When the annual death criterion was abandone COW coders determine the beginning and end of a civil war? Did they cod year with 1,000 deaths as the onset of the war?" Or did they code the start that the cumulative death toll surpassed the 1,000 mark? An example tha 6. See Walter (2002), Gleditsch et al. (2002, 617), and Licklider (1995, 682). 7. According to Singer and Small (see codebook, 1994), interstate system membership is terms of "certain minimal criteria ... at least 500,000 total population and either diplomatic rec least two major powers or membership in the League of Nations or United Nations." Extrasyst fought between "a nation that qualifies as an interstate system member ... [and] a political ent an interstate system member." They are subdivided into imperial and colonial wars. I refer to and extrastate wars interchangeably, as they both refer to colonial and imperial wars. 8. As I was working on final revisions of this study, I received an e-mail message from Sarkees (March 15, 2004), stating that the 1,000 battle-death threshold has never been aband COW project and that poor editing of the Sarkees (2000) article gives readers this mistaken im eral members of the COW team had read early versions of this study, and they had never correct over, I have referred to codebooks and publications from COW that counter Sarkees's recent 9. A concern is that information on deaths of colonial subjects might have been systemat in empires than information on deaths of insurgents in nation-states, particularly in the pos 10. The Wages of War, 1816-1965 and Resort to Arms include appendices on included and wars, but little, if any, explanation is given for exclusions. Most cases were dropped because meet the death threshold (Small and Singer 1982, 330). 11. This may be the implicit COW coding rule, according to a communication with Stuar (July 31,2001). However, Gleditsch et al. (2001, 16) note that "any conflict coded by COW as than 1,000 battle-deaths overall is recorded as a civil war for all years (even years of inactiv before the cumulative death toll reached 1,000)" This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 818 JOURNAL OF CONFLICT RESOLUTION strates potential errors in the application of the new rules is the Algerian 1990s. This war, which some data sets code as starting in 1992, is omi and Small (1994), even though five other wars that started in 1992 we revealing that their coding extended through 1992). Perhaps the war more than 1,000 deaths in 1992 (it actually had), but the revised COW goes up to 1997 (Sarkees and Singer 2001), includes the Algerian war date. Because the coding rules were the same in the two COW data sets a coding error in the 1994 version, the coded start of the war in the gests that the war reached the 1,000 death mark only after 1992, and the was then back-coded to the start of the violence in 1992. The cumulative death criterion added few wars to the COW list. For the period from 1816 to 1979, 13 wars were added to the 1994 list (107 wars were included in the 1982 list). Only 5 of these 13 wars had been included in Appendix B of Resort to Arms (Small and Singer 1982), and 3 of these 5 were excluded because of lack of system membership of the participants or because the violence was characterized as a massacre. Thus, shifting to the cumulative death criterion added only 2 wars to the list.'2 The cumulative death criterion introduces some problems. First, it is harder to know when to code the start of the war. If we code it the first year the killing begins, then we will not be able to study violence escalation because the outbreak of minor violence will be subsumed in the period of "civil war." One research group that tries to avoid this problem is Gleditsch et al. (2001, 2002), who code a "war" when they count more than 1,000 deaths in a single year and do not code the beginning of the war during the first year with more than 25 deaths. But their data set does not have a rule for coding war termination, so we do not know precisely how to code their wars in a way that is compatible with other data sets that use the cumulative death criterion. The problem is created by the so-called "intermediate" armed conflict category, in which there are more than 100 but fewer than 1,000 deaths in a given year. If we had a conflict with, say, 600 people killed in the first year, then 3 years with virtually no deaths, and then another year with 600 deaths, this might qualify as a civil war according to the cumulative death criterion but would be coded as two distinct events of intermediate violence in the Gleditsch et al. (2001, 2002) data set.'" The annual death threshold solves this 12. See, again, Sarkees's comments noted earlier. She argues that COW never abandoned the 1,000 annual death threshold. But this raises new questions. For example, the Uppsala data set and the related data set by Gleditsch et al. (2002), which use an annual death threshold of 1,000 to code civil war, should have few differences from COW lists if COW used an annual threshold. But, as I show later, the differences between these data sets are large. Take the example of Cambodia: Singer and Small (1994) code two civil wars, one from 1970 to 1975 and another from 1979 to 1991. Gleditsch et al. (2002) and Strand, Wilhelmsen, and Gleditsch (2003) code a first war in 1967, a second war from 1970 to 1975, a third war in 1978, and a fourth in 1989 (these all have different "conflict sub-IDs" in the Strand, Wilhelmsen, and Gleditsch data set; hence, they are considered as different "cases" or war starts). 13. The data set by Gleditsch et al. (2001, 2002) has undergone many revisions (one current and four old versions can be found online at http://www.prio.no/cwp/armedconflict [accessed June 22, 2004]). Later versions have addressed (although not entirely) the problem of coding war onset/termination by assigning a "conflict ID" and "sub-ID" to each conflict and considering conflicts to be different if or when they switch from an intrastate to an interstate war (and vice versa), when the parties and issues are different, or when there are more than 10 years with fewer than 25 deaths per year. However, it is still not clear to me if a "same" conflict that switches from "war" to "minor" and back to "war" should be considered a single "war" for the purposes of comparison with war lists that use a cumulative death criterion. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 819 problem because an end to the war would be coded whenever violence dr 1,000 deaths. But this creates the opposite problem of coding too many w what is essentially the same conflict, if levels of violence fluctuate widely One way around these problems is to stop trying to code and analyze civ distinct phenomenon and, instead, to code levels of violence along a cont country and year). Wars might then be identified as periods with many vio in addition to other characteristics of the violence, depending on the author cal argument (e.g., the degree of organization of the parties, the presence o violence, etc.). The analysis would then try to explain political violence fir refining the theory, explain why violence takes different forms. A second problem with the cumulative-death criterion is that it could simp to code as civil wars many small conflicts that slowly accumulate deaths. In criterion creates inherently right-censored data, making it hard to code wa tion. By consistently applying the cumulative death criterion, Sarkees an (2001) could code as a civil war any minor conflict or terrorist campaign th deaths per year for more than 40 years. Fearon and Laitin (2003, 76) attemp this with a rule that 100 deaths must occur every year on average in an on But, in combination with the 1,000 aggregate-deaths threshold, this create logical problem. They would not code as a civil war a conflict that caused 9 over 9 years, but they would code a conflict that caused 1,000 deaths over 10 y A useful example to consider is Northeast India (Nagaland), where F Laitin (2003) code an ongoing civil war since 1952. I was not able to find ev many (say, more than 100) deaths per year occurred in armed conflict there to 1961. According to Gleditsch et al. (2002, appendix) and Small and Sing 339), there was no war or intermediate violence during any year of the Nagaland. This case illustrates not only the problem of how to code war te with the cumulative threshold but also the related difficulty of how to han chronologically overlapping insurgencies in the same country. Combining concentrated insurgencies in India's Northeast states may be reasonable a probably satisfy the aggregate-death threshold in the period considered by Laitin. But a strict application of the cumulative-death rule in such cases i atic, given that in other countries, chronologically and even geographically ping insurgencies are often treated as separate conflicts. How to distinguis these rebellions is not always easy. Burma, Chad, India, Ethiopia, and Zai 1960s) are all countries that pose difficulties in distinguishing among vari lious groups and periods of violence.15 In the absence of a clear standard of ho dle such complicated cases, a rule of thumb should be to code a "civil war" i with many overlapping insurgencies when the violence escalates markedly the start of low-level hostilities. In the case of India, this means that if we w 14. Moreover, that 100 is the average number of deaths per year means that a conflict wi of deaths in the first year can be considered ongoing, even if annual deaths after the first year fal 15. For example, Gleditsch et al. (2002) and Strand, Wilhelmsen, and Gleditsch (2003) between the Serb and Croat rebellions in Bosnia, but most others combine the two insurgencie "Bosnian war" event. At least five separate rebellions were ongoing from 1960 to 1965 in Zair cratic Republic of the Congo), and all data sets typically combine these events into a single c This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 820 JOURNAL OF CONFLICT RESOLUTION bine the rebellions in the Northeast states, a civil war should be coded as sta 1980s, when violence escalated in Assam, Tripuras, and Manipur. Despite these problems, a cumulative-death threshold becomes more de it is combined with a criterion that military activity be sustained for the du conflict. Both Sarkees and Singer (2001) and Fearon and Laitin (2003) appl but it is unclear how sustained military activity is best defined. To focus number of deaths per year reintroduces the problems of counting annual can accurately count 100 deaths per year, we might be able to accurate higher numbers of deaths and be better off using these numbers as our dep able. We might instead count battles, requiring at least one battle per year same parties. But perhaps counting "battles" would create a definitional p severe than counting deaths, and again we would need an arbitrary "nu battles" threshold, which would exclude cases of low-level insurgency in w tion where battles are not easily distinguishable from terrorist activity (as i Peru's civil war). To reach a balance between the pros and cons of the absolute versus ann olds, we must consider a few issues related to how we measure war mag 1. What level of violence qualifies as civil war'? 2. Should this be an absolute or relative level? 3. Should we only count battle deaths or also civilian deaths? ARRIVING AT AN ABSOLUTE THRESHOLD OF DEATHS TO CODE A CIVIL WAR A characteristic of civil war that distinguishes it from other forms of vio it causes large-scale destruction. A high threshold of deaths can set war riots, terrorism, and some coups (although not necessarily from pogroms But there is nothing inherently "right" about the 1,000-deaths threshold us erature, and strict application of that threshold can cause us to drop seve satisfy all other characteristics of civil war. A range of 500 to 1,000 dea principle, be equally consistent with a common understanding of civil wa that causes major destruction. Using a range rather than a single cutoff poi better, given the highly skewed distribution of civil war deaths. In 145 c started between 1945 and 1999, the mean number of deaths is 143,883 (wi deviation of 374,065), and the median is 19,000. Despite the high average have caused fewer than 1,500 deaths, and some barely reached 1,000 deat cases-for example, the Taiwanese revolt in 1947, the Dar ul-Islam rebel in Indonesia, or the fighting in Croatia after its independence in 1992-s or all of the other criteria for civil wars: they were fought by well-orga with political agendas, challenging the sovereign authority, and violence cal. Given the poor quality of our data (recall how hard it was to count t dead after the Twin Towers attack on September 11, 2001, in New York how much harder it would be to measure civil war deaths in Angola in 19 1951) and the skewed distribution of deaths, we should use a more fle This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 821 rule, such as a range of deaths, instead of the 1,000-death threshold.'6 We c sionally code a war onset at the year that we count 100 to 500 deaths and keep as a civil war if coders count more than 1,000 deaths in total within 3 year This incorporates the sustained fighting rule because events with low deaths in year would be coded as wars only if armed conflict is sustained and cause more deaths within a 3-year period. The value of this coding rule is that it be applied to back-code events by integrating existing databases that have in on lower level armed conflicts. FOCUSING ON THE RELATIVE MAGNITUDE OF VIOLENCE Another problem with the absolute threshold is that it does not reflec magnitude as well as a relative (per capita) measure would. If we add deaths measure to our coding rules, we would be less likely to miss ar small nations that produced few deaths but were nonetheless dramati for the history of those countries. A 1-year conflict that killed 500 p number in the range suggested above to code war onset) in a country wit inhabitants (the smallest population size allowed by most coding rules to deaths at the level of 0.001 of the population. A conflict of equal country of 20 million people would have caused 20,000 deaths and w coded as a civil war in all data sets. We could use the 0.001 threshold and wars that do not exceed the 500 to 1,000 deaths mark could still civil wars if they meet the per capita deaths criterion. An example that illustrates the need for greater flexibility in applyin olds is the Dhofar Rebellion in Oman (1965-1976). This was an insurg an ethnically organized rebel group (the Dhofar Liberation Front [D Sultan of Oman. The rebels recruited mainly from the Qara, a small ethn in the mountains of the Dhofar province, and the group presented itself Maoist. Young potential recruits were sent to school to receive politica (Price 1975, 7; Connor 1998, 156-57). There was an element of Islami ism in the group's ideology until mass defections of Islamist soldiers Bait Ma'ahshini massacre (Connor 1998, 159). Some scattered terroris the populous north, but fighting was largely confined to mountainous were supported by the People's Republic of Yemen in their campaig mountains (Price 1975, 3). The pattern of armed conflict and the orga rebellion are consistent with a common understanding of civil war, but t cally excluded from civil war lists because of a low death count, even capita terms, this conflict caused more damage than many others tha included as civil wars. The 1,000-deaths criterion may lead us to include more cases of civil wars in large countries, if more populous countries can more easily produce several insurgencies that can cause high levels of deaths. Ethnic rioting in Nigeria or state repression and 16. In my data set, I include cases that may fall just short of the 1,000 mark and identify them as potentially ambiguous cases. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 822 JOURNAL OF CONFLICT RESOLUTION popular dissent in China is much more likely to generate large numbe than a coup in Fiji or Cyprus. This may be one reason that a variable co country's population is among the most significant and robust explanat models of civil war onset. Consider the omission from most data sets of the GrecoTurkish war in Cyprus that started in 1963. This war was excluded by Singer and Small (1972, 398; Small and Singer 1982, 340) due to insufficient death count. The conflict caused around 1,000 deaths and certainly met all other criteria for civil war: there were battles between organized military units, involving the government against rebel militias, and parties with local recruitment and a clearly articulated political agenda. Even if only 500 deaths had occurred in Cyprus in that period, this would have amounted to 0.001 of the population. A war with the same intensity in a country with 100 million inhabitants would have caused 100,000 deaths-a massive tragedy that would have been coded in all data sets. A per capita measure would capture these cases, but we would also need to preserve a relatively high absolute threshold not to run the opposite risk of selecting too many small conflicts in small countries (or dropping "relatively" small wars in larger coun tries). Creating a per capita measure is difficult and labor intensive. What we must no do is compute per capita deaths in conflicts that have already been selected by COW or other projects on the basis of an absolute-death threshold. Rather, coding must be based on primary research to measure deaths on a per capita basis in all countries with any political violence. A starting point might be the Gleditsch et al. (2001 ) list of minor armed conflicts. So far, no death figures are available for those conflicts, and we know only that they have caused more than 25 deaths per year and less than 1,000 for the duration of the conflict. Getting better data for those conflicts might facilitate the con struction of a per capita measure. An alternative (and less costly) approach is to code per capita deaths for a shorter period (e.g., from 1980 onwards), allowing us to construct a data sample based on the combination of absolute and per capita death measures (post-1980) and another sample using only the absolute threshold. We could then estimate a model on the two samples of the post-1980 period and compare the results. We could then do a sort of out-of-sample test by estimating the same model on each sample period to see if the model's predictions are consistent across periods using either sample. If the results are not influenced by our coding rules, then going through the trouble of coding a per capita measure might not be necessary. THE EFFECTIVE RESISTANCE CRITERION AND MEASURING BATTLE DEATHS VERSUS TOTAL DEATHS The third crucial question is whether we should code battle deaths o civilian deaths due to the war. The battle-deaths criterion is the legacy of ject and usually refers to military deaths (the measure was initially us deaths in interstate wars). Small and Singer (1982, 213) claim that they ian as well as military deaths" in civil wars, but it is not clear if they hav civilian deaths due to rebel attacks. Civilian deaths must be counted in death totals.'7 17. This position is gaining ground: see Sarkees and Singer (2001, 12) and Valentino, Huth, and Balch-Lindsay (2001). This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 823 Civilians are targeted in civil war and are disproportionately affected by h disasters created by combatants to hold civilian populations hostage and ga of territory."8 One might also consider counting refugees and internally di sons as a measure of the human cost of the war (cf. Doyle and Sambanis To get an accurate measure of civilian deaths, we must also determine if to politicide that occur in and around civil war should be counted. Small (1982, 215) counted deaths due to "acts of massacre committed ... during of a civil war." That seems reasonable, but a harder question is, how do w deaths due to acts of massacre that flank periods of civil war? Should we c having started if civilian massacres occur for a period (say, 1 year) and m frontation ensues with deaths that exceed 1,000 in the next year? What reverse situation, where a civil war gives way to civilian massacres? Dist periods of civil war from periods of massacres clearly is very hard, as the c bodia illustrates. Most data sets code civil war onsets in 1970 and 1978 in Cambodia but no war from 1976 to 1977, which is a period that corresponds to civilian massacre (the killing fields). Small-scale battles between the Khmer and Vietnamese-backed troops did take place near the Thai border in that period, but it is unclear how man people died as a result of armed conflict. Hence, most data sets code this as a period o politicide that is distinct from the civil war. Yet, if we could have established that 10 or so people had been killed on the side of the stronger party between 1976 and 1978 we would have coded an ongoing war in that period (especially in data sets using the cumulative-death criterion and where effective resistance is not measured as a percentage of the total number of deaths). In light of these difficulties, a conservative strategy is to count deaths due to massa cres that occur right before or after the war as war-related deaths. The main difficulty to find evidence of effective resistance. According to Small and Singer (1982, 214 15), effective resistance implies that the stronger side should suffer at least 5% of t casualties of the weaker side. Fearon and Laitin (2003) relaxed this coding rule an measured effective resistance by 100 state deaths. They do not, however, specify these deaths must occur in the first year of the war or cumulatively throughout the w Without data on the distribution of state deaths throughout the war, we are left rudderless in trying to distinguish periods of "civil war" from periods of "politicide" o "genocide." (Could we say with certainty that 50 government troops were not killed the jungles of Cambodia in 1976 and another 50 in 1977?) Some find the distincti easier to make if massacres occur before a "civil war" starts, as in the case of the 19 massacres in Nigeria, which are typically thought of as separate from the Biafran c war, which started in 1967.19 It might also be easier to distinguish massacres from civ war if the massacres occur after a formal end to the war is reached (i.e., a peace treaty But if the war peters out and mostly one-sided violence starts (as in the 1984-19 18. It might even be desirable to count civilian deaths that are indirectly the result of the war as, f example, in the case of war-related starvation deaths in Ethiopia and Somalia. However, this is a slipper slope, as it is unclear how far one could trace the health effects of civil war (see Ghobarah, Huth, and Russet 2003 for such an effort). 19. In May-July 1966, massacres of 30,000 Ibos in the Northeast took place, including massacres dur ing the Northern leaders' coup and murder of Ironsi and Ibo leaders. Biafra achieved de facto independen by the end of 1966, and war started in 1967. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 824 JOURNAL OF CONFLICT RESOLUTION massacres in Ndebele, Zimbabwe), the distinction is harder to make, because war may restart within 1 or more years as a result of such m Thus, the effective-resistance criterion cannot help us establish with difference between civil war and politicide, and we would need to us effective-resistance criterion to make a clear distinction. Just as Fearon and Laitin (2003) require 100 deaths per year to code an ongoing civil war, we could require 10 state deaths per year to distinguish a civil war from a politicide.20 But this type of arithmetic seems inappropriate, not least because it trivializes and imposes a petty discipline on the complex violence that occurs during civil war. What if insurgents suffering, say, 1,000 deaths per year can kill 60 soldiers or police officers in one year, none in the next year, and then 5 per year for the next 8 years (for a total of 100 in 7 years)? Would this be any more or any less of a civil war than a conflict in which 20 soldiers are killed per year for 5 years? The problem of counting effective resistance each year might be avoided if we require 100 deaths per year on average for the duration of the conflict, but this would obviously require a higher threshold of the state deaths criterion and would lead us to reclassify some civil wars as politicides or genocides.2' What if we forget about the question of distribution of state deaths over time? Should an absolute criterion of 100 state deaths be the single most important factor in deciding whether to classify an insurgency as a civil war if the insurgency meets all other criteria for a civil war? An example is the Gamaat Islamiya (Islamic Group) and Islamic Jihad's insurgency in Egypt from 1992 to 1997. According to one source, around 1,200 people were killed over 5 or 6 years, which meets the cumulative death criterion.22 The rebels were organized and had a political ideology. Guerilla insurgency was sustained, and attacks were directed against police, security forces, government ministers, tourists, and the Coptic minority. Whether we code this case as terrorism or civil war currently rests on whether the state suffered 100 or more deaths. This seems a weak criterion on which to base the classification of any case.23 Another difficult case is Argentina's "dirty war," in which Harff (2003) codes a politicide from 1976 to 1980, Fearon and Laitin (2003) code a civil war from 1973 to 1977, Singer and Small (1994) do not code a civil war, and Gleditsch et al. (2002) code a war in 1975 and "possibly" in 1976 and 1977. Official statistics from the Argentine military cite 492 deaths (including civilian officials) due to "terrorist attacks" from 1969 to 1980.24 Guerillas suffered deaths starting in 1969 until 1972 (more than 100 per year), and it is hard to know how many people were killed between 1973 and 1977, the period sometimes coded as a "civil war." Total deaths, including guerillas and civil- 20. The 1,000 cumulative-death criterion divided by the 100 deaths-per-year rule gives a 1 0-year window for minor conflicts to be labeled civil wars. Dividing the Fearon and Laitin (2003) 100 state-deaths effective-resistance rule by the same 10-year window yields a 10 state-deaths-per-year rule. 21. Also, averaging state deaths might not resolve the problem, if all or most deaths occurred early in the conflict. 22. See my online supplement for information on this and all cases discussed in the text. 23. The Kenyan "shifta" war in the 1960s is another such case that no data set includes as a civil war, most likely because of the difficulty of finding information on the effective-resistance criterion. I found sources listing dozens of killings of police and military but no clear evidence of more than 100, although the number and apparent intensity of battles described in several sources suggest that there was effective resistance. 24. See a more detailed discussion and a list of sources in the online supplement. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 825 ians, in that period seem not to have reached 1,000, particularly if we do n deaths after the coup of 1976, which brought a new regime to power, and so a the "war" could be coded in 1976. The new regime engaged in a massive purge pected guerillas and their supporters; starting in 1976, the new regime killed mated 9,000 to 30,000 people in the period from 1976 to 1983. During that deaths incurred by the state were few, so we could not code an ongoing civil w the 1976 coup. In the absence of clear rules on how sustained violence and e resistance must be coded over time (i.e., how many state deaths must occur per y is unclear if and when we can code this conflict as a civil war or rather a low-level insurgency, combined with a coup, and followed by a politicide. This brief discussion should reveal that most of the coding rules currently used to measure civil war are somewhat ad hoc and that the problem is often exacerbated by the low quality of data on deaths due to armed conflict. There is large room for measurement error here, and such error will make it harder to establish empirically the differences between civil war and other forms of political violence. Thus, in the presence of these problems, one might argue that coding rules should never be applied too strictly and that, when we are faced with an ambiguous case, we should err on the side of caution, including such cases while making it possible to identify them at the analysis stage. In the civil war list that I have compiled by applying a new set of coding rules that try to address the issues raised here, I have included ambiguous cases but have flagged them both in the data set and in a supplementary document that explains the coding for each case. This allows analysts to make their own decisions about which conflicts to drop or explore further to confirm that they are accurately coded. CLASSIFYING AND ANALYZING EXTRASYSTEMIC WARS Another issue that sometimes accounts for coding differences across difficulty of how to code extrastate (colonial, imperial) wars. I argue wars may reasonably be considered to be different from other civil w from civil war lists. The concept of a territorial state is central to the definition of civil war but creates some problems in the classification of the so-called extrastate (or extrasystemic) wars. The extrasystemic-wars category in COW included colonial and imperial wars. Imperial wars were defined as wars between "system member[s] versus independent nonmember[s] of the interstate system" (Singer and Small 1972, 382). Colonial wars were defined as wars between a "system member versus an ethnically different, nonindependent, nonmember of the interstate system" (Singer and Small 1972, 382). Extrasystemic wars were classified as a distinct category due to a conceptual distinction between wars that are "peripheral to the center of government (or the metropole)" and therefore qualitatively different from wars that take place within the core territory of the metropole (Sarkees 2000, 126). This was a normative definition because it interpreted the nature of the relationship between governments and dependent or independent regions within the state's territory. This normative element stands in contrast to mechanistic definitions of civil war (such as the death threshold). Moreover, a considThis content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 826 JOURNAL OF CONFLICT RESOLUTION eration of power relationships was absent from the definition of intersta were defined entirely based on a territorial conception of the state. O emphasis on the state member as a territorial entity necessitated a re-thi mate abandonment of the metropole [and periphery] distinction" (Sar leading the COW project to redefine its coding rules and recla extrasystemic wars as civil wars. Identifying extrasystemic wars is more complicated as a result of th coding rules evolved. Initially, coding rules for extrasystemic wars were rules for intrastate war: "whereas an interstate war which qualified on tle deaths would be included if the total fatality figure for the protagon reached the 1000 mark, the member itself(including system member a to sustain 1000 battle fatalities in order for the extrasystemic war t (Singer and Small 1972, 36), and "if the war lasted longer than a year, had to reach an annual average of 1,000" (Small and Singer 1982, 56; 2000, 129). Eventually, that criterion was relaxed, and nonsystem member deat into consideration. But could we assume that we would get an accurate nized people's deaths in uprisings against the metropole given the im contempt for human life in their colonies? If data measurement extrasystemic wars are systematically different from data problems in o then combining the two categories may bias analyses.25 The Mau-Ma rebellion in Kenya, for example, was excluded from COW because it ca deaths on the side of the United Kingdom (Singer and Small 1972, 397 added to the 1994 and 2000 updates of COW as a result of counting Ken how many other such wars failed even to appear in the original "exclud the first COW publications? One such example is the Rwandan revolu Belgium), which Fearon and Laitin (2003) code as a war between 1956 this case does not appear in any COW list. A lingering concern with ex therefore, is if the original coding rules have hampered our ability to according to the current definition without having to engage in pain resource-intensive historical research. Another important question is if extrasystemic wars should be coded as taking place in the metropole (i.e., in the territory of the system member) or in the territory of the nonsystem member. Given our territorial conception of the state (and of civil war), it seems clear that the war should be coded in the territory of the metropole, but this is not established practice across the literature. Gleditsch et al. (2001 ), for example, code extrasystemic wars under colonized states. Doyle and Sambanis (2000) and Licklider (1995) code some extrasystemic wars as civil wars (where the country was under trusteeship, as in the case of Namibia, or in the case of the war in Zimbabwe/Rhodesia from 1972 to 1980, given Rhodesia's independent status). Other scholars have committed more obvious errors in coding, for example, Bangladesh's war of independence in 1971 as taking place in Bangladesh rather than Pakistan (Leitenberg 2001), when it is 25. This may be an empirical question because governments everywhere have an incentive to underreport the number of their subjects that they kill. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 827 clear that Bangladesh did not exist as an independent state until after the c that war. In Sarkees and Singer (2001), the intention seems to be to code wars as having taken place in the territory of the system member. This brings up another difficulty. On one hand, extrastate wars that are co ing place in colonized states will have to be dropped from quantitati because data for the "usual suspect" explanatory variables (e.g., democrac domestic product [GDP]) are typically not available for dependent territor other hand, if the war is coded as taking place in the territory of the syste then explanatory variables have to be adjusted for the whole empire. Estim age GDP per capita or the level of ethnic fractionalization for entire empir a very difficult task and could only be done by taking shortcuts that migh compromise the quality of the data. Consider also that averaging GDP per entire empires will have the effect of turning highly developed countri France or England, into middle- to low-income countries. If GDP per capit a proxy for state strength, then averaging GDP values for empires will hav of underestimating state capacity in the metropoles (although it may accu ture state capacity with respect to the rest of the empire). But distance metropole and military technology are important missing variables here explain the metropole's reach.26 Empires were uniquely autocratic regimes, in which subjects lived under forms of government, and one argument for setting extrastate wars apart legal structure of empires prevented the articulation of colonized people (voice) and left them with rebellion as their only option. Might right-han trols for the level of democracy or autocracy capture this pressure to rebel in Consider the example of France, which, according to Fearon and Laitin (2 several civil wars from 1945 to 1960. France scores as a "deep" democracy relevant period in the Polity IV database (Marshall and Jaggers 2000). Ho standard notion of democracy is that it provides checks and balances to re putes peacefully and that it affords the right of political representation to and groups. But this notion does not apply to the periphery of an empire putes are resolved despotically, even when they are resolved by consensu metropole. Fearon and Laitin attempt to correct for this by using polity empires that are weighted by the share of the colonized subjects to the po the metropole, thereby reducing the polity score of countries, such as Fran land. But it is not clear to me that the combinations of regime characterist Polity IV database uses are even applicable to cases of foreign domination o "Watering down" France's, Britain's, or Belgium's democracy score in thi likely to make these countries seem like so-called "anocracies" (i.e., regim in the middle of the Polity range). But one could argue that as far as colon were concerned, they were living in autocracies, whereas citizens in the were living in a democracy. Thus, considering France or England as anocra 26. Moreover, the waves of decolonization wars in the 1950s and 1960s were all related t changes that affected entire groups of countries in similar ways. This, at a minimum, calls appropriate controls for systemic trends. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 828 JOURNAL OF CONFLICT RESOLUTION bly wrong, and the meaning of an anocracy is here inconsistent with th hypotheses that scholars usually test by using the anocracy measure.27 Perhaps all these problems can be addressed by adding appropriate righ controls to the regressions by denoting, for example, if a country posse Regional inequality, for example, may be more significant in explaining lence in a world full of empires than in a world composed mainly of nat But even trying to measure inequality in the British empire or in the Bel likely to be systematically more difficult than measuring it in the averag nation-state. If so, estimates based on the variables with systematic codin empires are likely to be biased. Thus, abandoning the distinction between and periphery may have been ill-advised. At its core, the earlier COW dist lighted that some peripheries were constitutionally excluded from the p cess, which presented insurmountable obstacles to the peaceful resolution disputes with the metropole. Despite these arguments, in many respects, extrastate wars are indee civil wars: rebels are mostly locally recruited, the groups have a political the governing authority is involved in the fighting. Thus, analysts could ad to their civil war lists, but doing so would necessitate adding appropriate the statistical model, as outlined above. Yet, only colonial wars should b for inclusion.29 Imperial wars of annexation do not meet the territoriality the nonsystem member's territory was never part of the metropole's territ the war. Wars in East Timor, Tibet, or the Western Sahara could be cons wars because (a) the fighting that started during a war of annexation contin territory is annexed by an imperial power, and/or (b) the annexation i accepted by the international community.;o CODING CIVIL WAR Drawing on the preceding analysis, I propose an operational definition that resolves some of the problems we have encountered so far. Parts of m are new, and parts are based on other, widely used coding rules. I provide 27. One hypothesis (Hegre et al. 2001) is that anocracies are prone to violent rebellion are neither autocratic enough to preempt or crush rebellion nor democratic enough to resolve fully. Many imperial powers were both autocratic enough (in their colonies) and democratic metropoles). 28. A common notion of empire is that it uses the periphery to extract resources with no regard for equality; this may also be true in some nation-states, although there is likely to be a systematic difference in the degree of regional inequality between empires and nation-states with the same level of democracy at the metropole. 29. Wars of decolonization may be combined with lists of secessionist wars because they are likely to share some of the same causal logic. However, the difficulties associated with the measurement of key variables in empires (see above) would still apply. 30. Such cases in my data set include Morocco (Western Sahara), Indonesia (East Timor), and Israel (West Bank and Gaza). Excluded is, for example, the Malaysian Insurgency, because the level of violence after independence was very low, although deaths during the phase of decolonization were high (see online supplement). If there is an intervention by another state to prevent annexation of the territory, then this should be considered an interstate conflict, even if local parties join the fighting. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 829 wars that conforms to the coding rule proposed here and later use the new c in an empirical analysis. Extensive notes that justify the coding of each ca able in a supplement posted online."3 An armed conflict should be classified as a civil war if (a) The war takes place within the territory of a state that is a member of the international system32 with a population of 500,000 or greater.33 (b) The parties are politically and militarily organized, and they have publicly stated political objectives.34 (c) The government (through its military or militias) must be a principal combatant. If there is no functioning government, then the party representing the government internationally and/or claiming the state domestically must be involved as a combatant.35 (d) The main insurgent organization(s) must be locally represented and must recruit locally. Additional external involvement and recruitment need not imply that the war is not intrastate.36 Insurgent groups may operate from neighboring countries, but they must also have some territorial control (bases) in the civil war country and/or the rebels must reside in the civil war country.37 (e) The start year of the war is the first year that the conflict causes at least 500 to 1,000 deaths.38 If the conflict has not caused 500 deaths or more in the first year, the war is 31. Go to http://pantheon.yale.edu/-ns237/index/research.html#Data. Note that this coding rule is more demanding than most of the others. Sometimes, the information to code each case with certainty is not available, so I identify those cases in the notes. I usually err on the side of caution and include ambiguous cases, identifying them in the data set so that researchers can decide if they want to include them or drop them from the analysis. 32. This includes states that are occupying foreign territories that are claiming independence (e.g., West Bank and Gaza in Israel and Western Sahara in Morocco). A strict application of this coding rule could drop those cases in which the international community (through the United Nations) rejects the state's claims of sovereignty on the occupied territories. 33. We could include countries after their population reaches the 500,000 mark or, from the start of the period, if population exceeds the 500,000 mark at some point in the country series. If a civil war occurs in a country with population below the threshold, we could include it and flag it as a marginal case. Cases of civil war close to the 500,000 mark are Cyprus in 1963 (578,000 population) and Djibouti in 1991 (450,000 population). The per capita death measure would allow us to relax the population threshold. 34. This should apply to the majority of the parties in the conflict. This criterion distinguishes insurgent groups and political parties from criminal gangs and riotous mobs. But the distinction between criminal and political violence may fade in some countries (e.g., Colombia after 1993). "Terrorist" organizations would qualify as insurgent groups according to this coding rule, if they caused violence at the required levels for war (see other criteria). Noncombatant populations that are often victimized in civil wars are not considered a "party" to the war if they are not organized in a militia or other such form, able to apply violence in pursuit of their political objectives. 35. Extensive indirect support (monetary, organizational, military) by the government to militias might also satisfy this criterion (e.g., Kenya during the Rift Valley ethnic clashes), although here it becomes harder to distinguish civil war from communal violence. In some cases, where the state has collapsed, it may not be possible to identify parties representing the state because all parties may be claiming the state, and these conflicts will also be hard to distinguish from intercommunal violence (e.g., Somalia after 1991). 36. Intrastate war can be taking place at the same time as interstate war. 37. This weeds out entirely interstate conflicts with no local participation. The Bay of Pigs, for example, would be excluded as a civil war because the rebels did not have a base in Cuba prior to the invasion. Some cases stretch the limits of this definitional criterion-for example, Rwanda in the late 1990s, when exFAR (Rwandan Army Forces) recruits with bases in the Democratic Republic of the Congo engaged in incursions and border clashes against government army and civilians. If this is a civil war, then so is the conflict between Lebanon-based Hezbollah and Israel (assuming the other criteria are met).. 38. This rule can be relaxed to a range of 100 to 1,000 because fighting might start late in the year (cf. Senegal or Peru). Given the lack of high-quality data to accurately code civil war onset, if we do not have a good estimate of deaths for the first year, we can code the onset at the first year of reported large-scale armed This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 830 JOURNAL OF CONFLICT RESOLUTION coded as having started in that year only if cumulative deaths in the next 3 1,000.39 (f) Throughout its duration, the conflict must be characterized by sustained violence, at least at the minor or intermediate level. There should be no 3-year period during which the conflict causes fewer than 500 deaths.40 (g) Throughout the war, the weaker party must be able to mount effective resistance. Effective resistance is measured by at least 100 deaths inflicted on the stronger party. A substantial number of these deaths must occur in the first year of the war.4 But if the violence becomes effectively one-sided, even if the aggregate effective-resistance threshold of 100 deaths has already been met, the civil war must be coded as having ended, and a politicide or other form of one-sided violence must be coded as having started.42 (h) A peace treaty that produces at least 6 months of peace marks an end to the war.43 (i) A decisive military victory by the rebels that produces a new regime should mark the end of the war.44 Because civil war is understood as an armed conflict against the government, continuing armed conflict against a new government implies a new civil war.45 If the government wins the war, a period of peace longer than 6 months must persist before we code a new war (see also criterion k). conflict, provided that violence continues or escalates in the following years. Note that in the data set, I also code the start/end month, where possible. In some cases, my coding rules can be used to identify the start month (e.g., in cases where the war causes 1,000 deaths in the first month of armed conflict). But in most cases, the month only indicates the start of major armed conflict or the signing of a peace agreement, which can give us a point of reference for the start or end of the war, respectively. 39. This rule also suggests when to code war termination if the 3-year average does not add up to 500. In such a case, we can code the end of the war at the last year with more than 100 deaths unless one of the other rules applies (e.g., if there is a peace treaty that is followed by more than 6 months of peace). 40. This criterion makes coding very difficult because data on deaths throughout the duration of a conflict are hard to find. However, such a coding rule is necessary to prevent one from coding too many war starts in the same conflict or coding an ongoing civil war for years after the violence has ended. Three years is an arbitrary cutoff point but is consistent with other thresholds found in the literature. The data notes (see online supplement) give several examples of cases in which the coding of war termination has been determined by this criterion. A more lenient version would be a 5-year threshold with fewer than 500 deaths. 41. This criterion must be proportional to the war's intensity in the first years of the war. If the war's onset is coded the first year with only 100 deaths (as often happens in low-intensity conflicts), then we would not be able to observe effective resistance in the first year of the war if we defined effective resistance as 100 deaths suffered by the state. 42. This criterion distinguishes cases in which insurgent violence was limited to the outbreak of the war and, for the remainder of the conflict, the government engaged in one-sided violence. A hypothetical example is a case in which insurgents inflicted 100 deaths on the government during the first week of fighting, and then the government defeated the insurgents and engaged in pogroms and politicide for several years with no or few deaths on the government's side. If we cannot apply this rule consistently to all cases (due to data limitations), then periods of politicide at the start or end of the war should be combined with war periods. This implies that civil wars will often be observationally equivalent to coups that are followed by politicide or other such sequences of different forms of political violence. 43. Treaties that do not stop the fighting are not considered (e.g., the Islamabad Accords of 1993 in Afghanistan's war; the December 1997 agreement among Somali clan leaders). If several insurgent groups are engaged in the war, the majority of groups must sign. This criterion is useful for the study of peace transitions but may not be as important if researchers are interested in studying civil war duration, forexample. 44. Thus, in secessionist wars that are won by the rebels who establish a new state, if a war erupts immediately in the new state, we would code a new war onset in the new state (an example is Croatia from 1992 to 1995), even if the violence is closely related to the preceding war. A continuation of the old conflict between the old parties could now count as an interstate war, as in the case of Ethiopia and Eritrea, which fought a war between 1998 and 2000 after Eritrea's successful secession from Ethiopia in 1993. 45. This criterion allows researchers to study the stability of military victories. Analysis of the stability of civil war outcomes would be biased if we coded an end to civil war through military victory only when the victory was followed by a prolonged period of peace. This would bias the results in favor of finding a positive correlation between military outcomes and peace duration. This criterion is important for analyzing war recurrence but not necessarily war prevalence. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 831 (j) A cease-fire, truce, or simply an end to fighting can also mark the end of a civil war result in at least 2 years of peace.46 The period of peace must be longer than w required in the case of a peace agreement because we do not have clear signals of th ties' intent to negotiate an agreement in the case of a truce/cease-fire.47 (k) If new parties enter the war over new issues, a new war onset should be coded, su the same operational criteria.48 If the same parties return to war over the same issu generally code the continuation of the old war, unless any of the above criteria for co a war's end apply for the period before the resurgence of fighting. Using these coding rules, I have coded 145 civil war onsets between 1945 and (2.08% of 6,966 nonmissing observations). Without coding new war onsets in tries with already ongoing civil wars, the number of civil wars is 119 (1.93% o nonmissing observations). Out of these cases, 20 are "ambiguous"-that is, they not meet one or more of the coding rules. DIFFERENCES IN THE CODING OF CIVIL WAR AND THEIR SUBSTANTIVE IMPLICATIONS The discussion thus far has established some of the sources of disagreem civil war lists. Disagreements over the coded year of onset and termination may matter for the inferences drawn when we analyze civil war onset, d recurrence using different data sets. The extent of disagreement over the co civil war onset is apparent in Table l a-d, which presents correlations b starts during the period from 1960 to 1993-a period covered by most data unit of analysis is the country-year. The dependent variable is civil war on variable. All years of no war are coded equal to 0. There are two versions o starts: in version (a), I code a 1 when a civil war starts and drop observati ing war in that country until the war ends. Thus, if another war starts in th try while another war is ongoing, we would not consider it. In version ( whenever a war starts, even if another war is ongoing. Country-years with starts are coded 0, and in this way, we end up with more war starts.'5 46. Peace implies no battle-related deaths or, in a lenient version of this criterion, fewer de lowest threshold of deaths used to code war onset, that is, fewer than 100 deaths per year. 47. These situations are different from those in which there is no violence as a result of ar down without a cease-fire agreement, which would fall under criterion (f). 48. These incompatibilities must be significantly different, or the wars must be fough groups in different regions of the country. For example, we would code three partially over Ethiopia (Tigrean, Eritrean, Oromo) between the 1970s and the 1990s. New issues alone shoul cient to code a new war because there is no "issue-based" classification in the definition of could apply such a rule if we classified civil wars into categories-for example, secessionist w olutions over control of the state. In addition to having new issues, most parties must also b can code a new war onset. 49. See the online supplement for a summary of the definitions and operational criteria used by major data projects. The period was selected to include the Collier and Hoeffler (2001) data set (1960-1999) in the analysis, given the prominence of that study in the literature. Most of the other data sets start in 1945. I compare results from a smaller number of data sets covering the entire post-1945 period. 50. Collier and Hoeffler (2001), Hegre et al. (2001), and Sambanis (2001) use version (a). Fearon and Laitin (2003) use version (b). (text continues on p. 835) This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 00 TABLE I Correlations among Civil War Lists, 1 a. Version (a) of War Onset (3,1 Collier Fearon Doy and and Sambani Hoeffler Licklider Gleditsch Laitin Leite COW 1994 COW 2000 (2001) (1995) et al. (2001) (2003) (20 warstla warst2a warst3a warst4a warst5a warst7a warst8a warstla 1.00 warst2a 0.96 1.00 warst3a 0.82 0.83 1.00 warst4a 0.74 0.75 0.71 1.00 warst5a 0.42 0.46 0.52 0.57 1.00 warst7a 0.69 0.70 0.70 0.70 0.54 1.00 warst8a 0.60 0.66 0.56 0.66 0.46 0.59 1.00 warst9a 0.70 0.70 0.66 0.66 0.46 0.67 0.59 warstl0a 0.69 0.69 0.80 0.68 0.48 0.72 0.55 warstlla 0.76 0.76 0.75 0.79 0.53 0.77 0.61 warstnsa 0.74 0.74 0.73 0.83 0.51 0.80 0.62 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms b. Version (b) of War Onset (3,503 Obse Collier Fearon Doy and and Samban Hoeffler Licklider Gleditsch Laitin Leit COW 1994 COW 2000 (2001) (1995) et al. (2001) (2003) (20 warstlb warst2b warst3b warst4b warst5b warst7b warst8b warstlb 1.00 warst2b 0.92 1.00 warst3b 0.83 0.83 1.00 warst4b 0.57 0.55 0.62 1.00 warst5b 0.37 0.45 0.46 0.40 1.00 warst7b 0.58 0.61 0.61 0.58 0.49 1.00 warst8b 0.51 0.55 0.49 0.50 0.37 0.56 1.00 warst9b 0.74 0.70 0.68 0.52 0.40 0.58 0.51 warstl0 0.69 0.67 0.80 0.60 0.41 0.61 0.46 warstllb 0.67 0.66 0.73 0.68 0.45 0.63 0.50 warstnsb 0.64 0.64 0.68 0.69 0.44 0.69 0.51 00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 1 (continued) c. Correlations among Lists of Civil War Prevalence Collier Fearon Doy and and Samban Hoeffler Licklider Gleditsch Laitin Leite COW 1994 COW 2000 (2001) (1995) et al. (2001) (2003) (20 atwarl atwar2 atwar3 atwar4 atwar5 atwar7 atwar8 atwarl 1.00 atwar2 0.94 1.00 atwar3 0.88 0.90 1.00 atwar4 0.75 0.76 0.82 1.00 atwar5 0.62 0.69 0.70 0.68 1.00 atwar7 0.67 0.70 0.73 0.75 0.65 1.00 atwar8 0.63 0.67 0.66 0.70 0.59 0.70 1.00 atwar9 0.62 0.66 0.64 0.66 0.53 0.70 0.62 atwarl0 0.70 0.73 0.77 0.74 0.64 0.76 0.66 atwarll 0.69 0.73 0.75 0.78 0.66 0.78 0.67 atwarns 0.69 0.74 0.77 0.79 0.66 0.83 0.69 d. War Onset Data Set Number of War Starts Country Singer and Small (1994) Sarkees and Singer (2000) Collier and Hoeffler (2001) Licklider (1995) 58 Gleditsch et al. (2001) Fearon and Laitin (2003) Regan (1996) 116 Doyle and Sambanis (2000) Sambanis (current study) NOTE: Years for which the depe civil wars in those years. Leiten 0o 44. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 835 As is evident from Table la,b, the correlation between most pairs of civi lists is low. Among the lowest correlations (0.42 and 0.46) are those betwee et al. (2001) and the two versions of the Correlates of War data set. These c are even lower with version (b) of the dependent variable (0.37 and 0. tively).5' The correlation between Gleditsch et al. (2001) and my data set is sion (b)], and the correlation with Fearon and Laitin (2003) is 0.69 [versio most highly correlated civil war lists are the two versions of the COW data The correlation between war prevalence in any two lists would be high compared wars that started within the same 2- or 3-year period (see Fearon 2003 for such a comparison). But that would be a comparison of war lists onsets. A difference of 2 to 3 years in the coding of war onset is significant ues of right-hand side variables, such as economic growth or political ins would be immediately influenced by war, which would in turn influence th results. In Table Ic, I correlate civil war prevalence (combined onset and du the different lists. If two lists included different war starts but the war over most of its duration, then the correlation between the two databases would than in Table 1 a,b. Indeed, the correlations between pairs of lists are now h between Gleditsch et al. 2001 and COW2, it rises to 0.69). Thus, there is m agreement on the question of war onset and termination than there is over war happened at all. However, there is still considerable disagreement about which armed c should be classified as civil wars. Many wars are coded in only one out of a sets. Table 1 d lists the number of war starts and total country-years of war in ent lists from 1960 to 1993.5" The highest number of war starts (116) is Regan's (1996) list and the lowest two in Licklider's (1995) list (58) and Sin Small's (1994) list (61). Now that I have established that there is substantial variation in the codi onset and termination in most data sets, it is appropriate to ask if these diffe substantive implications. To answer that question in a tractable manner, I same civil war model on all versions of the civil war onset variable and measur tion in parameter estimates. To isolate the differences that are due to the cod dependent variable, I use the same sources of data for the right-hand side v analyze the same set of countries over the same period (1960 to 1993, annu tions).54 Thus, all differences in parameter estimates should be due to differe 51. Here I consider only wars from the Gleditsch et al. (2001, 2002) list. In the regression on, I add a second measure that included minor and intermediate armed conflicts from that l 52. If we expand the comparison to the entire period from 1945 to 1999, the correlation smaller. My data set correlates with others as follows: 0.62 with Sarkees and Singer (2000 Gleditsch et al. (2001), 0.63 with Fearon and Laitin (2003), and 0.67 withDoyle and Samba 53. The number of observations across models differs in Table 2 because I drop observatio ing war. Thus, any differences are directly the result of coding war onset and continuation. I number of observations for the Collier-Hoeffler variable is slightly smaller because the lagged is missing for the year 1960 since their data set starts in 1960, and we cannot know how th coded the preceding year. Singer and Small's (1994) data set has fewer observations because it events in 1993, and Leitenberg's (2001) data set ends in 1990. Other models all have the same 54. Leitenberg's (2001) data set is an exception because it ends with the end of the cold w This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 836 JOURNAL OF CONFLICT RESOLUTION coding rules."5 I also restrict the analysis to a smaller number of lists entire period from 1945 to 1999. The model is very similar to those developed by Fearon and Laitin ( lier and Hoeffler (2001). Because these studies are by now well know cuss the theory behind the selection of variables. Briefly, civil war ons be less likely the higher the level of development, proxied by real per (gdpll).56 Anocracies (anoc211), states at the mid-range in the Polity IV have a higher risk of war as they are neither as effective as autocracies i as good as democracies in peaceful conflict resolution (Fearon and Lait et al. 2001). States with political instability (inst311) and regime transit higher risk of war onset.57 Ethnic heterogeneity (ef) should increase onset by pitting groups with different preferences against each other.58 ories expect a different relationship between this variable and civil empirical results on the link between ethnicity and civil war are mixe population (measured by the natural log of population, 1popnsll), wh shown to be significant and positively correlated with war onset.59 I also percentage of Muslims in the population (muslim) as a proxy of part of civilization" hypothesis and as a measure of religious division." I cont nomic growth, measured as annual percentage change in the level of income (groll), because Collier and Hoeffler (2001) find this to be sig negatively correlated with war onset. I control for countries that are exporters (oil211).6' Such countries are thought to be at higher risk of wa of reasons-the most commonly encountered hypothesis is that oil co institutions or that it generates incentives for secessionist war. I contr mountainous terrain (mtnll), following Fearon and Laitin, who view ter the technology of insurgency (mountains provide hideouts for rebels) for a variable measuring time at peace since the last war (pwt). All r variables are lagged 1 year.62 55. A risk is that the degree of multicollinearity in the data may be different across civ encing standard errors differentially. But this would be a result of differences in the coding variable, so it does not pose a problem for the analysis. 56. I use data from Fearon and Laitin (2003) for this variable. According to civil wa state capacity should discourage rebellion or allow the state to repress it in its early stag ment levels should discourage rebellion by raising the economic opportunity costs of v 57. I use the "Polity 2" series from the Polity IV data project, version 2002, to const and instability variables. 58. Ethnic heterogeneity was constructed by Fearon (2003) and is available in Fe (2003) replication data set. This is a constant, so I do not lag it. 59. I used data from the World Bank and other sources to complete the population supplement for more information. 60. The source for this variable is Fearon and Laitin (2003). 61. This series has several differences from the respective series in Fearon and Laitin the same underlying sources have been used (World Bank data). I discuss all differen supplement. 62. Fearon and Laitin (2003) do not lag all variables (e.g., oil is not lagged) and use the value for the second period in the country series to fill in missing values at the start of the country series for those variables that are lagged. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 837 This model captures the basic logic of prominent theories of civil war.6 include Fearon and Laitin's (2003) dummy variable for "new states" because able perfectly predicts the outcome when I lag all independent variables.6 it is not clear to me what that variable measures that is not already captured trols for "instability," "anocracy," and "income." "New" states are more like civil war than "old" states, as Fearon and Laitin found, but they are also m cally diverse than "old" states. Controlling for "new states" may artificially significance of ethnic fractionalization (ef). I say "artificially" because th these states are "new" may well be causally linked to their high ethnic frag Most "new states" are former colonies with higher ethnic fractionalization tries in which a nation-building process has reduced the effective level of e mentation. It might therefore be the case that the "newness" of the state e high degree of ethnic fractionalization. Alternatively, looking further upst chain of causality, the "newness" of the state may also be a consequence of for self-determination by ethnically distinct groups that a distant metropole attempted or succeeded to integrate into a single nation in the predecess Either way, if there is a causal connection between new state and ef, we can for both in the regression, and ef is the more theoretically interesting con We can now check if the correlations between these key variables and civ influenced by the coding of the dependent variable. In what follows, I estim models corresponding to 12 different coding rules and two versions of th variable and for overall war prevalence. I review these estimates, looking f in significance levels and very large changes in coefficient estimates (more standard deviations), because such volatility would have important policy tions in assessing the link between changes in the explanatory variables and the probability of civil war onset. I present the results of these estimation tables.66 Table 2 includes estimation results for 12 regressions using version dependent variable, and Table 3 summarizes the range of these parameter Table 4 includes estimation results for 12 regressions using version (b) of t 63. This is, of course, only one of several possible specifications of the model. I do not i variables that others have found significant because I find the theoretical justification for inc problematic or because other variables included in the model are likely to capture much of wha variables are measuring. For example, I do not include a control for noncontiguous territory be itself, not a good control of state strength without also controlling for military technology and a ject military power. It may also be the case that stronger states only have the capacity to maintain ous territories. Although this certainly does not apply to each case, an equality of means tes countries with noncontiguous territories have significantly higher average income ($5,855) t with only contiguous territories ($3,223). The t statistic for this test is -17.78. 64. For example, in my data set, there are only five war starts that occur in the second year try series and for which income is nonmissing, but all five are dropped due to missing values growth variable. 65. A logit regression of "new state" on "ethnic fractionalization" (ef) yields a statistically and large coefficient (2.69) for ef(z value = 5.52) with 316 observations (I restricted the sampl two observations in each country series). 66. I do not report fixed-effects results because fixed-effects estimation is very sensitive ment error (hence we can expect the coding rule differences to influence the results). I present th the online supplement. There are important differences across war lists, even among variables tha in ordinary logit models. (text continues on p. 840) This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 2 Probit Models of Civil War Onset, 19 Collier Gleditsch Gleditsch Fear and et al et al. and Sam Hoeffler Licklider (2001) (2001) Laitin Leiten COW 1994 COW 2000 (2001) (1995) (Wars) (All) (2003) (200 Variable warst 1 warst2 warst3 warst4 warst5 warst6 warst7 warst8 GDP -0.109 -0.114 -0.097 -0.075 -0.039 -0.047 -0.093 -0.06 (0.032) (0.040) (0.033) (0.030) (0.020) (0.018) (0.032) (0.0 GDP growth -0.384 -0.237 0.493 0.564 -1.218 -0.560 -0.225 0 (1.015) (0.898) (0.810) (0.760) (0.798) (0.542) (0.787) (1.0 Instability 0.339 0.321 0.131 0.238 0.263 0.228 0.239 0. (0.147) (0.134) (0.153) (0.154) (0.126) (0.104) (0.143) (0.1 Anocracy 0.212 0.233 0.199 0.234 0.217 0.287 0.251 0.2 (0.141) (0.121) (0.133) (0.147) (0.130) (0.089) (0.136) (0.1 Oil exporter 0.274 0.222 0.180 0.188 0.204 0.170 -0.060 0 (0.186) (0.157) (0.119) (0.149) (0.145) (0.150) (0.142) (0.1 Ethnic fractionalization 0.118 0.144 0.175 0.209 0.317 0.524 -0.006 0 (0.225) (0.206) (0.200) (0.217) (0.258) (0.180) (0.210) (0.2 Population (log) 0.102 0.090 0.104 0.113 0.111 0.090 0.109 0.10 (0.032) (0.030) (0.032) (0.037) (0.033) (0.032) (0.033) (0.0 Terrain 0.005 0.005 0.006 0.003 0.004 0.002 0.004 0.0 (0.003) (0.003) (0.003) (0.002) (0.002) (0.002) (0.003) (0.0 00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Percentage Muslim -0.001 0.001 0.002 0.003 0.002 0.003 0.003 0.0 (0.002) (0.002) (0.001) (0.001) (0.002) (0.001) (0.001) (0.0 Peace duration -0.003 -0.004 0.000 -0.006 -0.012 0.000 -0.002 -0.0 (0.005) (0.006) (0.006) (0.006) (0.005) (0.004) (0.005) (0.0 Constant -3.844 -3.626 -4.014 -4.147 -3.987 -3.668 -3.903 -4.0 (0.563) (0.527) (0.582) (0.718) (0.587) (0.506) (0.574) (0.5 Observations 3,770 3,861 3,815 3,765 3,926 3,530 3,611 3,3 Log likelihood -229.21 -263.55 -254.81 -224.91 -290.74 -435.29 -250.53 -238 Wald X2(10) 67.38 75.69 72.15 73.59 102.21 82.62 89.31 4 Pseudo-R2 0.0933 0.0995 0.0876 0.0944 0.0877 0.0719 0.0934 0.0 NOTE: Coefficients (standard errors) are presented. Ongoing wars dropped afte domestic product. 00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 840 JOURNAL OF CONFLICT RESOLUTION TABLE 3 Summary of Parameter Estimates from Table 2 Standard Model 1 Variables Observations Mean Deviation Minimum Maximum GDP coefficient 12 -0.084 0.024 -0.114 -0.039 GDP SE 12 0.029 0.006 0.018 0.040 Growth coefficient 12 -0.031 0.550 -1.218 0.816 Growth SE 12 0.776 0.152 0.542 1.026 Instability coefficient 12 0.273 0.066 0.131 0.382 Instability SE 12 0.136 0.016 0.104 0.154 Anocracy coefficient 12 0.217 0.048 0.082 0.287 Anocracy SE 12 0.127 0.014 0.089 0.147 Oil coefficient 12 0.177 0.104 -0.060 0.371 Oil SE 12 0.146 0.018 0.119 0.186 Ethnic fractionalization coefficient 12 0.255 0.139 -0.006 0.524 Ethnic fractionalization SE 12 0.211 0.019 0.180 0.258 Population log coefficient 12 0.101 0.012 0.076 0.116 Population log SE 12 0.033 0.002 0.030 0.037 Terrain coefficient 12 0.003 0.002 0.001 0.006 Terrain SE 12 0.003 0.000 0.002 0.003 Muslim coefficient 12 0.001 0.001 -0.001 0.003 Muslim SE 12 0.001 0.000 0.001 0.002 Peace duration coefficient 12 -0.004 0.003 -0.012 0.000 Peace duration SE 12 0.005 0.001 0.004 0.006 Constant coefficient 12 -3.820 0.224 -4.147 -3.370 Constant SE 12 0.582 0.053 0.506 0.718 NOTE: Mean coefficients and standard err ent variable. Table 4 models allo (warll) on the probability of summarizes the range of param One of the most robust variab nificant, although its coefficie from -0.039 to -0.114. In Table 4, variation. We can see in Tables 3 2 standard deviations, depending tionship between income and ci Growth (groll) is never signifi in Tables 2 and 4.6 Other nonsig growth, and the switching s 67. In Fearon and Laitin's (2003) data s try from 1945 to 1999. In my data set, th variable captures some of the time dep that I use in Table 2 models. 68. Using a 3-year growth rate did not change this. By lagging growth, we lose the first two observations in the country series because we must lag gdpen by I year to calculate the growth rate (gdpgro), which cannot be calculated for the first year of the series. I coded one version of the growth variable, grollm, where (text continues on p. 843) This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 4 Probit Models of Civil War Onset, 1960-1 Collier Gleditsch Gleditsch Fearon and et al. et al. and Samba Hoeffler Licklider (2001) (2001) Laitin Leitenber COW 1994 COW 2000 (2001) (1995) (Wars) (All) (2003) (2001) (1 Variable warst I b warst2b warst3b warst4b warst5b warst6 GDP -0.126 -0.134 -0.092 -0.077 -0.056 -0.054 -0.098 -0.088 -0 (0.035) (0.039) (0.029) (0.028) (0.023) (0.016) (0.029) (0.030) GDP growth -0.295 -0.072 0.195 0.584 -1.087 -0.895 -0.351 0.431 (0.914) (0.791) (0.716) (0.738) (0.804) (0.560) (0.774) (0.896) Instability 0.316 0.348 0.130 0.217 0.284 0.134 0.167 0.393 (0.161) (0.145) (0.145) (0.152) (0.119) (0.094) (0.131) (0.143) Anocracy 0.314 0.289 0.226 0.215 0.204 0.247 0.239 0.170 (0.139) (0.127) (0.121) (0.143) (0.132) (0.082) (0.124) (0.123) Oil exporter 0.358 0.291 0.194 0.222 0.240 0.302 0.111 0.373 (0.180) (0.155) (0.130) (0.156) (0.141) (0.116) (0.140) (0.153) Ethnic fractionalization 0.095 0.157 0.295 0.260 0.391 0.591 0.084 0.07 (0.212) (0.199) (0.203) (0.222) (0.277) (0.169) (0.188) (0.228) Population (log) 0.090 0.083 0.094 0.129 0.143 0.133 0.131 0.10 (0.031) (0.029) (0.032) (0.045) (0.035) (0.033) (0.031) (0.028) Terrain 0.005 0.006 0.006 0.002 0.003 0.002 0.004 0.001 (0.003) (0.003) (0.003) (0.002) (0.002) (0.002) (0.002) (0.003) Percentage -0.001 0.000 0.003 0.003 0.002 0.002 0.003 -0.00 Muslim (0.002) (0.002) (0.001) (0.001) (0.002) (0.001) (0.001) (0.001) War at (t - 1) -0.351 -0.351 -0.447 -0.677 -0.005 -0.103 -0.274 -0.37 (0.192) (0.176) (0.200) (0.236) (0.169) (0.107) (0.124) (0.215) 40 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 4 (continued) Collier Gleditsch Gleditsch Fearon and et al. et al. and Samba Hoeffler Licklider (2001) (2001) Laitin Leitenber COW 1994 COW 2000 (2001) (1995) (Wars) (All) (2003) (2001) (1 Variable warst 1 b warst2b warst3b warst4b warst5b warst6b w Constant -3.669 -3.575 -3.942 -4.520 -4.733 -4.353 -4.289 -3.747 (0.554) (0.516) (0.581) (0.787) (0.605) (0.530) (0.545) (0.530) Observations 4,045 4,179 4,099 4,178 4,179 4,179 4,179 3,782 Log likelihood -248.10 -289.52 -266.58 -237.15 -334.42 -561.33 -303.22 -270.64 Wald X2(10) 63.13 75.12 79.28 74.27 102.19 101.28 75.07 40.53 Pseudo-R2 0.0936 0.1026 0.0863 0.0952 0.0908 0.0844 0.0838 0.0715 NOTE: Ongoing wars coded 0 if no new war starts. Coefficients (standard errors) ar GDP = gross domestic product. ti This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 843 TABLE 5 Summary of Parameter Estimates from Table 4 Standard Model 2 Variables Observations Mean Deviation Minimum Maximum GDP coefficient 12 -0.093 0.024 -0.134 -0.054 GDP SE 12 0.028 0.006 0.016 0.039 Growth coefficient 12 -0.199 0.489 -1.087 0.584 Growth SE 12 0.705 0.137 0.529 0.914 Instability coefficient 12 0.249 0.085 0.130 0.393 Instability SE 12 0.131 0.020 0.094 0.161 Anocracy coefficient 12 0.243 0.043 0.170 0.314 Anocracy SE 12 0.121 0.016 0.082 0.143 Oil coefficient 12 0.238 0.080 0.111 0.373 Oil SE 12 0.140 0.021 0.106 0.180 Ethnic fractionalization coefficient 12 0.280 0.155 0.075 0.591 Ethnic fractionalization SE 12 0.206 0.027 0.169 0.277 Population log coefficient 12 0.104 0.026 0.050 0.143 Population log SE 12 0.033 0.005 0.028 0.045 Terrain coefficient 12 0.003 0.002 0.001 0.006 Terrain SE 12 0.002 0.000 0.002 0.003 Muslim coefficient 12 0.001 0.001 -0.001 0.003 Muslim SE 12 0.001 0.000 0.001 0.002 Lagged war coefficient 12 -0.286 0.212 -0.677 0.101 Lagged war SE 12 0.164 0.041 0.107 0.236 Constant coefficient 12 -3.915 0.493 -4.733 -2.911 Constant SE 12 0.587 0.081 0.516 0.787 NOTE: Mean coefficients and standard errors are endogeneity problem with this variab ting the date of onset of civil war "wr ficient sign of this variable.69 A pos that, in the immediate postwar period, periods of high risk of war recurren war recurrence might also be high-g be for growth to reduce the risk of ci Political instability is mostly signifi in two regressions (Collier and Hoeff in another two cases, it is significant lagging is delayed by one observation, so this r Tables 2 and 4, this adds only five observation instead of groll makes anocracy slightly more sions in Table 2 and efa little less significant in growth do not change. 69. Given my concern with endogeneity h regressions. See versions of Tables 2, 3, 4, and very similar to the ones presented here. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 844 JOURNAL OF CONFLICT RESOLUTION less robust, being significant in only half of the regressions and nonsign other half. Anocracy borders statistical significance in six regressions in Table 2 an significant only in Gleditsch et al.'s (2001) list, which combines all arm Similarly, it is significant in only six out of the twelve regressions in Table cient ranges from a low of 0.170 (Leitenberg 2001 list) to a high of 0.314 Small 1994 list) in Table 4, and the range widens in Table 2 from a low high of 0.287. Oil exports are basically nonsignificant in Table 2, although the pictu mixed in Table 4, where it is significant in four regressions and borderlin in another three.70 Its coefficient varies wildly from -0.060 to 0.371 (Tabl it is less unstable in Table 5). This variable is not robust to changes in the co onset. Ethnic heterogeneity (ef) is positive and almost always nonsignificant with a standard error, except for regression 6 (Gleditsch et al. 2001--all armed confli Table 2. Efis nearly significant (at the .10 level) in Regan's (1996) model (warst where again a number of additional wars are included because Regan uses a low death threshold (200 deaths). In Table 4, efis more often significant (in two mod the .05 level and in three more at the .10 level), although its coefficient has a range, from 0.075 to 0.591 (see Table 5). That efis generally nonsignificant in Ta and much more significant in Table 4 may have something to do with those extra that we are able to include when we use version (b) of war onset.7' The coefficient of population (lpopnsll) is quite stable and always positive an highly significant, except for model 9 in Table 4, where it is not significant. This i esting result seems to justify my earlier claim that the significance of populatio may be an artifact of the high absolute threshold of deaths imposed to classify a war. In model 9, Regan's lower death threshold captures more small wars in sm countries. In both Tables 2 and 4, mountainous terrain (mtnll) is relatively stable and is only significant in the Correlates of War and Collier and Hoeffler (2001) data sets. Mountainous terrain is a key variable in theories of civil war that emphasize the opportunity structure for rebellion (e.g. Fearon and Laitin 2003; Collier and Hoeffler 2001), but it is generally not statistically significant. The percent of Muslims in a country (muslim) is marginally significant in two regressions and in another two at the .10 level (see Table 2) and becomes somewhat more significant in Table 4. There, we find it significant in the Fearon and Laitin (2003) model (warst7b) with a positive sign, despite the findings of these authors that religious division is nonsignificant in models of civil war onset. However, the period analyzed seems to matter, as later (see Table 6) we will find that the significance of 70. I use the variable oil2, lagged once (oil211). If I were to use Fearon and Laitin's (2003) oilll series (lagging it once), it would be significant in model 6 and, at the . 10 level, in models 8 and 9 and nonsignificant in all other models, including some of those in which oil211 was significant (COW 1994 and my data). 71. Countries with more than one war at the same time have slightly higher ethnic fractionalization. There are 26 such cases (of chronologically overlapping wars in my data set), and a means test for ethnic fractionalization shows a statistically significant I10-point difference between them and the other cases. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 6 Probit Models of Civil War Onset, 1945-1999 Gleditsch Gleditsch Gleditsch Fearon Fearon Fearon Doy et al. et al. et al. and and and Sambanis Sam (2001) (2001) (2001) Laitin Laitin Laitin (Wars) (Wars) (Wars) (2003) (2003) (2003) Expa Variable 1 2 3 4 5 6 7 GDP -0.065 -0.079 -0.078 -0.095 -0.104 -0.099 -0.086 (0.022) (0.023) (0.023) (0.025) (0.025) (0.024) (0 GDP growth -0.962 - -1.350 -0.396 - -1.443 -0.1 (0.581) - (0.539) (0.537) (0.617) (0.3 Instability 0.305 0.279 0.278 0.222 0.176 0.186 0.22 (0.095) (0.088) (0.089) (0.101) (0.096) (0.097) (0 Anocracy 0.157 0.244 0.214 0.265 0.332 0.296 0.31 (0.108) (0.090) (0.095) (0.101) (0.093) (0.096) (0 Oil exporter 0.294 0.310 0.301 0.145 0.192 0.173 0.1 (0.127) (0.128) (0.135) (0.128) (0.119) (0.124) (0 Ethnic fractionalization 0.417 0.268 0.236 0.254 0.209 0.245 (0.217) (0.211) (0.208) (0.175) (0.166) (0.164) (0 Population (log) 0.140 0.137 0.144 0.142 0.141 0.148 0.0 (0.028) (0.025) (0.027) (0.029) (0.029) (0.030) (0 Terrain 0.003 0.002 0.002 0.003 0.003 0.003 0.002 (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) (0 Percentage Muslim 0.000 - - 0.002 - (0.002) - - (0.001) - - (0.0 War at (t - 1) 0.010 0.016 -0.007 -0.309 -0.342 -0.364 -0.30 (0.150) (0.143) (0.145) (0.097) (0.104) (0.102) (0 Constant -4.670 -4.500 -4.571 -4.581 -4.472 -4.568 -3.71 (0.487) (0.447) (0.474) (0.504) (0.489) (0.514) (0 ~tl This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 6 (continued) Gleditsch Gleditsch Gleditsch Fearon Fearon Fearon Doy et al. et al. et al. and and and Sambanis Sam (2001) (2001) (2001) Laitin Laitin Laitin (Wars) (Wars) (Wars) (2003) (2003) (2003) Expan Variable 1 2 3 4 5 6 7 Observations 5,893 6,092 6,051 5,893 6,092 6,051 5,893 Log likelihood -445.83 -491.67 -479.81 -408.14 -456.90 -442.08 -477.9 Wald X2(d/) 112.68 112.73 111.37 84.87 70.12 77.65 63. Pseudo-R2 0.0910 0.0870 0.0943 0.0908 0.0897 0.0957 0.078 NOTE: Ongoing wars coded 0 if no new war starts. Coefficients (standard errors) are pr GDP = gross domestic product. o00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 847 muslim disappears when we analyze the entire 1945 to 1999 period. But t results suggest that newer wars (post-1960) may have some different char from older wars (1945-1959) with respect to the role of religion. Peace duration is significant only in a single regression (Gleditsch et a warst5) in Table 2. We get a much more mixed picture in Table 4, where war i vious period (warll) is significant and negative in about half of the regre nonsignificant in the other half.72 Comparing the high- and low-death thresholds in the two regressions tha on Gleditsch et al.'s (2001) data reveals important substantive differences lower level violence and civil war. In model 6 (all armed conflict), we find t heterogeneity, percentage Muslim, and oil exporter status (in Table 4) are a nificant, whereas they become nonsignificant when we restrict the analys wars. Instability, by contrast, is significant only for civil wars and does no the risk of lower level violence significantly. This comparison is particularl tive because the same research team did the coding of both lists, and any diff parameter estimates must be attributed to the death threshold. Also, these sions highlight the differences between the two versions of the dependen Using version (a), instability enters significantly in both regressions 5 (civil (all armed conflict), and oil exports are nonsignificant. Using version (b), o become significant for regression 6, and the coefficient for instability drops half (see Table 4). In Table 6, I ran the same regressions, restricting the number of civil war li so that I could analyze the entire 1945 to 1999 period and use the most recen documented data sets. Here, again, we observe that income and popu robustly significant, and anocracy becomes more significant (although not sion 1). Instability is significant in most lists, though marginally nonsignif Fearon and Laitin's (2003) list in regressions 5 and 6, where I use their cod and do not lag the first observation in each country series. Anocracy is m than previously. Growth is now significant and negative in three of four lists, we do not lag the first observation, which results in artificial starting valu country series. Adding growth to regressions 3, 6, and 10 causes us to lose s vations, which reduces the significance of some variables (e.g., ef in regr Although there is more agreement here than in the previous tables, we some important differences in the results on ethnic fractionalization (now in two lists and borderline significant in regression 1 with Gleditsch et al.' and with respect to oil exporter status, which is positive and significant in four lists (Gleditsch et al. 2001; Sambanis 2004). Similarly, war in the prev is significant and negative only in two out of four lists (Fearon and Laitin and Sambanis 2000). Other variables-percentage Muslim and mount terrain-are consistently nonsignificant. 72. It would have been interesting to explore the effects on war onset of interactions between able and other right-hand side variables but, because there are only a few instances of chronol lapping civil wars in the same country, these interaction terms might have had the effect of over model to a few observations. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 848 JOURNAL OF CONFLICT RESOLUTION The results presented in Tables 2, 4, and 6 suggest wide-ranging agr robustness of income and population and cast doubt on the robustness variables used in civil war models, especially when we consider the tru analyzed in Tables 2 and 4. The most significant differences are with impact of oil exports, ethnic heterogeneity, and war in the previous p there seems to be agreement that the mountainous terrain variable, wh cant variable in Collier and Hoeffler's (2001) model, is not robust to al sures of civil war. There is also no evidence of a robust association between civil war and the percentage of Muslims in a country. GDP growth is generally nonsignificant and may well be endogenous to levels of violence. If we restrict our analysis to the three most recent data sets-Gleditsch et al. (2001), Fearon and Laitin (2003), and Sambanis (This Study)-then we find more agreement among key variables. But when we extend the analysis back to 1945, we must also consider if there was something different about these earlier civil wars, particularly those that started before 1945 (a few civil wars are left-censored at 1945). The significance of ethnic heterogeneity (ef) in some of the regressions in Table 6 is worth exploring further, given the prominence of that variable in the civil war literature. Both Fearon and Laitin (2003) and Collier and Hoeffler (2001 ) make strong statements against the significance of ethnic heterogeneity as a factor leading to civil war. I have shown that there is a very strong relationship between ethnic heterogeneity and an aggregate indicator of armed conflict and much less so with civil war. To the extent that violence escalates from minor to high levels, we should find ethnic fractionalization to be significant in a dynamic model of violence escalation. But when we look at civil wars alone, why might efnot be significant in the Fearon and Laitin (2003) data set and analysis, given some of the results presented here? Lagging right-hand side variables makes a difference, as we saw in Table 6. Fearon and Laitin include in their analysis 10 wars that occur on the first year of the country series. Using this data and dropping those wars by lagging all explanatory variables in their replication data set brings efvery close to statistical significance (see online supplement). Those wars are dropped in my data set (except where I note that I replace missing observations at the start of each country series with values for the second year of the country series, as Fearon and Laitin do). For some variables (e.g., GDP), this approach to preserve observations makes sense and approximates what any imputation program would do. However, other variables, such as instability, economic growth, and anocracy, are harder to "impute" for the start of the country series, especially for new states.73 Note also that among those dropped observations, the mean level of ethnic fractionalization for cases of war onset (using my civil war list) is higher than that for cases of no war. Thus, restoring those observations should not reduce the significance of ef in my data. Indeed, I followed Fearon and Laitin's approach to preserve those observations and recomputed the model in Table 6 (regressions 3, 6, 10). Ethnic fractionalization is nonsignificant in the Fearon and Laitin model, but it is still 73. In Fearon and Laitin's (2003) replication data set, for example, out of a total 966 country-years of political instability, only 4 were years of instability in a newly established state, which sounds implausible almost by definition. In fact, the instability variable, as defined by Fearon and Laitin (a greater than 2 change in the Polity scale) cannot be measured for a country's first year of independence. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 849 significant using my data (it is marginally above the .05 level), so differenc rules influence the results on that variable. That said, efseems very sensitive ing rules. Notice, for example, that by keeping 158 observations in regressio lagging the first observation) using Gleditsch et al.'s (2001) civil war list, t cient for efdrops from 0.42 in regression 1 to 0.24. This suggests that the r are fragile. But the question also hinges on whether we can use this arbitr scheme to preserve those observations. Another, perhaps more substantive, reason for dropping those few cases the first year of the country series may be that several of them are left-ce example, the wars in Greece or the USSR all started before 1945, even th data sets list them as starting on or after 1945. If violence was ongoing before then it would have affected the right-hand side variables in 1945. Two of Greece and South Korea (also Philippines)-are particularly importan nonsignificance of ef in the Fearon and Laitin (2003) model because they low efscores. In sum, there may be something different about the wars that s ing the few years after World War II. Most were communist insurgencies w "ethnic" dimension, and the mean value of ef for countries at war from 1 was about half that of countries at war in the 1950s, 1960s, or any other d 1999. Thus, losing those cases by lagging explanatory variables explai comes close to significance using Fearon and Laitin's model and data. I now turn to an analysis of civil war prevalence. Prevalence is defined as of onset and continuation of war, so the dependent variable is coded 1 for all war. I assume that civil war is a first-order Markov process (i.e., there is no time dependence beyond the first period) and estimate a dynamic probit mo ing to the previous model a set of interaction terms between a variable de prevalence of civil war in the previous period and all right-hand side var estimated coefficients of the interaction terms can, after a minor adjustment preted as estimates of the relationship between the independent variables an tinuation, conditional on the occurrence of war in the previous period. The involves correcting the coefficient and standard errors of the interaction example, the coefficient of gdpll with respect to war continuation is the sum coefficients for gdpll and wlgdp: (-0.124 + 0.293). The standard error is t root of the variance of gdpll plus the variance of wlgdp plus two ti covariance. The estimates in Table 7 are already adjusted in this way; thus, next to coefficient estimates for the interaction terms indicate significance w to war continuation. Estimates of the linear terms refer to civil war onset, The overall picture is one of substantial differences across civil war lists prevalence brings us a step closer to analyzing war duration, and it is not given the large differences in war onset and termination highlighted in Table we should see differences in estimates of civil war prevalence. GDP is again robust and negative in all regressions, although population n always significant with respect to war onset. Most variables (except GDP) a cant in only one or a few models, and instability is the most often significant 12 models). Growth is not significant.74 74. Using the variable grollm (instead of groll) again adds only five observations and doe the results substantially. (text continues on p. 853) This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 7 Dynamic Probit Models of Civil War Prevalence Collier Gleditsch Gleditsch Fearon and et al. et al. and Samba Hoeffler Licklider (2001) (2001) Laitin Leitenber COW 1994 COW 2000 (2001) (1995) (Wars) (All) (2003) (2001) (1 atwar I atwar2 atwar3 atwar4 atwar5 atwar6 atwar7 atwar8 at GDP -0.126 -0.135 -0.098 -0.103 -0.061 -0.051 -0.100 -0.087 -0 (0.038) (0.042) (0.031) (0.034) (0.024) (0.017) (0.033) (0.031) GDP growth -0.425 -0.281 0.418 0.460 -1.187 -0.692 -0.307 0.72 (0.995) (0.899) (0.778) (0.733) (0.854) (0.557) (0.829) (0.985) Instability 0.361 0.348 0.122 0.259 0.289 0.258 0.259 0.403 (0.149) (0.136) (0.157) (0.154) (0.130) (0.106) (0.141) (0.155) Anocracy 0.217 0.241 0.194 0.226 0.218 0.307 0.262 0.197 (0.142) (0.124) (0.135) (0.150) (0.135) (0.092) (0.137) (0.131) Oil exporter 0.327 0.273 0.206 0.269 0.237 0.201 -0.051 0.421 (0.206) (0.178) (0.127) (0.172) (0.164) (0.156) (0.147) (0.187) Ethnic fractionalization 0.033 0.062 0.130 0.077 0.257 0.483 -0.064 0.17 (0.226) (0.211) (0.196) (0.242) (0.261) (0.175) (0.206) (0.220) Population (log) 0.053 0.043 0.058 0.038 0.054 0.057 0.077 0.02 (0.033) (0.032) (0.029) (0.031) (0.034) (0.031) (0.032) (0.031) Terrain 0.004 0.005 0.006 0.003 0.003 0.002 0.004 0.002 (0.003) (0.003) (0.003) (0.002) (0.003) (0.002) (0.003) (0.002) Percentage -0.001 0.001 0.003 0.003 0.002 0.002 0.003 0.000 Muslim (0.002) (0.002) (0.002) (0.001) (0.002) (0.001) (0.001) (0.001) 00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms War(t - 1) * Muslim -0.005 -0.003 -0.001 -0.007 -0.002 0.000 -0.003 0.001 (0.004) (0.004) (0.003) (0.004) (0.003) (0.003) (0.003) (0.003) War(t - 1) * GDP 0.169 0.147 0.033 0.201 -0.197 0.029 0.082 -0.003 (0.139) (0.110) (0.099) (0.111) (0.066) (0.053) (0.068) (0.058) War(t - 1) * Growth -0.021 -0.369 0.198 1.395 -0.451 1.036 1.200 0.419 (0.745) (0.615) (0.653) (0.838) (0.828) (0.644) (1.148) (0.592) War(t - 1) * Instability -0.533 -0.371 -0.266 -0.164 -0.172 -0.152 -0.256 0.158 (0.241) (0.223) (0.211) (0.217) (0.190) (0.179) (0.191) (0.173) War(t - 1) *Anocracy 0.077 0.144 0.339 0.115 0.336 0.019 0.112 0.405 (0.238) (0.215) (0.193) (0.194) (0.175) (0.189) (0.200) (0.154) War(t - 1) * Oil -1.001 -0.643 -0.508 -0.286 -0.312 -0.283 -0.088 -0.198 (0.555) (0.514) (0.421) (0.383) (0.436) (0.334) (0.354) (0.367) War(t - 1) * Ethnic 1.561 1.379 0.761 1.001 1.233 0.306 1.330 0.562 (0.717) (0.587) (0.541) (0.772) (0.484) (0.443) (0.553) (0.370) War(t - 1) * Population 0.220 0.200 0.253 0.209 0.221 0.244 0.257 0.19 (0.043) (0.043) (0.039) (0.045) (0.040) (0.036) (0.041) (0.030) War(t - 1) *Terrain 0.000 0.000 0.000 0.008 0.005 0.005 0.002 0.004 (0.006) (0.005) (0.004) (0.005) (0.004) (0.004) (0.005) (0.004) C0 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms TABLE 7 (continued) Collier Gleditsch Gleditsch Fearon and et al. et al. and Samba Hoeffler Licklider (2001) (2001) Laitin Leitenber COW 1994 COW 2000 (2001) (1995) (Wars) (All) (2003) (2001) (1 atwar I atwar2 atwar3 atwar4 atwar5 atwar6 atwar7 atwar8 atw _cons -3.016 -2.842 -3.245 -2.898 -3.201 -3.095 -3.381 -2.683 (0.573) (0.548) (0.486) (0.551) (0.550) (0.512) (0.549) (0.525) Observations 4,045 4,179 4,099 4,178 4,179 4,179 4,179 3,782 Wald X2 1,050.19 1,169.87 1,270.07 952.18 988.45 Log likelihood -337.62 -398.73 -375.13 -366.71 -431.94 -687.92 -371.13 -417.82 Pseudo-R2 0.6996 0.6834 0.7199 0.7465 0.6134 0.6510 0.7887 0.701 NOTE: Coefficients (standard errors) are presented. Bold indicates significance at .05 or higher tion terms have been adjusted to refer to war continuation only. GDP = gross domestic p 00 This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 853 The picture becomes much more unstable when we consider results with res war continuation in the shaded portion of Table 7. Here we see that the int between income and lagged war prevalence is often nonsignificant, but occasi is significant with switching signs: in Gleditsch et al.'s (2001) civil war list (a high income significantly reduces war continuation, whereas in the two mod on Doyle and Sambanis's (2000) data (atwarlO, atwarl 1), higher income once war is ongoing has the effect of increasing the risk of the war continuing. In (1996) and Licklider's (1995) data, the same relationship is borderline signifi .10). Instability while war is ongoing significantly reduces the risk of war continuation in two models (Singer and Small 1994; Regan 1996) and is nonsignificant in the rest.75 Similarly, anocracy is significant in one model and borderline in another two but otherwise nonsignificant. Oil exports are generally nonsignificant. There is wide agreement that populous countries will have longer wars and no evidence that countries with significant Muslim populations or mountainous terrain will have longer civil wars. Interesting results emerge again with respect to ethnic fractionalization (ef). The interaction term with efis significant and positively correlated with war continuation in four regressions (e.g., in regression 7, using Fearon and Laitin's 2003 civil war list) and close to significant in one more. Thus, the results with respect to this variable are genuinely divided: about half the models would tell us that a very diverse country would have long wars, once a war actually started, whereas the other half would point to no statistically significant relationship between ethnicity and war duration. There is more support for the positive association between ethnic fractionalization and war continuation in Table 8, where I present prevalence results from the four civil war lists that span the entire period from 1945 to 1999, restricting the analysis to four civil war lists. Here, we see the same sort of substantial disagreement with respect to the effects of income on war continuation: in one model, the coefficient is significant and negative; in another, it is significant and positive; and in the other two, it is nonsignificant.76 The more we push the data, the more the differences in coding rules will matter. CONCLUSION This article does three things. First, it demonstrates that there are subst ences across civil war lists with respect to the coding of the onset and te civil war. Exploring those differences analytically reveals some conceptu 75. I am not concerned here with the theoretical explanation of these results. How reduce war duration is unclear. It may be that a regime transition toward significant democ would be coded as instability) satisfies some of the rebels' demands and leads to war termin of the instability variable may be affected by the occurrence of war, but it is not my purpose these questions of endogeneity. 76. The differences actually become greater if we do not lag the first observations in series (giving 6,051 observations). With respect to onset, growth is significant and negative in regressions, instability is significant in one regression, anocracy is significant in all regres are significant in one regression, and population is significant in one regression. With resp tion, there is no significant change from the differences noted in Table 8 (see the result supplement). This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 854 JOURNAL OF CONFLICT RESOLUTION TABLE 8 Dynamic Probit Models of Civil War Prevalence, 1945-1999 Gleditsch Fearon Doyle and Sambanis et al. (2001) and Laitin Sambanis (2000, (This (Wars) (2003) Expanded) Study) GDP -0.073 -0.095 -0.102 -0.085 (0.025) (0.027) (0.026) (0.025) GDP growth -1.006 -0.204 -0.187 -0.113 (0.626) (0.473) (0.411) (0.392) Instability 0.273 0.243 0.219 0.213 (0.108) (0.108) (0.103) (0.109) Anocracy 0.172 0.291 0.274 0.294 (0.112) (0.114) (0.115) (0.099) Oil exporter 0.276 0.067 0.148 0.148 (0.146) (0.151) (0.147) (0.130) Ethnic fractionalization 0.220 0.115 0.295 0.221 (0.218) (0.192) (0.199) (0.200) Population (log) 0.055 0.078 0.034 0.043 (0.027) (0.027) (0.026) (0.024) Terrain 0.003 0.003 0.001 0.002 (0.002) (0.002) (0.003) (0.003) Percentage Muslim 0.001 0.002 0.000 0.002 (0.002) (0.001) (0.001) (0.001) War(t - 1) * GDP -0.153 0.042 0.111 0.022 (0.073) (0.048) (0.041) (0.026) War(t - 1) ' Growth 0.108 0.288 -0.399 0.580 (0.615) (0.791) (0.495) (0.661) War(t - 1) * Instability -0.108 -0.309 -0.169 -0.040 (0.184) (0.178) (0.153) (0.166) War(t - 1) * Anocracy 0.319 0.054 0.159 0.021 (0.196) (0.200) (0.146) (0.177) War(t - 1) ' Oil -0.103 -0.042 -0.228 -0.087 (0.323) (0.291) (0.302) (0.284) War(t - 1) * Ethnic 1.004 1.052 0.700 0.639 (0.417) (0.349) (0.303) (0.353) War(t - 1) e Population 0.211 0.265 0.197 0.226 (0.035) (0.032) (0.029) (0.032) War(t - 1) * Terrain 0.004 0.003 0.002 0.003 (0.005) (0.003) (0.003) (0.003) War(t - 1) * Muslim -0.001 -0.002 0.000 0.000 (0.003) (0.002) (0.003) (0.002) _cons -3.159 -3.501 -2.689 -2.880 (0.436) (0.448) (0.429) (0.412) Observations 5,893 5,893 5,893 5,893 Wald X2 791.89 1,736.07 1,898.86 2,022.78 Log likelihood -596.74 -548.11 -694.25 -671.34 Pseudo-R2 0.5797 0.7731 0.6963 0.7185 NOTE: Coefficients (standard errors) are presented. Bold indi cates significance at .10. Parameters for all interaction terms h only. GDP = gross domestic product. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 855 about the meaning of civil war and can serve as the foundation for a theoretic sion that establishes what civil war is and how it can be distinguished from othe of political violence. Second, it proposes a new coding rule for civil war that a to sidestep some of the problems identified in other coding rules and offers a ne civil wars that is based on this new coding rule. Third, it measures the subs implications of differences in coding rules by formally comparing the emp results that we get from a civil war model when we use 12 different rules to c War. The quantitative literature on civil war reveals a remarkable degree of disagr on how to code the onset and termination of wars, and the literature is fuzzy on distinguish among different forms of political violence. This implies the need for rizing about civil war and then for proper measurement of the concept. Given t ing complexities I have identified, researchers should conduct robustness tests different civil war lists and justify their coding decisions as transparently as p Differences in the coding of civil war have substantive implications, but perhap large as one might have expected. The results presented here show that the es coefficients of most variables vary widely as a result of changes in the coded o civil war. In a few cases, the sign of the coefficient also changes, and significan els also vary across data sets. Several variables are not robust to changes in the of civil war onset, most strikingly in the case of lagged war prevalence, fractionalization and oil exports, and, to a much lesser extent, anocracy and instab Predictions of when and where civil war might occur are likely to depend cr on which coding rule we use. More important, estimates of the substantive eff policy interventions to reduce the risk of civil war by manipulating the level of i political instability, or any of the other "manipulable" variables in the model w vary widely, depending on the coding rule for war onset and termination. At the same time, some variables are remarkably robust to coding differe Income level and population size are very robust and significantly associated wi war. That significance levels for these variables do not change much despite lar ferences in coding rules is likely because these variables change slowly over ti problem that this highlights is that we cannot rely on them to make accurate tions of the timing of war onset. Other variables-especially mountainous ter Muslim population, and economic growth-are consistently nonsignificant minor exceptions). The results from models of war prevalence suggest that predictions of civ duration will be even less accurate than predictions of civil war onset. There greater instability of empirical results in the prevalence model, so analyses of c duration will be much more affected by differences in the coding rules. Robu tests are therefore essential when analyzing war duration or termination. Overall, civil war models seem to be good at identifying countries with lon proclivities to civil war. But if the models are further developed to be able to pre timing of war onset better, then we are likely to see coding differences in the dep variable affect the parameter estimates more significantly (the changes we found coefficient sign of economic growth, which changes substantially over time, a indication of this). Thus, differences in coding will begin to matter if the theo This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 856 JOURNAL OF CONFLICT RESOLUTION models "catch up" to the data by including more time-sensitive variables. ory develops more, it seems likely that analysts of civil wars will have to go data to develop a higher degree of consensus on the meaning of civil war can produce accurate and credible predictions of where and when civil war That said, I have no clear answer on whether it is better to have a single de coding rule for civil war or if it is more beneficial to have differing definiti hand, some standardization of our coding rules would reflect theoretical c what civil war is, and homogeneity in measurement would rule out one pos of difference in empirical results. On the other hand, that we have many suggests that the concept of civil war may mean different things to diffe and coding rules should reflect differences in subjective understandings o cept. Analyzing the differences across coding rules can be instructive be differences map out the space within which we might be able to fin understanding of civil war. A substantive result from this analysis, and to which I must return, is fractionalization may not be as nonsignificant a correlate of civil war as ma have argued. Its significance hinges perhaps too much on the coding rules for but ethnic fractionalization is clearly important in explaining a broad ca armed conflict that includes minor insurgency. This is an important result, c how difficult it was to clearly distinguish between civil war and other forms of violence. Ethnic fractionalization was sometimes significantly correla war onset and war continuation, but these results were at times affected wars that are dropped when we lag explanatory variables by 1 year. Howe ences in coding rules seem to explain the nonsignificance of that variabl and Laitin's (2003) results. In other runs (see supplement), religious fractio and the size of the largest confession were significant and often dominated t ethnic fractionalization.77 Thus, ethnoreligious identity may have been wri quickly as a correlate of large-scale armed conflict-even "civil war" scholars. Several other substantive results are worth noting. Economic growth may be endogenous to civil war, which implies the need for a different estimation strategy for models that include that variable. The political variables of anocracy and instability, which are central to some models of civil war, are sensitive to the coding rules but much less so if we examine the entire post- 1945 period. The significance of the population variable may be tied to the high threshold of violence used to identify civil war and distinguish it from minor violence. Mountainous terrain-a key measure of the technology of insurgency in civil war in some models-is not robust to changes in the coding rules, and neither is the measure used to identify countries that are major exporters of oil. Peace duration or war in the previous year is also sensitive to changes in coding. Moreover, the significance of some of these variables is affected by whether observations of ongoing war are dropped (e.g., anocracy is much more consistently significant in Table 4 than in Table 2). Finally, the period analyzed influences some 77. For an analysis of the sensitivity of empirical results in the civil war literature to small changes in the specification of the model, see Hegre and Sambanis (2004). This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms Sambanis / WHAT IS CIVIL WAR? 857 results due to the small number of war starts: for example, instability consistently significant in Table 6 than in Table 4. Despite these differences and difficulties, the conclusion from this study shou be that coding wars and analyzing them quantitatively is a futile exercise. Rat abandon these efforts, I favor redoubling them by improving the coding rule ing them transparently to the data, and studying the implications of difference coding rules. The legacy of the Correlates of War project is that we now have at posal replicable data on civil war that we can analyze quantitatively to point to oretical and empirical complexities of defining and measuring civil war. This suggests ways of building on that legacy. REFERENCES Collier, Paul, and Anke Hoeffler. 2001. Greed and grievance in civil war. Policy Resear Bank. Connor, Ken. 1998. Ghost force: The secret history of the SAS. London: Weidenfeld and Nicolson. Doyle, Michael, and Nicholas Sambanis. 2000. International peacebuilding: A theoretical and quantitative analysis. American Political Science Review 94 (4): 779-801. Fearon, James D. 2003. Ethnic and cultural diversity by country. Journal ofEconomic Growth 8:195-222. Fearon, James D., and David D. Laitin. 2003. Ethnicity, insurgency, and civil war. American Political Science Review 97 (1): 75-90. Ghobarah, H., Paul Huth, and Bruce Russett. 2003. Civil wars kill and maim people, long after the fighting stops. American Political Science Review 97 (2): 189-202. Gleditsch, Nils Petter, HAvard Strand, Mikael Eriksson, Margareta Sollenberg, and Peter Wallensteen. 2001. Armed conflict 1945-99: A new dataset. Unpublished paper, PRIO, Oslo, Norway. Gleditsch, Nils Petter, Peter Wallensteen, Mikael Eriksson, Margareta Sollenberg, and Havard Strand. 2002. Armed conflict 1946-2001: A new dataset. Journal of Peace Research 39 (5): 615-37. Harff, Barbara. 2003. No lessons learned from the Holocaust? Assessing risks of genocide and political mass murder since 1955. American Political Science Review 97 (1): 57-73. Hegre, Havard, Tanja Ellingsen, Scott Gates, and Nils Petter Gleditsch. 2001. Toward a democratic civil peace? Democracy, political change, and civil war, 1816-1992. American Political Science Review 95:33-48. Hegre, Havard, and Nicholas Sambanis. 2004. Sensitivity analysis of empirical results in the quantitative erature on civil war. Unpublished manuscript, PRIO and Yale University. Leitenberg, Milton. 2001. Deaths in wars and conflicts between 1945 and 2000. Paper prepared for the Co ference on Data Collection in Armed Conflict, June 8-9, Uppsala, Sweden. Licklider, Roy. 1995. The consequences of negotiated settlements in civil wars, 1945-1993. American Po cal Science Review 89 (3): 681-90. Marshall, Monty, and Keith Jaggers. 2000. Polity IV project [Codebook and data files]. Accessed fr www.bsos.umd.edu/cidcm/inscr/polity/. Mason, David, and Patrick Fett. 1996. How civil wars end: A rational choice approach. Journal of Confl Resolution 40:546-68. McAdam, Doug, Sidney Tarrow, and Charles Tilly. 2001. Dynamics of contention. Cambridge, UK bridge University Press. Price, D. L. 1975. Oman: Insurgency and development. Conflict studies. London: Institute for the Stu Conflict. Regan, Patrick. 1996. Conditions for successful third party interventions. Journal of Conflict Resolution 40 (1): 336-59. Sambanis, Nicholas. 2001. Do ethnic and nonethnic civil wars have the same causes? A theoretical and empirical inquiry (Part 1). Journal of Conflict Resolution 45 (3): 259-82. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms 858 JOURNAL OF CONFLICT RESOLUTION . 2004. Expanding economic models of civil war using case studies. Perspectives 259-79. Sarkees, Meredith Reid. 2000. The Correlates of War data on war: An update to 1997. Conflict Management and Peace Science 18 (1): 123-44. Sarkees, Meredith Reid, and J. David Singer. 2001. The Correlates of War datasets: The totality of war. Paper prepared for the 42nd Annual Convention of the International Studies Association, February 20-24, Chicago. Singer, J. David, and Melvin Small. 1972. The wages of war, 1816-1965: A statistical handbook. New York: John Wiley. . 1994. Correlates of War project: International and civil war data, 1816-1992 [Computer file, Study #9905]. Ann Arbor, MI: Interuniversity Consortium for Political and Social Research [distributor]. Small, Melvin, and J. David Singer. 1982. Resort to arms: International and civil war, 1816-1980. Beverly Hills, CA: Sage. Strand, HAvard, Lars Wilhelmsen, and Nils Petter Gleditsch, in collaboration with Peter Wallensteen, Margareta Sollenberg, Mikael Eriksson, Halvard Bulhaug, and Jan Ketil Rod. 2003. Armed conflict dataset codebook, version 1.2a. Accessed June 21, 2004, from http://www.prio.no/page/Project_detail// 9244/44532.html?PHPSESSID=e92e0a3b72738fae3ff8585c 12cd291b/. Tilly, Charles. 1978. From modernization to revolution. New York: Random House. . 2003. The politics of collective violence. Cambridge, UK: Cambridge University Press. Valentino, Benjamin, Paul Huth, and Dylan Balch-Lindsay. 2001. Draining the sea: Mass killing, genoc and guerilla warfare. Unpublished manuscript, Stanford University. Walter, Barbara. 2002. Committing to peace: The successful settlement of civil wars. Princeton, NJ: Pr ton University Press. This content downloaded from 78.99.21.185 on Thu, 08 Nov 2018 15:30:47 UTC All use subject to https://about.jstor.org/terms