Ethnicand CulturalDiversity Country* by

CA USA Science, University, Stanford 94305-6044, Stanford DepartmentPolitical of in and science data active research several For their evaluation, programs economics political require on empirical ** and is After Ethnic across countries. ethnic however, a slippery addressing conceptual group," concept. groups in that of I a obstacles,present listof822 ethnic groups 160countries madeup at least1 percent the practical fractionalization onthis with based list the I a of in 1990s. compare measure ethnic country population theearly that an fractionalization uses thestructural I used most commonly measure. also construct indexof cultural distance between in for as between distance groups a country. languages a proxy thecultural economic cultural ethnic fractionalization, growth diversity, Keywords: heterogeneity, JEL classification: D7 05,

1. Introduction rate economic lowera country's Does ethnic growth or levelof publicgood diversity and as Easterly Levine(1997) and Alesinaet al. (1997) claim?Are more provision, Less (or more)likelyto experience dividedstatesmorecivil war prone? ethnically fractionalized Morelikely havehighly to or transitions stable democratic democracy? that or ethnic What differentiate groups rebel if are systems they democratic? factors party ethnic What differentiates the that movements those do not? from havesecessionist groups "latent?"1 from those remain that in mobilize thepolitical whose members sphere data that to efforts answer ofthese require wecollect onethnic any questions Empirical we and in different countries, before suchdatais collected, needa listor any groups for of groups somesetofcountries. sample ethnic a in involved constructing and This paperdiscusses problems conceptual practical of the 2 list cross-national ofethnic (Sections and3) andthen presents results an groups that attentiongroups hadatleast1 to to out task effort carry the (Sections 4-8). Restricting and I 822 in the1990s, identify ethnic "ethnoreligious" of country population percent that those is in groups would ly, my Hypothetical objective toinclude groups 160countries. were asked in in individuals thecountry question chosen if most often randomly be listed I in "whatare themainethnic racialor ascriptive) (or groups thiscountry?" lackthe I and out to resources carry sucha survey havenotdoneso. Forlackofsomething better, here Thelistoffered below. lists and sources theexisting discussed on rely thesecondary
as as work to should viewed a continual inprogress, be improved more be country-specific to or is data, brought bearcase bycase. expertise, actual survey a Section proposes simple 6 Section presents 5 statistics. wayto use the descriptive as differ across countries, opposedto an data to represent how ethnicstructures the standard measureof ethnicdiversity. Section7 compares diversity aggregate used basedon theAtlas measure the constructed mydatawith commonly measure from in Narodov 8, Mira,published Sovietethnographers1964.In Section 1 use thedata by that of to construct indexof cultural an fractionalization uses a measure thestructural distance between between to relationship languages take into accountthe cultural similar ethnic have somewhat in a country. instance, Belarusand Cyprus For groups muchcloserthanthosein Cyprus. in are butthegroups Belarus culturally structures, of as distance between languages a proxyforextent cultural Using the structural to the measureattempts take such cultural difference, culturalfractionalization into proximity account. 1.1. Relation Existing to Work of was out As noted, best and usedsimilar effort carried bya team the known most widely list Narodov Mira.Their of Soviet inthe 1960s,andpublished Atlas as ethnographers early has been employedby several "ethnolinguistic" groupsand population figures to economists produce of and, scientists, recently, generations political sociologists, more Most studies cross-national estimates "ethnic of fractionalization."2 empirical concerning theimplications ethnic and on of such diversity, as Easterly Levine'swork economic usedlanguage define to have this The growth, employed measure. Sovietteammainly of butsometimes that seemto be distinguished somenotion included by groups, groups are racerather language, quite than and often national used (e.g.,Anglo-Canadians origin listed theUnited in States). at More a Ted and have of recently, R. Gurr hiscollaborators developed list "minorities risk" in 115 countries, arrayof variables codinggroup along witha remarkable the and characteristics, situations, experiences (Gurr, 1996).Thisdatasetallowsfor first timelarge-N and research the correlates groupoppression, on of protest, rebellion. theselection must judged"at risk" be criteria thesample thegroups for Unfortunately, - render inoneormore four inferences of efforts usethe to datatodraw ways problematic about these of Thedifficulty sameas that trying learn effect is the of to the phenomena. SATscores academic on If only performancelooking atelite by only colleges. weconsider or we variation the independent on oppressed disadvantaged groups, are truncating and it to a between variable, thus making harder detect relationship (say)discrimination andrebellion. sample This in at selection problem minorities risk(MAR) is one ofthe motivations thepresent for study. Alesinaet al. (2002) attempt distinguish to between and ethnic, linguistic, religious ina sample about countries, then their toconstruct of 190 and use lists measures of groups andreligious fractionalization. it is notclearhow"ethnic" ethnic, linguistic, Although and "linguistic" are themselves (as allow),thedescriptive groups distinguished they statistics their"ethnic" measure for look broadly to similar thoseforthe measure



a here. of measures constructed Roeder (2002)hasmadeavailable series fractionalization on Sovietethnographic his for1961and 1985basedalmost sources; measures entirely I Narodov Mirathan to to of Atlas mine, thoughhavenotseen appear be closer those the his thegroup that list underlies estimates. 2. CodingEthnic Groups that are are "Primordialists" saidtobelieve ethnic fixed, groups either biologically given that are and are rooted, drawn, or, entities, ifthey socialconventions, they deeply clearly conventions. historically rigid should quickly be disabused them undertaking of with leanings by Anyone primordialist It clearthat one in many different countries.3rapidly becomes tocode ' 'ethnic groups" in and cases there of must makeall manner borderline-arbitrary decisions, that many in "Whatare theethnic answer thequestion to is groups this simply no singleright and aboutthecontingent, or instrumentalist Constructivist fuzzy, arguments country?" seem character ethnicity amply of situational supported. Let What itsethnic racial) are the States. for (or Take, example, United groups? usmake to with least1 percent country at of attention groups much easierbyrestricting things we official censuscategories, get three"races"- white, If we consult population.4 which government the and African American, Asian andan additional group, Hispanic, the list the States? not is declares "nota race." Is this right for United Why emphatically Mexican Cuban Americans, Hispanicinto PuertoRican Americans, disaggregate between Arab and Americans, so on, or likewiseforAsian?5Whynot distinguish German and Italian Irish Americans, Americans, so on?Andwhy Americans, Americans, the whenearliercensusesformulated censuscategories, shouldwe use the current (Nobles, 2000)? quite categories differently ethnic or farther doesSomaliahavea single afield, (Somalis), several, group Looking detailed at clans?If thelatter, whatlevelin theextremely to correspondingthemajor our do of hierarchical groups"for list? system clanandsubclan we locatethe"ethnic where in North African What about Berbers several in castes India? What about countries, if claim Berber but of population a large descent, attitudes wished, could, they majority the Latin What about as oneself "Arab" or"Berber?" to on many vary whether characterize and and"mestizo," between between the where lines American countries, "indigenous" or of are "mestizo"and"white," often point being imperceptiblesituationvaguetothe code oneself a (black,notArab) as one might WhataboutSudan,where dependent. in but Southerneronecontext, as a Dinkain another? but someofthese of definition "ethnic Anexplicit questions, group"couldhelpwith For solve all of them. example, to it is important see thatit couldnotplausibly any in that that of definitionethnic group group saidunambiguously "Hispanic"is an ethnic Thenature of is but States "CubanAmerican" notisprima theUnited facieimplausible. the there be multiple can of theconcept "ethnic waysto specify set group"is suchthat more-or-less include in a country, of which all of ethnic equallyvalid "ethnic groups groups." that for has This observation an important implication social scienceresearch uses



measuresof ethnicdiversity explain outcomessuch as economic growth political to or violence. If thereare multiple a plausibleways of listing country's"ethnicgroups," we mustbe carefulthatwe do not,in effect, our choose thecodingthatbestsupports theory, afterthe fact. Somalia was viewed by the Soviet ethnographers 1960 as highly in a homogenous, nationof ethnicSomalis sharingreligion,language,and customs.This was a perfectly plausible codingthenand it remainsso today.Since thecivil war of the or 1990s,however, analystsseekingto explainpoorprioreconomicgrowth thewar itself wouldbe drawnto arguethatSomalia is highly fractionalized ethnically along clan lines, and thus a good example of the propositionthat ethnic heterogeneity causes poor is economic performance civil strife. and Somalia as highlyfractionalized Designating not implausibleeither,whether 1960 or 1990. Or considerBotswana,a case often for used to supportthe argument that "Africa's growthtragedy" is explained by ethnic Withits large Tswana ethnicgroup,Botswana can be plausiblycoded as heterogeneity. and its economy has performed homogeneousby Africanstandards, very well. Yet Botswana's ethnic structure fundamentaly is similar to Somalia's- the Tswana are divided into eight subtribes thatare socially and politicallyconsequential.If forsome reason Botswana's economyhad done poorlyover the last 30 years,and if it had seen internalfighting significant along triballines, it would have been viewed ex post as confirmation the "regularity"that of and ethnic makesforlow growth a greater diversity riskof civil conflict! So what can be done? Many of the problematic cases noted above have a common and groupB is a origin:Whereto locate the "ethnicgroup" whenthereare two groups, subsetof groupA? One approachis to avoid a decision,insteadincorporating theseset/ subsetrelations thestructure thedata. That is, we might in of code multiple"levels" of ethnic wouldform level 1, groups.In theUnitedStates,forexample,thecensuscategories a disaggregation country originlevel 2 (Mexican-American, of VietnameseAmerican, by and build such structure etc.), and so on. FollowingScarritt Mozaffar(1999), I partially intothe grouplists forsubSaharanAfrica,wherethisissue is particularly commonand difficult. But formanypurposes, such as producing cross-national a measureof ethnic diversity, we will wanta singlelistof groupsfora country. is notevident It that "levels" would the across countries, correspond makingit sensibleto compute"level 1 fractionalization," "level 2 fractionalization," Moreover, etc. sets and subsetsare nottheonlyproblemwe encounter. Should Mexico be divided between"indigenous" and "mestizo/white," or should"white" be broken out?Or ifwe are listing Americans theUnited for hyphenated States,do we include,say,"GermanAmericans,"even ifthisis at besta vague category rather thana groupin thesense of a set ofpeople who recognizeand feelmotivated act to on thebasis of thismembership? Implicitin the idea of an ethnicgroup is the idea thatmembersand non-members and that actionsare orcould be conditioned recognizethedistinction anticipate significant on it. So it is naturaland perhapsnecessarythatthe "rightlist" of ethnicgroupsfora as ethnic country dependon whatpeople in thecountry identify themostsociallyrelevant I if in groupings. adoptthisapproachforthelistdiscussedbelow,in principle notliterally for list" thatI am seekingwouldbe defined a practice. Ideally,thestandard "the right by like thefollowing: procedure



of 1. Randomlysample a largenumber people in thecountry. 2. Ask each of themto listthemajoror mainethnicgroupsin thecountry. 3. of Show themor read a list of manypossible formulations the ethnicgroupsin the members. and country, ask themto say of whichtheyconsiderthemselves Repeat (3), askingthemto say of whichgroupson the listmostotherpeople in the would considerthemto be members. country


in identified (3) according how strongly to to to 5. Ask them try rankthegroups they they to with the group (e.g., which is "most important you," or some such identify language). of could be usefulformanyinteresting Such a survey purposesbesidesthat constructing a list of ethnicgroups by country (for which I would expect to analyze responsesto question2). It could be used to assess the degree of social consensuson what are the not "ethnicgroups,"whichmight be particularly in manycases. If takenat high country's of the in time, could be used to study politicaloreconomicdeterminants it points multiple element of their "situational ethnicity,"factorsthat lead people to see this-or-that times(Laitin, 1998; Posner, at as "identity repertoire" moreor less important different in Rankingsby importance question 5 could allow a more subtle and forthcoming). and and possibilitiesfor reformulation nuanced mappingof levels of ethnicidentity betweenanswersto questions3 and 4 could allow an inquiry coalitions.The differences of and intogaps betweensubjectiveunderstandings objectiveassessments ethnicity (e.g., would "Asians" "Asian" as a race,buthow often whiteAmericans identify might many this self-identify way?). data of thissort,we are forcedto reviewexistinglistsand secondary Without survey as sourcesto applythisstandard bestwe can. The mainsourcesemployedare dicussedin Section4. thatwhatthe I Beforeproceeding, stresstwo pointsthatfollowfromthe observation are think on whatthepeople in thecountry are in ethnic they at a groups a country depends are thatethnicdistinctions time.First,it cannotbe assumed, without argument, given For to otherpolitical, economic, and social variables of interest. wholly exogenous could exacerbatedistributional causing struggles, example,poor economicperformance considered to see and act along lines of ethnic division that were formerly people of robusteconomic growthmightlead to the downplaying By unimportant. contrast, If ethnicdivisionsand a greater emphasis on nationalidentity. Botswana seems more ethnically homogeneousthanSomalia does at thispoint,it may be thatthisis in parta rather thana cause ofeconomicgrowth. result Likewise,manyexamples,suchas Somalia, morestrongly violencecan lead orforcepeople to identify showthat alongethnic political were less salient(Kaufmann,1996; Fearon and Laitin,2000b). This lines thatformerly in for groupsconstructed 1960,suchas theAtlas maybe an argument usinga listofethnic or economicgrowth politicalconflict.6 NarodovMira, to studysubsequent Second, we cannot use the list to ask empirically why some possible ethnicgroups



as to become actual ethnic at or ethnic opposed other time, why groups a given political We want know ethnic suchas Germanto cleavages develop. might why groups possible and American Scots-Irish nothavethesamesocialandpolitical or do salience white that blackdo in theUnited national electoral coalitions are States present, why at or Kenyan structured divisions rather between than Luo, Kamba, etc., by among Kikuyu, Kalenjins, menandwomen, rich poor.7 for or and if in is Obviously, a criterion inclusion thelist that in country the see category question anethnic in as then havea people the group, wedonot a Nonetheless, listof (or sampleof all hypothetically possibleethnic other) groups. "actual" or existing ethnic wouldbe a prerequisite sucha study. trick for The groups would constructing listof"potential" be ethnic other) Sinceitis not clear the (or groups. that population "all possible the in a country" well-defined, in is even of ethnic groups somesort case-control of would necessary.8 be theory, approach 3. Ethnicity I argued Section that plausible in 2 no a definition "ethnic of group"willbyitself imply listof groups a country. for a definition wouldbe useful boundthe to Still, unique we like Are to and phenomenon aretrying capture, to address questions thefollowing. and in Protestants Catholics Northern or to Ireland, BosnianSerbsand Muslims, be if included theonlysignificant cultural differencereligion? is Standard definitions"ethnic of of belief common of group"interms a shared ancestry shared cultural features problematic are and and/or (Fearon Laitin, 2000a).It is almost to of that the taken but are always possible giveexamples groups fit definition literally, that notintuitively that notfit definition that often the are do but "ethnic,"or of groups described "ethnic." as Fearon Laitin and to this the (2000a)attempt deal with problem examining implicit by rules people(oratleast that usetodecidewhich are"ethnic" in English speakers) groups talk. that a as everyday They argue incommon speech group be designated "ethnic" may ifthe is than and in group reckoned is group larger a family membershipthe primarily a by descent Thesearethecorecriteria, rule. the restricted to although concept be further may rule casessuch out castes nobility, groups are"legislated" existence or and that into (i.e., haveno "naturalized as It that cultural features history" a group). is worth noting shared seemto playno necessary in whether group role a can be described "ethnic"in as ' talk. described anethnic as everyday Forexample,'Jews"areoften group despite lacking a common shared or customs, even common language, universally religious practice are in included thegroup, and it is contested whether (sincenon-believers typically conversion makeone ethnically can Somaliclansare frequently referred as to Jewish). "ethnic"formations, though members not the even their do see clansas culturally distinct in anysignificant respect. Theresults theordinary of also when a language analysis helpexplain sharing groups common willbe considered "ethnic" namely, when inthe is religion membership group reckoned rather bypublic than confession faith. Bosnia, of In primarily descent by thugs between SerbsandMuslims thebasisof local knowledge records on and distinguished not of faith In United one descent, bytests religious orpractice. the States, can concerning


20 1

make oneselfProtestant Catholicby adoptingthe appropriate or religiouspracticesand that Ireland. beliefs,something is hardor impossibleto do in Northern Anotherapproach to definition in several ways more useful for the purpose of a listbycountries is to employtheidea of "radial categories"advancedby constructing linguistsand cognitivescientists(Lakoff, 1987; see Collier and Mahon, 1993, for a the discussion with respect to political science). In practice,people may understand cases maynot to cases. Less prototypical of meaning a conceptX byreference prototypical and of shareall thefeatures a prototype, yetstillbe validlyclassed as Xs, at least in some circumstances. features: ethnicgrouphas thefollowing For example,theprototypical and in 1. Membership thegroupis reckoned by primarily descent bothmembers nonby members. and and view it as normatively 2. Members are conscious of group membership to important them. psychologically such as common language, culturalfeatures, 3. Members share some distinguishing and customs. religion, 4. of of features held to be valuable by a largemajority members the are These cultural group.

or 5. The grouphas a homeland, at least "remembers"one. as this 6. The grouphas a sharedand collectively history a group.Further, represented but is notwhollymanufactured, has some basis in fact. history sense thatis, itis nota caste "stand alone" in a conceptual 7. The groupis potentially or or caste-likegroup(e.g., Europeannobility commoners). that the "radial" comes from observation bytaking The term away one ormoreofthese but one may get typesof "ethnicgroups" thatare notprototypical nonetheless features, seen as ethnicgroups.For example: are often Take away 2, 4, and 6, (and possibly othersexcept 1), and you get an ethnic category ratherthan an ethnic group. The extent or degree to which these the be conditions applymight said to determine "groupness" of a group(Brubaker, 2002). Take away 5 (and possiblyothers groups, except 1) and you getsome nomadicethnic such as theRoma. in distinctions Take away 7 and you get castes in South Asia, or noble/commoner Europe.



In assembling the list discussed below, I am looking for groups that meet the * 'prototype" conditions as much as possible. This implies that I allow groups fromothersin the same country distinguished by primarily religionprovidedthatthey meet condition 1 (membershiphas a strongdescent basis) and condition 2 (selfconsciousnessas group).It also impliesthatI do notcountcastes in SouthAsia as ethnic even though readily I admitthat groups, theysharean important "familyresemblance"to ethnicgroupsthrough descentcriterion, could be validly consideredas ethnic the and groupsin some researchdesigns(Horowitz,1985; Chandra,2000). I believe that a large majorityof the groups in the list discussed below meet the conditionsfor a "prototypical" group fairlywell, althoughfor a numberof cases, to especially in Asia and Africa,the extent which2, 4, and 6 are met is unclear.These have manygroupsthatare identified some languagecommonality, continents whichin by most cases does marksome culturalsimilarity. the extentof their"groupness," or But sense of commonidentity the 2, (conditions 4, and 6) is notclear from sourcesI have been able to consult.



withAlex Rosas, Christina Maimone,and AtsukoSuga, I used theCIA's World Working Factbook onlinefora "first and designations werethen pass." The Factbook's numbers comparedwiththose in EncyclopediaBrittanica(EB) and, when possible,the relevant Libraryof Congress CountryStudy (LCCS), Significant discrepanciesbetween these sources promptedan investigation sources. For a numberof using country-specific countries and particularly Latin America, for LCCS providesa nuanceddiscussionof the natureof ethnicidentity. These were oftenused to modifythe Factbook's listing.For forchoices aboutwhether code "whites" separatefrom to "mestizos" in Latin example, AmericaI followedLCCS whenpossible. I also comparedtheFactbook,EB, and LCCS groupsand numbers withthe minority at MAR does notpurport cover to groupslistedin theMinorities Risk data set. Although all ethnicminorities a country, has theadvantageof including in it groupsthatare almost all "mobilized" or have a non-trivial level of "groupness." In a fewcases I includeda identified whichdoes notappear in theFactbook. but groupthey The Factbookgenerally does notlistthelargenon-citizen that populations inhabit many Western countries and manyof thePersianGulfstates.Excludingtheseseems European hardto defendif we wanta list of ethnicgroupsin a country a giventime would a at countrywith 50 percentwhite citizens and 50 percentblack noncitizensbe properly as on I regarding ethnically homogenous?For information noncitizens, consultedrecent census figures OECD countries, a variety web sourcesbothfortheseand forthe for and of Gulf states.9 The subSaharan African countries pose special problems. In general they are remarkably ethnicallydiverse, and Africansoftenmanifesttheir multipleascriptive affiliations highlycomplex, situation-dependent in ways. At the time of access, the Factbook was unusable for much of the continent, or providingeitheruninformative breakdowns or aboutthetotalnumber ethnic of superficial (e.g., Bantu/Nilotic, a statement



Scarrittand Mozaffar (1999) have carefully groups in the country).Fortunately, a countries. constructed listof over 300 "ethnopolitical"groupsin 48 African Working of from Morrison al. (1989) and a largenumber country-specific et and accounts,Scarrit with"contemporary past politicalrelevance*'at to listethnic or Mozaffar sought groups the nationallevel. For my purposes,an important advantageof theirdata is thatthey evidenceon thesharedawarenessand politicalsignificance a of country-specific required in ethniccategory orderto includeit,so thattheseare morelikelyto be "real" groups. is For A disadvantageis thatformy purposespolitical significance too restrictive. and list of example,forBurkinaFaso Scarritt Mozaffar onlytheMossi, at 50 percent the found evidencethat other no the ethnic had any "political because they groups population, relevance" at the nationallevel (the othergroupsare excluded by the Mossi). So we et and returned Scarritt Mozzafar's mainsource,Donald Morrison al.'s Black Africa:A to to thosegroups than1 percent country of Handbookandtried identify greater Comparative -relevancerule. We reconsideredall populationthat were omittedunder the political in and s countries whichthe sum of thegrouppercentages Scarritt Mozaffar' listwas for we less than 95. 10 Parallel to the process for the rest of the world's countries, took list thoseprovided the Morrison's with Morrison (1989) as ourbase, andthen by compared Summer Institute of Language's Ethnologue, and Levinson (1998). Significant sources.11For a numberof discrepancieswere resolved by resortto country-specific these countries for example, Chad, Congo-Brazzaville,the Democratic Republic of that Congo,and Liberia I do nothave greatconfidence all ofthegroupslistedaccurately in The divide the social terrain ethnicterms. reflect how people in the country mentally so sourcesused overwhelmingly identify groupsby sharedlanguages,butthereare often of that to manyclosely relatedlanguages/dialects it is difficult know whereperceptions attachmoststrongly. groupness and feature Scarritt Mozaffar's(1999) data is that of An innovative theycode groupsat which theyterm"national dichotomy,""middle-level of threelevels of aggregation refersto situationswhere aggregation,"and "lower level of aggregation."The first ... the intensely politicizedas partofone side population is at leastfairly "Virtually entire in nationalethnopolitical or the otherof a long-standing dichotomy"(89). Hutu/Tutsi in Rwanda and Burundi,and Mainlanders/Zanzibaris Tanzania, are examples of this groupsand in some cases coalitionsof coding.The "middle level" lists ethnopolitical the but which do not necessarilypartition whole thatact together politically, groups "Lower level" groupsare brokenout undermiddle-levelgroups in some population. middle-level cases, wherethereis a "significant groups" cleavage within ethnopolitical (90). Above, I noted thata major obstacle to listinga country's"ethnic groups" is that relationattachments have multiple organizedin set/subset ascriptive people commonly in for ships Hispanic/Mexican-American, example,or (black) Southerner/Nuer,Sudan. it of One way to deal withthisissue is simplyto incorporate in the structure the data, and Mozaffar'sthreelevels are not levels. AlthoughScarritt at different coding groups in of aboutthestructure ethnicattachments, practice motivated thissame observation by notedhere.For the three levels tendto reflect set/subset thecodingsfortheir phenomenon code Kalenjinsand Luhyasin Kenya at themiddlelevel of and Scarritt Mozaffar instance, of but aggregation, also lista number Kalenjin and Luhya tribesat the lower level.



In therawdatausedtogenerate ethnic the list belowI havepreserved group examined andinsomecasesaddedtoScarritt Mozaffar's and scheme three of levelsofaggregation. Infuture I would torationalize extend approach the ofthe work like and this to rest world's countries. of Such data would providea richerand more accurate rendering the of ethnicity I acrosscountries. present For however, havegone organization purposes, a the countries selected thelevelofaggregation produces and out that through subSaharan list of groupsthat,in the mid-1990s, judged to be collectively are closestto the is research. task made This case," as assessed additional "prototypical by country-specific lesssubjective itmaysound thefacts (1) Scarritt Mozaffar only12 than that and code by cases of "national obvious dichotomies" thesearemainly and cases,and (2) in many cases there virtually "lower" levelgroups are no listed. certainly But there some are difficult ethnic be measured thesubclan at countries suchas Somalia here, (should groups levelorjustHawiye, etc.?). Issaq,Darod, 5. Descriptive Statistics Thelistofethnic in from described abovehas822groups groups resulting theprocedures the160 countries hadoverhalfa million population 1990.12 in in Table 1 provides that statistics thesample a wholeandfor cultural for six as descriptive regions. f\\e thesample a whole, find the"average as we that has Considering country" about ethnic that than1 percent thepopulation, halfof theworld's with of groups are larger countries between three and six suchgroups (thisis theinterquartile having range). 22 with groups, thelist, is while Tanzania, zero, the (PNG),with tops PapuaNewGuinea
TableI. Descriptive statistics ethnic on than percent country 1 of groups larger population, region. by World No. ofcountries Total(%) No. ofgroups Total(%) Groups/country Std.Dev. Max. no. of groups Min.no. of groups of Avg.pop. share largest group of Avg.pop. share 2ndlargest Countries a with group> 50% Countries with a group> 90% 160 822 5.14 3.51 22C 0d 0.65 0.17 0.71 0.21 Westa 21 0.13 68 0.08 3.24 2.1 9 1 0.85 0.09 1.00 0.62 NA/ME 19 0.12 70 0.09 3.68 1.95 9 1 0.68 0.19 0.84 0.21 LA/Ca 23 0.14 84 0.10 3.65 1.03 6 2 0.69 0.21 0.78 0.17 Asia 23 0.14 108 0.13 4.70 3.28 13 0 0.72 0.16 0.78 0.22 EE/FSU 31 0.19 141 0.17 4.55 2.11 12 1 0.73 0.15 0.90 0.19 SSAb 43 0.27 351 0.43 8.16 4.45 22 2 0.41 0.20 0.28 0.02

Notes:aIncludes New Zealand,and Japan; includes Sudan;Tanzania; dPapuaNew Guineais Australia, codedas having ethnic no that the threshold. groups meet 1 percent



have zero ethnic I How can a country somewhat anomalousminimum. groups?Recall that The am coding only ethnicgroupsthatmake up over 1 percentof country population. in are the ethnicunitsof PNG sourcesI have consulted consistent characterizing primary NotesforPNG are indicative as extremely small.The US StateDepartment's Background of whattheanthropologists as well. say in The indigenous of population PNG is one of themostheterogeneous theworld.PNG most withonly a few hundred has severalthousand people. . . separatecommunities, terrain so greatthatsome groups,until is The isolationcreatedby the mountainous were unawareof the existenceof neighboring groupsonly a few kilometers recently, away.13 or such as Papuans/Melanesians, Highlanders/Sepak While broad classifications, thatPNG citizens' thereis general agreement Valley/etc,are sometimesmentioned, are ethnicattachments to these very small groups,which are almost always primary measurediscussedin thenext differentiated language.By theethnicfractionalization by state. a fractionalized PNG approximates perfectly section, in of to Returning Table 1, we see thatabout70 percent thecountries theworldhave an of ethnicgroup thatformsan absolute majority the population,althoughthe average are of and only 21 percent countries shareof such groupsis only 65 percent population 4 The claims 9 outof 10 residents. in theweak sense of havinga groupthat 'homogenous" is ethnic or large,at minority, surprisingly group, largest averagesize ofthesecondlargest of This is not due to the influence a single highlydiverseregion,such as 17 percent. in is Africa.Seventeen subSaharan minority percent close to theaveragesize ofthelargest tendto be smaller(and the the West,wherethe largestminorities everyregionexcept ethnicgrouplarger). majority divided is whatis moststriking how muchmoreethnically to variation, Turning regional With351 groupscoded, Africaaccountsforabout countries. are thesubSaharanAfrican of than1 percent of but of groups(larger quarter all countries 43 percent theworld'sethnic While therestof theworld'sregionsaveragebetween3.2 and 4.7 groupsper population). share thaneight.The averagepopulation countries' the averageis greater country, African in less thana majority, sharp is in thesecountries 42 percent, ethnicgroup of thelargest ' to contrast all otherregions. SubSaharan Africahas only one 'highlyhomogenous" have an ethnic and majority. (Rwanda,withHutusat 90 percent), less thana third country is feature the regionalstatistics how small are the aggregate of A second interesting of betweenthecountries North differences Africa/Middle East, LatinAmerica/Caribbean, Soviet Union. The Westerncountriesare somewhat Asia, and EasternEurope/Former morediverse countries considerably are the morehomogeneous, and,as noted, subSaharan ethnic on average.But therestof theworld'sregionsshowbroadlysimilar demographies. and of The averagenumber groupspercountry theaveragesize of thetop twogroupsare thatare "homogeneous" or thathave an of all quite similar.The percentage countries Eastern in similar thisset (although are also fairly ethnic Europehas a somewhat majority and small number of ethnicmajorities "homogeneous" countries). higher proportion have that theseregions does notimply in Of course,similarity broadethnic demography or similarethnicpolitics,interethnic relations, economic or politicaloutcomes.To the



that these we do definitive, datasuggest hardly they not.Although contrary, knowthat to outcomes reference inpolitical economic or scholars want explain who to differences by facean uphill task. in cross-national differencesethnic may demography 6. EthnicStructures in and outcomes economic In cross-national of studies political violence, growth, other as a of use fractionalization measure ethnic most often ethnic political economy, analysts in selected individuals a as that This diversity. is defined theprobability tworandomly in and But will different ethnic hypotheses arguments the groups. many country be from like of literature notjustto measures ethnic refer diversity thisone,butto morefineHorowitz structure. example, For ofethnic (1985) andothers grained conceptualizations and with in ethnic conflict morelikely countries an ethnic is majority a large say that countries. or ethnic as Reilly heterogeneous minority, opposedto homogenous highly structures of different is ill-suited capture to observes that fractionalization (2000/01) - forinstance, ethnic (PNG),bipolar (Cyprus), multipolar highly fragmented cleavages andbalanced dominant (Burundi). (Sri (Bosnia), minority majority Lanka),ordominant the A simple structures around datatogeta senseofhowethnic vary waytouse these world to graph population is the share thesecond of (the minority) largest group largest

Figure 1. Ethnicstructures region. by



in group).This is undertaken Figure 1 for againsttheshareof largestgroup(theplurality of nameas theplotting If each region, symbol. theshare usingan abbreviation thecountry then definition shareofthesecondlargest no greater the is than ofthelargest by groupis/?, fallwithin triangle a withvertices or 1 - p. Thusthepointsin thesegraphs at necessarily p located near the (1, 0) vertexhave a large ethnic (0, 0), (0.5, 0.5), and (1, 0). Countries locatednearerto and homogeneous(e.g., Tunisia). Countries majority are thusrelatively at (0, 0) are highly fragmented (e.g., Tanzania and Uganda; PNG wouldbe approximately near to (0.5, 0.5) is roughly''bipolar," withtwo large ethnicgroups (0,0)). A country a locatednearthemiddleofthe mostofthepopulation (e.g., Fiji). Finally, country dividing set jc-axishas a singleplurality groupand a highly fragmented of ethnicminorities (e.g., India).14 are "ethnic structures"in Figure 1 illustratesmore dramaticallyhow different Whereascountries withno ethnic Africafrom thosein other subSaharan regions. majority in thisis thenorm Africa.MostAfrican countries cluster arefairly in therestofworld, rare arounda pointthatimpliesa plurality on the leftside of thetriangle, groupof about 22 less also showsconsiderable withthesecond largest slightly thanthis.The figure percent, within Africa.Rwanda, Burundi, in variation ethnicstructures Lesotho, Swaziland, and that Zimbabwehave a largemajority groupand a minority makesup almostall of therest Botswanais coded as havinga largemajority of thepopulation. (theTswana) and a set of dividedbetweentwo large and Djiboutiare fairly Mauritania minorities.15 smaller evenly has is whilethere a setofcountries Faso, andNamibia)that a (e.g., Mali, Burkina groups,16 dividedamongquite small groupwiththerestof thepopulation relatively largeplurality groups. shows the West and EasternEurope/FSUas the regions Outsideof Africa,the figure states.Amongtheless homogeneous of clusters relatively withthelargest homogeneous tends to make up about half of the in countries the West,the largestethnicminority This is true for EE/FSU as well, but these outside of the majority group. population The USSR had and Kyrgyzstan aroundthispattern. show muchmorevariation countries the of has a bare majority minorities; Baltic states groupand a largenumber small ethnic are approximately (i.e., the titular)group; the bipolar witha moderatesized majority and former countries, Kazakhstanis not typicalof subSaharan Yugoslavia had a structure an too farfrom evenlybalanced bipolarity. of Latin Americaand the Caribbeanare notableforthe highproportion thecountries and a singleminority betweena majority thatare approximately group, group partitioned usually"mestizos" (or "whites") and "indigenouspeoples." "Indigenouspeoples" is dividedamongmany often of coursea catch-all, combining groupsthatwerehistorically smaller tribes speaking diverse languages. A long historyof assimilationand the thesedistinctions has of and numerical politicaldominance thesettler populations blurred ethniccategoriesin manyof thesecountries and made thecommon-sense "indigenous" versus "white/mestizo." Bolivia, Exceptionsare Guatemala and the Andean countries Peru,and Ecuador,whichare coded as havinglarge indigenous along with populations, between whites and mestizos (in the Andean countries).For distinctions noteworthy between Quechua and Aymara speaking Bolivia, the sources suggesteda distinction and withTrinidad Tobago, Guatemala, Ecuador,and Perulook peoples. Along indigenous to similar Bosnia, Bolivia is divided Structurally bipolarby thisrendering. approximately



The between three whites mestizos combined) four and are or (if fairly equalsizedgroups. ondistinguishing between ofColombia toward middle thetriangle the of coding depends will and In measure discussed Colombia look white mestizo. thecultural below, diversity much more homogeneous. of structure. Asia andNorth East Africa/Middle showsimilar Finally, patterns ethnic there a number are with Both countries have but mostly ethnic regions' majorities, inboth For a sometimes ethnic that of slim groups. majority facesa largenumber smallethnic that or Asia this often reflectsconfiguration large a ofa lowland majority is ringed edged Pakistan mountain Vietnam, Laos, Thailand, by morefragmented peoples (Burma, ofoil production the in FortheMiddleEast,itreflects political the (slightly)). economy Persian United ArabEmirates, Gulf.Saudi Arabia, Oman,and Kuwaithave Bahrain, a or of who are either baremajority a mere ethnically homogeneous groups citizens noncitizen the of is madeup ofethnically diverse plurality; rest thepopulation typically Iran as with baremajority a of workers. comesbythisstructure honestly, it were, more 24 North East and other small Africa/Middle Persians, percent Azeris, seven quite groups. is alsonotable thenumber countries bymycodings, almost are divided for of that, strictly in or Arabs Berbers Morocco, and Libya, bytwoethnic ethnoreligious groups:17 Algeria, in andTunisia; Muslims Copts Egypt; and in Turks Kurds Turkey; and in Greeks Turks and andPalestinians TransJordan inJordan. and Arabs Cyprus; 7. Ethnic Fractionalization Themost measure aggregate of ethnic is commonly employed diversity fractionalization, defined theprobability twoindividuals as that at will selected random from country be a from different groups. the ethnic If population ina country are shares theethnic of groups denoted ->Pn-> fractionalization 1 - E"= ^ pf. Table2 givesa few men is F= P\,P2->P3 of works. examples howthemeasure In line with the discussionabout ethnic structures above, notice that the fractionalization forcountries and F are notthat scores E even one different, though ethnic to is might expecttheir politics differ markedly giventhatthere an absolute in F to measure, is notsensitive discontinuities majority F butnotinE. As a continuous
Table2. Fractionalization examples. Country A B C D E F G H I J Structure Perfectly homogeneous 2 groups (0.95,0.05) 2 groups (0.8,0.2) 2 groups (0.5,0.5) 3 groups (0.33,0.33,0.33) 3 groups (0.55,0.30,0.15) 3 groups (0.75,0.20,0.05) (0.48,0.01,0.01,...) (0.25,0.25,0.25,0.25) n groups, (l//i, l//i,...) F 0 0.10 0.32 0.50 0.67 0.59 0.40 0.76 0.75 \-{Mn)



mark 25-th 75-th 2. circles denote median values. ends the The of boxes the and and percentiles, Figure Thefilled within times interquartile ofthe25-th the the and observations 1.5 the"whiskers" mark smallest largest range minimum maximum all casesexcept subSaharan and in for This and75-th Africa, respectively. is the percentiles, in are and of fractionalization. where four the "outliers" Rwanda, Swaziland, Lesotho, Burundi, order increasing

betweencountries and I makes a H rule. The comparison relatedto theidea of majority F in As differences ethnic different measure, cannotfully capture point. a one-dimensional that structures may seem intuitively significant. F it. Still,as an indexof overallethnic diversity has muchto recommend It has a natural to of It is farsuperior the number ethnicgroupsbecause it takes intuitive interpretation. than would using the account of population shares. It encodes more information populationshare of the largestgroup (thoughthese measuresare quite close). And its in empiricaldistribution summarized Figure 2- is not highlyskewed.18The average at then value of 0.48 forall countries impliesthatifone wereto selecta country random, selecttwo people from thereis about a 50-50 chance thattheywould come it, randomly score computed ethnicgroups.The Appendixlists the fractionalization fromdifferent in thesedata foreach of the 160 countries the sample. using Foreach region, Figure3 plotsF as measuredusingtheAtlas NarodovMira againstF between twocodingsis quite the thedatadiscussedabove. The agreement computed using where my East and Latin America/Caribbean, high, except in North Africa/Middle The bivariate morediversity a number countries. for of show considerably constructions



of fractionalization. 3. Figure Two measures ethnic

correlation the whole sample which consists of only 135 states because of new for countries (mainlyin theFSU) notcoded by theSoviet ethnographersis 0.76. betweenthetwomeasures. Different of conceptions ethnicity explainsome differences The Soviet geographerscode ethnolinguistic groups, adopting the common Eastern As that ethnicity. discussedabove,I allow for Europeanassumption nativelanguagemarks other cultural criteria distinguishing groups, provided that the groups are locally understood (primarily) as descentgroupsand are locallyviewed as sociallyor politically lesser extentin the West, most consequential.In EasternEurope/FSUand to a slightly the between in "my" sense,so thecorrelation does indeedtendtomark language ethnicity code the twomeasuresis nearly In however, Sovietethnographers perfect. LatinAmerica, all Spanish speakersas one ethnolinguistic group,and tendto breakout the "indigenous by greater homogeneity peoples" by tribal language.On net,thismakesforconsiderably theirmeasure forthis region.This consideration also explains a numberof prominent in since both outliers other as homogenous, regions.The Sovietscode Burundi ethnically 19 Hutusand Tutsisspeak Kirundi! The commonlanguagesof Somali and Malagasy make in Somalia and Madagascarappearnearly perfectly homogenous theSovietcoding(which could be arguedas plausible in each case). They drawno distinction betweenWhiteand Black Moors in Mauritaniabecause bothspeak Arabic. In an exceptionto theirnormal thanby practice,theycode PNG by racial categories(Papuans and Melanesians) rather


2 11

for estimate PNG in their which leads to a muchless fractionalized language groups, data.20 for in Middle Eastarecodedquite countries the Several differently twomeasures bythe Arabsin Jordan I Palestinians TransJordan and between thissamereason. distinguish I all becausethey teamsees them as Arabs whereas Soviet the Likewise, speakArabic. as the team sees this whereas Soviet in codetheethnoreligious country groups Lebanon and in and for all almost "Arab;" andsimilarly AlawiandChristians Syria, Sunni Shia the is reason thelowcorrelation for in Arabs Iraq.Butthere another (0.22) between two who in I in measures this populations theGulfstates, region. code thelargenoncitizen that 1960s(itappears the countries theearly in smaller in these much groups comprised that the team try include did to Soviet them). Ironically, states showbyfarthegreatest are overthelast40 years theGulf in due to "globalization" increase ethnic diversity monarchies.21 8. A Measureof Cultural Diversity on to score a With fractionalization of0.37,Belarus percentile happens fallat the38-th the a between Thisreflects division a from cross-national ethnic perspective. diversity and Poles(4 percent), Russians percent), (13 (78 group percent), Byelorussian majority is Greekand 18 percent codedas 78 percent Ukrainians percent). Turkish, (3 Cyprus, at 0.36. in of the as assessed practically sameas Belarus terms ethnic fractionalization, differences matters becauseethnic ethnic that If one has a theory saysthat diversity in be then and for makeit harder peopleto cooperate coordinate, one might interested of the culturaldistancebetweenethnicgroups ratherthan just some notion is F are their scores aboutthesame,Belarus even fractionalization. Intuitively, though are and thanCyprus. divided less culturally much Ukrainians, Russians Byelorussians, and and in of language, customs, Poles speaka Slavic quitesimilar terms religion, and Turks Greeks of and speak By language sharemany thesamecustoms. contrast, and families different thatcome from (Indo-European Altaic), completely languages and world to subscribe twodifferent (Orthodox Christianity Islam),andhave religions that distance of a I In thissection construct measure cultural customs. different very distances between of someaccount cultural so as fractionalization totake modifies groups. Belarus fractionalization" the To continue aboveexample, this' 'cultural measure, by - while fractionalization on movesdownto 0.23 aboutthe40thpercentile cultural The of is staysat 0.36 which now at the56thpercentile thenew measure. Cyprus rank eachcountry's of lists fractionalization, with along Appendix themeasure cultural withits rankon ethnic its within regionon this score,to facilitate comparison fractionalization. with between the and languages the relationships classify represent structural Linguists the andLaitin and Fearon Laitin oftree (2000)propose (1999,2000) using diagrams. help ' of a albeit noisy of as branches' twolanguages a measure, the between ' 'tree distance one, For as that between distance thecultural language. example, groups speakthem a first comefrom or branch level,sincethey at and Greek Turkish structurally diverge thefirst share and families. contrast, unrelated Russian, Ukrainian their Byelorussian, By language



first threeclassifications Indo-European, as Slavic, East Branchlanguages.Polish shares two levels withthese,since it is Indo-European, Slavic, WestBranch.The only the first idea is thatthe number commonclassifications the languagetreecan be used as a of in measureof cultural proximity.22 For two ethnicgroupsi and ;, considerdefining resemblancefactorrtj(Greenberg, a 1956) that works as follows. r{j is zero when the two groups' languages come from different families(like Indo-European Altaic),r^ is 1 whenthetwogroups and completely we be function the of speak exactlythesame language.In between, letrtj some increasing number sharedclassifications of betweeni's and/s languages.Since earlydivergence a in much more culturaldifference average than later on language tree probablysignifies divergence,the functionshould be concave as well. (For example, coming from unrelated familiessuch as Bantu and Indo-European, denotesmorecultural structurally difference averagethandoes thedifference on between, say,Slavic East Branchand Slavic WestBranch.)23 To constructa measure of "cultural fractionalization"analogous to the ethnic fractionalization measureF discussedabove,consider two from drawing people at random a country and thencomputing theirexpectedculturalresemblance, as using r,y defined above. In a country withone languagegroupor a setof ethnic all groupsthat speakhighly similarlanguages,theexpectedresemblance will be close to 1. In a country witha large number groupsthatspeak structurally of unrelated theexpectedresemblance languages, will be closer to zero. To get a fractionalization measure analogous to ethnic from1.24If the groups fractionalization, simplysubtract expectedculturalresemblance in thecountry unrelated their cultural fractionalization index speak structurally languages, willbe thesame as theethnic fractionalization indexF. The moresimilar thelanguages are ethnic the measurebe reducedbelow spokenbythedifferent groups, morewillthecultural thevalue of F forthecountry.25 classifications Using the linguistic given by Grimesand Grimes(1996), I calculated culturalfractionalization defined as above- call it C. As shownin Table 3, its avergage value of0.31 is muchsmaller thantheaveragevalue ofethnic fractionalization (0.48 when that similarities accounthas into computed usingmydata). This indicates taking linguistic a largeeffect a significant for number countries. of Even so, C is correlated fairly strongly
Table3. Cultural versus ethnic fractionalization. N World West LA/Ca NA/ME EE/FSU Asia SSA 160 21 23 19 31 23 43 C 0.31 0.19 0.19 0.29 0.30 0.33 0.43 F 0.48 0.24 0.41 0.45 0.41 0.44 0.71 ELF 0.43 0.22 0.27 0.25 0.36 0.53 0.65 NELF 129 21 22 17 7 20 42

Notes:C=Avg. cultural F= fractionalization, Avg. ethnicfractionalization using my data, ELF = Avg. fractionalization theSovietAtlasdata,A^ = thesize of thesample available from the ethnolinguistic using Sovietdata.


2 13

fractionalization. versus 4. fractionalization ethnic Figure Cultural

at withethnicfractionalization, 0.79 withmy measureF, and 0.82 withfractionalization is based on the Soviet Atlas (ELF). So by these measures,ethnic fractionalization in fractionalization a broadcross-section.26 reasonableifhardly perfect proxyforcultural is When cultural/linguistic similarity takeninto account,Latin America looks much measure. This is due mainly fractionalization thanitdoes bytheethnic morehomogeneous to the use of Spanish across "white" and "mestizo" groups,which indeed reflects This is one example of an considerable(some mightsay near total) culturalsimilarity. is of feature themeasureC. In manycases wherethere a questionaboutwhereto attractive decision.Another makesa principled C "draw theline" betweenethnic groups, in effect the of is Somalia, whichwill have a low C regardless whereone thinks "ethnic example groups" shouldbe located. show thegreatest After LatinAmerica,thesubSaharancountries averagechangewhen The and linguistic similarities. greatethnic we takeaccountofcultural/linguistic diversity articulated of small number highly of Africais represented a fairly languagetrees.For by commonlevels(Niger-Congo, mostofTanzania's manysmallgroupsshareeight example, NarrowBantu,Central). Volta Congo, Benue Congo, Bantoid,Southern, Atlantic Congo, As a result and plausibly if arguably the measure judges some Africancountries diverse. diversethantheyare ethnically less significantly culturally measure C tends to be closer on average to the Not surprisingly, culturaldiversity the fractionalization usingtheSovietAtlasdata (ELF). As noted, computed ethnolinguistic



so in and Sovietethnographers defined of origin, that ethnicity terms language national ELF than Latin America North Eastcomeoutmore and Africa/Middle F, by homogeneous while measures. implication One subSaharan Africa judged is by highly heterogeneousboth to that is that ELF measure the favorable thethesis low economic maybe particularly it in is and 1997),since represents growth Africa duetoethnic diversity (Easterly Levine, as to of than orF does. Africa more diverse C ethnically compared therest theworld F 4 measure fractionalization the fractionalization Figure plotscultural against ethnic are into Under the American countries essentially C, Latin partitioned twosets, byregion. The measure and thosewith substantial indigenous populations thosewithout. cultural of showsmuchgreater subSaharan Africa thanF does,as a number variation within we countries appear that diverse muchless so when takeinto highly ethnically appear are accountlanguageproximities. Angola,Somalia,Zambia,and Madagascar most in affected this respect. 9. Conclusion science for Several active research in and require, empirical programs economics political here to data The across countries. research evaluation, onethnic reported tries do a groups a list better of conceptually andconstructing ofethnic job grounding, operationalizing, acrosscountries is availablein theliterature. shown, listof ethnic As the than groups measures ethnic here cross-national of diversity, groups presented canbe usedtoproduce ethnic and wouldbe "crossAnother notillustrated, structures, cultural use, diversity. in the with members involved on that group"research thefactors distinguish groups secessionist andLaitin, (Fearon 1999). struggles The concept an "ethnic of Thereare often slippery. multiple group"is inherently the of partitioning "ethnic the of a country. example, 11 For plausible ways groups" listed theUnited for Mira are "Americans States theAtlasNarodov largest groups by Swedes,Austrians, Jews, Germans, Italians, Mexicans, Poles,Irish, blacks), (including Puerto and I do know I would that is a highly that this Ricans, Anglo-Canadians." not say the States'ethnic but plausible wayof rendering United groups, at anyrateit is quite different White, from in and the that Black,Hispanic, Asian, groups appear mylist. It is interesting learn, to that different formulations "ethnic of then, despite sharply theaggregate measure ethnic of fractionalization on theAtlasNarodov based group," Miradataandthe datapresented aremoderately correlated,0.75.Very here well at similar correlations obtainbetweenthe Soviet ELF and the "ethnic" and "linguistic" fractionalization measures (2002) several produced Alesinaet al. (2002). Roeder's by measures correlate around 1 with measure andat about at F 0.8 0.88 with Soviet the my ELF. So as a measure aggregate of ethnic acrosscountries, fractionalization diversity to robust thelooseness theconcept "ethnic to of of appears be fairly group."27 a of that a more than ofthevariation the half in Still, correlation 0.75means only little twomeasures "shared."The analysis is aboveshowed that there somesystematic are differenceshowmy in measure theSoviet and ELF assessethnic so regional diversity,that not variation pure is noise. addition, is some In there reason to certainly all oftheunshared be concerned perceptions what ethnic that of the are can groups ina country be caused by


2 15

is supposed predict to thedependent variables ethnic that fractionalization (likegrowth their Researchers should therefore check see whether results to andconflict). concerning oneconomic the effect ethnic of fractionalization conflict, growth, political political party if on measure used,andifthey why. do, structure, depend thespecific etc., Relatedly, a diverse fractionalization matters becauseit makesfor is that ethnic researcher's theory difficulties thenthe measureof cultural and cooperating, preferences consequent in introduced Section maybe more 8 fractionalization appropriate. fractionalization measures ofnohelpifone's is the robustness ethnic of Finally, partial of of is research (e.g.,a study determinantsgroup groups project at thelevelof ethnic ' that groups listed the' 'right'groups be or In case,itmatters the oppression rebellion). this on the list that sense.I haveargued in principle right must in somedefensible depend in end data in country question, that the survey in so views the of people the contemporary who we In the is required. lieuofsuch data, best cando is toconsult country experts havea hereis in senseof howcitizens Thus,thelistdiscussed "map" ethnicity thecountry. statementan of and not and as offered provisional tobe amended corrected, as a definitive reality. unchanging objective, I In related research couldbe noted. concluding will for Manypossibilities further, here measureconstructed used structural mention diversity just one. The cultural leaves This for as between similarity. obviously languages a proxy cultural relationships Whilea shared most of dimensions cultural out other resemblance, notably religion. et have fractionalization beenconstructed of of (Alesina al., variety measures religious data no and and 2002;Fearon Laitin, 2003),so far cross-national 2002;Barro McCleary, and between oroverlapping whether examines language/ethnicity cleavages cross-cutting It be variables interest.should relatively of for matter dependent straightforward religion of here countries theinteractionreligious list tousethegroup discussed tocategorize by structures. andlinguistic cleavage Appendix
and Ethnicfractionalization culturaldiversity scores, by regionand ethnicfractionalization. EthnicFrac. Western Europe and Japan 1 Canada 2 Switzerland 3 Belgium 4 Spain 5 USA 6 New Zealand 7 UK 8 France 9 Sweden 10 Ireland 11 Australia CulturalFrac. Rank of Cultural Frac. WithinRegion 1 3 2 6 5 4 9 7 8 10 11

0.596 0.575 0.567 0.502 0.491 0.363 0.324 0.272 0.189 0.171 0.149

0.499 0.418 0.462 0.263 0.271 0.363 0.184 0.251 0.189 0.157 0.147



fractionalization. and Ethnic fractionalization cultural and scores, region ethnic diversity by Ethnic Frac. 12 Finland 13 Denmark 14 Austria 15 Norway 16 Germany Federal Republic 17 Netherlands 18 Greece 19 Portugal 20 Italy 21 Japan Eastern Soviet Union Europeand theFormer 1 Yugoslav 2 USSR 3 Bosnia 4Kyrgyzstan 5 Kazakhstan 6 Latvia 7 Yugoslavia 8 Macedonia 9 Tajikistan 10 Estonia 11 Moldova 12 Czechoslovakia 13 Georgia 14 Uzbekistan 15 Ukraine 16 Turkmenistan 17 Croatia 18 Belarus 19 Lithuania 20 Russia 21 Slovakia 22 CzechRepublic 23 Romania 24 Bulgaria 25 Slovenia 26 Azerbaijan 27 Hungary 28 Armenia 29 Albania 30 Poland 31 German Democratic Republic Asia 1 PapuaNewGuinea 2 India 3 Indonesia 0.132 0.128 0.126 0.098 0.095 0.077 0.059 0.04 0.04 0.012 0.801 0.711 0.681 0.679 0.664 0.585 0.575 0.535 0.513 0.511 0.51 0.505 0.49 0.485 0.419 0.392 0.375 0.372 0.338 0.333 0.332 0.322 0.3 0.299 0.231 0.188 0.186 0.134 0.097 0.047 0.006 1 0.811 0.766 Cultural Frac. 0.132 0.128 0.1 0.098 0.09 0.077 0.05 0.04 0.04 0.012 0.385 0.596 0.146 0.624 0.62 0.441 0.392 0.432 0.492 0.492 0.401 0.29 0.404 0.442 0.258 0.328 0.185 0.228 0.259 0.311 0.293 0.064 0.265 0.25 0.17 0.187 0.185 0.124 0.082 0.041 0.006 RankofCultural Frac.Within Region 12 13 14 15 16 17 18 19.5 19.5 21 12 3 26 1 2 7 11 8 5 4 10 16 9 6 19 13 24 21 18 14 15 29 17 20 25 22 23 27 28 30 31

0.667 0.522

2 6


2 17

and fractionalization. and Ethnic fractionalization cultural scores, region ethnic diversity by Ethnic Frac. 4 Afghanistan 5 Nepal 6 Bhutan 7 Malaysia 8 Fiji 9 Pakistan 10 Burma 11 Laos 12 Thailand 13 SriLanka 14 Singapore 15 Taiwan 16 Mongolia 17 Vietnam 18 Bangladesh 19 Cambodia 20 Philippines 21 China Korea 22 South Korea 23 North East and North African theMiddle 1 Lebanon ArabEmirates 2 United 3 Kuwait 4 Iran 5 Syria 6 SaudiArabia 7 Bahrain 8 Iraq 9 Israel 10 Jordan 11 Morocco 12 Oman 13 Cyprus 14 Algeria 15 Turkey 16 Egypt 17 Libya 18 Yemen 19 Tunisia SubSaharan Africa 1 Tanzania 2 Democratic Congo Republic 3 Uganda 4 Liberia 5 Cameroon 0.751 0.677 0.605 0.596 0.566 0.532 0.522 0.481 0.431 0.428 0.388 0.274 0.272 0.233 0.223 0.186 0.161 0.154 0.004 0.002 0.78 0.737 0.708 0.669 0.581 0.553 0.551 0.549 0.526 0.509 0.479 0.439 0.359 0.32 0.299 0.164 0.151 0.078 0.039 0.953 0.933 0.93 0.899 0.887 Frac. Cultural 0.679 0.542 0.518 0.564 0.553 0.289 0.419 0.02 0.431 0.386 0.388 0.169 0.227 0.21 0.141 0.15 0.116 0.154 0.004 0.002 0.195 0.65 0.54 0.542 0.235 0.413 0.46 0.355 0.246 0.049 0.36 0.404 0.359 0.237 0.299 0 0.127 0.078 0.033 0.564 0.628 0.647 0.644 0.733 RankofCultural Frac.Within Region 1 5 7 3 4 12 9 20 8 11 10 15 13 14 18 17 19 16 21 22 14 1 3 2 13 5 4 9 11 17 7 6 8 12 10 19 15 16 18 14 7 5 6 1



fractionalization. and Ethnic fractionalization cultural and scores, region ethnic diversity by Ethnic Frac. 6 Togo 7 South Africa 8 Congo 9 Madagascar 10 Gabon 11 Kenya 12 Ghana 13 Malawi 14 GuineaBissau 15 Somalia 16 Nigeria 17 Central African Republic 18 Ivory Coast 19 Chad 20 Mozambique 21 Gambia 22 Sierra Leone 23 Ethiopia 24 Angola 25 Mali 26 Senegal 27 Zambia 28 Namibia 29 Sudan 30 Burkina Faso 31 Guinea 32 Eritrea 33 Niger 34 Mauritius 35 Mauritania 36 Benin 37 Djibouti 38 Zimbabwe 39 Botswana 40 Burundi 41 Swaziland 42 Lesotho 43 Rwanda LatinAmerica theCaribbean and 1 Bolivia 2 Colombia 3 Ecuador 4 Trinidad Tobago and 5 Peru 6 Guyana 7 Brazil 8 Mexico 0.883 0.88 0.878 0.861 0.857 0.852 0.846 0.829 0.818 0.812 0.805 0.791 0.784 0.772 0.765 0.764 0.764 0.76 0.756 0.754 0.727 0.726 0.724 0.708 0.704 0.669 0.647 0.637 0.632 0.625 0.622 0.606 0.366 0.351 0.328 0.28 0.255 0.18 0.743 0.656 0.655 0.647 0.638 0.62 0.549 0.542 Frac. Cultural 0.602 0.53 0.562 0.192 0.382 0.601 0.388 0.294 0.568 0.29 0.66 0.511 0.557 0.727 0.285 0.548 0.534 0.562 0.242 0.59 0.402 0.189 0.589 0.698 0.354 0.49 0.398 0.6 0.448 0.272 0.4 0.404 0.141 0.161 0.04 0.143 0.057 0 0.662 0.02 0.48 0.38 0.506 0.46 0.02 0.434 RankofCultural Frac.Within Region 8 20 15 36 29 9 28 31 13 32 4 21 17 2 33 18 19 16 35 11 25 37 12 3 30 22 27 10 23 34 26 24 40 38 42 39 41 43 1 17.5 4 7 2 5 16 6


2 19

Ethnic fractional ization cultural and and fractionalization. scores, region ethnic by diversity Ethnic Frac. 8 Mexico 9 Panama 10 Chile 11 Guatemala 12 Venezuela 13 Nicaragua 14 Dominican Republic 15 Argentina 16 CostaRica 17 Uruguay 18 Cuba 19 El Salvador 20 Honduras 21 Jamaica 22 Paraguay 23 Haiti 0.542 0.507 0.497 0.493 0.483 0.402 0.387 0.255 0.238 0.218 0.213 0.198 0.185 0.166 0.132 0.095 Cultural Frac. 0.434 0.168 0.167 0.493 0.02 0.095 0 0 0.078 0 0.02 0.18 0.167 0.027 0.039 0 RankofCultural Frac.Within Region 6 9 11 3 19 12 21.5 21.5 13 21.5 17.5 8 10 15 14 21.5

others between ethnic divisions civilconflict, among and see Hibbs 1. On therelationship lackthereof) (or and and (1999),Fearon Laitin (2003),Collier Hoeffler (1973),Horowitz (1985),Powell (1982),VanHanen of et ethnic as (1996).Przeworski al. (2001)consider (2001),andHuntington diversity a possible predictor of on of transitions. (1997)considers effect ethnic Cox the the likelihood democratic diversity party systems and and indemocracies. and (1999),Gurr (1993),Gurr Moore(1997), (1998),Fearon Laitin Dudley Miller the ofethnic rebellion protest. and andLindstrom Moore(1995)examine determinants and group in science. violence an early is 2. Hibbs's(1973) cross-national ofcausesofpolitical example political study that included version themeasure is widely the of cited. Easterly See and (1983)handbook Taylor Jodice's in et references economics. andLevine (1997)andAlesina al. (2002) for is and that very oflisting idea ethnic scientists sociologists 3. Somepolitical argue the groups "pnmordiahst, in that existin thewrong of way.I see no contradiction sort it or because presumes implies these groups to them. as socialfacts, trying enumerate and ethnic seeing groups purely all low it to this or that 4. I suspect without restriction someother threshold willbe impossible enumerate in "ethnic groups" all countries. Asianfrom list. the this the 5. With 1 percent threshold, couldmeandropping far the werehighly and thatcountries from equator tendforget 6. Economists ethnically linguistically of an Weber from fractionalized (1976) on thediversity ago anywhere 100to500 years (see,for example, runs from ethnic to From historical a the 19th France). diversity century perspective, causalarrow not early to and states bothethnic/ (Easterly Levine,1997),butfrom strong growth performance pooreconomic and statesin Europedeliberately actively and Strong linguistic homogenization economicgrowth. national and to their identity culture (e.g.,Gellner, premodern populations a common homogenized diverse 1983). on see and 7. Forsometheoretical (1999). question, Bates(1983)andFearon arguments examples this no but actual select setofpotential not a 8. That ' 'randomly" (or is, groups), making pretense groups non-ethnic suchas those Thenuse techniques of getting wholepopulation evendefining conceptually). the it, (or the in discussed KingandZeng(2001) toanalyze resulting sample. are uncertain probably low;the and too and other states fairly Gulf that for 9. Note estimates SaudiArabia some their to populations. kingdoms appear be overstating citizen



see 10. Using Scarritt Mozaffar's and on second levelofaggregation, which below. we in are 11. Althoughis difficultdo because language listed Ethnologue highly it to the disaggregated, groups in and sources identified Morrison other often constructed estimates African for by groups population of name(or somevariant it). In list related thegroup to the searching Ethnologue forlanguages closely 's based on Ethnologue language-speaker we general, foundthatthe grouppopulation proportions and come fromlinguists estimateswhichare typically dated in the early 1990s and presumably - were who estimates Morrison, references from missionaries close remarkably tothepopulation proportion Whenthere weresignificant sources the census). goingbackto the1950sand 1960s(often lastcolonial estimate to the recent differences someother and source the tended corroborate more (especially Factbook) basedon Ethnologue, adjusted figures we the accordingly. at 12. The group and fractionalization list discussed beloware available http://www.stanford.edu/ measures their Becauseoflarge ethnic in following break ups,theSoviet group/ethnic/. changes their compositions as and Union Russiaareentered twodifferent and as countries, areYugoslavia Yugoslavia/Serbia. 13. "Background of and of Bureau EastAsian Pacific Note: US Affairs, PapuaNewGuinea," DepartmentState, literature. and to October 2001.See Reilly for assessment referencestheanthropological (2000/01) a similar there other are 14. I codedIndiausinglanguage of Certainly plausible groups, whichHindiis thelargest. all wouldimply high a levelofdiversity, of but likely of them renderings India's"ethnic groups," most which true language is for groups. than as I moreimportant identity a 15. The sources consulted stressed identity a Tswanais generally that as As this context member one ofthesubtribes theTswana(though doubt is highly of of no specific). noted and rather a couldwellbe a result Botswana's of economic political above,this strong performance than cause. 16. White BlackMoorsin Mauritania, and Afars Issasin Djibouti. and 17. Thatis,right thedownward-sloping ofthetriangle. on line of 18. Cox(1997)andothers sometimes to number ethnic (or groups" political parties' prefer usethe"effective voteor seat shares), with has whichis 1/(1- F). Thus,a country n equal-sized groups an "effective number" n groups, of withdepartures equal sharesshrinking effective from the number continuously. ethnic so the is "nice," this measure highly is at fractionalization, skewed, leastfor Although interpretation that tends exaggerate influence very it to when the of diverse countries Tanzania like usedas anexplanatory variable. 19. Interestingly, we differ little Rwanda for we ethnic eventhough identify different groups theSoviets very codea moderately number Kirundi of inRwanda a distinct as next a large to of large majority speakers group, and (Hutu Tutsi) speakers. Kinyarwanda 20. The Soviets codedthePhilippines a combination language islands of and by (e.g., "Visayans"),which makes a much for fractionalization than estimate I have(I code "lowland Christian larger Malays"as the main in the and in group, linewith Factbook thediscussion LCCS. ' 21. Which havemanaged, course, the of in samewayas "theWest'- bykeeping newcomers as the they largely noncitizens comeandgo. who 22. Forsomecases,there a question is aboutwhether use the"historical to language"of thegroup for Gaelicfor in Catholics Northern Ireland for or ScotsinBritain orthe example, language currently spoken as a first members thegroup. of I howmany language most by Ideally, wouldliketo takeintoaccount have the For datadiscussed in this below, issueis handled a generations beenspeaking "new" language. the somewhat hocfashion present. more ad at For discussion this on see and (2000a). point, Fearon Laitin = 23. A function fits billis rVj (//m)a, that the / where is the number shared of classifications betweenandy, is / m the number classifications anylanguage the of for in dataset(m= 15here), a is a positive and number highest less than1. Wheni andj speakthesamelanguage, set/= m.Forthemeasure I constructed I below, use a = 1/2. these In the of the is 40 are data, number pairs (i,j) within 160countries2,678.Almost percent from different and shareonlyone classification. restshowa slightly The families, 13 percent completely where is 6 of classifications, there a notable decreasing pattern to 8 and9 common up spike. percent group within samecountry the pairs speakthesamelanguage. 24. Formally, cultural fractionalization - 1,"= Zy= PiPf^ where is the is 1 i of proportiongroup andn is the l , pt number groups. of 25. Thismeasure first was (1956) in a paperon waysof guaging proposed linguist by Greenberg linguistic


22 1

in he for the two diversity a region, though had a different proposal assessing resemblance between ry He it his languages. termed the"B index"(the"A index"in his 1956paperwasjustF, where groups referred groups first to of Laitin three and (2000) discusses language speakers). Greenberg's measures uses thelanguage-tree to resemblance compute B index sixSoviet to the for approach measuring republics. one these correlations choosing higher inconstructingwhich substantive a a in 26. Ofcourse, canlower C, by terms more differencemore to minor inlinguistic means differences structure. a = 1, With attributing cultural thecorrelation between andF is 0.67,andthat C between andELF is 0.70. C cross-national fractionalization measures couldbe misleading iftheestimates group is of 27. Another that way in developing are it countries, world, seems proportions systematically Formany wrong. especially the likely in CIA Factbook other sources that group the estimates found the and such derive from proportion ultimately thelast colonialera census, sinceveryfewpost-colonial censuses ask questions aboutethnicity. My to the estimates basedonEthnologue the with much older, experience trying match recent population usually in et a of which is censusbasedestimates Morrison al. (1989) showed remarkable degree consistency, in fractionalization very is not sensitive small to Also, reassuring. byconstruction changes group proportions, of largest that anditsvalue determined is So are mainly the by share the group. I doubt there major problems shares. causedbyerrors theestimated in population being

