1 Introduction

Social divisions, particularly those defined in ethnic terms, can have major impacts on quality of life (Banerjee et al. 2005). Inequalities between groups, in turn, are linked in the research literature to underdevelopment, poor public goods provision, and violent conflict (Alesina et al. 2016; Baldwin and Huber 2010; Cederman et al. 2011; Stewart 2008). Inequality and social exclusion receive considerable contemporary policy attention. In the field of international development, inequality—both vertical (between individuals and households) and horizontal (between groups)—is a core concern in the 2030 Agenda for Sustainable Development (Brinkman et al. 2013; Camfield et al. 2013; UN-OHCHR 2015; Unterhalter and Dorward 2013). Goal 10 deals with reducing inequality within and among countries, including promotion of ‘social, economic and political inclusion of all, irrespective of age, sex, disability, race, ethnicity, origin, religion or economic or other status’.

Despite considerable attention to horizontal inequality in both research and policy, there are notable gaps and weaknesses in our empirical knowledge about how it manifests within and across countries and over time. This has implications not only for the rigour with which we can build and test theories in this area, but also for informing policy, monitoring trends, and evaluating the impact of interventions.

The studies in this special issue—and in the broader research initiative of which it is a part—probe what more can be learned from existing survey and census data to address empirical gaps about horizontal inequality in countries of the Global South.Footnote 1 Each of the six studies in this collection focuses on a particular country for which the existing literature on horizontal inequality is relatively limited: Ecuador (Gachet et al. 2017), Iran (Majbouri and Fesharaki 2017), Pakistan (Majid and Memon 2017), the Philippines (McDoom et al. 2018), Tanzania (Maliti 2018), and Vietnam (Dang 2018). Collectively, these articles provide insight into patterns, trends, correlates, and implications of horizontal inequality in countries across multiple world regions—from Asia to the Middle East, Latin America, and sub-Saharan Africa. While each of these articles advances a distinct argument, each also speaks to our collective effort to address and consider empirical gaps. This article frames the contributions to this special issue while drawing on findings from these studies and our broader research initiative to advance an argument about the limits in this area to the ‘data revolution for sustainable development’.Footnote 2 In particular, we argue that methodological, conceptual, and—especially—political issues pose challenges for survey and census data on topics relating to ‘ethnicity’ broadly defined. These challenges need to be more fully taken into account in considering the limits of the ‘data revolution for sustainable development’ and the ways in which over-reliance on available quantitative data can be problematic for policy.

Section 2 of this article provides a brief introduction to horizontal inequality and its measurement. It frames this collection in particular by introducing the core set of horizontal inequality measures employed in each of the studies. Building on this discussion, Sect. 3 turns to issues of data and data gaps. It introduces the countries explored in this collection in the context of the wider research initiative, as well as the cross-national data used in this article to situate these countries. Section 4 considers regional and national trends in horizontal inequality measured in terms of educational attainment, providing a comparative global context within which the studies in this collection can be considered. Section 5 speaks to how re-examination of existing data sources does—and does not—fill empirical gaps on horizontal inequality, and discusses persistent methodological, conceptual, and political challenges for survey and census data in this area. Section 6 concludes with a discussion of broader implications for the ‘data revolution’ and evidence-based policy making.

2 Horizontal Inequality and Its Measurement

Stewart (2008, 3) defines horizontal inequalities as ‘inequalities in economic, social or political dimensions or cultural status between culturally defined groups’. Consistent with much of the literature, we treat ‘culturally defined groups’ here as generally equivalent to ‘ethnic’ groups, broadly defined. ‘Ethnic’ as understood here includes categories based on ascriptive attributes such as skin colour, native language, tribe, caste, religion, and sometimes region (Chandra 2004; Horowitz 1985; Htun 2004). We also consider ascriptive groups in a broader sense to include gender.

Recent analyses link horizontal inequalities with violent conflict (Cederman et al. 2011; Stewart 2008), poor growth and underdevelopment (Alesina et al. 2016), and the underprovision of public goods (Baldwin and Huber 2010). These analyses in turn have links to long traditions of research in the social sciences on the relationships among ethnicity, class, and other social cleavages, and their consequences. Classic work in political science, for example, considers social cleavage structures and democratic governance (Lijphart 1979; Lipset and Rokkan 1967) and the links from ethnic discrimination and ‘ranking’ to violent conflict (Gurr 1993; Horowitz 1985).

The measurement of inequality—in particular, of vertical inequality—occupies a considerable literature. The Gini coefficient is by far the most widely used measure of inequality. Based on the Lorenz curve, it measures the extent to which the distribution of a given variable deviates from the uniform distribution (a perfectly equal distribution). Possible values range between 0 and 1, with 0 meaning a perfectly equal distribution and 1 a perfectly unequal distribution. Other measures, such as generalized entropy measures, \(GE(\alpha )\), and Atkinson’s inequality measures, among others, are also quite popular in the economic inequality literature, but less used than the Gini coefficient mainly because they can range between zero and infinity and therefore interpretation is less intuitive.

While all these measures capture different aspects of vertical inequality, other inequality measures more properly capture differences between groups. Work by Stewart and colleagues on horizontal inequality offers arguably the most extensive consideration of such measures (see Stewart 2008). Stewart et al. (2010) consider principles of good measures and make the case for three, namely:

$$\begin{aligned}&{\text {GGini}}= \frac{1}{2\bar{y}} \sum _{r}^{R}\sum _{s}^{S}p_{r} p_{s}\; |\bar{y_r}-\bar{y_s}| \end{aligned}$$
(1)
$$\begin{aligned}& {\text {GTheil}}= \sum _{r}^{R}p_{r}\frac{\bar{y_r}}{\bar{y}} \;log \frac{\bar{y_r}}{\bar{y}} \end{aligned}$$
(2)
$$\begin{aligned}& {\text {GCOV}}= \frac{1}{\bar{y}}\Big ( \sum _{r}^{R}p_{r}((\bar{y_r}-\bar{y})^2)\Big )^\frac{1}{2} \end{aligned}$$
(3)

where y is the variable of interest, \(\bar{y}\) its mean value, R the number of groups, and p the group’s population share.

These measures correspond to the classical Gini coefficient, Theil index, and coefficient of variation weighted by the size of the population subgroups. The GGini compares the mean in the outcome variable of every group with that of every other group. The GTheil compares each group’s mean in the outcome variable with the national mean.Footnote 3 The GCOV is a measure of overall dispersion and therefore changes on this index can be interpreted as occurring at all levels of the distribution, not only at the tails or near the mean.

A variety of other group inequality measures are explored in the literature (e.g., Atkinson 1970; Das and Parikh 1982; Deutsch and Silber 2013; Zhang and Kanbur 2005). Two common measures are the mean differences in the outcome variable between groups and ratios between the most and least deprived groups. Both are simple to compute, easy to understand, and widely used. While they do not provide as much information as the measures described above, when used in conjunction with them, they also provide useful information for the study of inequality between groups. Take, for example, the case of Nigeria, where in 2008 the GGini coefficient of 0.306 translated into a mean difference in educational attainment between the most and least deprived ethnic groups of 8.95 years of schooling.Footnote 4

In order to facilitate comparisons across the studies in this research initiative and the possibility of cumulating learning through comparison in particular with the country studies developed in Stewart et al.’s work, the contributors to this special issue were asked to take as a starting point the GGini, GTheil, and GCov measures defined above. Contributors were also asked to calculate and consider in their initial analyses two related measures of ‘cross-cuttingness’ from the political science literature (Rae and Taylor 1970; Selway 2011),Footnote 5 along with several common indicators of ethnic diversity (the ethnic fractionization index (Taylor and Hudson 1972) and ethnic polarization (Montalvo and Reynal-Querol 2005)) and vertical inequality (Gini and Theil). In lieu of presenting exhaustive tables of numbers in their articles, most contributors focus on just a few of these measures in the final versions of the studies presented here—usually one or more of the three core measures highlighted above (GGini, GTheil, and GCov). In general, we have found—broadly consistent with Stewart (2008)—that these latter three measures tend to move together empirically, and thus discussing all is not necessary for descriptive purposes.

It is worth highlighting that our focus in this project was not the development of new measures or the evaluation of existing ones, but probing what more could be learned from existing empirical data. The contributors thus were tasked with using a common ‘toolkit’ of relatively well-established measures. Extensive discussion of measures in each paper would surely have been repetitive in the context of this special issue. Moreover, use of this common toolkit helps us to explore comparative findings.

3 Data and Case Studies

Although gaps and weaknesses even in basic poverty data for many developing countries are well noted in the literature (Beegle et al. 2015; Jerven 2013; Sandefur and Glassman 2015), a rise in the production of development indicators in a number of countries is generally seen to have strengthened international comparison. In terms of horizontal inequality, however, it is striking that information is still quite limited, especially when compared with figures on vertical inequality, for which long time-series exist for most countries in the world. By comparison, large-N studies of horizontal inequality such as ∅stby (2008) and, more recently, Tetteh-Baah et al. (2018), for example, each cover (just) 36 countries. Both rely on Demographic and Health Survey (DHS) data, given its coverage and standardization of data collection.

The measurement of inequality in general is constrained by issues of data availability, data quality, non-standardization of data, and differences in definitions, classifications, and methodologies. A major constraint on the production of horizontal inequality indicators in particular—beyond those that influence the production of vertical inequality indicators—is the need to have information on ethnic groups, which is unavailable or significantly incomplete or problematic for many countries. We return to the reasons for this in Sect. 5. In light of gaps and weaknesses in survey- and census-based data on horizontal inequality, a number of recent studies have worked with various proxy measures that, for instance, combine geocoded night-light data with historical maps of ethnic territories or homelands (Alesina et al. 2016; Cederman et al. 2011; Wucherpfennig et al. 2011).Footnote 6 While such data are useful, they are not a substitute for regularly collected and up-to-date survey- and census-based measures. For one, without the latter, we have weak bases to show the validity—or lack of validity—of such proxies. Moreover, and particularly in a policy context, using (unproven and possibly inaccurate) proxy measures to monitor progress, design policy, or hold countries to account would appear to be at odds with basic norms of accountability and transparency in governance.

The research initiative of which this special issue is a part was developed to speak to such data gaps. It has involved both investigation into available large-N datasets on horizontal inequalities across countries around the world, and focused studies on a set of 15 countries. Among the 15 countries are the ten largest developing countries in the world with a minimal degree of ethnic diversity at the national level (India, Indonesia, Brazil, Pakistan, Nigeria, Mexico, the Philippines, Ethiopia, Vietnam, and Iran),Footnote 7 plus the next three largest sub-Saharan African countries that met the same criteria (Democratic Republic of the Congo, South Africa, and Tanzania).Footnote 8 Two additional (smaller) Latin American countries (Ecuador and Guatemala) were subsequently added to complement the larger research effort.

In order to help situate empirically the countries studied in this special issue, this article draws on the FHI 360s Education Policy and Data Center’s Education Inequality and Conflict (EIC) dataset,Footnote 9 commissioned by the UNICEF Peacebuilding, Education and Advocacy Programme. The EIC is an unbalanced panel of countries that combines data from national censuses, DHSs, and household consumption and expenditure surveys. It contains measures of horizontal inequality based on the educational attainment (HI-E) of young people (ages 15–24) across ‘ethnic’, religious, and subnational divisions for groups comprising at least 5% of the population.Footnote 10 It covers up to 111 countries for the period 1960–2010. While major harmonization efforts were undertaken to obtain HI-E measures that allow comparison across countries, the data heavily rely on back projections and interpolations.Footnote 11

4 Global Patterns and Trends: Situating the Cases

In this section, drawing on the EIC data, we provide a brief comparative overview of trends in HI-E for developing countries. We start by presenting trends for regions and then turn to more focused consideration of some of the countries in our research initiative.

In the last decades, developing countries have made great progress towards reducing poverty and vertical inequality. The EIC data suggest this is also the case for horizontal inequality. As Fig. 1 shows, HI-E among ethnic groups (as measured with the GGini in the EIC dataset) decreased in all four regions, with the steepest downward sloping curve in Latin America and the Caribbean. Further, changes in HI-E according to these data have been quite heterogeneous across regions. The mean GGini coefficient in Latin America decreased by 71% between 1965 and 2005, compared with 23% in sub-Saharan Africa over the same period.Footnote 12

The EIC data also suggest significant variation at the country level, from 5% change in HI-E in the Republic of Congo between 1965 and 2005, to 91% in Vietnam over the same period. Table 1 summarizes available data from the EIC dataset on the GGini coefficient and on mean years of schooling for 1965 and 2005 (five-year average) for countries studied in our wider research initiative. In other words, this source includes data on three of the six countries explored in this special issue (Pakistan, the Philippines, and Vietnam), but not on Ecuador, Iran, or Tanzania.

Fig. 1
figure 1

Source: Authors’ calculation based on the EIC data-set

Regional trends in educational inequality.

As Table 1 shows, during this period the lowest change in HI-E among this set of countries was in Ethiopia, where it decreased by 6%. Nigeria and the Philippines also show comparatively low decreases. In this special issue, McDoom et al. (2018) focus on the period 2000 to 2010 in the Philippines and consider horizontal inequality assessed both in terms of differential educational attainment, as well as access to basic public services. They show that while the Philippines made notable progress in reducing horizontal inequality at the national level over this period, the picture is less positive when inequality is measured at the subnational level and when taking into account socio-politically salient divisions.

Mexico, Vietnam, the DRC, Brazil, and South Africa occupy the other end of the spectrum. In these countries, the GGini is shown to decrease by more than 70% between 1965 and 2005, while average years of schooling increased by around four years during the same period. According to the EIC estimates, in Vietnam between 1965 and 2005, the GGini decreased by 91% for the population aged 15–24 years. In this special issue, Dang (2018) focuses on the period 1989 to 2012 in Vietnam, drawing on three census rounds (1989, 1999, and 2009) and three Household Living Standard Surveys (1998, 2008, and 2012). Considering horizontal inequality for both ethnic and regional groups, assessed in terms of four welfare indicators, Dang finds that while HI-E generally improved, there was little change in horizontal inequality in other domains.

Table 1 Horizontal inequality measures.

More in the middle is Pakistan, where the GGini in the EIC dataset declined 47% between 1965 and 2005. In this special issue, Majid and Memon (2017) provide more nuanced consideration of horizontal inequality trends in Pakistan in terms of both educational attainment and income, drawing on data from 1990, 1996, 2006, and 2013. They find overall that ‘inequality across the cleavages shows a great deal of persistence over the 1996–2013 period, and in fact is showing evidence of increasing when we consider the 2013 figures’ (p. 17).

In summary, the measures of HI-E compiled in the EIC dataset provide a useful guidepost, not only for thinking about global trends, but also for thinking comparatively across the countries explored in this collection. While this dataset does not include information on Ecuador, Iran, or Tanzania, the data in Table 1 can help us to think about these countries comparatively. For instance, for Ecuador, Gachet et al. (2017) report a GGini (calculated between indigenous/non-indigenous for years of education) of 0.015 in 2006. Perhaps not surprisingly, this figure is similar to the 2005 figures for other Latin American countries shown in Table 1 (i.e. Brazil and Mexico). However, it is also close to the figures for South Africa and Vietnam.

In addition, considering the nuanced patterns and trends in horizontal inequality discussed in this article against the blunter EIC figures can offer important insight into what such cross-national datasets may miss. In particular, the studies in this collection show that trends in HI-E do not necessarily match up with trends in horizontal inequality measures based on other indicators (such as poverty). The articles in this special issue further underscore how different subnational and national experiences can be, and thus the importance of considering both. If we are worried about the relationship between horizontal inequality and violent conflict, for instance, looking only at horizontal inequality at the national level may obscure more worrying patterns at subnational levels (see McDoom et al. 2018).

5 Building Empirical Knowledge: Persistent Data Challenges

In terms of filling empirical gaps on horizontal inequality using survey and census data, two points are clear from our research: first, more can be learned from rigorous re-examination of existing datasets and this can help us to fill (some) empirical gaps. The EIC dataset (EIC 2015) is a good example of relatively new survey- and census-based efforts that provide new insight into empirical patterns and trends in horizontal inequality (in educational attainment) at the cross-national level. The country-focused studies prepared under our initiative go deeper into these and related sources, documenting what more can be learned about horizontal inequality at subnational as well as national levels, and considering multiple ascriptive groups and multiple dimensions of inequality (i.e. educational attainment, as well as poverty, access to public services, land, and so on). As discussed above, these country-focused studies thus complement, and can inform, cross-national efforts such as the EIC dataset.

Second, our country-focused studies show clearly that, even with such focused analysis, it is not uncommon for significant gaps to remain because the information on ascriptive groups that is needed for the production of valid and reliable quantitative measures of horizontal inequality is unavailable in survey and census data.Footnote 13 We focus in the rest of this section on this second point, drawing country examples from the articles in this special issue.

Here, we give two examples of the sort of significant gaps in data that we are talking about. Consider, for one, Tanzania. Standard quantitative measures of ethnic diversity around the world list Tanzania as among the most (if not the most) ethnically diverse countries in the world (see Alesina et al. 2003).Footnote 14 However, limited data on ‘ethnicity’ and religion are available from the Tanzanian census and standard surveys, such as the DHS.Footnote 15 As the second example, consider Iran. Some data on ascriptive identities—such as language group—are available in survey and census data, but information is not publicly available on other salient identities that are relevant to full consideration of horizontal inequalities in the country. In particular, data are unavailable on religious sect—and the distinction between Shi’a and Sunni is politically salient in Iran. We return to these and other countries below.

Some gaps in the extant data on ‘ethnic’ groups can be traced to issues common across multiple areas,Footnote 16 but we find three sets of challenges in particular problematic in terms of survey and census data for ‘ethnic’ groups broadly defined.

The first set of challenges are ‘methodological’ and largely particular to small minority populations. With normal sampling procedures, nationally representative surveys could miss such populations or provide insufficient data on them to produce representative samples. Moreover, it is not unusual for small minority populations to be located in remote areas, or to be unable to speak the dominant national language, which can add expense to data collection and pose methodological and practical challenges for survey and census enumerators.

In practice, multiple ‘minority’ groups are often considered together in descriptive statistics and analysis. For instance, Vietnam officially has 54 ethnic groups, 53 of which each comprise less than 2% of the population (see Dang 2018). Many analyses, including Dang (2018) in this issue, thus consider ethnic divisions in terms of two groups, Kinh and ‘other’ (or ‘non-Kinh’). Likewise, in Ecuador, Gachet et al. (2017) focus on ‘indigenous’ compared to ‘non-indigenous’ individuals, basing the ‘indigenous’ category on whether an indigenous language is spoken at home. Treating such multiple ‘minority’ or ‘other’ groups as a single group is often the only practical option, and indeed is defensible in terms of group ‘salience’ in many circumstances. For instance, major political divides in many countries are defined by minority status, or distinctions between ‘indigenous’ and ‘non-indigenous’ individuals. It is indeed horizontal inequalities between such politically salient groups that we often expect to be divisive and want to measure and monitor. However, it is also worth keeping in mind that such aggregate categories obscure the diversity that may exist within them, and that such diversity may also be significant. For instance, in Guatemala, where much of the quantitative literature focuses on the indigenous/non-indigenous divide, Canelas and Gisselquist (2018) find that indigenous ethnic groups show distinct patterns of wages and wage gaps, which they hypothesize may have implications for politics.

In general, resources, recognition, and technical savvy can go a long way towards addressing this first set of methodological challenges. For instance, if there is a particular need to have representative data on small minority groups, sampling procedures can help considerably. (Still, it is important to keep in mind, for reasons discussed below, that stratified sampling on the basis of ethnicity is not necessarily straightforward).

A second set of issues stem from the conceptual challenge of capturing ‘ethnic’ identities and groups. For one, there are multiple such groups in a society that have political, social, and economic salience (see Laitin 1986; Posner 2005). How do we decide where our attention should be focused? For instance, in this issue, McDoom et al. (2018) discuss how the groups that are salient in socio-political terms in the Philippines in fact differ from those in the census. The Philippine census in 2000 identified 147 ethno-linguistic groups and 93 religions (in 2010, 182 and 97 respectively). However, McDoom et al. (2018) make the case on the basis of socio-political salience for analysing horizontal inequality with reference to only three ethno-religious groups—Muslims, indigenous persons, and everyone else—thus reclassifying the categories listed in the census into these groupings for their analysis. Such detailed consideration of group salience is clearly within the scope of a country-focused study, but it can pose greater challenges for cross-national data efforts.

A related issue concerns how to interpret the data available—which can require considerable familiarity with the country context. One example is provided in Majid and Memon’s (2017) discussion of language data in the Pakistan Integrated Household Survey (PIHS). As Majid and Memon note, linguistic ethnicity has emerged as ‘one of the dominant cleavages in society’ and six major ethno-linguistic groups can be identified: Baloch, Muhajir, Pashtun, Punjabi, Seraiki, and Sindhi. They use PIHS data on the ‘language in which the interview was conducted’ in the identification of these groups, noting: ‘if the interview was conducted in a regional language one can be confident that the respondent identifies with that language ethnically. If, on the other hand, the interview is conducted in Urdu, the national language of Pakistan, the questionnaire does not provide any additional identifiers of ethnicity, and we cannot be certain if the respondent’s ethnicity is indeed “Urdu-speaking”’. (While Urdu is the national language of Pakistan and is widely spoken as a second language, those who speak it as their first language are in fact a minority.) Based on their contextual knowledge, and taking into account the collection protocol of the Federal Bureau of Statistics, they then identify as Muhajirs Urdu-speaking people in Sindh province only (pp. 9–10). This is of course an imperfect measure of who is Muhajir or who self-identifies as ‘Muhajir’, but it offers a contextually reasonable estimation.

Another challenge relates to changes over time in the salience, significance, or meaning of particular ethnic identities and groups (e.g., Chandra 2012; Flesken 2014; Nobles 2000; Waters 2000). For instance, the rise of indigenous social movements in a number of Latin American countries in the 1990s and 2000s contributed in some contexts to a rise in the political salience of ‘indigenous’ identities (Van Cott 2007; Yashar 2005). In Ecuador, for instance, this shift was arguably reflected in census and Living Standards Measurement Survey (LSMS) questionnaires, which began compiling information on indigenous self-identification in 2001 and 2006 respectively. As Gachet et al. (2017) discuss, a challenge comes then in considering horizontal inequality between indigenous and non-indigenous groups over time. In previous census and LSMS rounds, language data were compiled, and Gachet et al. (2017) use these data to identify as ‘indigenous’ those belonging to households in which an indigenous language was spoken by at least one member. As we discussed for Pakistan, this is a reasonable estimation, but it is also an imperfect measure of who self-identifies as ‘indigenous’. If our aim is to consider horizontal inequality between politically salient groups, self-identification is arguably the better indicator.

To add further complexity, it is not simply that ethnic groups and boundaries shift over time or that different groups are salient in different spheres of life, but also that individual ethnic identifications vary across contexts (e.g. Okamura 1981; Posner 2017). In other words, in responding to surveys or the census, individuals may self-identify in ethnic terms differently depending, for instance, on the way in which questions are asked and the choice of options they are given (Burton et al. 2010; Williams and Husk 2013). Broader happenings around the time that they took the survey—such as the timing of elections—can also influence identification (Eifert et al. 2010). All this is not to say that surveys and censuses cannot provide useful information on ethnicity, but that particular care needs to be taken in the design of questionnaires, as well as in the use and interpretation of data.

A third set of challenges stems from the political salience of ethnicity and the fact that data are political—and ethnic data can be especially so. This relates, for one, to use of census and survey data by governments in the distribution of public resources. For instance, population figures may be taken into account in budget allocations and ethnic data in particular may be used when transfers are linked to ethnicity (e.g. support for disadvantaged minorities).

Moreover, simple ethnic statistics can have implications for perceptions of power and for political manoeuvring among groups in and outside of government. In a majoritarian system, for instance, whether a group is 49 or 51% of the population can be extremely important. Census data also have implications for political representation—for example, the number of legislative seats from particular regions are regularly tied to the number of people who live there, which is determined by the census. Ethnic groups are often concentrated in particular administrative regions. In some countries, political representation is directly linked to the relative population shares of ethnic groups, such as via ethnic quotas.

Likewise, census and survey data, when publicly available, can be a powerful tool for social movement groups and other non-government actors and individuals. For instance, the data compiled in an official survey could be used by disadvantaged groups to document their disadvantage and to advance claims for greater support from government. Such data could also be interpreted to justify and solidify group-based grievances, a potentially powerful mobilization tool for ethnic leaders. Governments might see such activity as a major risk in countries with significant histories of ethnic tension and conflict, and thus restrict the collection and/or public release of ethnicity data. In contexts in which such major ethnic tensions exist alongside real threats of ethnic mobilization, this is arguably a wise approach for governments to take. The benefits of better information and the rights of citizens to that information need to be balanced against the risks and potential costs of major ethnic violence.

Likewise, particularly in situations of profound ethnic tension, respondents may be justifiably concerned when asked questions about ethnicity. In the United States during the Second World War, for instance, the Census Bureau provided the Secret Service with names and addresses of Japanese-Americans so they could be rounded up in internment camps (Minkel 2007). We should consider carefully the ethics and risks of collecting and making available ethnic data. Risks to respondents could be pronounced in countries where governments are dominated by particular ethnic groups and the rule of law is weak. A failure to protect data can also be unintentional, linked for instance to gaps in digital infrastructure (Bhatia 2018). Further, even when respondents’ concerns may be unfounded, they could arouse distrust and have implications for response rates.

Finally, the act of compiling such information, especially in official sources such as the census, could be—or could be seen as—nationally divisive (Lieberman and Singh 2016). In Rwanda, for instance, where official ethnic identification was a major component of the genocide, the official line has become: ‘There is no ethnicity here. We are all Rwandan’ (Lacey 2004). More broadly, ‘official’ projects like the census lend such subnational identities official legitimacy and could have an impact on their continued salience (see, e.g. Hochschild and Powell 2008; Mazumder 2018). In many countries, such as Tanzania, for instance, the building of a national identity—that supersedes subnational ethnic identities—has been seen as a government priority and one with important links to national development (Campbell 1999).

In other words, given the political salience of ethnicity, it is not uncommon for gaps in ethnic data to be intentional. Official actors may intentionally neglect to include questions on ethnicity, or to publicly release such data, or they may compile and release ethnic data on some types of groups but not others. Governments may likewise place restrictions on organizations and researchers operating in the country that prevent them from compiling and analysing such data. While such actions sometimes reflect illiberal efforts by state and government actors to suppress ethnic groups that might threaten their authority, this is not always the case. In situations of high ethnic tension, for instance, intentional gaps in ethnic data may be arguably in the public interest and have significant popular support. Given this third set of challenges, therefore, ‘data revolution for sustainable development’ or no, we can expect a lack of political will in some countries for filling data gaps in this area. While discussion about the data revolution for sustainable development tends to highlight technical and financial constraints to better data, the hardest constraints—for ethnic data at least—are political.

Another implication of this discussion is that it is precisely in contexts in which ethnicity is politically salient that quantitative data on ethnicity may be unavailable or incomplete in important ways. It is of course in such contexts that horizontal inequalities may pose the greatest risks and are thus most important for well-intentioned practitioners and policy-makers to somehow keep track of.

6 Conclusion

This article frames the contributions to this special issue by (1) providing an introduction to horizontal inequality and its measurement, with particular attention to the ‘toolkit’ of horizontal inequality measures used in the studies in this collection and (2) exploring empirical trends in available cross-national data, providing a comparative context for consideration of the countries explored in this collection. The article also builds on the studies in this collection and the broader research initiative of which it is a part to advance an argument about the key methodological, conceptual, and political challenges for survey and census data on topics relating to ‘ethnicity’ broadly defined. These challenges imply real limits in the so-called data revolution for sustainable development.

These challenges also suggest more broadly risks to theory building and ‘evidence-based’ policy making in this area when they rely too heavily on quantitative data. Collectively, the studies in this collection show not only that a lot can be learned from rigorous re-examination of existing datasets, but also that significant gaps remain. Such gaps stem from methodological and conceptual challenges, as well as—in particular—political challenges. Political challenges are likely to be especially pronounced in contexts in which ethnic tensions and risks are high.

Thus, survey and census data on ethnic issues may provide—intentionally or unintentionally—incomplete and biased views of on-the-ground realities. Furthermore, regardless of how much researchers want better data on ethnicity, it is worth considering whether collecting ethnic data in some contexts is in fact in the broader public interest and whether it presents risks for individuals and for societies.

Returning to the objective of this research project to address empirical gaps in our understanding of patterns and trends in horizontal inequality across and within countries of the Global South, we thus propose four ways forward.

First, more can be gleaned from focused analysis of existing sources to strengthen datasets in this area. Cross-national survey- and census-based databases on horizontal inequality in educational attainment are especially developed. Are the data sufficient to extend such databases beyond HI-E to other indicators such as poverty, income, or consumption?

Second, future data collection on ethnicity in many instances can be improved through careful consideration of the methodological, conceptual, and political issues outlined above. This might involve changes, for instance, in sampling methodology, questionnaire design, and survey implementation.

Third, more attention should be paid to comparing proxy measures of horizontal inequality against measures based on survey and census data. Such comparison is a useful way to explore construct validity (Thomas 2009). Do they really measure what they say they do?

Fourth, other types of systematic data should be explored and developed. In particular, efforts that rely on systematic coding of qualitative reports or assessment by independent experts, such as the Ethnic Power Relations (EPR), Minorities at Risk (MAR), and All Minorities at Risk (AMAR) databases (Birnir et al. 2018; Cederman et al. 2010; Gurr 1993), may be informative. Such sources have strengths and weaknesses, but in situations where major data gaps are due to the political challenges outlined above, such sources also can be invaluable.