1 Introduction

In June 2003, in a special section of the journal American Anthropologist entitled “Did Boas Get It Right or Wrong?,” a debate played out between two teams of researchers under the journal’s rubric “Exchange Across Differences”. The subject of the debate was a large-scale anthropometric study carried out by the German-American anthropologist Franz Boas (1858–1942) on a cohort of European immigrants to the US and their American-born children (Gravlee et al. 2003b; Sparks and Jantz 2003). That Boas’s study, by then more than 90 years old, should still spark debate after such a long time is not surprising. Boas had found statistical evidence for slight but significant changes in physical traits such as head-form among descendants of immigrants pointing to changes in “type”. This finding formed one of the empirical cornerstones of his sustained critique of racial typologies (Boas 1911, p. 53–58), and this critique, in turn, has been framing debates among anthropologists and population geneticists about the biological and political meaningfulness of the concept of race to this very day (Jackson and Depew 2017). What is more surprising is that each of the two opposing teams of researchers reached their divergent assessments by an independent reanalysis of the very same data that Boas had collected in his original study.

What are the conditions that make such re-use of data across time and changing disciplinary contexts possible? Or to ask the same question in the terms favored by this volume: What enabled Boas’s data to journey from their original site of production in early twentieth-century New York, across the changing landscape of twentieth-century physical anthropology and human population genetics, and into the electronic databases of twenty-first century researchers? Historians and philosophers of science as well as STS scholars have emphasized in recent years the key role that metadata play in enabling data to travel. But metadata is a deceivingly simple concept; it is usually understood to refer to information that helps evaluating and analyzing data by providing information regarding the circumstances of their production (Leonelli 2014, 4–5; for a more detailed discussion, see Leonelli 2016, ch. 4). Complexities arise, however, from the fact that, in any given study, it is not obvious what counts as relevant metadata, and what standards should be followed in annotating them. While there exist regimes of data-production that can rely on notions of metadata that have remained stable for centuries – e.g. in bibliography or taxonomy – ongoing research often involves improvised and shifting sets of metadata (Edwards et al. 2011). What counts as data, and what counts as metadata – or what counts as product, and what counts as circumstance of a given experiment or observation – is hardly down to an analytical distinction, but depends on the theoretical perspective of, and questions being asked by, researchers.

In this chapter, I am going to explore these complexities by taking a close look at the data collected by Boas in his statistical studies of physical variation among different human “races” and “tribes,” and how these data were reused not only by Boas himself, but also by later researchers. Boas conducted a whole suite of anthropometric studies between 1890 and 1911, which all in all generated data from body measurements carried out on about 27,000 individuals. These anthropometric campaigns where funded by various organizations, including the British Association for the Advancement of Science, the Bureau of American Ethnology and the US Immigration Commission, and peaked twice: once, in 1891 and 1892, when Boas and about 50 field observers collected data on c. 12,000 persons of Native American origin; and a second time in 1909, when Boas took measurements on c. 10,000 immigrants to the United States and their children (for a succinct overview and assessment of Boas’s anthropometric surveys, see Jantz 2003).

I am going to approach this case study, first, by analyzing early programmatic statements by Boas that cast light on his statistical outlook on human diversity, which placed emphasis on individuals, not types. In the second section, I will zoom in on a sample of the data sheets that Boas used in his surveys in order to provide a detailed reconstruction of how the original data was coproduced by Boas, his field observers, and their informants. The final section will then look at how Boas, but also various anthropologists in the twentieth century, used this data to draw out a variety of general conclusions about the evolution of human populations. Sustained data journeys in human population studies, I will conclude, are not only made possible by “fixing” data once and for all in some durable numerical and tabular form, but more importantly by including qualitative “pattern data” in the study. “Pattern data” sit uneasily within the distinction of data and metadata, referring to structures within a population such as genealogical or geographic origin that can both be seen as data about the study subjects and describing the circumstances under which data are produced. They play a crucial role, however, in mobilizing data for re-use by allowing for the flexible re-arrangement of data in order to employ new statistical methods or address new questions.

2 Boas’s Statistical Outlook

George Stocking has assigned Boas the role of a founding father of the “modern anthropological culture concept” characterized by “historicity, plurality, behavioral determinism, integration, and relativism” (Stocking Jr 1983, 230). At the same time, Stocking has portrayed Boas’s work in physical anthropology as instrumental in the “passing of a romantic conception of race – of the ideas of racial ‘essence,’ of racial ‘genius,’ of racial ‘soul,’ of race as a supra-individual organic identity.” In particular it was Boas’s statistical approach that was, as Stocking put it, “subversive of traditional racial assumptions” (Ibid., 192–94; see also Xie 1988). And this “critique of racial formalism”, as he dubbed it, was not just theoretical. Boas, as we will see in the next two sections of this chapter, was an ardent and up-to-date practitioner of physical anthropology and biometry, highly aware of the intricate problems of the “personal equation” involved in anthropometric measurement, innovative in the design of anthropometric surveys, and creating new mathematical and visual tools for studying statistical correlations. But in order to understand his statistical approach, it is useful to leave anthropometry aside and turn to some early programmatic statements in which Boas advocated the use of statistical methods for the study of culture.

In 1887, Boas became involved in a debate about museum displays (Jacknis 1985; Jenkins 1994). Otis Tufton Mason, curator of ethnology at the Smithsonian Institution, had suggested to arrange ethnological displays at the United States National Museum according to a classification of the objects displayed; exemplars of different varieties of artifacts, he maintained, should be arranged in series, each representing a stage in the evolution of its kind; the rationale on which this presentation rested was borrowed from evolutionary biology. As Boas quoted Mason (without specifying his source):

[Human inventions] may be divided into families, genera, and species. They may be studied in their several ontogenies (that is we may watch the unfolding of each individual thing from its raw material to its finished production). They may be regarded as the products of specific evolution out of natural objects serving human wants and up to the most delicate machine performing the same function. They may be modified by their relationship, one to another, in sets, outfits, apparatus, just as the insect and flower are co-ordinately transformed. They observe the law of change under environment and geographical distribution. (Boas 1887a, 485)

The alternative Boas proposed was to arrange collections “according to tribes, in order to teach the peculiar style of each group.” The reasons he adduced for this position were epistemological:

In regarding the technological phenomenon as a biological specimen, and trying to classify it, [Mason] introduces the rigid abstractions species, genus, and family into ethnology, the true meaning of which it took so long to understand. It is only since the development of the evolutional [sic] theory that it became clear that the object of study is the individual, not abstractions from the individual under observation. We have to study each ethnological specimen individually in its history and in its medium […]. Our objection to Mason’s idea is, that classification is not explanation. (Ibid. 485)

This seems to be a strange way of reasoning: first of all, “studying each ethnological specimen individually in its history and in its medium” would, taken literally, be an endless task, and both Mason as well as other participants in the debate pointed out the practical difficulties that an arrangement by tribes would imply (Dall 1887, 587; Powell 1887, 612–13). Secondly, an arrangement according to tribes seems to involve as much classification as that proposed by Mason. What, one can ask, defines a “tribe,” especially since tribal identity is highly fluid over time? Also this criticism was raised in the debate, accompanied by the remarkable observation that “a museum collected to represent the tribes of America … to be properly representative, would have to be collected as the census of the native inhabitants of India has been taken, all in one day, by an army of collectors” (Powell 1887, 612). To this criticism, Boas only had a short, categorical reply: “Such groups [i.e. tribes, and groups of tribes] are not at all intended to be classifications” (Boas 1887b, 614).

Boas’s studies of native myths along the North-Pacific coast carried out between 1888 and 1895 can serve as an example to elucidate what he had in mind with this strange assertion. In these studies, Boas broke down the myths into constituent “elements” and recorded their distribution within a group of geographically contiguous “tribes”. “We can in this manner,” as Boas explained in a paper summarizing the results of his mythological studies, “trace what we might call a dwindling down of an elaborate cyclus [sic] of myths to mere adventures, or even to incidents of adventures, and we can follow the process step by step.” In more detail, he described this method as follows:

If we have a full collection of the tales and myths of all the tribes of a certain region, and then tabulate the number of incidents which all the collections from each tribe have in common with any selected tribe, the number of common incidents will be larger the more intimate the relation of the two tribes and the nearer they live together. This is what we observe in a tabulation of the material collected at the North Pacific Coast. On the whole, the nearer the people, the greater the number of common elements; the farther apart, the less the number. (Boas 1896, 2–3)

The article from which this quote is taken does not contain any “tabulation,” but so does a German monograph to which it refers and that Boas had put together in 1895 from earlier reports documenting North Western myths in the Proceedings of the Berlin Society for Anthropology, Ethnology and Prehistory (on Boas’s early publication strategy, which relied on German academic periodicals, see L. Müller-Wille 2014). From the tables included in the final chapter of this monograph, it becomes clear that, for Boas, it was the unequal distribution of “incidents of adventures” that defined them as constituent elements of myths in the first place. The table arranges the data – page references to the preceding collection of tales that Boas had “recorded from the mouth of Indians” (Boas 1895a, v: aus dem Munde der Indianer aufgezeichnet) during field research in the late 1880s – in such a way that one can immediately see how the full cycle of a particular myth is present in a small group of neighboring tribes while it “dwindles down ... to mere adventures, or even to incidents of adventures” to the left and the right of the table occupied by more distantly related tribal groups (see Fig. 1).

Fig. 1
A table with German text. There are nine entries in the table, with data spread over eight columns.

Table from Franz Boas, Indianische Sagen von der Nord-Pacifischen Küste Amerikas (Berlin: A. Asher, 1895a), pp. 338–39. The columns relate to groups of tribes, the rows to narrative elements of the myth in question. The full suite of incidents making up the myth is only prevalent among the Kwakiutl, while individual elements can be found in more distant tribes. The fields of the table contain page references to the preceding collection of mythical material collected and documented by Boas

Such a “statistical inquiry”, as Boas called his investigation of Northwestern myths (Boas 1896, 3) rested on a “fundamental condition”, which “differentiates our method from other investigators […], who see a proof of dissemination or even blood relationship in each similarity that is found between a certain tribe and any other tribe of the globe.” The material, on which an investigation was based, had to be “collected in contiguous areas” (Ibid., p. 6). This contiguity was largely, but not necessarily a geographical one, as Boas emphasized; in addition, marriage, kinship, and social structure entered the picture. “The social customs of the Kwakiutl” – the ethnic group most intensely studied by Boas during several field trips – are, he maintained, “based entirely upon the division into clans and the ranking of each individual is the higher – at least to a certain extent – the more important the legend of the clan.” Moreover, “the customs of the tribe are such that by means of a marriage the young husband acquires the clan legends of his wife, and the warrior who slays an enemy those of the person whom he has slain. By this means a large number of traditions of the neighboring tribes have been incorporated in the mythology of the Kwakiutl” (Ibid., p. 8–9). The clan system that Boas had detected among the Kwakiutl was actually even more complex than described in this quote; through marriage, the husband did not personally acquire the clan status of his wife, but he acquired it “for his son” (Boas 1897, 334–35).

By relating the distribution of mythical elements to a space whose contiguity could be ascertained in terms of geographic and socio-political relations among individuals – alliances as well as antagonisms – Boas wanted to circumvent the pitfalls of analogical reasoning in anthropology that he warned his colleagues of in the museum debate. Ironically, however, the grand picture that Boas came up with on the basis of this approach was disconcertingly fractional, and Boas would eventually give up his initial attempt to reduce the data he was presented with to some universal transmission pattern (Levi-Strauss 1988). Rationalizations of myths, whether proposed by anthropologist observers, or by the observed informants themselves, were not to be trusted:

A great many [...] important legends prove to be of foreign origin, being grafted upon mythologies of various tribes. This being the case, I draw the conclusion that the mythologies of the various tribes as we can find them now are not organic growths, but have gradually developed and obtained their present form by accretion of foreign material. Much of this material must have been adopted ready-made […]. We are, therefore, led to the conclusion that from mythologies in their present form it is impossible to derive the conclusion that they are mythological explanations of phenomena of nature […], but that many of them, at the place where we find them now, never had such a meaning. If we acknowledge this conclusion as correct, we must […] admit that, also, explanations given by the Indians themselves are often secondary, and do not reflect the true origin of the myths. (Boas 1896, 5)

What is remarkable about Boas’s “statistical inquiry” into myth is that it did not rest content with just collecting and reproducing mythical material. In order to be useful for the kind of comparative and critical analysis that Boas accomplished, this material had to be accompanied by information on how myths were produced and communicated in the places and communities from which they were originally recorded. As Stocking Jr (1974, 8) has argued, for Boas, integration of data “was not a matter of necessary or logical relations of elements.” He favored “historical integration” instead, notwithstanding, or rather, precisely because of his statistical approach.

3 Boas’s Data Sheets

From Boas’s anthropometric surveys, a large number of original data sheets have been preserved in the archives of the American Museum for Natural History and the American Philosophical Society in Philadelphia (Jantz et al. 1992, 437). Many of these pertain to Native American tribes, and were produced in anthropometric campaigns carried out in 1891 and 1892 by Boas in preparation of an exhibition on physical anthropology he had been commissioned to organize for the World’s Columbian Exposition, which was held in Chicago in 1893 (Jacknis 1985).

In this section, I am going to offer a description and detailed analysis of a small sub-set of these data sheets in order to reconstruct not only what data Boas collected, but also how he did it, paying particular attention to the role that field observers, but also informants, played in the generation of the data. The subset in question is preserved at the American Philosophical Society, and pertains to Chickasaw individuals living in Stonewall and Tishomingo in the Indian Territory, now Oklahoma.Footnote 1 The Chickasaw had been forced to remove from the Southeastern Woodlands in 1832, and decided to settle with a closely related tribe, the Choctaw, in the Indian Territory. By the 1850s they had established settlements, including Tishomingo as the capital, and successfully resisted subsequent attempts to merge them with the Choctaw, forming a polity of their own to this day (St. Jean 2011).

The data sheets consist of forms printed on both sides that were filled out by hand (see Fig. 2). At the front top of the form, the field observer is asked to “[n]umber each record and write your name after number.” From this, we know that one “Richard T. Buchanan,” who indeed entered a serial number for each record taken, collected the data for Chickasaw.Footnote 2 The form consists of three sections: a first one providing metadata in the form of “Place” and “Date of observation” as well as information on the person observed: “Name of individual recorded,” “Age,” “Tribe,” “Tribe of father,” “Tribe of mother,” relationships to other persons recorded (“Mother of … Daughter of … Sister of …”), “Mode of Life” and finally, “Number of sons” and “daughters”. This is followed by a section that records a large number of qualitative physical traits, such as hair form or eye color, by offering default descriptive categories for selection. In the case of skin color, a color chart seems to have been used, as indicated by the numerals used to describe this parameter. Only then, on the verso side of the sheet, follow anthropometric variables in a separate section entitled “Measurements.” The first six of these refer to overall stature, followed by six further measurements taken on head, face and nose. A third and final section is entitled “Indices,” and separated from the rest of the form by a horizontal line above which it states that the field observer is not to pay “attention … to lines below this rule.” As we will see further below, this section was reserved for Boas to process the data given by the measurements. The form ends with the prompt “This form when filled to be returned to Franz Boas, Worcester, Mass.” There are separate forms for males and females, since some of the kin designations used, as well as some of the qualitative traits differ by gender (male forms ask for detailed information on “Beard,” for example).

Fig. 2
Two medical forms. Form 1 has the basic information, and form 2 has the measurements.

Recto and verso of the data collection forms used by Franz Boas in anthropometric surveys in 1891 and 1892. Franz Boas field notebooks and anthropometric data, American Philosophical Society, Box 2, Anthropometric Data Sheets Recorded at Stonewall and Tishimingo, Indian Territory (Oklahoma). The name has been blackened by the author for anonymization. With kind permission by the American Philosophical Society

What is particularly striking in the series of filled out forms is that a lot of effort was spent on ascertaining the genealogical relationships between recorded individuals. Alongside relatively straightforward parental and sibling relationships, the categories of “Tribe of father” and “Tribe of mother” provide the most intriguing information in this respect. As one might expect, for each and every one of the individuals measured, “Chickasaw” is stated for the tribe he or she belongs to. Yet, answers to the questions relating to the “tribe” of their mother and father reveal very complex, mixed ancestral backgrounds. “Half-breed” is a frequently recurring designation. On the sheet reproduced in Fig. 2, for example, it is given as an answer for “Tribe of father” while the mother is stated as being “Chickasaw.” If one looks at the records of the children of the female in question (sheets no. 8, 9 and 10), which record “\( \raisebox{1ex}{$3$}\!\left/ \!\raisebox{-1ex}{$4$}\right. \) Chickasaw \( \raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$4$}\right. \) white” for the tribe of mother, “half-breed” reveals itself as referring to individuals with one parent of Native American descent and one parent of European descent. Many sheets also record mixed Chickasaw and Choctaw, or “Choctaw half-breed,” ancestry (sheet no. 14). The Chickasaw had been a slave-owning tribe, and while in contrast to the Choctaw they did not adopt their freedmen after emancipation in 1863 (St. Jean 2011, ch. 3), one does find quite a number of sheets were the tribe of father or mother is stated as “Chickasaw and negro” (e.g. sheets no. 2–3 and 18–19). Such assessments of mixed ancestry could reach considerable complexity. On one sheet, the tribe of mother is recorded as “\( \raisebox{1ex}{$2$}\!\left/ \!\raisebox{-1ex}{$3$}\right. \) Chickasaw \( \raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$3$}\right. \) white” (sheet 35). The only way to make sense of this proportion is to note that cousin marriage in the parental generation reduces the number of grandparents to six individuals, and to assume that ancestry may have been described with reference to the grandparents.

Although relatively little genealogical information is asked for by the form itself, the answers thus allow to carry out an almost complete analysis of kin relations within the cohort studied that reaches back to the grandparental, and in some cases, great-grandparental generation. Husband-wife relationships are not recorded, but can be inferred, though slightly tediously, by comparing number of children and ancestry. Conveniently, the data seems to have been recorded household by household, so that sheets for parents and their children often follow each other consecutively (e.g. sheets no. 7–10). Boas’s anthropometric field campaigns were probably modeled on the 1890 US Census, which also asked for genealogical (“relationship to head of family”) and racial information (“whether white black mulatto, quadroon, octeroon, Chinese, Japanese, or Indian”). He could thus assume that both his field observers, and their informants, were used to these kind of questions, the logic of genealogical analysis they presupposed, and the procedure of filling in a questionnaire.Footnote 3

The apparent discrepancy between assigning “Chickasaw” as the tribe of persons recorded, and the mixed ancestry of their parents, is easily explained. The Chickasaw were organized by exogamous matrilineal clans, and both tribal affiliation and belongings were passed on along maternal lines (Champagne 1992, 40–41). One can therefore safely assume that the persons recorded regarded not only themselves, but also their parents as Chickasaw, as long as their parent’s mothers in turn were Chickasaw. This raises the suspicion that the information on mixed ancestry might actually not have been provided by them, but by the observer. That this is not the case, however, is evident from occasional notes that the observer jotted down in the space left in the form between “Measurements” and the section “Indices,” which was reserved for Boas’s calculations. In these notes, Buchanan expressed doubts about the information filled in under “Tribe of father” and “Tribe of mother”. Sheet no. 26, for example, where tribe of father is given as “½ Chickasaw ½ Choctaw” and tribe of mother as “Chickasaw” carries the following statement on its verso side: “The gentleman says he has heard two parents say that they had some French blood in them. He shows white blood.” In another case, where tribe of both father and mother is stated as “Chickasaw,” Buchannan added the note “Negro blood appears in complexion … and in shape of face” (sheet no. 53). Tribal (or racial) affiliation seems occasionally to have been a matter of negotiation between observer and informant, but in the end, the latter seems to have had the last word when it came to filling in the answers on the top front of the sheet.

This reflects the role of mere “recorders” that Boas assigned to his field assistants. The sections of the forms dedicated to qualitative traits and anthropometric measurements leave no freedom to add personal observations. It is well known that Boas especially trained the 50 or so assistants that collected data for him in preparation of the World’s Columbian Exhibition. He also modified the instruments used for measurements (Boas 1890), restricted himself to measurements where “the starting points are easily ascertained,” and had the assistants perform measurements on each other, or let two observers take measurements on the same set of persons; all this in order to “reduce the personal equation, as far as possible, to a minimum” (die persönliche Gleichung möglichst auf ein Minimum zu reduzieren). In addition, he restricted his survey to measurements that could be performed “without disrobement” (ohne Entkleidung), since this would “necessarily limit the number of measured individuals” (Boas 1895b, 367). Minimizing the personal equation and maximizing population number thus actually lead to a very impoverished ontology. Observers were reduced to deleting descriptive categories prescribed on the form, and filling in a small number of mechanical measurements in generating data on the survey subjects.

Yet Boas was able, as we will see in the next section, to make a lot out of his data. Hints at how he proceeded in this can be found in the subset of data sheets on the Chickasaw. Only about a third of these show entries in Boas’s hand in the section headed “Indices.” Many of these entries calculate the cephalic index, i.e. the ratio of breadth of head to the length of head, and some of them the facial index in addition. What is striking about these entries is that they exclusively appear on data sheets on which the tribe of both father and mother is stated as “Chickasaw” and/or “Choctaw.” In processing data, Boas apparently proceeded by grouping the data sheets, in this case separating sheets on individuals of “pure” Native American descent from those on individuals with mixed racial backgrounds. And in making this decision, Boas exclusively trusted the genealogical information that informants provided. The data sheets mentioned above, on which Buchanan had expressed his doubts about possible admixture of “French” and “negro blood,” were included in the set of sheets that Boas processed in order to calculate the cephalic index. Even here, the observer’s personal expertise was erased in favor of “self-identified” tribe or race, as we would say today.

4 Use and Re-use of Data

Forensic anthropologist Richard L. Jantz, who has probably done more than anyone else for recovering Boas’s data from the obscurity of historical archives, expresses great admiration for the “incredible computational feat” that Boas achieved by computing “the means of height and cranial index for some 4000 individuals distributed over 60 tribes, all with pencil and paper” (Jantz 2003, 279). In this, he is referring to the only article in which Boas summarized results from the anthropometric survey carried out in preparation of the World’s Columbian Exhibition. It appeared in the German Journal for Ethnology (Zeitschrift für Ethnologie) in 1895, and made liberal use of tables and curve diagrams to synthesize the findings, some of which had probably already been presented in the Physical Anthropology Department of the Exposition.Footnote 4

Boas admitted right away in this article, that the qualitative data he had collected varied too much between observers to deliver comparable results. The article therefore focused exclusively on body stature and head form, but considered these two variables not only in populations of Native American adults, but in addition in children and in “mixed bloods between Indians and other races, especially whites” (Boas 1895b, 367). The first table spanned four pages, and showed the number of individuals measured, averages, and percentaged distribution of stature in steps of 1 cm for 62 “tribes” in columns that were roughly arranged by geographic location on an East to West axis. This table was complemented by “curve plates” (Kurventafeln) that showed the distribution for each individual “tribe” (see Fig. 3). Boas then proceeded to break down his overall population, as well as populations constituting individual “tribes,” by age, gender and racial descent (“full-blooded” [Vollblut-] vs. “half-blooded Indians” [Halbblut-Indianer], ibid. p. 381) in ever more complex ways. The same procedure was then repeated for head form (cephalic index) and breadth of face. He found evidence, that both parameters were influenced by environmental and hereditary factors (Ibid., 376). Their distribution at the West Coast in particular showed similar complex geographic patterns as the ones he observed in his linguistic and mythological studies, but notably without simply reproducing the latter (Ibid., p. 402). One feature that particularly fascinated Boas was that distribution curves of “half-blooded” individuals did not show simple blending of parental types, but usually two maxima indicating a “law of inheritance” according to which a “reversion (Rückkehr) generally occurs towards the parental form” (Ibid., 406). He pursued this topic by looking at the distribution of breadth of face. By “classifying mixed-bloods in such a way, that one group includes individuals which have more than half of Indian blood, and other individuals, which have half or less Indian blood” he even tried to demonstrate that the “Indian type” was characterized by a “stronger hereditary force” (grössere Vererbungskraft) with regard to this trait (see Fig. 4).

Fig. 3
Ten line-graphs. The graph plots height, in centimeters versus percentages. Each curve peaks at a point and dips as the graph progresses.

Curve diagrams showing distribution of body height in several North American “tribes”. The vertical axis gives percentage, the horizontal axis body height in centimeters. From Franz Boas, “Zur Anthropologie der nordamerikanischen Indianer.” Zeitschrift für Ethnologie 27 (1895b), p. 373

Fig. 4
A table. There are 22 entries in the table, with data spread over 4 columns. The column headers are: m m, Vollblut, 3 by 4 Blut, 3 by 8 Blut.

Table showing distribution of breadth of face for Ojibwa-men of different racial ancestry (“fullblooded”, “\( \raisebox{1ex}{$3$}\!\left/ \!\raisebox{-1ex}{$4$}\right. \) blooded” and “\( \raisebox{1ex}{$3$}\!\left/ \!\raisebox{-1ex}{$8$}\right. \)-blooded”). From Franz Boas, “Zur Anthropologie der nordamerikanischen Indianer.” Zeitschrift für Ethnologie 27 (1895b), p. 410

Jantz, like many others, has claimed that the 1895 article is the only one in which Boas presented and analyzed his data in detail (Jantz 2003, 279). This is not quite true. Rather, Boas seems to have re-used the data quite opportunistically in a number of publications to make particular points. Already in 1891, he published a very concise paper in the journal Science in which he argued for reversion, rather than blending, in human inheritance based on cephalic index data from “Oregonian Athapascans,” “Northern Californians” and their “crosses” (Boas 1891). In 1894, he published an article entitled “The Half-Blood Indian: An Anthropometric Study” that made many of the points, and contained many of the illustrations, of the 1895 article, but started off with a curve diagram showing “number of children of Indian Women and of Half-Blood Women” in order to disprove the common belief that “hybrid races show a decrease in fertility” (Boas 1894, 2). And what is perhaps Boas’s most important anthropometric paper, a critique of the significance of the cephalic index for indicating human types, was also based on data he had gathered in his field campaigns, now of course relating to “full-blooded” individuals, because the critique could otherwise easily have been fended off by maintaining that stability of type is generally compromised by mixture. In all of these cases, the relatively impoverished base of data was compensated by the myriad ways in which it was classified with respect to information collected on the measured individuals – place of birth, age, gender, and ancestry, in particular. While the data on physical traits covered few properties only, this data revealed that the populations under scrutiny were rich in structures that could be deployed again and again to answer different research questions relating to the role of environment and inheritance.

After 1900, Boas’s interest in the physical anthropology of Native Americans seems to have dwindled. The number of preserved observations abruptly drops to 2 only in 1901, and then there are none for the remaining years. The reasons for this may have been political: With the Jim Crow laws and legislation increasingly enforcing allotment of tribal land to individual tribe members who could prove their “purity of blood” (Curtis Act 1898), having one’s ancestry “questioned” was increasingly becoming a highly delicate matter (see St Jean 2011, 55, for the Chickasaw). The fraught relationship between Native Americans and their “scientific observers” that this new situation must have created continues to this day and was exacerbated by the highly publicized conflicts around large-scale human genetics projects such as the Human Genome Diversity Project and the Genographic Project in the early 1990s (Reardon and TallBear 2012). It is therefore not surprising that the data from Boas’s anthropometric survey have been eagerly taken up by anthropologists in past decades. While I have not found any direct mention of these conflicts in the sources I have worked with, it is revealing that one of them mentions in passing that “Boas’s data offer the only opportunity for systematic examination of anthropometric variation among North American Indians” (Jantz et al. 1992, 456; my emphasis).

A team of postgraduate students and researchers around Jantz was the first to convert Boas’s data into a “computerized database”, retaining data for “individuals who could be considered full-blooded” only, and replacing “obvious outlying values” with values “predicted from all others” following accepted statistical procedures not available to Boas (Jantz et al. 1992, 439, 442). Their analysis revealed that the data showed “strong geographic patterning” supporting “climate-morphology correlations” with exception of head-shape which showed “considerable intertribal variation” (ibid., 457). Lyle W. Konigsberg and Stephen D. Ousley – noting their gratefulness to Boas for “what, for the time, was an unusual inclusion of pedigree data” (1995, 481; cf. Jantz 1995, 351) and to Jantz for granting them access to this data in its electronic form – used a small subset of the data to test an important assumption in quantitative genetics about the proportionality between phenotypic and genetic co-variation. Using a subset of data for five “tribes”, and normalizing it for sex and age, their mathematically sophisticated paper provides an impressive example for the degree to which Boas’s data rendered itself amenable to the application of complex genealogical matrices (ibid., 484–485). Yet another research agenda was pursued by economic historians who drew conclusions about the historical development of nutritional status and living standards among nineteenth-century First Nations by looking at the variation of body height across time and across tribes, again thanking Jantz for granting them access to the data (Steckel 2010, 267; Carlson and Komlos 2014, 158).

The end of studies on Native Americans in 1901 did not mean the end of Boas’s interest in physical anthropology. Instead, he changed subject. Reducing the number of anthropometric variables even further, he carried out an anthropometric survey on some 16,000 immigrants from Eastern Europe and Italy and their children in order to determine whether the new environment they entered resulted in a change of physical type (Boas 1912). With their obvious political significance – Boas’s conclusions became part of the idea of America as a “melting pot” –, these studies as well have invited reanalysis again and again, especially since Boas took the unusual step to publish his raw data (Boas 1928). R. A. Fisher was among those who re-used this data to throw doubt on Boas’s conclusions. Part of the argument pertained to the quality of Boas’s data; Fisher and his collaborator Horace Gray, a medical doctor from Stanford University Hospital, suspected that it was compromised by wrongly reported paternity and inter-observer variability. This did not keep them, however, from subjecting it to Fisher’s “method of analysis of variance” for the purpose of making this point by demonstrating that variability and regressions within families did not meet expectations informed by “previous biometrical work” (Fisher and Gray 1937, 92). An earlier study by Geoffrey Mackay Morant, a student of Karl Pearson, and Otto Samson had followed a similar strategy, arguing that Boas’s results had been confounded by variation in age and sex while using his published raw data as evidence in favor of precisely this claim (Morant and Samson 1936).

Fisher and Gray’s doubts let me return to the recent debate about whether Boas got it “right or wrong” with which I opened this chapter. The debate was sparked by another paper by Jantz, co-authored with a former graduate student of his department, Corey S. Sparks, in which the authors tested what they took to be the central conclusion of Boas’s immigrant study, namely that it demonstrated “the plastic nature of the human body in response to changes in the environment.” They did so by reassessing his data “within a modern statistical and quantitative genetic framework”, in particular “using pedigree information contained in Boas’ data [to estimate] narrow sense heritability”. The outcome was negative, with results indicating “very small and insignificant differences between European- and American-born offspring, and no effect of exposure to the American environment on the cranial index” (Sparks and Jantz 2002, 14, 636).

Unbeknownst to Sparks and Jantz, three other researchers had been carrying out a similar re-analysis on Boas’s published data, the results of which they published in the March 2003 issue of American Anthropologist. “Using methods unavailable to Boas,” just like Sparks and Jantz were doing, medical anthropologist Clarence C. Gravlee and his co-authors were led to the opposite conclusion, namely that “modern analytical methods provide stronger support for Boas’s conclusion than did the tools at his disposal” (Gravlee et al. 2003a, 125). In the ensuing exchange between the two sets of authors, which was published in the June issue of American Anthropologist, some degree of reconciliation was reached by agreeing that Boas’s claim that human head form changed with immigration was generally confirmed by his data, but that doubts remained regarding the biological significance of these changes and the nature of the causes responsible for them (Gravlee et al. 2003b, 331; Sparks and Jantz 2003, 335). What is notable about this reconciliation is that it did not hinge so much on the data used, than on the questions being asked from it. Gravelee et al. had set out to test claims that Boas had expressly made, whereas Sparks and Jantz questioned a common assumption about these claims that over the 90 years that had passed since Boas study had become part and parcel of disciplinary lore and that they considered “a burr in our bed for 90 years” (Holden 2002).

5 Conclusion

If we consider the “journey” of Boas’s data as a unit of analysis, as suggested by Sabina Leonelli in the introduction to this volume, it is a journey in which the body of Boas’s data as a whole did not remain untouched. Quite on the contrary, that body of data was variously partitioned, cleansed of outliers, adjusted for confounding variables like age or sex, and processed by a bewildering range of statistical procedures to produce ever new numerical and visual representations.Footnote 5 In part, as evidenced by the re-use of Boas’s immigrant data, these renewed analyses of historical data sets were motivated by the impact that his critique of the race concept had on the disciplinary self-understanding of American anthropologists. The relevance of Boas’s data may hence be seen to be partly due to their direct relevance for a framework of concepts and theories that had been travelling alongside them through the twentieth century. But even those who disagreed with this framework, like Fisher or Jantz, made use of the data, and it was also used by researchers in other disciplines, like quantitative genetics or economic history. What generated this astounding surplus of Boas’s data in terms of usability in a variety of theoretical and disciplinary contexts?

My case study suggests two points in response to this question. The first concerns the importance of what I suggest to call “pattern data” for making data relevant to a variety of contexts. These are data that do not describe single, manifest properties of individual entities under scrutiny, but rather relationships among them, and hence precisely occupy the middle ground between data and metadata that I have outlined at the outset of this article. Typically, they relate to categories that allow researchers to group the subjects under study, and hence the data produced about them, in a variety of ways that are believed to be of causal relevance for similarities exhibited among these subjects.Footnote 6 Thus, Boas’s anthropometric surveys are not only renowned today for their sheer scale, but also for their careful design which included collection of basic geographical and genealogical information on observed individuals (Jantz 2003, 280).Footnote 7 This information allowed Boas to classify the data that he had collected on a modest number of anthropometric variables in ways that enabled him to address an array of questions in his publications, and also explains why later researchers could turn to his data again and again to carry out new research. The poverty of data on physical traits collected by Boas, that is, was compensated by the richness of pattern data that allowed for meaningful classification (on the significance of data classification, see also Leonelli 2012; Müller-Wille 2018).

However, this richness – and this is my second point – depended on information that was provided in situational contexts in which the measured individuals themselves took on an active role, rather than simply being the passive subjects of measurement procedures. The “pattern data” Boas used in his anthropometric survey, that is, were “given” in a literal sense; in contrast to the data on stature and head form, which was extracted from individuals in a more or less mechanical manner, information on age, sex, birth place, next-of-kin, as well as tribal and racial affiliation had to rely on interviews, and was hence partly informed by common-sense notions of the persons observed. While these categories proved to be an extremely versatile tool for classifying the data in ever new ways, it also irretrievably tied it to the historical context of its production. Especially tribal and racial affiliation are categories the meaning of which, at any given point in time, has been molded by centuries of political struggle and whose application will continue to be of political relevance.

The tens of thousands of datasheets that Boas took care to preserve in his papers, and that are still accessible to researchers, thus does not only form a repository of data to be explored scientifically for what it tells us about the physical appearance and genetic constitution of historic populations. Every single sheet also gives us glimpses of the life story of an individual person, and the collection of datasheets as a whole therefore forms a historical archive in its own right that can also be used to reconstruct the power relations that informed the original surveys. It is therefore unlikely that any answer to the question whether Boas got it “right or wrong” will ever bring the journey of his data to a an end. They will remain relevant as long as the historical circumstances under which they were produced, and the intervention that Boas and his collaborators made on these circumstances through their surveys, have historical bearing for the present situation.