Skip to main content

Reconstructing Historical Populations from Genealogical Data Files

  • Chapter
  • First Online:
Population Reconstruction

Abstract

Over the past two decades, a huge number of historical documents have been digitised and made available online. At the same time, numerous software options and websites have encouraged people to conduct research into their family trees, leading to a surge in the availability of genealogical data. A major advantage of genealogical data, from a scientific research perspective, is that it combines information from many sources into a format that is structured by family relations and descendancy, which is very useful for studying the dynamics of population change over the generations. A critical issue for researchers who want to use genealogical data is how to assess the quality of the data and put in place measures to correct the errors that we find in it. In this chapter, I present some of the methods that are being used to filter, clean and aggregate genealogical data to create large datasets that may be used across a diverse range of academic research disciplines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    According to Louis Kessler, an expert on the GEDCOM format, speaking at Gaenovium 2014, a genealogy technology conference held on 7 October 2014 in Leiden, The Netherlands.

  2. 2.

    Dutch NWO funded project conducted at University of Utrecht: Nature or nurture? A search for the institutional and biological determinants of life expectancy in Europe during the early modern period (276-53-008).

References

  • Bhattacharya, I., & Getoor, L. (2007). Query-time entity resolution. Journal of Artificial Intelligence Research, 30, 621–657.

    Google Scholar 

  • Christen, P. (2012). Data matching. Berlin: Springer. doi:10.1007/978-3-642-31164-2

    Google Scholar 

  • Fu, Z., Christen, P., & Boot, M. (2011). A supervised learning and group linking method for historical census household linkage. In Proceedings of the Ninth Australasian Data Mining Conference (Vol. 121, pp. 153–162). Australian Computer Society, Inc.

    Google Scholar 

  • Gavrilov, L. A. & Gavrilova, N. S. (2001). Biodemographic Study of Familial Determinants of Human Longevity. Population: An English Selection, 13(1), 197–221.

    Google Scholar 

  • Gavrilova, N. S., & Gavrilov, L. A. (2007). Search for predictors of exceptional human longevity. North American Actuarial Journal, 11(1), 49–67. doi:10.1080/10920277.2007.10597437

    Google Scholar 

  • Gellatly, C. (2009). Trends in population sex ratios may be explained by changes in the frequencies of polymorphic alleles of a sex ratio gene. Evolutionary Biology, 36(2), 190–200. doi:10.1007/s11692-008-9046-3

    Google Scholar 

  • Ivie, S., Pixton, B., & Giraud-Carrier, C. (2007). Metric-based data mining model for genealogical record linkage. In IRI 2007, IEEE international Conference on Infomation Reuse and Integration.

    Google Scholar 

  • Larmuseau, M. H. D., Van Geystelen, A., van Oven, M., & Decorte, R. (2013). Genetic genealogy comes of age: Perspectives on the use of deep-rooted pedigrees in human population genetics. American Journal of Physical Anthropology, 150(4), 505–511. doi:10.1002/ajpa.22233

    Google Scholar 

  • Moreau, C., Bhérer, C., Vézina, H., Jomphe, M., Labuda, D., & Excoffier, L. (2011). Deep human genealogies reveal a selective advantage to be on an expanding wave front. Science, 334(6059), 1148–1150. doi:10.1126/science.1212880

    Google Scholar 

  • Newcombe, H. B., Kennedy, J. M., Axford, S. J., & James, A. P. (1959). Automatic linkage of vital records: Computers can be used to extract “follow-up” statistics of families from files of routine records. Science, 130(3381), 954–959. doi:10.1126/science.130.3381.954

    Google Scholar 

  • Otterstrom, S. M., & Bunker, B. E. (2013). Genealogy, migration, and the intertwined geographies of personal pasts. Annals of the Association of American Geographers, 103(3), 544–569. doi:10.1080/00045608.2012.700607

    Google Scholar 

  • Post, W., van Poppel, F., van Imhoff, E., & Kruse, E. (1997). Reconstructing the extended kin-network in the Netherlands with genealogical data: methods, problems, and results. Population Studies, 51(3), 263–278. doi:10.1080/0032472031000150046

    Google Scholar 

  • United Nations. (1983). Manual X: Indirect techniques for demographic estimation. United Nations Publication.

    Google Scholar 

  • Zhao, Z. (1994). Demographic conditions and multi-generation households in Chinese history. Results from genealogical research and microsimulation. Population Studies, 48(3), 413–425. doi:10.1080/0032472031000147946

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Corry Gellatly .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gellatly, C. (2015). Reconstructing Historical Populations from Genealogical Data Files. In: Bloothooft, G., Christen, P., Mandemakers, K., Schraagen, M. (eds) Population Reconstruction. Springer, Cham. https://doi.org/10.1007/978-3-319-19884-2_6

Download citation

Publish with us

Policies and ethics