Limits to the Pursuit of Reproducibility: Emergent Data-Scarce Domains of Science

  • Peter T. DarchEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10766)


Recommendations and interventions to promote reproducibility in science have so far largely been formulated in the context of well-established domains characterized by data- and computationally-intensive methods. However, much promising research occurs in little data domains that are emergent and experience data scarcity. This paper presents a longitudinal study of such a domain, deep subseafloor biosphere research. Two important challenges this domain faces in establishing itself are increasing production and circulation of data, and strengthening relationships between domain researchers. Some potential interventions to promote reproducibility may also help the domain to establish itself. However, other potential interventions could profoundly damage the domain’s long-term prospects of maturation by impeding production of new data and undermining critical relationships between researchers. This paper challenges the dominant framing of the pursuit of reproducible science as identifying, and overcoming, barriers to reproducibility. Instead, those interested in pursuing reproducibility in a domain should take into account multiple aspects of that domain’s epistemic culture to avoid negative unintended consequences. Further, pursuing reproducibility is premature for emergent, data-scarce domains: scarce resources should instead be invested to help these domains to mature, for instance by addressing data scarcity.


Reproducibility Data reuse Little data Open code Open data 



This work is funded by the Alfred P. Sloan Foundation (Awards #20113194, #201514001). Thank you to current members of UCLA Center for Knowledge Infrastructures (CKI) for comments on earlier drafts of this paper (Christine L. Borgman, Bernie Boscoe, Milena S. Golshan, Irene Pasquetto, and Michael J. Scroggins), to past members of CKI (Ashley E. Sands and Sharon Traweek) for discussion of ideas, and to Rebekah L. Cummings for assistance with data collection. Thank you also to the C-DEBI and IODP personnel who were observed and interviewed.


  1. 1.
    Borgman, C.L.: Big Data, Little Data, No Data: Scholarship in the Networked World. The MIT Press, Cambridge (2015)Google Scholar
  2. 2.
    Vitale, C.R.: Is research reproducibility the new data management for libraries? Bull. Assoc. Inf. Sci. Technol. 42(3), 38–41 (2016)Google Scholar
  3. 3.
    Baker, M.: 1,500 scientists lift the lid on reproducibility. Nat. News 533(7604), 452 (2016)CrossRefGoogle Scholar
  4. 4.
    Pellizzari, E., Lohr, K.N., Blatecky, A., Creel, D.: Reproducibility: A Primer on Semantics and Implications for Research, 1st edn. RTI Press/RTI International, Research Triangle Park (2017)CrossRefGoogle Scholar
  5. 5.
    Stodden, V., Leisch, F., Peng, R.D. (eds.): Implementing Reproducible Research. CRC Press, Boca Raton (2014)Google Scholar
  6. 6.
    Knorr-Cetina, K.: Epistemic Cultures: How the Sciences Make Knowledge. Harvard University Press, Cambridge (1999)Google Scholar
  7. 7.
    Lenoir, T.: Instituting Science: The Cultural Production of Scientific Disciplines. Stanford University Press, Stanford (1997)Google Scholar
  8. 8.
    Wallis, J.C., Rolando, E., Borgman, C.L.: If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology. PLoS ONE 8(7), e67332 (2013)CrossRefGoogle Scholar
  9. 9.
    Stodden, V.: Resolving irreproducibility in empirical and computational research. IMS Bull. Online (2013)Google Scholar
  10. 10.
    Ram, K., Marwick, B.: Building towards a future where reproducible, open science is the norm. In: Kitzes, J., Turek, D., Deniz, F. (eds.) The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, pp. 69–78. University of California Press, Oakland (2018)Google Scholar
  11. 11.
    Ince, D.C., Hatton, L., Graham-Cumming, J.: The case for open computer programs. Nature 482(7386), 485–488 (2012)CrossRefGoogle Scholar
  12. 12.
    Stodden, V., et al.: Enhancing reproducibility for computational methods. Science 354(6317), 1240–1241 (2016)CrossRefGoogle Scholar
  13. 13.
    Kahneman, D.: A new etiquette for replication. Soc. Psychol. 45(4), 310 (2014)Google Scholar
  14. 14.
    Marwick, B.: Computational reproducibility in archaeological research: basic principles and a case study of their implementation. J. Archaeol. Method Theory 24(2), 424–450 (2017)CrossRefGoogle Scholar
  15. 15.
    Kitzes, J., Turek, D., Deniz, F. (eds.): The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Univ of California Press, Oakland (2018)Google Scholar
  16. 16.
    Darch, P.T., Borgman, C.L.: Ship space to database: emerging infrastructures for studies of the deep subseafloor biosphere. PeerJ Comput. Sci. 2, e97 (2016)CrossRefGoogle Scholar
  17. 17.
    Hammersley, M., Atkinson, P.: Ethnography: Principles in Practice, 3rd edn. Routledge, London (2007). ReprintedGoogle Scholar
  18. 18.
    Darch, P.T., Borgman, C.L., Traweek, S., Cummings, R.L., Wallis, J.C., Sands, A.E.: What lies beneath?: knowledge infrastructures in the subseafloor biosphere and beyond. Int. J. Digit. Libr. 16(1), 61–77 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Information SciencesUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations