Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Provenance and Reproducibility

  • Fernando ChirigatiEmail author
  • Juliana Freire
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_80747


Computational reproducibility


A computational experiment composed by a sequence of steps S created at time T, on environment (hardware and operating system) E, using data D is reproducible if it can be executed with a sequence of steps S′ (modified from or equal to S) at time T′ > T, on environment E′ (potentially different than E), using data D′ that is similar to (or the same as) D with consistent results [5]. Replication is a special case of reproducibility where S′ = S and D′ = D. While there is substantial disagreement on how to define reproducibility [1], in particular across different domains, in this entry, we focus on computational reproducibility, i.e., reproducibility for computational experiments or processes.

The information needed to reproduce an experiment can be obtained from its provenance: the details of how the experiment was carried out and the results it derived. For computational experiments, provenance can be systematically and transparently...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Baker M. Muddled meanings hamper efforts to fix reproducibility crisis. Nature News & Comment. 14 Jun 2006 (2016).Google Scholar
  2. 2.
    Bonnet P, Manegold S, Bjørling M, Cao W, Gonzalez J, Granados J, Hall N, Idreos S, Ivanova M, Johnson R, Koop D, Kraska T, Müller R, Olteanu D, Papotti P, Reilly C, Tsirogiannis D, Yu C, Freire J, Shasha D. Repeatability and workability evaluation of SIGMOD’2011. ACM SIGMOD Rec. 2011;40(2):45–8.CrossRefGoogle Scholar
  3. 3.
    Claerbout J, Karrenbach M. Electronic documents give reproducible research a new meaning. In: Proceedings of the 62nd Annual International Meeting of the Society of Exploration Geophysics; 1992. p. 601–4.Google Scholar
  4. 4.
    Collberg C, Proebsting T, Warren AM. Repeatability and benefaction in computer systems research. Technical report. TR 14-04, University of Arizona; 2015.Google Scholar
  5. 5.
    Freire J, Bonnet P, Shasha D. Computational reproducibility: state-of-the-art, challenges, and database research opportunities. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data; 2012. p. 593–6.Google Scholar
  6. 6.
    Knuth DE. Literate programming. Comput J. 1984;27(2):97–111.zbMATHCrossRefGoogle Scholar
  7. 7.
    Kovacevic J. How to encourage and publish reproducible research. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing; 2007. p. IV-1273–6.Google Scholar
  8. 8.
    LeVeque R. Python tools for reproducible research on hyperbolic problems. Comput Sci Eng. 2009;11(1):19–27.CrossRefGoogle Scholar
  9. 9.
    Nuzzo R. How scientists fool themselves, and how they can stop. Nature. 2015;526(7572):182–5.CrossRefGoogle Scholar
  10. 10.
    Piwowar HA, Day RS, Fridsma DB. Sharing detailed research data is associated with increased citation rate. PLoS One. 2007;2(3):e308.CrossRefGoogle Scholar
  11. 11.
    Vandewalle P, Kovacevic J, Vetterli M. Reproducible research in signal processing – what, why, and how. IEEE Signal Process Mag. 2009;26(3):37–7.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.NYU Tandon School of EngineeringBrooklynUSA
  2. 2.NYU Center for Data ScienceNew YorkUSA
  3. 3.New York UniversityNew YorkUSA