Skip to main content

Implementing a Reliable Digital Object Archive

  • Conference paper
  • First Online:
Book cover Research and Advanced Technology for Digital Libraries (ECDL 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1923))

Included in the following conference series:

Abstract

An Archival Repository reliably stores digital objects for long periods of time (decades or centuries). The archival nature of the system requires new techniques for storing, indexing, and replicating digital objects. In this paper we discuss the specialized indexing needs of a write-once archive. We also present a reliability algorithm for effectively replicating sets of related objects. We describe a data import utility for archival repositories. Finally, we discuss and evaluate a prototype repository we have built, the Stanford Archival Vault (SAV).

This material is based upon work supported by the National Science Foundation under Award 9811992.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yuri Breitbart, Raghavan Komondoor, Rajeev Rastogi, S. Seshadri, and Avi Silberschatz. Update propagation protocols for replicated databases. In Proceedings of the ACM SIGMOD Conference, 1999.

    Google Scholar 

  2. Yuan Chen, Jan Edler, Andrew Goldberg, Allan Gottlieb, Sumeet Sobti, and Peter Yianilos. A prototype implementation of archival intermemory. In Proceedings of the Fourth ACM DL Conference, 1999.

    Google Scholar 

  3. Ann Chervenak, Vivekenand Vellanki, and Zachary Kurmas. Protecting file systems: A survey of backup techniques. In Proceedings Joint NASA and IEEE Mass Storage Conference, March 1998.

    Google Scholar 

  4. Brian Cooper, Arturo Crespo, and Hector Garcia-Molina. Implementing a reliable digital object archive. http://dbpubs.stanford.edu/pub/2000-27, 2000. Extended version of paper.

    Chapter  Google Scholar 

  5. Brian Cooper and Hector Garcia-Molina. InfoMonitor: Unobtrusively archiving a World Wide Web server. http://www-db.stanford.edu/pub/papers/fmpaper.ps, 2000. Technical Report.

  6. Inktomi Corporation. Web surpasses one billion documents. http://-www.inktomi.com/new/press/billion.html, 2000.

    Google Scholar 

  7. Arturo Crespo and Hector Garcia-Molina. Awareness services for digital libraries.In Lecture Notes in Computer Science, volume 1324, 1997.

    Google Scholar 

  8. Arturo Crespo and Hector Garcia-Molina. Archival storage for digital libraries. In Proceedings of the Third ACM DL Conference, 1998.

    Google Scholar 

  9. Arturo Crespo and Hector Garcia-Molina. Modeling archival repositories for digital libraries. In Proceedings of the Fourth European Conference on Research and Advanced Technology for Digital Libraries (ECDL), 2000.

    Google Scholar 

  10. Jean Deken. Writ in water? an exploration of the gap between archival construct and practice in the machine-readable environment. In Working With Knowldge Conference, May 1998. Accessible at http://www.slac.stanford.edu/pubs/slacpubs/7000/slac-pub-7811.html.

  11. Ross Finlayson and David Cheriton. Log files: An extended file service exploiting write-once storage. In Proceedings of the 11th Symposium on Operating Systems Principles, November 1987.

    Google Scholar 

  12. National Science Foundation. Workshop on Data Archival and Information Preservation: Executive summary. http://cecssrv1.cecs.missouri.edu/NSFWorkshop/execsum.html, 1999.

  13. Hector Garcia-Molina, Jeff Ullman, and Jennifer Widom. Database System Implementation. Prentice Hall, Upper Saddle River, New Jersey, 2000.

    Google Scholar 

  14. John Garrett and Donald Waters. Preserving digital information: Report of the Task Force on Archiving of Digital Information, May 1996. Accessible at http://www.rlg.org/ArchTF/.

  15. Kaj Gronbaek and Randall Trigg. Design issues for a Dexter-based hypermedia system. Communications of the ACM, 37(2):40–49, February 1994.

    Article  Google Scholar 

  16. Anja Haake and David Hicks. Verse: Towards hypertext versioning styles. In Hypertext’ 96, 1996.

    Google Scholar 

  17. Frank Halasz and Mayer Schwartz. The Dexter Hypertext Reference Model. Communications of the ACM, 37(2):30–39, February 1994.

    Article  Google Scholar 

  18. Joseph Halpern and Carl Lagoze. The Computing Research Repository: Promoting the rapid dissemination and archiving of computer science research. In Proceedings of the Fourth ACM DL Conference, 1999.

    Google Scholar 

  19. John Hartman and John Ousterhout. The Zebra striped network file system. In Proceedings 14th Symposium on Operating Systems Principles, December 1993.

    Google Scholar 

  20. Norman C. Hutchinson, Stephen Manley, Mike Federwisch, Guy Harris, Dave Hitz, Steven Kleiman, and Sean O’Malley. Logical vs. physical file system backup. In Proceedings of the Third USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1999.

    Google Scholar 

  21. Tivoli Systems Inc. Tivoli storage manager. http://www.tivoli.com/products/index/storage mgr/, 1999.

  22. Richard P. King, Nagui Halim, Hector Garcia-Molina, and Christos A. Polyzois. Management of a remote backup copy for disaster recovery. TODS, 16(2):338–68, 1991.

    Article  Google Scholar 

  23. Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba Shrira, and Michael Williams. Replication in the Harp file system. In Proceedings 13th Symposium on Operating Systems Principles, October 1991.

    Google Scholar 

  24. Stanford Conservation Online. Electronic storage media.http://palimpsest.stanford.edu/bytopic/electronic-records/electronic-storage-media/, 2000.

  25. David Patterson, Garth Gibson, and Randy H. Katz. A case for redundant arrays of inexpensive disks (RAID). SIGMOD Record, 17(3):109–116, September 1988.

    Article  Google Scholar 

  26. Michael Rabinovich, Narain Gehani, and Alex Kononov. Efficient update propagation in epidemic replicated databases. In Proceedings of the 5th International Conference on Extending Database Technology, 1996.

    Google Scholar 

  27. Arcot Rajasekar, Richard Marciano, and Reagan Moore. Collection-based persistent archives. http://www.sdsc.edu/NARA/Publications/OTHER/Persistent/Persistent.html, 2000.

  28. Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-structured file system. In Proceedings 13th Symposium on Operating Systems Principles, October 1991.

    Google Scholar 

  29. David Rosenthal and Vicky Reich. Permanent web publishing.http://lockss.stanford.edu/, 2000. To appear at Freenix, San Diego, CA, June 2000.

  30. Victorian Electronic Records Strategy. Victorian electronic records strategy final report. http://home.vicnet.net.au/~ provic/vers/final.htm, 1999.

  31. Walter Tichy. RCS — a system for version control. Software — Practice and Experience, 15(7):637–654, 1985.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cooper, B., Crespo, A., Garcia-Molina, H. (2000). Implementing a Reliable Digital Object Archive. In: Borbinha, J., Baker, T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2000. Lecture Notes in Computer Science, vol 1923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45268-0_13

Download citation

  • DOI: https://doi.org/10.1007/3-540-45268-0_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41023-2

  • Online ISBN: 978-3-540-45268-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics