Advertisement

Accessing the History of the Web: A Web Way-Back Machine

  • Joachim Feise
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1903)

Abstract

One of the deficiencies of the World Wide Web is that the Web does not have a memory. Web resources always display one revision only, namely the latest one. In addition, once a Web resource moves from one location (i.e., URL) to another, the resource at the original location ceases to exist.

Since the Web does not provide a mechanism to allow access to the revision history of resources, services like the Internet Archive have sprung into action to collect the history of important Web resources. The Web way-back machine described here, currently under development by the author, utilizes collections of historical Web resources like the ones provided by the Internet Archive to allow online, read-only access to the revisioned resources.

Keywords

Proxy Server Online Access Proxy Cache Collection Archive Internet Archive 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    The Apache Server Project, http://www.apache.org/httpd.html
  3. 3.
  4. 4.
    The Internet Archive, http://www.archive.org/
  5. 5.
    The Internet Archive: Storage and Preservation of the Collections, http://www.archive.org/collections/storage.html
  6. 6.
    Kristol, D. and Montulli, L. HTTP State Management Mechanism, Lucent Technologies, Netscape, February 1997, http://www.ietf.org/rfc/rfc2109.txt Google Scholar
  7. 7.
  8. 8.
    Squid Frequently Asked Questions: How does Squid work, http://www.squid-cache.org/Doc/FAQ/FAQ-12.html
  9. 9.
  10. 10.
    Squid Web Proxy Cache, http://www.squid-cache.org/
  11. 11.
    van der Hoek, A. A Reusable, Distributed Repository for Configuration Management Policy Programming, Ph.D. Dissertation, University of Colorado at Boulder, 2000Google Scholar
  12. 12.
  13. 13.
    Wessels, D. and Claffy, K. Internet Cache Protocol (ICP), version 2, National Laboratory for Applied Network Research, U.C. San Diego, September 1997, http://www.ietf.org/rfc/rfc2186.txt Google Scholar
  14. 14.
    Wessels, D. and Claffy, K. Application of Internet Cache Protocol (ICP), version 2, National Laboratory for Applied Network Research, U.C. San Diego, September 1997, http://www.ietf.org/rfc/rfc2187.txt Google Scholar
  15. 15.

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Joachim Feise
    • 1
  1. 1.Irvine Information and Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations