Accessing the History of the Web: A Web Way-Back Machine
One of the deficiencies of the World Wide Web is that the Web does not have a memory. Web resources always display one revision only, namely the latest one. In addition, once a Web resource moves from one location (i.e., URL) to another, the resource at the original location ceases to exist.
Since the Web does not provide a mechanism to allow access to the revision history of resources, services like the Internet Archive have sprung into action to collect the history of important Web resources. The Web way-back machine described here, currently under development by the author, utilizes collections of historical Web resources like the ones provided by the Internet Archive to allow online, read-only access to the revisioned resources.
KeywordsProxy Server Online Access Proxy Cache Collection Archive Internet Archive
Unable to display preview. Download preview PDF.
- 1.AltaVista, http://www.altavista.com/
- 2.The Apache Server Project, http://www.apache.org/httpd.html
- 3.Google, http://www.google.com/
- 4.The Internet Archive, http://www.archive.org/
- 5.The Internet Archive: Storage and Preservation of the Collections, http://www.archive.org/collections/storage.html
- 7.Netscape Cookies, 1997, http://developer.netscape.com/docs/manuals/communicator/jsguide4/cookies.htm
- 8.Squid Frequently Asked Questions: How does Squid work, http://www.squid-cache.org/Doc/FAQ/FAQ-12.html
- 9.Squid User Manual, http://www.squid-cache.org/Doc/Users-Guide/
- 10.Squid Web Proxy Cache, http://www.squid-cache.org/
- 11.van der Hoek, A. A Reusable, Distributed Repository for Configuration Management Policy Programming, Ph.D. Dissertation, University of Colorado at Boulder, 2000Google Scholar
- 12.The Web Robots Pages, http://info.webcrawler.com/mak/projects/robots/robots.html
- 15.Why Use Google?, http://www.google.com/why_use.html