Abstract
We propose robust hyperlinks as a solution to the problem of broken hyperlinks. A robust hyperlink is a URL augmented with a small “signature” consisting of carefully chosen words taken from the referenced document. If the address-based portion of the URL fails, this content-based signature can be submitted as a query to web search engines to locate the document. It turns out that very small signatures are sufficient to readily locate individual documents out of the billion on the web, even if the document is modified. Robust hyperlinks exhibit a number of desirable qualities: They can be computed and exploited automatically, are small and cheap to compute (so that it is practical to make all hyperlinks robust), do not require new server or infrastructure support, can be rolled out reasonably well in the existing URL syntax (so they can retrofit existing links to make them robust), and are easy to understand. One can start using robust hyperlinks now, as servers and web pages are mostly compatible as is, while clients can increase their support in the future. Robust hyperlinks is one example of using the web to bootstrap new features onto itself.
Lexical Signature: cnrp macskassy hypercafe shklar multivalent belongie blobworld bregler cityquilt cyberbelt
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Georgia Tech Graphics, Visualization & Usability (GVU) Center, Tenth WWW User Survey, Question 11, Problems Using the Web (1998), http://www.cc.gatech.edu/gvu/user_surveys/survey-1998-10/graphs/use/q11.htm
World Wide Web Consortium, Jim Pitkow, Chair. Web Characterization Activity Answers to the W3C HTTP-NGs Protocol Design Group’s Questions (1998), http://www.w3.org/WCA/Reports/1998-01-PDG-answers.htm
Sollins, K., Masinter, L.: Functional Requirements for Uniform Resource Names, Network Working Group Request for Comments 1737 (December 1994)
Kahn, R., Wilensky, R.: A Framework for Distributed Digital Object Services. cnri.dlib/tn95-01, May 13 (1995)
OCLC PURL Service, http://www.purl.org/
Common Name Resolution Protocol, October 18 (1999), http://www.ietf.org/html.charters/cnrp-charter.html
Ingham, D., Caughey, S., Little, M.: Fixing the Broken-Link Problem: The W3Objects Approach. MFCS 1974 28(7–11), 1255–1268 (1974); Proceedings of the Fifth International World Wide Web Conference, Paris, France, May 6-10 (1996)
Macskassy, S., Shklar, L.: Maintaining information resources. In: Proceedings of the Third International Workshop on Next Generation Information Technologies (NGITS 1997), Neve Ilan, Israel, June 30-July 3 (1997)
Francis, P., Kambayashi, T., Sato, S.-y., Shimizu, S.: Ingrid: A Self- Configuring Information Navigation Infrastructure. December 11-14 (1995), http://www.ingrid.org/francis/www4/Overview.html
Creech, M.L.: Author-oriented Link Management. In: Proceedings of the Fifth International World Wide Web Conference, Paris, France, May 6-10 (1996)
LinkGuard, http://www.linkguard.com
Davis, H.C.: Referential Integrity of Links in Open Hypermedia Systems. In: Proceedings of Hypertext 1998, Pittsburgh, Pennsylvania, June 20-24 (1998)
Bharat, K., Broder, A.: A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines. In: Proceedings of the Seventh World Wide Web Conference, Brisbane, Austrailia, April 14-18 (1998), http://www-sor.inria.fr/mirrors/www7/programme/fullpapers/1937/com1937.htm
Inktomi. Web Surpasses One Billion Documents. Press Release, January 18 (2000), http://www.inktomi.com/new/press/billion.html
Martin, J.D., Holte, R.: Searching for Content-based Addresses on the World- Wide Web. In: Proceedings of Digital Libraries (1998), http://ai.iit.nrc.ca/II_public/DL98paper.ps
Netcraft Web Server Survey (December 1999), http://www.netcraft.com/survey/
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: Image segmentation using Expectation-Maximization and its application to image querying, February 4 (1999), http://elib.cs.berkeley.edu/~carson/papers/pami.html
Alexa, http://www.alexa.com/
Barger, J.: A New Strategy for Document Indexing? USENET posting, February 15 (1999), http://www.deja.com/=dnc/getdoc.xp?AN=444610476
XML Pointer Language (XPointer). W3C Working Draft July 9 (1999), http://www.w3.org/1999/07/WD-xptr-19990709
Phelps, T.A.: Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Documents and Systems. Ph.D. Dissertation. University of California, Berkeley. UC Berkeley Division of Computer Science Technical Report No. UCB/CSD-98-1026, Also see the general and technical home pages (December 1998)
Wilensky, R., Phelps, T.A.: Multivalent Documents: A New Model for Digital Documents. UC Berkeley Division of Computer Science Technical Report, CSD-98- 999, March 13 (1998)
Phelps, T.A., Wilensky, R.: Robust Intra-document Locations. In: Proceedings of the Ninth World Wide Web Conference, Amsterdam, May 15-18 (2000)
Robust Web Site., http://www.cs.berkeley.edu/~phelps/Robust/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phelps, T.A., Wilensky, R. (2004). Robust Hyperlinks: Cheap, Everywhere, Now. In: King, P., Munson, E.V. (eds) Digital Documents: Systems and Principles. PODDP 2000. Lecture Notes in Computer Science, vol 2023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39916-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-39916-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21070-2
Online ISBN: 978-3-540-39916-2
eBook Packages: Springer Book Archive