Abstract
Ranking is perhaps the most important feature of a search engine, as it allows the user to efficiently order the huge amount of pages matching a query according to their relevance to the user’s information need. With respect to traditional textual search engines, Web information retrieval systems build ranking by combining at least two evidences of relevance: the degree of matching of a page—the content score—and the degree of importance of a page—the popularity score. While the content score can be calculated using one of the information retrieval models described in Chap. 3, the popularity score can be calculated from an analysis of the indexed pages’ hyperlink structure using one or more link analysis models. In this chapter we introduce the two most famous link analysis models, PageRank and HITS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The diameter of the graph is logarithmically proportional to the number of its nodes.
- 2.
More precisely, a power law distribution with a≈2.1.
- 3.
- 4.
- 5.
- 6.
References
R. Albert, H. Jeong, A.L. Barabasi, The diameter of the world wide web. Nature 401, 130–131 (1999)
L. Becchetti, C. Castillo, The distribution of pagerank follows a power-law only for particular values of the damping factor, in Proceedings of the 15th International Conference on World Wide Web. WWW’06 (ACM, New York, 2006), pp. 941–942
K. Bharat, M.R. Henzinger, Improved algorithms for topic distillation in a hyperlinked environment, in Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR’98 (ACM, New York, 1998), pp. 104–111
A. Borodin, G.O. Roberts, J.S. Rosenthal, P. Tsaparas, Link analysis ranking: algorithms, theory, and experiments. ACM Trans. Internet Technol. 5(1), 231–297 (2005)
S. Brin, L. Page, The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)
A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, Graph structure in the web, in Proceedings of the 9th International World Wide Web Conference on Computer Networks: the International Journal of Computer and Telecommunications Networking (North-Holland, Amsterdam, 2000), pp. 309–320
S. Chakrabarti, Mining the Web: Discovering Knowledge from Hypertext Data (Morgan Kauffman, San Mateo, 2002)
J.M. Kleinberg, Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
A. Langville, C. Meyer, Google’s Page Rank and Beyond: the Science of Search Engine Rankings (Princeton University Press, Princeton, 2008)
R. Lempel, S. Moran, Rank-stability and rank-similarity of link-based web ranking algorithms in authority-connected graphs. Inf. Retr. 8(2), 245–264 (2005)
C.D. Manning, P. Raghavan, H. Schütze, Introduction to information retrieval. 2008. Online edition (2007)
M. Najork, Web crawler architecture, in Encyclopedia of Database Systems, ed. by L. Liu, M.T. Öñzsu (Springer, Berlin, 2009), pp. 3462–3465
M.A. Najork, H. Zaragoza, M.J. Taylor, Hits on the web: how does it compare? in Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR’07 (ACM, New York, 2007), pp. 471–478
L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: bringing order to the web, Technical report, Stanford InfoLab, 1999
G. Pandurangan, P. Raghavan, E. Upfal, Using PageRank to characterize web structure, in Proceedings of the 8th Annual International Conference on Computing and Combinatorics. COCOON’02 (Springer, London, 2002), pp. 330–339
M. Richardson, A. Prakash, E. Brill, Beyond PageRank: machine learning for static ranking, in Proceedings of the 15th International Conference on World Wide Web. WWW’06 (ACM, New York, 2006), pp. 707–715
W. Stewart, Introduction to the Numerical Solution of Markov Chains (Princeton University Press, Princeton, 1994)
T. Upstill, et al., Predicting fame and fortune: PageRank or Indegree? in In Proceedings of the Australasian Document Computing Symposium, ADCS 2003 (2003), pp. 31–40
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ceri, S., Bozzon, A., Brambilla, M., Della Valle, E., Fraternali, P., Quarteroni, S. (2013). Link Analysis. In: Web Information Retrieval. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39314-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-39314-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39313-6
Online ISBN: 978-3-642-39314-3
eBook Packages: Computer ScienceComputer Science (R0)