Abstract
Traditional link-based web ranking algorithms are applied to web snapshots in the form of webgraphs consisting of pages as vertices and links as edges. Constructing webgraph, researchers do not pay attention to a particular method of how links are taken into account, while certain details may significantly affects the contribution of link-based factors to ranking. Furthermore, researchers use small subgraphs of the webgraph for more efficient evaluation of new algorithms. They usually consider a graph induced by pages, for example, of a certain first level domain. In this paper we reveal a significant dependence of PageRank on the method of accounting redirects while constructing the webgraph. We evaluate several natural ways of redirect accounting on a large-scale domain and find an optimal case, which turns out non-trivial. Moreover, we experimentally compare different ways of extracting a small subgraph for multiple evaluations and reveal some essential shortcomings of traditional approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Networks and ISDN Systems 30, 107–117 (1998)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web (1999), http://dbpubs.stanford.edu/pub/1999-66
Callan, J., Hoy, M., Yoo, C., Zhao, L.: The ClueWeb09 Dataset
Billion Triple Challenge 2011 Dataset (2011), http://km.aifb.kit.edu/projects/btc-2011/
Wikipedia, URL redirection, http://en.wikipedia.org/wiki/URL_redirection
Berberich, K., Vazirgiannis, M., Weikum, G.: T-Rank: Time-Aware Authority Ranking. In: Leonardi, S. (ed.) WAW 2004. LNCS, vol. 3243, pp. 131–142. Springer, Heidelberg (2004)
Dai, N., Davison, B.D.: Freshness Matters: In Flowers, Food, and Web Authority. In: Proc. SIGIR 2010, pp. 114–121 (2010)
Davison, B.D.: Recognizing Nepotistic Links on theWeb. In: AAAI 2000 Workshop on Artificial Intelligence for Web Search (July 2000)
Baykan, E., Henzinger, M., Keller, S.F., de Castelberg, S., Kinzler, M.: A Comparison of Techniques for Sampling Web Pages. In: 26th International Symposium on Theoretical Aspects of Computer Science (STACS 2009). Leibniz International Proceedings in Informatics (LIPIcs), vol. 3, pp. 13–30 (2009)
Scime, A.: Web Mining: Applications and Techniques. Idea Group Publishing, UK (2005)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the block structure of the Web for computing PageRank (Technical Report), Stanford, CA: Stanford University (2003)
Li, X., Liu, B., Yu, P.: Time Sensitive Ranking with Application to Publication Search. In: Yu, P., Han, J., Faloutsos, C. (eds.) Link Mining: Models, Algorithms and Applications, pp. 187–209. Springer (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhukovskii, M., Gusev, G., Serdyukov, P. (2013). URL Redirection Accounting for Improving Link-Based Ranking Methods. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)