Literal-Matching-Biased Link Analysis

Xu, Yinghui; Umemura, Kyoji

doi:10.1007/978-3-540-31871-2_14

Yinghui Xu²⁰ &
Kyoji Umemura²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3411))

Included in the following conference series:

Asia Information Retrieval Symposium

406 Accesses

Abstract

The PageRank algorithm, used in the Google Search Engine, plays an important role in improving the quality of results by employing an explicit hyperlink structure among the Web pages. The prestige of Web pages defined by PageRank is derived solely from surfers’ random walk on the Web Graph without any textual content consideration. However, in the practical sense, user surfing behavior is far from random jumping. In this paper, we propose a link analysis that takes the textual information of Web pages into account. The result shows that our proposed ranking algorithms perform better than the original PageRank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph Structure in the Web: Experiments and Models. In: Proceedings of the 9th International World Wide Web Conference on Computer Networks, Amesterdam, pp. 309–320 (2000)
Google Scholar
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)
Article Google Scholar
Manning, C.D., Schutze, H.: Foundation of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
Google Scholar
Eguchi, K., Oyama, K., Ishida, E., Kando, N., Kuriyama, K.: System Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure. In: The 12th International World Wide Web Conference, No.poster-344, Budapest, Hungary (2003)
Google Scholar
Eiron, N., McCurley, K.S.: Analysis of Anchor Text for Web Search. In: Proc. of the 26th Annual International ACM SIGIR 2003 Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 459–460 (August 2003)
Google Scholar
Glover, E., Tsioutsiouliklis, K., Lawrence, S., Pennock, D., Flake, G.: Using Web Structure for Classifying and Describing Web Pages. In: Proc. 11th WWW, pp. 562–569 (2002)
Google Scholar
Haveliwala, T.: Topic-sensitive PageRank. In: Proceedings of the eleventh international conference on World Wide Web, pp. 517–526. ACM Press, New York (2002)
Chapter Google Scholar
Jin, R., Hauptmann, A.G., Zhai, C.: Title Language Model for Information Retrieval. In: Proc. of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 42–48. ACM, New York (2002)
Chapter Google Scholar
Kao, H.-Y., Lin, S.-H.: Mining Web Information Structure and Content Based on Entropy Analysis. IEEE Transactions on Knowledge and Data Engineering 16(1) (2004)
Google Scholar
Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (1998)
Google Scholar
Kraft, R., Zien, J.: Mining Anchor Text for Query Refinement. In: Proceeding of the Thirteenth International Conference on World Wide Web, New York, USA, May 17-22 (2003)
Google Scholar
Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in Pagerank. In: Advances in Neural Information Processing Systems, pp. 1441–1448. MIT Press, Cambridge (2002)
Google Scholar
Westerveld, T., Kraaij, W., Hiemstra, D.: Retrieving Web Pages Using Content, Links, URLs and Anchors. In: Voorhees, Harman, pp. 52–61 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Information and Computer Science Department, Software System Lab., Toyohashi University of Technology, 1-1, Hibarligaoka, Tempaku, Toyohashi, 441-8580, Aichi, Japan
Yinghui Xu & Kyoji Umemura

Authors

Yinghui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Kyoji Umemura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Information and Communications University, 119, Munjiro, Yuseong-gu, 305-732, Daejeon, Korea
Sung Hyon Myaeng
The Key Laboratory of Power System Protection and Dynamic Security Monitoring and Control under Ministry of Education, North China Electric Power University, Zhuxinzhuang Dewai, 102206, Beijing, China
Ming Zhou
Department of Systems Engineering and Engineering Management, Shatin, The Chinese University of Hong Kong, Hong Kong, N.T.
Kam-Fai Wong
5F, Beijing Sigma Center, Microsoft Research Asia, No. 49 Zhichun Road Haidian District, 100080, Beijing, China
Hong-Jiang Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, Y., Umemura, K. (2005). Literal-Matching-Biased Link Analysis. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-31871-2_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25065-4
Online ISBN: 978-3-540-31871-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics