Skip to main content

Literal-Matching-Biased Link Analysis

  • Conference paper
Information Retrieval Technology (AIRS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3411))

Included in the following conference series:

  • 406 Accesses

Abstract

The PageRank algorithm, used in the Google Search Engine, plays an important role in improving the quality of results by employing an explicit hyperlink structure among the Web pages. The prestige of Web pages defined by PageRank is derived solely from surfers’ random walk on the Web Graph without any textual content consideration. However, in the practical sense, user surfing behavior is far from random jumping. In this paper, we propose a link analysis that takes the textual information of Web pages into account. The result shows that our proposed ranking algorithms perform better than the original PageRank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J.: Graph Structure in the Web: Experiments and Models. In: Proceedings of the 9th International World Wide Web Conference on Computer Networks, Amesterdam, pp. 309–320 (2000)

    Google Scholar 

  2. Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)

    Article  Google Scholar 

  3. Manning, C.D., Schutze, H.: Foundation of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)

    Google Scholar 

  4. Eguchi, K., Oyama, K., Ishida, E., Kando, N., Kuriyama, K.: System Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure. In: The 12th International World Wide Web Conference, No.poster-344, Budapest, Hungary (2003)

    Google Scholar 

  5. Eiron, N., McCurley, K.S.: Analysis of Anchor Text for Web Search. In: Proc. of the 26th Annual International ACM SIGIR 2003 Conference on Research and Development in Information Retrieval, Toronto, Canada, pp. 459–460 (August 2003)

    Google Scholar 

  6. Glover, E., Tsioutsiouliklis, K., Lawrence, S., Pennock, D., Flake, G.: Using Web Structure for Classifying and Describing Web Pages. In: Proc. 11th WWW, pp. 562–569 (2002)

    Google Scholar 

  7. Haveliwala, T.: Topic-sensitive PageRank. In: Proceedings of the eleventh international conference on World Wide Web, pp. 517–526. ACM Press, New York (2002)

    Chapter  Google Scholar 

  8. Jin, R., Hauptmann, A.G., Zhai, C.: Title Language Model for Information Retrieval. In: Proc. of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, pp. 42–48. ACM, New York (2002)

    Chapter  Google Scholar 

  9. Kao, H.-Y., Lin, S.-H.: Mining Web Information Structure and Content Based on Entropy Analysis. IEEE Transactions on Knowledge and Data Engineering 16(1) (2004)

    Google Scholar 

  10. Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (1998)

    Google Scholar 

  11. Kraft, R., Zien, J.: Mining Anchor Text for Query Refinement. In: Proceeding of the Thirteenth International Conference on World Wide Web, New York, USA, May 17-22 (2003)

    Google Scholar 

  12. Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in Pagerank. In: Advances in Neural Information Processing Systems, pp. 1441–1448. MIT Press, Cambridge (2002)

    Google Scholar 

  13. Westerveld, T., Kraaij, W., Hiemstra, D.: Retrieving Web Pages Using Content, Links, URLs and Anchors. In: Voorhees, Harman, pp. 52–61 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, Y., Umemura, K. (2005). Literal-Matching-Biased Link Analysis. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31871-2_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25065-4

  • Online ISBN: 978-3-540-31871-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics