Advertisement

Link Analysis

  • Krishna Raj P. M.Email author
  • Ankith Mohan
  • K. G. Srinivasa
Chapter
Part of the Computer Communications and Networks book series (CCN)

Abstract

In this chapter we will look mainly at search engines. A search engine is an application which takes as input a search query and returns a list of relevant Webpages. We will start by looking at the general architecture of a search engine, focusing on each and every one of its components. We will place more focus on search engine algorithms by going into detail into prominent ones like the PageRank, HITS, random walk and other algorithms. Finally, we will look at the architecture of the Google search engine of 1998.

References

  1. 1.
    Agirre, Eneko, Mona Diab, Daniel Cer, and Aitor Gonzalez-Agirre. 2012. Semeval-2012 task 6: a pilot on semantic textual similarity. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the sixth international workshop on semantic evaluation, 385–393. Association for Computational Linguistics.Google Scholar
  2. 2.
    Altman, Alon, and Moshe Tennenholtz. 2005. Ranking systems: the pagerank axioms. In Proceedings of the 6th ACM conference on electronic commerce, 1–8. ACM.Google Scholar
  3. 3.
    Arasu, Arvind, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, and Sriram Raghavan. 2001. Searching the web. ACM Transactions on Internet Technology (TOIT) 1 (1): 2–43.CrossRefGoogle Scholar
  4. 4.
    Berkhin, Pavel. 2005. A survey on pagerank computing. Internet Mathematics 2 (1): 73–120.MathSciNetCrossRefGoogle Scholar
  5. 5.
    Bharat, Krishna, and Monika R. Henzinger. 1998. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, 104–111. ACM.Google Scholar
  6. 6.
    Borodin, Allan, Gareth O. Roberts, Jeffrey S. Rosenthal, and Panayiotis Tsaparas. 2001. Finding authorities and hubs from link structures on the world wide web. In Proceedings of the 10th international conference on World Wide Web, 415–429. ACM.Google Scholar
  7. 7.
    Brin, Sergey, and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems 30 (1–7): 107–117.CrossRefGoogle Scholar
  8. 8.
    Chakrabarti, Soumen, Byron Dom, Prabhakar Raghavan, Sridhar Rajagopalan, David Gibson, and Jon Kleinberg. 1998. Automatic resource compilation by analyzing hyperlink structure and associated text. Computer Networks and ISDN Systems 30 (1–7): 65–74.CrossRefGoogle Scholar
  9. 9.
    Cho, Junghoo, and Hector Garcia-Molina. 2003. Estimating frequency of change. ACM Transactions on Internet Technology (TOIT) 3 (3): 256–290.CrossRefGoogle Scholar
  10. 10.
    Cohn, David, and Huan Chang. 2000. Learning to probabilistically identify authoritative documents. In ICML, 167–174. Citeseer.Google Scholar
  11. 11.
    Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (methodological) 1–38.Google Scholar
  12. 12.
    Gyöngyi, Zoltán, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the thirtieth international conference on very large data bases-volume 30, 576–587. VLDB Endowment.Google Scholar
  13. 13.
    Gyongyi, Zoltan, Pavel Berkhin, Hector Garcia-Molina, and Jan Pedersen. 2006. Link spam detection based on mass estimation. In Proceedings of the 32nd international conference on very large data bases, 439–450. VLDB Endowment.Google Scholar
  14. 14.
    Haveliwala, Taher H. 2002. Topic-sensitive pagerank. In Proceedings of the 11th international conference on World Wide Web, 517–526. ACM.Google Scholar
  15. 15.
    Hirai, Jun, Sriram Raghavan, Hector Garcia-Molina, and Andreas Paepcke. 2000. Webbase: a repository of web pages. Computer Networks 33 (1–6): 277–293.CrossRefGoogle Scholar
  16. 16.
    Kleinberg, Jon M. 1998. Authoritative sources in a hyperlinked environment. In In Proceedings of the ACM-SIAM symposium on discrete algorithms, Citeseer.Google Scholar
  17. 17.
    Lempel, Ronny, and Shlomo Moran. 2000. The stochastic approach for link-structure analysis (salsa) and the tkc effect1. Computer Networks 33 (1–6): 387–401.CrossRefGoogle Scholar
  18. 18.
    MacCluer, Charles R. 2000. The many proofs and applications of perron’s theorem. Siam Review 42 (3): 487–498.Google Scholar
  19. 19.
    Melink, Sergey, Sriram Raghavan, Beverly Yang, and Hector Garcia-Molina. 2001. Building a distributed full-text index for the web. ACM Transactions on Information Systems (TOIS) 19 (3): 217–241.CrossRefGoogle Scholar
  20. 20.
    Rafiei, Davood, and Alberto O. Mendelzon. 2000. What is this page known for? computing web page reputations. Computer Networks 33 (1–6): 823–835.Google Scholar
  21. 21.
    Ribeiro-Neto, Berthier A., and Ramurti A. Barbosa. 1998. Query performance for tightly coupled distributed digital libraries. In Proceedings of the third ACM conference on digital libraries, 182–190. ACM.Google Scholar
  22. 22.
    Tomasic, Anthony, and Hector Garcia-Molina. 1993. Performance of inverted indices in shared-nothing distributed text document information retrieval systems. In Proceedings of the second international conference on parallel and distributed information systems, 8–17. IEEE.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Krishna Raj P. M.
    • 1
    Email author
  • Ankith Mohan
    • 1
  • K. G. Srinivasa
    • 2
  1. 1.Department of ISERamaiah Institute of TechnologyBangaloreIndia
  2. 2.Department of Information TechnologyC.B.P. Government Engineering CollegeJaffarpurIndia

Personalised recommendations