Advertisement

A Study on Different Types of Web Crawlers

  • P. G. ChaitraEmail author
  • V. Deepthi
  • K. P. Vidyashree
  • S. Rajini
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 989)

Abstract

The world wide web is a global information medium in which as many people as possible explore the information around the world. Search engine is a place where internet users search for the required content and the results are returned to users through websites, images or videos. Here web crawlers emerged that browses the web to gather and download pages relevant to user topics and store them in a large repository that makes the search engine more efficient. These web crawlers are becoming more important and growing daily. This paper presents the various web crawler types and their architectures. Comparisons are analyzed between these crawlers.

Keywords

Web crawler Focused crawler Incremental crawler Distributed crawler Parallel crawler Hidden web crawler 

Notes

Acknowledgements

The authors express gratitude towards the assistance provided by Accendere Knowledge Management Services Pvt. Ltd. In preparing the manuscripts. We also thank our mentors and faculty members who guided us throughout the research and helped us in achieving desired results.

References

  1. 1.
    Gupta, S.B.: The issues and challenges with the web crawlers. Int. J. Inf. Technol. Syst. 1, 1–10 (2012)Google Scholar
  2. 2.
    Castillo, C.: Effective web crawling. Ph.D. thesis. University of Chile (2004). Accessed 03 Oct 2018Google Scholar
  3. 3.
    Suebchua, T., Rungsawang, A., Yamana, H.: Adaptive focused website segment crawler. In: 19th International Conference on Network-Based Information Systems, pp. 181–187 (2016)Google Scholar
  4. 4.
    Gupta, A., Anand, P.: Focused web crawlers and its approaches. In: 2015 1st International Conference on Futuristic Trends on Computational Analysis and Knowledge Management ABLAZE 2015, pp. 619–622 (2015)Google Scholar
  5. 5.
    Shchekotykhin, K., Jannach, D., Friedrich, G.: xCrawl: a high-recall crawling method for web mining. Knowl. Inf. Syst. 25, 303–326 (2010)CrossRefGoogle Scholar
  6. 6.
    Yu, H., Han, J.: PEBL: positive example based learning for web page classification using SVM. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)Google Scholar
  7. 7.
    Sharma, S., Gupta, P.: The anatomy of web crawlers. In: International Conference on Computing, Communication and Automation ICCCA 2015, pp. 849–853 (2015)Google Scholar
  8. 8.
    Hall, W., De Roure, D., Shadbolt, N.: The evolution of the web and implications for eResearch. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 367, 991–1001 (2009)CrossRefGoogle Scholar
  9. 9.
    Yuhao, F.: Design and implementation of distributed crawler system based on Scrapy. In: IOP Conference Series: Earth and Environmental Science, pp. 1–5 (2018)Google Scholar
  10. 10.
    Kumar, D., Mishra, R.: Deep web performance enhance on search engine. In: International Conference on Soft Computing Techniques and Implementations, ICSCTI 2015, pp. 137–140 (2015)Google Scholar
  11. 11.
    Raghavan, S., Garcia-Molina, H.: Crawling the hidden web. In: 27th VLDB Conference, Roma, Italy, pp. 1–10 (2010)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • P. G. Chaitra
    • 1
    Email author
  • V. Deepthi
    • 1
  • K. P. Vidyashree
    • 1
  • S. Rajini
    • 1
  1. 1.Department of Information Science and EngineeringVidyavardhaka College of EngineeringMysuruIndia

Personalised recommendations