Skip to main content

Design and Implementation of an Efficient Web Crawling Using Neural Network

  • Conference paper
  • First Online:
Advances in Computer Science and Ubiquitous Computing (CUTE 2018, CSA 2018)

Abstract

The number of users the usage of internet is mounting day by day. Currently, researches on the information using the retrieval model neural networks have been actively progressed for the retrieval of information and the classification of documents. Various types of algorithms have been applied for identification and quantification of the words weights in documents. As information technologies accelerate, it is necessary to understand the exact meaning of documents through analyzing the words, using the advanced methods of technologies. In this paper, specific keywords were used by word2vec to identify naturally fused word frequencies, semantic relationships, and directional text-ranks. Therefore, the neural network is the advanced mechanism to verify the semantic relationship between words and texts in a particular document. Our approach uses the Word2vec to capture the semantic features between words in the selected text, and meanwhile naturally integrate the word frequency, semantic relation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amudha, S., Phil, M.: Web crawler for mining web data. Int. Res. J. Eng. Technol. (IRJET) 4 (2017)

    Google Scholar 

  2. Kim, Y., Hong, H., Chung, M.: Application of cohesion devices for improvement of distributional representation. In: Proceeding of the 14th International Conference on Multimedia Information Technology and Applications (MITA 2018), Shanghai University of Engineering Science, China, 28–30 June 2018, pp. 84–87 (2018)

    Google Scholar 

  3. Zhao, D., Du, N., Zhi, C., Li, Y.: Keyword extraction for social media short text. In: 14th Web Information System and Applications Conference (WISA), pp. 251–256 (2017)

    Google Scholar 

  4. Jiang, L., Wu, Z., Feng, Q., Liu, J., Zheng, Q.: Efficient deep web crawling using reinforcement learning. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 428–439. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Patil, Y., Patil, S.: Implementation of enhanced web crawler for deep-web interfaces. Int. Res. J. Eng. Technol. (IRJET) 3 (2016)

    Google Scholar 

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2017R1D1A1B03030033).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mokdong Chung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tanvir, A.M., Kim, Y., Chung, M. (2020). Design and Implementation of an Efficient Web Crawling Using Neural Network. In: Park, J., Park, DS., Jeong, YS., Pan, Y. (eds) Advances in Computer Science and Ubiquitous Computing. CUTE CSA 2018 2018. Lecture Notes in Electrical Engineering, vol 536. Springer, Singapore. https://doi.org/10.1007/978-981-13-9341-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9341-9_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9340-2

  • Online ISBN: 978-981-13-9341-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics