Skip to main content

Phrase Based Information Retrieval Analysis in Various Search Engines Using Machine Learning Algorithms

  • Conference paper
  • First Online:
Data Management, Analytics and Innovation

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1016))

Abstract

Query-based information retrieval is an essential part of the web search engine. Many researchers have applied different types of web mining technologies to find more relevant information based on the keyword but are not able to know the correct meaning of the term (keyword) single, multiword or phrases. In this paper we address this problem of searching phrases. In this work the phrase searching process is three-fold as whole Phrase, Sequence of term in phrase and Mingle of the term in the phrase. Here the user enters a query as phrases that is passed to various search engines and retrieves the top ā€˜nā€™ list of web pages. Initially preprocessing is performed on the Sequence of Keyword in phrase and Mingle of the keyword in the phrase. Then feature extraction is done based on the web pages in the various search engines using term Frequency-Inverse Document Frequency method. Following the feature extraction, grouping of the top ā€˜nā€™ list of web pages from various search engines based on the parameter as title-based, snippet-based, content-based, address-based, link-based, uniform resource locator-based, and co-occurrence-based calculation is done using LBG clustering algorithm. Then identified the unique link from the above grouping of web pages from the various search engines using SVM classifier and assigned the rank value to the unique link web pages are done using proposed ranking algorithm. Finally it is observed from this experiment that precision, recall, f-measure, accuracy, speed, and error rate show significant improvement than the traditional search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arampatzis, A. T., Tsoris, T., Koster, C. H. A., & Van Der Weide, Th. P. (1998). Phase-based information retrieval. Information Processing & Management (Elsevier Science Ltd), 34(6), 693ā€“707.

    Google ScholarĀ 

  2. Zanaty, E. A. (2012). Support vector machines (SVMs) versus multilayer perception (MLP) in data classification. Egyptian Informatics Journal, 13, 177ā€“183.

    ArticleĀ  Google ScholarĀ 

  3. Lang, H., Wang, B., Jones, G., Li, J.-T., Ding, F., & Yi-Xuan, L. (2008). Query performance prediction for information retrieval based on covering topic score. Journal of Computer Science and Technology, 23(4), 590ā€“601.

    ArticleĀ  Google ScholarĀ 

  4. Masłowska, I. (2003). Phrase-based hierarchical clustering of web search results. In European Conference on Information Retrieval, ECIR 2003: Advances in Information Retrieval (pp. 555ā€“562).

    Google ScholarĀ 

  5. Remesh Babu, K. R., Samuel, P. (2015). Concept networks for personalized web search using genetic algorithm. In International Conference on Information and Communication Technologies (ICICT 2014), Procedia Computer Science 46 (pp. 566ā€“573).

    ArticleĀ  Google ScholarĀ 

  6. Laura, L., & Me, G. (2017). Searching the web for illegal content: The anatomy of a semantic search engine, methodologies and application. Soft Computing, 21, 1245ā€“1252.

    ArticleĀ  Google ScholarĀ 

  7. Arias, M., Cantera, J. M., Vegas, J. (2008). Context-based personalization for mobile web search. In ACM. VLDBā€™08. August 24ā€“30, 2008.

    Google ScholarĀ 

  8. Adriani, M., van Rijsbergen, C. J. (1999). Term similarity-based query expansion for cross-language information retrieval. In International Conference on Theory and Practice of Digital Libraries ECDL 1999: Research and Advanced Technology for Digital Libraries (pp. 311ā€“322).

    ChapterĀ  Google ScholarĀ 

  9. Mangla, N., Jain, V. (2014). Context based indexing in information retrieval system using BST. International Journal of Scientific and Research Publications,4(6), ISSN:2250-3153.

    Google ScholarĀ 

  10. Patterson, A. L. (2006). Phrase identification in an information retrieval system. In Google.

    Google ScholarĀ 

  11. Bhatia, S., Brunk, C., & Mitra, P. (2012). Analysis and automatic classification of web search queries for diversification requirements. In ASIST 2012, October 28ā€“31, 2012.

    Google ScholarĀ 

  12. Stoyanchev, S., Song, Y. C., Lahti, W. (2008). Exact phrases in information retrieval for question answering. In Proceedings of the 2nd workshop on Information Retrieval for Question Answering (IR4QA), pp. 9ā€“16.

    Google ScholarĀ 

  13. Fatmawati, T., Zaman, B., Werdiningsih, I. Implementation of the common phrase index method on the phrase query for information retrieval. In International Conference on Mathematics: Pure, Applied and Computation, AIP Conf. Proc. 1867, 020027-1-020027-9.

    Google ScholarĀ 

  14. Mala, V., Lobiyal, D. K. (2016). Semantic and keyword based web techniques in information retrieval. In IEEE 2016 International Conference on Computing, Communication and Automation (ICCCA). ISBN:978-1-5090-1666-2.

    Google ScholarĀ 

  15. Li, Z., & Wu, X. A phrase-based method for hierarchical clustering of web snippets. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10).

    Google ScholarĀ 

  16. Yu, Z., Johnson, T. R., & Kavuluru, R. (2014). Phrase based topic modeling for semantic information processing in biomedicine. In IEEE Explore, 2013 12th International Conference on Machine Learning and Applications. ISBN:978-0-7695-5144-9.

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Amudha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amudha, S., Elizabeth Shanthi, I. (2020). Phrase Based Information Retrieval Analysis in Various Search Engines Using Machine Learning Algorithms. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1016. Springer, Singapore. https://doi.org/10.1007/978-981-13-9364-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9364-8_21

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9363-1

  • Online ISBN: 978-981-13-9364-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics