Abstract
Query-based information retrieval is an essential part of the web search engine. Many researchers have applied different types of web mining technologies to find more relevant information based on the keyword but are not able to know the correct meaning of the term (keyword) single, multiword or phrases. In this paper we address this problem of searching phrases. In this work the phrase searching process is three-fold as whole Phrase, Sequence of term in phrase and Mingle of the term in the phrase. Here the user enters a query as phrases that is passed to various search engines and retrieves the top ānā list of web pages. Initially preprocessing is performed on the Sequence of Keyword in phrase and Mingle of the keyword in the phrase. Then feature extraction is done based on the web pages in the various search engines using term Frequency-Inverse Document Frequency method. Following the feature extraction, grouping of the top ānā list of web pages from various search engines based on the parameter as title-based, snippet-based, content-based, address-based, link-based, uniform resource locator-based, and co-occurrence-based calculation is done using LBG clustering algorithm. Then identified the unique link from the above grouping of web pages from the various search engines using SVM classifier and assigned the rank value to the unique link web pages are done using proposed ranking algorithm. Finally it is observed from this experiment that precision, recall, f-measure, accuracy, speed, and error rate show significant improvement than the traditional search engines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arampatzis, A. T., Tsoris, T., Koster, C. H. A., & Van Der Weide, Th. P. (1998). Phase-based information retrieval. Information Processing & Management (Elsevier Science Ltd), 34(6), 693ā707.
Zanaty, E. A. (2012). Support vector machines (SVMs) versus multilayer perception (MLP) in data classification. Egyptian Informatics Journal, 13, 177ā183.
Lang, H., Wang, B., Jones, G., Li, J.-T., Ding, F., & Yi-Xuan, L. (2008). Query performance prediction for information retrieval based on covering topic score. Journal of Computer Science and Technology, 23(4), 590ā601.
MasÅowska, I. (2003). Phrase-based hierarchical clustering of web search results. In European Conference on Information Retrieval, ECIR 2003: Advances in Information Retrieval (pp. 555ā562).
Remesh Babu, K. R., Samuel, P. (2015). Concept networks for personalized web search using genetic algorithm. In International Conference on Information and Communication Technologies (ICICT 2014), Procedia Computer Science 46 (pp. 566ā573).
Laura, L., & Me, G. (2017). Searching the web for illegal content: The anatomy of a semantic search engine, methodologies and application. Soft Computing, 21, 1245ā1252.
Arias, M., Cantera, J. M., Vegas, J. (2008). Context-based personalization for mobile web search. In ACM. VLDBā08. August 24ā30, 2008.
Adriani, M., van Rijsbergen, C. J. (1999). Term similarity-based query expansion for cross-language information retrieval. In International Conference on Theory and Practice of Digital Libraries ECDL 1999: Research and Advanced Technology for Digital Libraries (pp. 311ā322).
Mangla, N., Jain, V. (2014). Context based indexing in information retrieval system using BST. International Journal of Scientific and Research Publications,4(6), ISSN:2250-3153.
Patterson, A. L. (2006). Phrase identification in an information retrieval system. In Google.
Bhatia, S., Brunk, C., & Mitra, P. (2012). Analysis and automatic classification of web search queries for diversification requirements. In ASIST 2012, October 28ā31, 2012.
Stoyanchev, S., Song, Y. C., Lahti, W. (2008). Exact phrases in information retrieval for question answering. In Proceedings of the 2nd workshop on Information Retrieval for Question Answering (IR4QA), pp. 9ā16.
Fatmawati, T., Zaman, B., Werdiningsih, I. Implementation of the common phrase index method on the phrase query for information retrieval. In International Conference on Mathematics: Pure, Applied and Computation, AIP Conf. Proc. 1867, 020027-1-020027-9.
Mala, V., Lobiyal, D. K. (2016). Semantic and keyword based web techniques in information retrieval. In IEEE 2016 International Conference on Computing, Communication and Automation (ICCCA). ISBN:978-1-5090-1666-2.
Li, Z., & Wu, X. A phrase-based method for hierarchical clustering of web snippets. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10).
Yu, Z., Johnson, T. R., & Kavuluru, R. (2014). Phrase based topic modeling for semantic information processing in biomedicine. In IEEE Explore, 2013 12th International Conference on Machine Learning and Applications. ISBN:978-0-7695-5144-9.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Amudha, S., Elizabeth Shanthi, I. (2020). Phrase Based Information Retrieval Analysis in Various Search Engines Using Machine Learning Algorithms. In: Sharma, N., Chakrabarti, A., Balas, V. (eds) Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing, vol 1016. Springer, Singapore. https://doi.org/10.1007/978-981-13-9364-8_21
Download citation
DOI: https://doi.org/10.1007/978-981-13-9364-8_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9363-1
Online ISBN: 978-981-13-9364-8
eBook Packages: EngineeringEngineering (R0)