Abstract
The retrieval information models have been of important study since 1992. These models are based on comparing a user query and a collection of documents taking into account the concurrency of the terms, with the objective to classify a set of relevant documents and retrieve them to the user in accordance with the evaluations criterion. There are metrics to classify a set of documents according to the grade of similarity, such as cosine similarity and soft cosine measure. In this paper, we perform a comparative study of these similarity metrics. The Vector Space Model (VSM) was implemented for retrieving information. A sample of the Collection of the Association for Computing Machinery (CACM) in the domain of Computer Science was used in the evaluation. The experiment results show that the recall is of 96 % in both metrics, but the soft cosine achieves 2 % more in mean average precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Grigori Sidorov, Alexander Gelbukh, Helena Gómez-Adorno, and David Pinto. Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas, Vol. 18, No. 3, 2014, pp. 491–504, DOI:10.13053/CyS-18-3-2043.
Baeza - Yates, R., & Ribeiro - Neto, B. (1999). Modern Information Retrieval (Vol. 463). New York, The United States of America: Addison Wesly, ACM press.
Farrús, M., & R. Costa-jussá, M. (2013). Presencia de IRRODL evaluación automática del aprendizaje electrónico utilizando el análisis semántico latente: caso de uso. Revista mexicana de bachillerato a distancia., 153 - 165.
La Serna Palomino, N., Pró Concepción, L., & Román Concha, U. (2013). Diseño de un sistema de recuperación de imágenes de individuos malhechores para seguridad ciudadana. Revista de Investigación de Sistemas e Informática, 25 - 32.
Monsalve, L. S. (2012). Experimento de Recuperación de Información usando las medidas de similitud coseno, Jaccard and Dice. Revista de Investigación: TECCIENCIA, 14 - 24.
La Serna Palomino, N., Román Concha, U., & Osorio, N. (2009). Implementación de un Sistema de Recuperación de Información. Revista de Ingeniería de Sistemas e Informática, 57 - 64.
The JNT Association. (01 de October de 2015). CACM collection. Obtenido de CACM collection: http://ir.dcs.gla.ac.uk/resources/test_collections/cacm
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Barbosa, J.J.G. et al. (2017). Implementation of an Information Retrieval System Using the Soft Cosine Measure. In: Melin, P., Castillo, O., Kacprzyk, J. (eds) Nature-Inspired Design of Hybrid Intelligent Systems. Studies in Computational Intelligence, vol 667. Springer, Cham. https://doi.org/10.1007/978-3-319-47054-2_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-47054-2_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47053-5
Online ISBN: 978-3-319-47054-2
eBook Packages: EngineeringEngineering (R0)