Skip to main content

Implementation of an Information Retrieval System Using the Soft Cosine Measure

  • Chapter
  • First Online:
Nature-Inspired Design of Hybrid Intelligent Systems

Abstract

The retrieval information models have been of important study since 1992. These models are based on comparing a user query and a collection of documents taking into account the concurrency of the terms, with the objective to classify a set of relevant documents and retrieve them to the user in accordance with the evaluations criterion. There are metrics to classify a set of documents according to the grade of similarity, such as cosine similarity and soft cosine measure. In this paper, we perform a comparative study of these similarity metrics. The Vector Space Model (VSM) was implemented for retrieving information. A sample of the Collection of the Association for Computing Machinery (CACM) in the domain of Computer Science was used in the evaluation. The experiment results show that the recall is of 96 % in both metrics, but the soft cosine achieves 2 % more in mean average precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Grigori Sidorov, Alexander Gelbukh, Helena Gómez-Adorno, and David Pinto. Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas, Vol. 18, No. 3, 2014, pp. 491–504, DOI:10.13053/CyS-18-3-2043.

    Google Scholar 

  2. Baeza - Yates, R., & Ribeiro - Neto, B. (1999). Modern Information Retrieval (Vol. 463). New York, The United States of America: Addison Wesly, ACM press.

    Google Scholar 

  3. Farrús, M., & R. Costa-jussá, M. (2013). Presencia de IRRODL evaluación automática del aprendizaje electrónico utilizando el análisis semántico latente: caso de uso. Revista mexicana de bachillerato a distancia., 153 - 165.

    Google Scholar 

  4. La Serna Palomino, N., Pró Concepción, L., & Román Concha, U. (2013). Diseño de un sistema de recuperación de imágenes de individuos malhechores para seguridad ciudadana. Revista de Investigación de Sistemas e Informática, 25 - 32.

    Google Scholar 

  5. Monsalve, L. S. (2012). Experimento de Recuperación de Información usando las medidas de similitud coseno, Jaccard and Dice. Revista de Investigación: TECCIENCIA, 14 - 24.

    Google Scholar 

  6. La Serna Palomino, N., Román Concha, U., & Osorio, N. (2009). Implementación de un Sistema de Recuperación de Información. Revista de Ingeniería de Sistemas e Informática, 57 - 64.

    Google Scholar 

  7. The JNT Association. (01 de October de 2015). CACM collection. Obtenido de CACM collection: http://ir.dcs.gla.ac.uk/resources/test_collections/cacm

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Javier González Barbosa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Barbosa, J.J.G. et al. (2017). Implementation of an Information Retrieval System Using the Soft Cosine Measure. In: Melin, P., Castillo, O., Kacprzyk, J. (eds) Nature-Inspired Design of Hybrid Intelligent Systems. Studies in Computational Intelligence, vol 667. Springer, Cham. https://doi.org/10.1007/978-3-319-47054-2_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47054-2_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47053-5

  • Online ISBN: 978-3-319-47054-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics