Abstract
This paper presents a text mining-based search approach aimed at information retrieval in the Spanish language. For this purpose, a tool has been developed in order to facilitate and automate the analysis and retrieval, allowing the user to apply different analyzers when carrying out a query, to index and delete documents stored in the system and to evaluate the recovery process. To this extent, a dataset consisting in 27 songs has been used as a case study. Different queries have been made to investigate about the best fitting approaches to the Spanish language and their suitability depending on the query text.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gormley, C., Tong, Z.: Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine. O’Reilly Media Inc., New York (2015)
Gupta, V., Lehal, G.S.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)
Hotho, A., Nürnberger, A., Paaß, G.: A brief survey of text mining. Ldv Forum 20, 19–62 (2005)
Patel, F.N., Soni, N.R.: Text mining: a brief survey. Int. J. Adv. Comput. Res. 2(4), 243–248 (2012)
Porter, M.: Spanish stemming algorithm (2005). http://snowball.tartarus.org/algorithms/spanish/stemmer.html. Accessed 20 Jan 2018
Porter, M.F.: Snowball: a language for stemming algorithms (2001). http://snowball.tartarus.org/texts/introduction.html. Accessed 14 Jan 2018
Ramos, J., et al.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, vol. 242, pp. 133–142 (2003)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Savoy, J.: Report on CLEF-2001 experiments: effective combined query-translation approach. In: Workshop of the Cross-Language Evaluation Forum for European Languages, pp. 27–43. Springer (2001)
Sharma, D.: Stemming algorithms: a comparative study and their analysis. Int. J. Appl. Inf. Syst. 4(3), 7–12 (2012)
Sproat, R.W.: Morphology and Computation. MIT press, Cambridge (1992)
Vijayarani, S., Ilamathi, M.J., Nithya, M.: Preprocessing techniques for text mining-an overview. Int. J. Comput. Sci. Commun. Netw. 5(1), 7–16 (2015)
Wu, H.C., Luk, R.W.P., Wong, K.F., Kwok, K.L.: Interpreting TF-IDF term weights as making relevance decisions. ACM Trans. Inf. Syst. (TOIS) 26(3), 13 (2008)
Acknowledgments
This work has been supported by project MOVIURBAN Máquina social para la gestión sostenible de ciudades inteligentes: movilidad urbana, datos abiertos, sensores móviles (SA070U 16). Project cofinanced with Junta Castilla y Leon, Consejera de Educacion and FEDER funds. In addition, the research of Juan Ramos González has been co-financed by the European Social Fund and Junta de Castilla y León (Operational Programme 2014-2020 for Castilla y León, BOCYL EDU/602/2016).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ramos-González, J., Martín-Gómez, L. (2019). A Text Mining-Based Approach for Analyzing Information Retrieval in Spanish: Music Data Collection as a Case Study. In: Rodríguez, S., et al. Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference. DCAI 2018. Advances in Intelligent Systems and Computing, vol 801. Springer, Cham. https://doi.org/10.1007/978-3-319-99608-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-99608-0_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99607-3
Online ISBN: 978-3-319-99608-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)