Abstract
The nasa yuwe is the language of the Paez people in Colombia is currently an endangered language[1]. The nasa community has therefore been reviewing different strategies with the purpose of encouraging 1) the visualization process of the language and 2) the sensibilization of the use of the language, by means of computational tools. With the intention of making a contribution to both of these areas, the building of an information retrieval system (IRS) for texts written in Nasa Yuwe is proposed. This would be expected to encourage writing in Nasa Yuwe and the retrieval of documents written in the language. To implement the system, it is necessary to have a test collection with which to assess the IRS, so that the first step, prior to IRS development, is to build that test collection specifically for Nasa Yuwe texts, something which is not currently available. This paper thus presents the first test collection in Nasa Yuwe, as well as showing its construction process and results. The results allow appreciation of:1) the process of building the Nasa Yuwe test collection, 2) the queries, expert opinions and documents; and 3) a statistical analysis of the data, including an analysis of Zipf’s Law[2].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Rojas Curieux, T.: Por los caminos de la recuperación de la lengua Paéz (nasa yuwe), Popayán Letrarte editores (2006)
Manning, C., Raghavan, P., Shütze, H.: An Introduction to Information Retrieval. Cambridge University Press (2009)
Moseley, C.: Atlas de las lenguas del mundo en peligro. Ediciones UNESCO, Popayán (2010), Versión en línea: http://www.unesco.org/culture/en/endangeredlanguages/atlas (accessed Marzo 2013)
Instituto Colombiano de Cultura Hispánica: Geografía Humana de Colombia, Región Andina Central Tomo IV Volumen II, Bogotá: Banco de la República (2000)
Rojas, C., Esbozo Gramatical de la, T.: lengua nasa (lengua Paéz). In: El Lenguaje en Colombia, Tomo I: Realidad Lingüística de Colomba, Bogotá, Academía Colombiana de la Lengua e Instituto Caro y Cuervo, pp. 479–495 (2009)
Universidad del Cauca, CRIC-PEBI-Comisión General de Lenguas: Estudio Sociolingüistico Fase preliminar. Base de datos - CRIC 01/2007 Lengua Nasa Yuwe y Namtrik. Popayán, Cauca, Colombia (2008)
Farfán Martínez, M., Rojas Curieux, T.: Zuy Luuçxkwe kwe’kwe’sx ipx kwetuy piyaaka. Cartilla de aprendizaje de nasa yuwe como segunda lengua, Buenos Aires (2010)
Jung, I.: Gramática del Páez o nasa yuwe. Descripción de una Lengua Indígena de Colombia. LINOM GmbH (1984, 2008)
CRIC y el Programa de Dllo Rural en la Región de Tierra Dentro Cxhab Wala -PT/CW, Diccionario Nasa Yuwe - Castellano, Primera ed., Popayán: Litografía San José (2005)
Rojas Curieux, T., Perdomo Dizu, A., Corrales Carvaja, M.H.: Una Mirada al nasa yuwe de Novirao, Primera ed., Popayán: Sello Editorial Universidad del Cauca (2009)
Rojas Curieux, T.E.: La lengua paéz una visión de su gramática, primera ed., M. d. Cultura, Ed., Bogotá: Panamericana Formas e Impresos S.A (1998)
Carterette, B., Voorhees, E.M.: Overview of Information Retrieval Evaluation. In: Current Challenges in Patent Information Retrieval, pp. 69–85. Springer (2011)
Jadidinejad, A.H., Mahmoudi, F., Dehdari, J.: Evaluation of Perstem: A Simple and Efficient Stemming Algorithm for Persian. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 98–101. Springer, Heidelberg (2010)
Agosti, M., Bacchin, M., Ferro, N., Melucci, M.: Improving the Automatic Retrieval of Text Documents. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 279–290. Springer, Heidelberg (2003)
Peters, C., Braschler, M., Clough, P.: Evaluation for Multilingual Information Retrieval Systems. In: Multilingual Information Retrieval, pp. 129–169. Springer (2012)
NTCIR Project, NTCIR Project 2007 (En línea), http://research.nii.ac.jp/ntcir/permission/ntcir-4/perm-en-PATENT.html (Último acceso: December 5, 2014)
Ribeiro-Neto, B., Baeza-Yates, R.: Modern Information Retrieval -the concepts and technology behind search, 2nd edn. Addison Wesley, Harlow (2011)
Sheykh Esmaili, K., Salavati, S., Yosefi, S.: Building A Test Collection For Sorani Kurdish. In: ACS International Conference on Computer Systems and Applications (AICCSA), Ifrane (2013)
Esmaili, K., Abolhassani, H., Neshati, M., Behrangi, E.: Mahak: A Test Collection for Evaluation of Farsi Information Retrieval Systems. In: IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2007, pp. 639–644. IEEE (2007)
Armenska, J., Tomovski, A., Zdravkova, K., Pehcevski, J.: Information Retrieval Using a Macedonian Test Collection for Question Answering. In: Gusev, M., Mitrevski, P. (eds.) ICT Innovations 2010. CCIS, vol. 83, pp. 205–214. Springer, Heidelberg (2011)
AleAhmad, A., Amiri, H., Darrudi, E., Rahgozar, M., Oroumchian, F.: Hamshahri: A standard Persian text collection. Knowledge-Based Systems 22(5), 382–387 (2009)
Kuriyama, K., Kando, N., Nozue, T., Eguchi, K.: Pooling for a Large-Scale Test Collection: An Analysis of the Search Results from the First NTCIR Workshop. Information Retrieval 5(1), 41–59 (2002)
Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Universidad Autónoma Indígena Intercultural –UAIIN (2015), http://www.pebi-cric.org/uaiin.html ( accessed Marzo 2015)
Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Cuentos y Cosmovisión Nasa. Area Nasawe’sx Fxinzenxi, Segunda ed., Popayán (2010)
Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Te invitamos a leer. Eç thegya’ ipi’ki’ tha’w, Primera ed., Cali: Grafitextos (2007)
Asociación de Cabildos Ukawe’sx Nasa Çxhab, Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): NASAWE’SX KIWAKA FXI’ZENXI ẼEN, Primera ed., Cali: Grafitextos (2006)
Yule Yatacue, M., Vitonas Pavi, C.: Pees kupx fxi’zenxi. La metamorfosis de la vida, Tercera ed., Toribio, Cauca: Grafitextos (2012)
Consejo Regional Indígena del Cauca – CRIC, Programa de Educación Bilingüe e Intercultural.: Sistema Educativo Indígena Propio -SEIP. Primer Documento de Trabajo (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sierra, L.M., Cobos, C.A., Corrales, J.C., Curieux, T.R. (2015). Building a Nasa Yuwe Language Test Collection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-18111-0_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)