Skip to main content

Building a Nasa Yuwe Language Test Collection

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2015)

Abstract

The nasa yuwe is the language of the Paez people in Colombia is currently an endangered language[1]. The nasa community has therefore been reviewing different strategies with the purpose of encouraging 1) the visualization process of the language and 2) the sensibilization of the use of the language, by means of computational tools. With the intention of making a contribution to both of these areas, the building of an information retrieval system (IRS) for texts written in Nasa Yuwe is proposed. This would be expected to encourage writing in Nasa Yuwe and the retrieval of documents written in the language. To implement the system, it is necessary to have a test collection with which to assess the IRS, so that the first step, prior to IRS development, is to build that test collection specifically for Nasa Yuwe texts, something which is not currently available. This paper thus presents the first test collection in Nasa Yuwe, as well as showing its construction process and results. The results allow appreciation of:1) the process of building the Nasa Yuwe test collection, 2) the queries, expert opinions and documents; and 3) a statistical analysis of the data, including an analysis of Zipf’s Law[2].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rojas Curieux, T.: Por los caminos de la recuperación de la lengua Paéz (nasa yuwe), Popayán Letrarte editores (2006)

    Google Scholar 

  2. Manning, C., Raghavan, P., Shütze, H.: An Introduction to Information Retrieval. Cambridge University Press (2009)

    Google Scholar 

  3. Moseley, C.: Atlas de las lenguas del mundo en peligro. Ediciones UNESCO, Popayán (2010), Versión en línea: http://www.unesco.org/culture/en/endangeredlanguages/atlas (accessed Marzo 2013)

  4. Instituto Colombiano de Cultura Hispánica: Geografía Humana de Colombia, Región Andina Central Tomo IV Volumen II, Bogotá: Banco de la República (2000)

    Google Scholar 

  5. Rojas, C., Esbozo Gramatical de la, T.: lengua nasa (lengua Paéz). In: El Lenguaje en Colombia, Tomo I: Realidad Lingüística de Colomba, Bogotá, Academía Colombiana de la Lengua e Instituto Caro y Cuervo, pp. 479–495 (2009)

    Google Scholar 

  6. Universidad del Cauca, CRIC-PEBI-Comisión General de Lenguas: Estudio Sociolingüistico Fase preliminar. Base de datos - CRIC 01/2007 Lengua Nasa Yuwe y Namtrik. Popayán, Cauca, Colombia (2008)

    Google Scholar 

  7. Farfán Martínez, M., Rojas Curieux, T.: Zuy Luuçxkwe kwe’kwe’sx ipx kwetuy piyaaka. Cartilla de aprendizaje de nasa yuwe como segunda lengua, Buenos Aires (2010)

    Google Scholar 

  8. Jung, I.: Gramática del Páez o nasa yuwe. Descripción de una Lengua Indígena de Colombia. LINOM GmbH (1984, 2008)

    Google Scholar 

  9. CRIC y el Programa de Dllo Rural en la Región de Tierra Dentro Cxhab Wala -PT/CW, Diccionario Nasa Yuwe - Castellano, Primera ed., Popayán: Litografía San José (2005)

    Google Scholar 

  10. Rojas Curieux, T., Perdomo Dizu, A., Corrales Carvaja, M.H.: Una Mirada al nasa yuwe de Novirao, Primera ed., Popayán: Sello Editorial Universidad del Cauca (2009)

    Google Scholar 

  11. Rojas Curieux, T.E.: La lengua paéz una visión de su gramática, primera ed., M. d. Cultura, Ed., Bogotá: Panamericana Formas e Impresos S.A (1998)

    Google Scholar 

  12. Carterette, B., Voorhees, E.M.: Overview of Information Retrieval Evaluation. In: Current Challenges in Patent Information Retrieval, pp. 69–85. Springer (2011)

    Google Scholar 

  13. Jadidinejad, A.H., Mahmoudi, F., Dehdari, J.: Evaluation of Perstem: A Simple and Efficient Stemming Algorithm for Persian. In: Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.) CLEF 2009. LNCS, vol. 6241, pp. 98–101. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Agosti, M., Bacchin, M., Ferro, N., Melucci, M.: Improving the Automatic Retrieval of Text Documents. In: Peters, C., Braschler, M., Gonzalo, J. (eds.) CLEF 2002. LNCS, vol. 2785, pp. 279–290. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  15. Peters, C., Braschler, M., Clough, P.: Evaluation for Multilingual Information Retrieval Systems. In: Multilingual Information Retrieval, pp. 129–169. Springer (2012)

    Google Scholar 

  16. NTCIR Project, NTCIR Project 2007 (En línea), http://research.nii.ac.jp/ntcir/permission/ntcir-4/perm-en-PATENT.html (Último acceso: December 5, 2014)

  17. Ribeiro-Neto, B., Baeza-Yates, R.: Modern Information Retrieval -the concepts and technology behind search, 2nd edn. Addison Wesley, Harlow (2011)

    Google Scholar 

  18. Sheykh Esmaili, K., Salavati, S., Yosefi, S.: Building A Test Collection For Sorani Kurdish. In: ACS International Conference on Computer Systems and Applications (AICCSA), Ifrane (2013)

    Google Scholar 

  19. Esmaili, K., Abolhassani, H., Neshati, M., Behrangi, E.: Mahak: A Test Collection for Evaluation of Farsi Information Retrieval Systems. In: IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2007, pp. 639–644. IEEE (2007)

    Google Scholar 

  20. Armenska, J., Tomovski, A., Zdravkova, K., Pehcevski, J.: Information Retrieval Using a Macedonian Test Collection for Question Answering. In: Gusev, M., Mitrevski, P. (eds.) ICT Innovations 2010. CCIS, vol. 83, pp. 205–214. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  21. AleAhmad, A., Amiri, H., Darrudi, E., Rahgozar, M., Oroumchian, F.: Hamshahri: A standard Persian text collection. Knowledge-Based Systems 22(5), 382–387 (2009)

    Article  Google Scholar 

  22. Kuriyama, K., Kando, N., Nozue, T., Eguchi, K.: Pooling for a Large-Scale Test Collection: An Analysis of the Search Results from the First NTCIR Workshop. Information Retrieval 5(1), 41–59 (2002)

    Article  MATH  Google Scholar 

  23. Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Universidad Autónoma Indígena Intercultural –UAIIN (2015), http://www.pebi-cric.org/uaiin.html ( accessed Marzo 2015)

  24. Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Cuentos y Cosmovisión Nasa. Area Nasawe’sx Fxinzenxi, Segunda ed., Popayán (2010)

    Google Scholar 

  25. Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): Te invitamos a leer. Eç thegya’ ipi’ki’ tha’w, Primera ed., Cali: Grafitextos (2007)

    Google Scholar 

  26. Asociación de Cabildos Ukawe’sx Nasa Çxhab, Consejo Regional Indígena del Cauca – Programa de Educación Bilingüe e Intercultural (PEBI - CRIC): NASAWE’SX KIWAKA FXI’ZENXI ẼEN, Primera ed., Cali: Grafitextos (2006)

    Google Scholar 

  27. Yule Yatacue, M., Vitonas Pavi, C.: Pees kupx fxi’zenxi. La metamorfosis de la vida, Tercera ed., Toribio, Cauca: Grafitextos (2012)

    Google Scholar 

  28. Consejo Regional Indígena del Cauca – CRIC, Programa de Educación Bilingüe e Intercultural.: Sistema Educativo Indígena Propio -SEIP. Primer Documento de Trabajo (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luz Marina Sierra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sierra, L.M., Cobos, C.A., Corrales, J.C., Curieux, T.R. (2015). Building a Nasa Yuwe Language Test Collection. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18111-0_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18110-3

  • Online ISBN: 978-3-319-18111-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics