Advertisement

Towards a Portable SLU System Applied to MSA and Low-resourced Algerian Dialects

  • Mohamed LichouriEmail author
  • Rachida Djeradi
  • Amar Djeradi
  • Mourad Abbas
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 921)

Abstract

As the most used approach to extend a Spoken language Understanding (SLU) from a language to another, Machine translation achieves high performance for English domains, which is not the case for other languages, especially low-resourced ones as Arabic and its dialects. To avoid Machine Translation approach which requires huge parallel corpora, we will investigate, in this paper, the problem of user’s intent interpretation from natural language queries to a system’s semantic representation format across the languages and dialects, namely: English, Modern Standard Arabic (MSA) and four vernacular Algerian dialects from different regions: Blida, Djelfa, Tenes and Tizi-Ouzou. We should note that the domain we have chosen to run our experiments is a special application of school management. For this, We use three classifiers: kNN, Gaussian Naive Bayes and Bernoulli Naive Bayes which led to an average accuracy of 90%.

Keywords

Spoken Language Understanding Multilingual Dialects Portability Human-machine dialog Utterance Thematic approach 

Notes

Acknowledgment

Special thanks to Dhia El Hak Megtouf, Amel Elbachir and Karima Mahdjane for their contribution in corpus enrichment.

References

  1. 1.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Glass, J., Flammia, G., Goodine, D., Phillips, M., Polifroni, J., Sakai, S., Seneff, S., Zue, V.: Multilingual spoken-language understanding in the MIT Voyager system. Speech Commun. 17(1–2), 1–18 (1995)CrossRefGoogle Scholar
  3. 3.
    Lefevre, F., Mairesse, F., Young, S.: Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation. In: Eleventh Annual Conference of the International Speech Communication Association (2010)Google Scholar
  4. 4.
    Lefevre, F., Mostefa, D., Besacier, L., Esteve, Y., Quignard, M., Camelin, N., Favre, B., Jabaian, B., Barahona, L.M.R.: Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora. In: The International Conference on Language Resources and Evaluation, May 2012Google Scholar
  5. 5.
    Misu, T., Mizukami, E., Kashioka, H., Nakamura, S., Li, H.: A bootstrapping approach for SLU portability to a new language by inducting unannotated user queries. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4961–4964. IEEE, March 2012Google Scholar
  6. 6.
    Stepanov, E.A., Kashkarev, I., Bayer, A.O., Riccardi, G., Ghosh, A.: Language style and domain adaptation for cross-language SLU porting. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 144–149. IEEE, December 2013Google Scholar
  7. 7.
    Stepanov, E.A., Riccardi, G., Bayer, A.O.: The development of the multilingual LUNA corpus for spoken language system porting. In: LREC pp. 2675–2678, May 2014Google Scholar
  8. 8.
    Upadhyay, S., Faruqui, M., Tur, G., Hakkani-Tur, D., Heck, L.: (Almost) Zero-Shot Cross-Lingual Spoken Language Understanding (2018)Google Scholar
  9. 9.
    Graja, M., Jaoua, M., Belguith, L.H.: Building ontologies to understand spoken tunisian dialect. arXiv preprint arXiv:1109.0624 (2011)
  10. 10.
    Elmadany, A.A., Abdou, S.M., Gheith, M.: Towards understanding Egyptian Arabic dialogues. arXiv preprint arXiv:1509.03208 (2015)
  11. 11.
    Lichouri, M., Djeradi, A., Djeradi, R.: A new automatic approach for understanding the spontaneous utterance in human-machine dialogue based on automatic text categorization. In: Proceedings of the International Conference on Intelligent Information Processing, Security and Advanced Communication, p. 50. ACM, November 2015Google Scholar
  12. 12.
    Lichouri, M., Djeradi, A., Djeradi, R.: Une approche Statistico-Linguistique pour l’extraction de concepts sémantiques: Une première étape vers un système générique de dialogue Homme-MachineGoogle Scholar
  13. 13.
    Indurkhya, N., Damerau, F.J. (eds.): Handbook of Natural Language Processing, vol. 2. CRC Press, Boca Raton (2010)Google Scholar
  14. 14.
    Palmer, D.D., Hearst, M.A.: Adaptive multilingual sentence boundary disambiguation. Comput. Linguist. 23(2), 241–267 (1997)Google Scholar
  15. 15.
    Bird, S., Loper, E.: NLTK: the natural language toolkit. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 31. Association for Computational Linguistics, July 2004Google Scholar
  16. 16.
    Kiss, T., Strunk, J.: Unsupervised multilingual sentence boundary detection. Comput. Linguist. 32(4), 485–525 (2006)CrossRefGoogle Scholar
  17. 17.
    Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing Using Very Large Corpora, pp. 157–176. Springer, Dordrecht (1999)CrossRefGoogle Scholar
  18. 18.
    Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of ISIM, vol. 4, pp. 93–100 (2004)Google Scholar
  19. 19.
    Leskovec, J.: Dimensionality reduction PCA, SVD, MDS, ICA, and friends. Machine Learning recitation, 27 April 2006Google Scholar
  20. 20.
    Yang, Y.: An evaluation of statistical approaches to text categorization. Inf. Retrieval 1(1–2), 69–90 (1999)CrossRefGoogle Scholar
  21. 21.
    Schütze, H., Manning, C.D., Raghavan, P.: Introduction to Information Retrieval, vol. 39. Cambridge University Press, New York (2008)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Mohamed Lichouri
    • 1
    • 2
    Email author
  • Rachida Djeradi
    • 2
  • Amar Djeradi
    • 2
  • Mourad Abbas
    • 1
  1. 1.Computational Linguistics DepartmentCRSTDLABouzaréahAlgeria
  2. 2.University of Science and Technology Houari BoumedieneBab EzzouarAlgeria

Personalised recommendations