Skip to main content

Kurdish Spoken Dialect Recognition Using X-Vector Speaker Embedding

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12997))

Included in the following conference series:

Abstract

This paper presents a dialect recognition system for the Kurdish language using speaker embedding. Two main goals are followed in this research: first, we investigate the availability of dialect information in speaker embedding, then this information is used for spoken dialect recognition in the Kurdish language. Second, we introduce a public dataset for Kurdish spoken dialect recognition named Zar. The Zar dataset comprises 16,385 utterances in 49 h-36 min for five dialects of the Kurdish language (Northern Kurdish, Central Kurdish, Southern Kurdish, Hawrami, and Zazaki). The dialect recognition is done with x-vector speaker embedding which is trained for speaker recognition using Voxceleb1 and Voxceleb2 datasets. After that, the extracted x-vectors are used to train support vector machine (SVM) and decision tree classifiers for dialect recognition. The results are compared with an i-vector system that is trained specifically for Kurdish spoken dialect recognition. In both systems (i-vector and x-vector), the SVM classifier with 87% of precision results in better performance. Our results show that the information preserved in the speaker embedding can be used for automatic dialect recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/ArashAmani/Kurdish-Dialect-Recognition.

References

  1. Li, H., Ma, B., Lee, K.: Spoken language recognition: from fundamentals to practice. In: Proceedings of the IEEE, vol. 101, issue 5, pp. 1136–1159 (2013). https://doi.org/10.1109/JPROC.2012.2237151

  2. Biadsy, F., Soltauy, H., Manguy, L., Navratily, J., Hirschberg, J.: Discriminative phonotactics for dialect recognition using context-dependent phone classifiers. In: Proceedings of the IEEE Odyssey: Speaker and Language Recognition Workshop, pp. 263–270, Brno, Czech Republic (2010)

    Google Scholar 

  3. Wang, W., Song, W., Chen, Ch., Zhang, Z., Xin, Y.: I-vector features and deep neural network modeling for language recognition. Procedia Comput. Sci. 147, 36–43 (2019)

    Article  Google Scholar 

  4. Torres-Carrasquillo, P., Gleason, T., Reynolds, D.: Dialect identification using Gaussian Mixture Models (2004)

    Google Scholar 

  5. Lei, Y., Hansen, J.: Factor analysis-based information integration for Arabic dialect identification. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4337–4340 (2009). https://doi.org/10.1109/ICASSP.2009.4960589

  6. Hanani, A., Naser, R.: Spoken Arabic dialect recognition using X-vectors. Natural Language Engineering. Cambridge University Press (2020)

    Google Scholar 

  7. Snyder, D., Garcia-Romero, D., McCree, A., Sell, G., Povey, D., Khudanpur, S.: Spoken language recognition using X-vectors. In: Proceedings of the Odyssey 2018 The Speaker and Language Recognition Workshop, pp. 105–111 (2018). https://doi.org/10.21437/Odyssey.2018-15

  8. Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: robust DNN embedding for speaker recognition. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5329–5333 (2018). https://doi.org/10.1109/ICASSP.2018.8461375

  9. Raj, D., Snyder, D., Povey, D., Khudanpur, S.: Probing the information encoded in X-vectors. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 726–733 (2019). https://doi.org/10.1109/ASRU46091.2019.9003979

  10. Mohammadamini, M., Matrouf, D., Bonastre, J-F., Serizel, R., Dowerah, S., Jouvet, D.: Compensate multiple distortions for speaker recognition systems. In: EUSIPCO (2021)

    Google Scholar 

  11. Veisi, H., MohammadAmini, M., Hosseini, H.: Toward Kurdish language processing: experiments in collecting and processing the AsoSoft text corpus. Digit. Scholarsh. Humanit. 35(1), 176–193 (2020). https://doi.org/10.1093/llc/fqy074

    Article  Google Scholar 

  12. Malmasi, S.: Subdialectal differences in Sorani Kurdish. In: Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, Osaka, Japan (2016)

    Google Scholar 

  13. Veisi, H., Hosseini, H., Mohammadamini, M., Fathy, W., Mahmudi, A.: A Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon (2021), https://arxiv.org/abs/2102.07412v1

  14. Abdul, Z.: Kurdish speaker identification based on one dimensional convolutional neural network. Comput. Methods Diff. Equat. 7(4), 566–572 (2019). (Special Issue)

    Google Scholar 

  15. Hassani, H., Hamid, O.: Using Artificial Neural Networks in Dialect Identification in Less-resourced Languages - The Case of Kurdish Dialects Identification

    Google Scholar 

  16. Hassani, H., Medjedovic, D.: Automatic Kurdish dialects identification. In: Conference: Fifth International Conference on Natural language Processing, Sydney, Australia (2016)

    Google Scholar 

  17. Pappagari, R., Wang, T., Villalba, J., Chen, N., Dehak, N.: X-vectors meet emotions: a study on dependencies between emotion and speaker recognition. In: ICASSP (2020)

    Google Scholar 

  18. Nandwana, M.K., et al.: The VOiCES from a distance challenge 2019: analysis of speaker verification results and remaining challenges. In: Proceedings of the Speaker and Language Recognition Workshop, pp. 165–170. https://doi.org/10.21437/Odyssey.2020-24

  19. Snyder, D., Chen, G., Povey, D.: MUSAN A Music, Speech, and Noise Corpus (2015) arXiv:1510.08484v1

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amani, A., Mohammadamini, M., Veisi, H. (2021). Kurdish Spoken Dialect Recognition Using X-Vector Speaker Embedding. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2021. Lecture Notes in Computer Science(), vol 12997. Springer, Cham. https://doi.org/10.1007/978-3-030-87802-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87802-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87801-6

  • Online ISBN: 978-3-030-87802-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics