Skip to main content

Towards a High-Quality Lemma-Based Text to Speech System for the Arabic Language

  • Conference paper
  • First Online:
Arabic Language Processing: From Theory to Practice (ICALP 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 782))

Included in the following conference series:

Abstract

Recent numbers put the Arabic language at around 250 million native speakers, making it the fifth spoken language regarding the number of speakers. Therefore, it has gained the interest of researchers in speech technologies in particular speech recognition and speech synthesis. Indeed, many researchers are still investigating in Arabic Text To Speech to deliver an intelligible and close to natural Text To Speech systems. Nevertheless, the most of the available free and semi-free Arabic Text To Speech systems are still away from the natural sounding as human voice does, and the generation of smooth voice is still involved. The primary intention of this work is to increase the quality of the produced speech resulting from the sub-segment based approach proposed in our previous work. To this end, a lemma-based approach for concatenative TTS synthesis is adopted and presented in this paper. In this context, a study of Arabic lemmas frequency was conducted to identify the highly frequent lemmas that often occur in written and spoken Classical and Modern Standard Arabic (MSA). This study reports an analysis of roughly 65 million words fully vocalized obtained from Tashkila corpus, Nemlar, and Al Jazeera. These latter cover modern and classical Arabic languages. As a result, an Arabic lemmatized frequency list was generated. The top 1,000 frequent lemmas were found to provide approximately 79% coverage of the Arabic words. Thus, the former were used as the basic acoustic units of our Text to Speech System. Finally, we demonstrate that this approach affords an improvement in the intelligibility and naturalness of a Text To Speech system with an overall rate 4.5 out of 5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://shamela.ws/.

  2. 2.

    http://www.nemlar.org.

References

  1. Chabchoub, A., Alahmadi, S., Cherif, A., Barkouti, W.: Di-Diphone Arabic speech synthesis concatenation. Int. J. Comput. Technol. 3, 218–222 (2012)

    Google Scholar 

  2. Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51, 1039–1064 (2009). https://doi.org/10.1016/j.specom.2009.04.004

    Article  Google Scholar 

  3. Zine, O., Meziane, M.: Novel approach for quality enhancement of Arabic Text To Speech synthesis. In: Presented at 3rd International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2017 (2017). https://doi.org/10.1109/ATSIP.2017.8075550

  4. Bozkurt, B., Öztürk, Ö., Dutoit, T.: Text design for TTS speech corpus building using a modified greedy selection. In: INTERSPEECH (2003)

    Google Scholar 

  5. Khan, R.A., Chitode, J.S.: Concatenative speech synthesis: a review. Int. J. of Comput. Appl. 136(3), 1–6 (2016). https://doi.org/10.5120/ijca2016907992

    Google Scholar 

  6. Hande, S.S.: A review of concatenative text to speech synthesis. Int. J. Latest Technol. Eng. Manag. Appl. Sci. IJLTEMAS 3(9), 12–15 (2014)

    Google Scholar 

  7. Hamacher, V., Chalupper, J., Eggers, J., Fischer, E., Kornagel, U., Puder, H., Rass, U.: Signal processing in high-end hearing aids: state of the art, challenges, and future trends. EURASIP J. Appl. Sig. Process. 2005, 2915–2929 (2005)

    MATH  Google Scholar 

  8. Gonzalvo, X., Tazari, S., Chan, C., Becker, M., Gutkin, A., Silen, H.: Recent advances in Google real-time HMM-driven unit selection synthesizer. Presented at the September 8 (2016)

    Google Scholar 

  9. Abdelmalek, R., Mnasri, Z.: High quality Arabic text-to-speech synthesis using unit selection. In: 2016 13th International Multi-Conference on Systems, Signals and Devices (SSD), pp. 1–5. IEEE (2016)

    Google Scholar 

  10. Rashad, M.Z., El-Bakry, H.M., Isma’il, I.R.: Diphone speech synthesis system for Arabic using MARY TTS. Int. J. Comput. Sci. Inf. Technol. 2, 18–26 (2010). https://doi.org/10.5121/ijcsit.2010.2402

    Google Scholar 

  11. Alsharif, B., Tahboub, R., Arafeh, L.: Arabic text to speech synthesis using quran-based natural language processing module. J. Theor. Appl. Inf. Technol. 83, 148 (2016)

    Google Scholar 

  12. Husni-Al-Muhtaseb, M.E., Al-Ghamdi, M.: Techniques for high quality arabic speech synthesis. Computer Science and Engineering, King Fahd University of Petroleum and Minerals (2003)

    Google Scholar 

  13. Campbell, N.: Conversational speech synthesis and the need for some laughter. IEEE Trans. Audio Speech Lang. Process. 14, 1171–1178 (2006). https://doi.org/10.1109/TASL.2006.876131

    Article  Google Scholar 

  14. Dutoit, T., Pagel, V., Pierret, N., Bataille, F., van der Vrecken, O.: The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes. In: Proceedings of the Fourth International Conference on Spoken Language, ICSLP 1996, vol. 3, pp. 1393–1396 (1996)

    Google Scholar 

  15. MaryTTS – Overview. http://mary.dfki.de/documentation/overview.html

  16. Karabetsos, S., Tsiakoulis, P., Chalamandaris, A., Raptis, S.: Embedded unit selection text-to-speech synthesis for mobile devices. IEEE Trans. Consum. Electron. 55, 613–621 (2009)

    Article  Google Scholar 

  17. Buckwalter, T., Parkinson, D.: A Frequency Dictionary of Arabic: Core Vocabulary for Learners. Routledge, London (2014)

    Google Scholar 

  18. Zaghouani, W., Bouamor, H., Hawwari, A., Diab, M., Obeid, O., Ghoneim, M., Alqahtani, S., Oflazer, K.: Guidelines and framework for a large scale Arabic diacritized corpus. In: The Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 3637–3643 (2016)

    Google Scholar 

  19. Aljazeera Network, Aljazeera Learning Arabic Service 2016. http://learning.aljazeera.net/arabic. Accessed 10 Aug 2017

  20. Belinkov, Y., Magidow, A., Romanov, M., Shmidman, A., Koppel, M.: Shamela: a large-scale historical arabic corpus. arXiv Preprint arXiv:161208989 (2016)

  21. Yaseen, B.: Language technology for Arabic. NEMLAR, Center for Sprog-teknologi, Univ. of Copenhagen, Copenhagen (2005)

    Google Scholar 

  22. Zeroual, I., Lakhouaja, A.: A new Quranic Corpus rich in morphosyntactical information. Int. J. Speech Technol. 19, 339–346 (2016). https://doi.org/10.1007/s10772-016-9335-7

    Article  Google Scholar 

  23. Boudchiche, M., Mazroui, A., Ould Abdallahi Ould Bebah, M., Lakhouaja, A., Boudlal, A.: AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J. King Saud Univ. Comput. Inf. Sci. 29(2), 141–146 (2017). https://doi.org/10.1016/j.jksuci.2016.05.002

    Google Scholar 

  24. Boudchiche, M., Mazroui, A.: Approche hybride pour le développement d’un lemmatiseur pour la langue arabe. Presented at the 13th African Conference on Research in Computer Science and Applied Mathematics, Hammamet, Tunisia (2016)

    Google Scholar 

  25. Masmoo3 - Arabic Audio Books. http://www.masmoo3.com/. Accessed 10 Aug 2017

  26. Boersma, P., Weenink, D.: Praat: doing phonetics by computer [Computer program]. Version 6.0.29. http://www.praat.org/. Accessed 24 May 2017

Download references

Acknowledgment

The authors gratefully acknowledge and thank Masmoo3 Team for providing us with the Arabic audio files used to build our speech corpus.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oumaima Zine .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zine, O., Meziane, A., Boudchiche, M. (2018). Towards a High-Quality Lemma-Based Text to Speech System for the Arabic Language. In: Lachkar, A., Bouzoubaa, K., Mazroui, A., Hamdani, A., Lekhouaja, A. (eds) Arabic Language Processing: From Theory to Practice. ICALP 2017. Communications in Computer and Information Science, vol 782. Springer, Cham. https://doi.org/10.1007/978-3-319-73500-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73500-9_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73499-6

  • Online ISBN: 978-3-319-73500-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics