Skip to main content

Design of a Yoruba Language Speech Corpus for the Purposes of Text-to-Speech (TTS) Synthesis

  • Conference paper
  • 2345 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9621))

Abstract

This paper deals with the design of a speech corpus for a corpus-based Text-To-Speech (TTS) synthesis approach. The purposes are first to provide enough speech to develop Yoruba corpus-based TTS system and second, to provide a simple methodology for other languages corpus design. The paper focuses on text analysis, selection of the reliable sentences, selection of the reader, and sentences recording. The analysis is performed to ensure a good balance of the corpus. Then, 2,415 sentences are gathered (essentially affirmative sentences). Those sentences have been read by a Yoruba language journalist who is a native speaker of the language. There is one speaker for the whole corpus.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.africa.uga.edu/Yoruba/phonology.html.

  2. 2.

    http://espeak.sourceforge.net.

  3. 3.

    http://www.jw.org/yo.

  4. 4.

    https://github.com/johnaoga/marytts.

References

  1. Adeyemo, O.O., Idowu, A.: Development and integration of text to speech usability interface for visually impaired users in Yoruba language. Afr. J. Comput. ICT 8(1), 87–94 (2015)

    Google Scholar 

  2. Afolabi, A.O., Wahab, A.S.: Implementation of Yoruba text-to-speech e-learning system. Int. J. Eng. Res. Technol. 2(11), 1055–1064 (2013)

    Google Scholar 

  3. Akinadé, O.O., Ọdẹ́jọbí, O.A.: Computational modelling of Yorùbá numerals in a number-to-text conversion system. J. Lang. Model. 2(1), 167–211 (2014)

    Article  Google Scholar 

  4. Akinlabi, A.: Yorùbá sound system. In: Understanding Yoruba life and culture. Africa world press Inc. pp. 453–468. (2004)

    Google Scholar 

  5. Akinwonmi, A.E.: A prosodic text-to-speech system for yorùbá language. In: 8th IEEE International Conference for Internet Technology and Secured Transactions (ICITST), pp. 630–635, London (2013)

    Google Scholar 

  6. Chou, F.-C., Tseng, C.-Y., Lee, L.-S.: A set of corpus-based text-to-speech synthesis technologies for mandarin Chinese. IEEE Trans. Speech Audio Process. 10, 481–494 (2002)

    Article  Google Scholar 

  7. Dagba, T.K., Aoga, O.R., Fanou, C.C.: eSpeak support of Yoruba language for the purposes of mobile phone applications. In: 3rd IEEE Pan African Conference on Science Computing and Telecommunication (PACT’2015), pp. 137–141, Kampala (2015)

    Google Scholar 

  8. Dagba, T.K., Boco, C.: A text to speech system for Fon language using multisyn algorithm. Procedia Comput. Sci. 35, 447–455 (2014). Elsevier

    Article  Google Scholar 

  9. Fék, M., Pesti, P., Németh, G., Zainkó, C., Olaszy, G.: Corpus-based unit selection TTS for Hungarian. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 367–373. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Gauvain, J.-L., Lamel, L., Eskenazi, M.: Design considerations and text selection for BREF, a large french read-speech corpus. In: 1st International Conference on Speech and Language Processing, vol. 2, pp. 1097–2000 (1990)

    Google Scholar 

  11. Igue, A.M.: Grammaire Yorùbá de base abrégée. Center for Advanced Studies of African Society (CASAS), monograph 238 (2009)

    Google Scholar 

  12. Kawai, H, Tsuzaki, M.: Study on time-dependent voice quality variation in a large-scale single speaker speech corpus used for speech synthesis. In: Proceeding of the IEEE Workshop on Speech Synthesis, pp. 15–18 (2002)

    Google Scholar 

  13. Matousek, J., Psutka, J., Kruta, J. : Design of speech corpus for text-to-speech synthesis. In: Interspeech (Eurospeech), pp. 2047–2050 (2001)

    Google Scholar 

  14. Nagy, A., Pesti, P., Németh, G., Böhm, T.: Design issues of a corpus-based speech synthesizer. Hung. J. Commun. 6, 18–24 (2005)

    Google Scholar 

  15. Odéjobí, O.A.: Design of a text markup system for Yorùbá text-to-speech synthesis applications. In: Conference on Human Language Technology for Development, pp. 74–80, Alexandria, Egypt (2011)

    Google Scholar 

  16. Odéjobí, O.A., Beaumont, A.J., Wong, S.H.S.: A computational model of intonation for Yorùbá text-to-speech synthesis: design and analysis. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 409–416. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  17. Ortega-Garcia, J., Gonzalez-Rodriguez, J., Marrero-Aguiar, V.: AHUMADA: a large speech corpus in Spanish for speaker characterization and identification. Speech Commun. 31(2), 255–264 (2000)

    Article  Google Scholar 

  18. Piits, L., Mihkla M., Nurk T., Kiissel, I.: Designing a speech corpus for Estonian unit selection synthesis. In: Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007, pp. 367–371 (2007)

    Google Scholar 

  19. Radová, V., Vopálka, P.: Methods of Sentences Selection for Read-Speech Corpus Design. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 165–170. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  20. Schröder, M., Trouvain, J.: The German text-to- speech synthesis system MARY: a tool for research, development and teaching. Int. J. Speech Technol. 6, 365–377 (2003)

    Article  Google Scholar 

  21. Tan, T.-S., Hussain, S.: Scorpus design for Malay corpus-based speech synthesis system. Am. J. Appl. Sci. 6(4), 696–702 (2009)

    Article  MathSciNet  Google Scholar 

  22. Taylor, P.: Text-to-Speech Synthesis. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  23. Van Niekerk, D.R., Barnard, E.: Tone realisation in a Yoruba speech recognition corpus. In: 2012 Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), Cape Town, South Africa. http://www.mica.edu.vn/sltu2012/files/proceedings/11.pdf

  24. Viswanathan, M., Viswanathan, M.: Measuring speech quality for text-to-speech systems: development and assessment of a modified mean opinion score (MOS) scale. Comput. Speech Lang. 19(1), 55–83 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

The authors acknowledge the contribution of Vincent AWE, radio journalist of Yoruba language at the Office of Radio and Television of Benin (ORTB), for the recording of the corpus.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Théophile K. Dagba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dagba, T.K., Aoga, J.O.R., Fanou, C.C. (2016). Design of a Yoruba Language Speech Corpus for the Purposes of Text-to-Speech (TTS) Synthesis. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49381-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49381-6_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49380-9

  • Online ISBN: 978-3-662-49381-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics