Skip to main content

A Multi-criteria Text Selection Approach for Building a Speech Corpus

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

Abstract

Speech corpus is an important and primary requirement for several speech tasks. Building a speech corpora is a lengthy, time consuming and expensive process, it typically involves collection of a large set of textual utterances and then selective distribution of these text utterances among a set of speakers, called speaker sheets. These speaker sheets are articulated by speakers to generate the speech corpora. Depending on the task at hand the speech corpora needs to satisfy certain criteria; For example, a phonetically balanced speech corpora is essential for building an automatic speech recognition (ASR) engine, while for a text dependent speaker recognition engine there is a need for several spoken repetition of the same text by several speakers. In this paper, we formulate a method that enables creation of speaker sheets from a predetermined set of text utterances such that the speech corpora satisfies the desired requirement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. SPEECON, Speech-driven interfaces for consumer devices (2014). http://www.speechdat.org/speecon/index.html

  2. Abushariah, M.A., Ainon, R.N., Zainuddin, R., Elshafei, M., Khalifa, O.O.: Phonetically rich and balanced text and speech corpora for Arabic language. Lang. Resour. Eval. 46(4), 601–634 (2012)

    Article  Google Scholar 

  3. Pineda, L.A., Pineda, L.V., Cuétara, J., Castellanos, H., López, I.: DIMEx100: a new phonetic and speech corpus for Mexican Spanish. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 974–983. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Uraga, E., Gamboa, C.: VOXMEX speech database: design of a phonetically balanced corpus. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation. LREC 2004, Lisbon, Portugal, May 26–28. European Language Resources Association (2004)

    Google Scholar 

  5. Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. van Heerden, C., Davel, M.H., Barnard, E.: The semi-automated creation of stratified speech corpora (2013). http://www.nwu.ac.za/sites/www.nwu.ac.za/files/files/v-must/Publications/prasa2013-17.pdf

  7. Tian, J., Nurminen, J., Kiss, I.: Optimal subset selection from text databases. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005), vol. 1, pp. 305–308, March 2005

    Google Scholar 

  8. Wu, Y., Zhang, R., Rudnicky, A.: Data selection for speech recognition. In: IEEE Workshop on Automatic Speech Recognition Understanding, ASRU, pp. 562–565, December 2007. http://www.cs.cmu.edu/~yiwu/paper/asru07.pdf

  9. Nagroski, A. Boves, L., Steeneken, H.: Optimal selection of speech data for automatic speech recognition systems. In: ICSLP, pp. 2473–2476 (2002)

    Google Scholar 

  10. Chitturi, R., Mariam, S.H., Kumar, R.: Rapid methods for optimal text selection. In: Recent Advances in Natural Language Processing, September 2005

    Google Scholar 

  11. Mandal, S., Das, B., Mitra, P., Basu, A.: Developing Bengali speech corpus for phone recognizer using optimum text selection technique. In: 2011 International Conference on Asian Language Processing (IALP), pp. 268–271, November 2011

    Google Scholar 

  12. Awaz, Y.P.: Data: Speaker sheet generation for building speech corpora (2015). https://sites.google.com/site/awazyp/data/speaker

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chiragkumar Patel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Patel, C., Kopparapu, S.K. (2015). A Multi-criteria Text Selection Approach for Building a Speech Corpus. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24033-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24032-9

  • Online ISBN: 978-3-319-24033-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics