A Multi-criteria Text Selection Approach for Building a Speech Corpus

Patel, Chiragkumar; Kopparapu, Sunil Kumar

doi:10.1007/978-3-319-24033-6_2

Chiragkumar Patel¹⁵ &
Sunil Kumar Kopparapu¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1809 Accesses
2 Citations

Abstract

Speech corpus is an important and primary requirement for several speech tasks. Building a speech corpora is a lengthy, time consuming and expensive process, it typically involves collection of a large set of textual utterances and then selective distribution of these text utterances among a set of speakers, called speaker sheets. These speaker sheets are articulated by speakers to generate the speech corpora. Depending on the task at hand the speech corpora needs to satisfy certain criteria; For example, a phonetically balanced speech corpora is essential for building an automatic speech recognition (ASR) engine, while for a text dependent speaker recognition engine there is a need for several spoken repetition of the same text by several speakers. In this paper, we formulate a method that enables creation of speaker sheets from a predetermined set of text utterances such that the speech corpora satisfies the desired requirement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

SPEECON, Speech-driven interfaces for consumer devices (2014). http://www.speechdat.org/speecon/index.html
Abushariah, M.A., Ainon, R.N., Zainuddin, R., Elshafei, M., Khalifa, O.O.: Phonetically rich and balanced text and speech corpora for Arabic language. Lang. Resour. Eval. 46(4), 601–634 (2012)
Article Google Scholar
Pineda, L.A., Pineda, L.V., Cuétara, J., Castellanos, H., López, I.: DIMEx100: a new phonetic and speech corpus for Mexican Spanish. In: Lemaître, C., Reyes, C.A., González, J.A. (eds.) IBERAMIA 2004. LNCS (LNAI), vol. 3315, pp. 974–983. Springer, Heidelberg (2004)
Chapter Google Scholar
Uraga, E., Gamboa, C.: VOXMEX speech database: design of a phonetically balanced corpus. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation. LREC 2004, Lisbon, Portugal, May 26–28. European Language Resources Association (2004)
Google Scholar
Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009)
Chapter Google Scholar
van Heerden, C., Davel, M.H., Barnard, E.: The semi-automated creation of stratified speech corpora (2013). http://www.nwu.ac.za/sites/www.nwu.ac.za/files/files/v-must/Publications/prasa2013-17.pdf
Tian, J., Nurminen, J., Kiss, I.: Optimal subset selection from text databases. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005), vol. 1, pp. 305–308, March 2005
Google Scholar
Wu, Y., Zhang, R., Rudnicky, A.: Data selection for speech recognition. In: IEEE Workshop on Automatic Speech Recognition Understanding, ASRU, pp. 562–565, December 2007. http://www.cs.cmu.edu/~yiwu/paper/asru07.pdf
Nagroski, A. Boves, L., Steeneken, H.: Optimal selection of speech data for automatic speech recognition systems. In: ICSLP, pp. 2473–2476 (2002)
Google Scholar
Chitturi, R., Mariam, S.H., Kumar, R.: Rapid methods for optimal text selection. In: Recent Advances in Natural Language Processing, September 2005
Google Scholar
Mandal, S., Das, B., Mitra, P., Basu, A.: Developing Bengali speech corpus for phone recognizer using optimum text selection technique. In: 2011 International Conference on Asian Language Processing (IALP), pp. 268–271, November 2011
Google Scholar
Awaz, Y.P.: Data: Speaker sheet generation for building speech corpora (2015). https://sites.google.com/site/awazyp/data/speaker

Download references

Author information

Authors and Affiliations

TCS Innovation Labs - Mumbai, Thane (West), 400601, Maharastra, India
Chiragkumar Patel & Sunil Kumar Kopparapu

Authors

Chiragkumar Patel
View author publications
You can also search for this author in PubMed Google Scholar
Sunil Kumar Kopparapu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chiragkumar Patel .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Pavel Král
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patel, C., Kopparapu, S.K. (2015). A Multi-criteria Text Selection Approach for Building a Speech Corpus. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-24033-6_2
Published: 11 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24032-9
Online ISBN: 978-3-319-24033-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics