Abstract
This paper introduces a new approach based on neural networks for selecting the vocabulary to be used in a speech transcription system. Indeed, nowadays, large sets of text data can be collected from web sources, and used in addition to more traditional text sources for building language models for speech transcription systems. However, web data sources lead to large amounts of heterogeneous data, and, as a consequence, standard vocabulary selection procedures based on unigram approaches tend to select unwanted and undesirable items as new words. As an alternative to unigram-based and empirical manual-based selection approaches, this paper proposes a new selection procedure that relies on a machine learning technique, namely neural networks. The paper presents and discusses the results obtained with the various selection procedures. The neural network based selection experiments are promising and they can handle automatically various detailed information in the selection process.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rosenfeld, R.: Optimizing lexical and ngram coverage via judicious use of linguistic data. In: Proc. EUROSPEECH 1995, 4th European Conf. on Speech Communication and Technology, Madrid, Spain, pp. 1763–1766 (1995)
Allauzen, A., Gauvain, J.-L.: Automatic building of the vocabulary of a speech transcription system (in French) “Construction automatique du vocabulaire d’un système de transcription”. In: Proc. JEP 2004, Journées d’Etudes sur la Parole, Fès, Maroc (2004)
Venkataraman, A., Wang, W.: Techniques for effective vocabulary selection. In: Proc. INTERSPEECH 2003, 8th European Conf. on Speech Communication and Technology, Geneva, Switzerland, pp. 245–248 (2003)
Maergner, P., Waibel, A., Lane, I.: Unsupervised Vocabulary Selection for Real-Time Speech Recognition of Lectures. In: Proc. ICASSP 2012, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Kyoto, Japan (2012)
Mendona, A., Graff, D., DiPersio, D.: French Gigaword, 2nd edn. Linguistic Data Consortium, Philadelphia (2009)
Stolcke, A.: SRILM - An Extensible Language Modeling Toolkit. In: Proc. ICSLP 2002, Int. Conf. on Spoken Language Processing, Denver, Colorado (2002)
Gravier, G., Adda, G.: Evaluations en traitement automatique de la parole (ETAPE). Evaluation Plan, Etape 2011, version 2.0 (2011)
de Calmès, M., Pérennou, G.: BDLEX: A Lexicon for Spoken and Written French. In: Proc. LREC 1998, 1st Int. Conf. on Language Resources & Evaluation, Grenade, pp. 1129–1136 (1998)
FANN toolkit, http://leenissen.dk/fann/wp/
Sphinx (2011), http://cmusphinx.sourceforge.net
Jouvet, D., Vinuesa, N.: Classification margin for improved class-based speech recognition performance. In: ICASSP 2012, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Kyoto, Japan (2012)
Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation campaign for rich transcription of French broadcasts. In: Proc. INTERSPEECH 2009, Brighton, UK, pp. 2583–2586 (2009)
Corpus EPAC: Transcriptions orthographiques. Catalogue ELRA, reference ELRA-S0305, http://catalog.elra.info
Illina, I., Fohr, D., Jouvet, D.: Grapheme-to-Phoneme Conversion using Conditional Random Fields. In: Proc. INTERSPEECH 2011, Florence, Italy (2011)
Gillick, L., Cox, S.J.: Some statistical issues in the comparison of speech recognition algorithms. In: Proc. ICASSP 1989, Int. Conf. on Acoustics, Speech and Signal Processing, pp. 532–535 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jouvet, D., Langlois, D. (2013). A Machine Learning Based Approach for Vocabulary Selection for Speech Transcription. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-40585-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)