Topic-Dependent Language Model Switching for Embedded Automatic Speech Recognition

Santos-Pérez, Marcos; González-Parada, Eva; Cano-García, José Manuel

doi:10.1007/978-3-642-28783-1_30

Marcos Santos-Pérez⁵,
Eva González-Parada⁵ &
José Manuel Cano-García⁵

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 153))

718 Accesses
7 Citations

Abstract

Embedded devices incorporate everyday new applications in different domains due to their increasing computational power.Many of these applications have a voice interface that uses Automatic Speech Recognition (ASR). When the complexity of the language model is high, it is common to use an external server to perform the recognition at the expense of certain limitations (network availability, latency, etc.). This paper focuses on a new proposal to improve the efficiency of the usage of the language model in a recognizer for multiple domains. The idea is based on the selection of a proper language model for each domain within the ASR system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BeagleBoard website, http://beagleboard.org/
CMU sphinx, http://cmusphinx.sourceforge.net/ , http://cmusphinx.sourceforge.net/
Ballinger, B., Allauzen, C., Gruenstein, A., Schalkwyk, J.: On-demand language model interpolation for mobile speech input. In: Kobayashi, T., Hirose, K., Nakamura, S. (eds.) Proceedings of Interspeech, pp. 1812–1815. ISCA (2010)
Google Scholar
Bennett, C., Rudnicky, A.I.: The Carnegie Mellon Communicator corpus. In: Proceedings of the International Conference on Spoken Language Processing, pp. 341–344 (2002)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Google Scholar
Chen, S.F.: An empirical study of smoothing techniques for language modeling. Tech. rep. (1998)
Google Scholar
CMU Communicator limited domain website, http://festvox.org/dbs/dbs_com.html
CMU Weather limited domain website, http://festvox.org/dbs/dbs_weather.html
Hsu, B.J., Glass, J.: Iterative language model estimation: Efficient data structure & algorithms. In: Proceedings of Interspeech, pp. 504–511. ISCA (2008)
Google Scholar
Huggins-daines, D., Kumar, M., Chan, A., Black, A.W., Ravishankar, M., Rudnicky, A.I.: Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices. In: Proceedings of ICASSP (2006)
Google Scholar
Lane, I.R., Kawahara, T., Matsui, T., Nakamura, S.: Dialogue speech recognition by combining hierarchical topic classification and language model switching. IEICE - Trans. Inf. Syst. E88-D, 446–454 (2005)
Article Google Scholar
Price, P., Fisher, W., Bernstein, J., Pallet, D.: Resource Management RM1 2.0. Linguistic Data Consortium, Philadelphia (1993), LDC93S3B
Google Scholar
Ravishankar, M.: Efficient algorithms for speech recognition. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh (1996), Available as tech report CMU-CS-96-143
Google Scholar
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M., Strope, B.: “your word is my command”: Google search by voice: A case study. In: Neustein, A. (ed.) Advances in Speech Recognition, pp. 61–90. Springer, US (2010)
Chapter Google Scholar
Schmitt, A., Zaykovskiy, D., Minker, W.: Speech recognition for mobile devices. International Journal of Speech Technology 11, 63–72 (2008)
Article Google Scholar
Vapnik, V.N.: The nature of statistical learning theory. Springer-Verlag New York, Inc., New York (1995)
MATH Google Scholar
Vertanen, K.: Baseline WSJ acoustic models for HTK and sphinx: Training recipes and recognition experiments. Technical report, University of Cambridge, Cavendish Laboratory (2006)
Google Scholar
Voxforge English Acoustic Model website, http://www.voxforge.org/home/downloads

Download references

Author information

Authors and Affiliations

Electronic Technology Department, School of Telecommunications Engineering, University of Malaga, Teatinos Campus, 29071, Malaga, Spain
Marcos Santos-Pérez, Eva González-Parada & José Manuel Cano-García

Authors

Marcos Santos-Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Eva González-Parada
View author publications
You can also search for this author in PubMed Google Scholar
José Manuel Cano-García
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcos Santos-Pérez .

Editor information

Editors and Affiliations

, Departamento de Informática, Universidade do Minho, Campus de Gualtar, Braga, 4710-057, Portugal
Paulo Novais
The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Campusvej 55, Odense M, DK-5230, Denmark
Kasper Hallenborg
Facultad de Ciencias, Departamento de Informática, Universidad de Salamanca, Plaza de la Merced S/N, Salamanca, 37008, Spain
Dante I. Tapia
Faculty of Science, Department of Computing Science, University of Salamanca, Plaza de la Merced S/N, Salamanca, 37008, Spain
Juan M. Corchado Rodríguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santos-Pérez, M., González-Parada, E., Cano-García, J.M. (2012). Topic-Dependent Language Model Switching for Embedded Automatic Speech Recognition. In: Novais, P., Hallenborg, K., Tapia, D., Rodríguez, J. (eds) Ambient Intelligence - Software and Applications. Advances in Intelligent and Soft Computing, vol 153. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28783-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-28783-1_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28782-4
Online ISBN: 978-3-642-28783-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics