Training HMM/ANN Hybrid Speech Recognizers by Probabilistic Sampling

Tóth, László; Kocsor, A.

doi:10.1007/11550822_93

Training HMM/ANN Hybrid Speech Recognizers by Probabilistic Sampling

László Tóth²⁰ &
A. Kocsor²⁰

Conference paper

1581 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3696))

Abstract

Most machine learning algorithms are sensitive to class imbalances of the training data and tend to behave inaccurately on classes represented by only a few examples. The case of neural nets applied to speech recognition is no exception, but this situation is unusual in the sense that the neural nets here act as posterior probability estimators and not as classifiers. Most remedies designed to handle the class imbalance problem in classification invalidate the proof that justifies the use of neural nets as posterior probability models. In this paper we examine one of these, the training scheme called probabilistic sampling, and show that it is fortunately still applicable. First, we argue that theoretically it makes the net estimate scaled class-conditionals instead of class posteriors, but for the hidden Markov model speech recognition framework it causes no problems, and in fact fits it even better. Second, we will carry out experiments to show the feasibility of this training scheme. In the experiments we create and examine a transition between the conventional and the class-based sampling, knowing that in practice the conditions of the mathematical proofs are unrealistic. The results show that the optimal performance can indeed be attained somewhere in between, and is slightly better than the scores obtained in the traditional way.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bishop, C.M.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)
Google Scholar
Bourlard, H.A., Morgan, N.: Connectionist Speech Recognition – A Hybrid Approach. Kluwer Academic, Dordrecht (1994)
Google Scholar
Bourlard, H.A., Morgan, N.: Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions. In: Giles, C.L., Gori, M. (eds.) IIASS-EMFCSC-School 1997. LNCS (LNAI), vol. 1387, pp. 389–417. Springer, Heidelberg (1998)
Chapter Google Scholar
Chawla, N.V., Japkowicz, N., Kolcz, A. (eds.): Proceedings of the ICML 2003 Workshop on Learning from Imbalanced Data Sets (2003), http://www.site.uottawa.ca/~nat/Workshop2003/workshop2003.html
Japkowicz, N. (ed.): Proceedings of the AAAI’2000 Workshop on Learning from Imbalanced Data Sets. AAAI Tech. Report WS-00-05 (2000)
Google Scholar
Lawrence, S., Burns, I., Back, A., Tsoi, A.C., Giles, C.L.: Neural Network Classification and Prior Class Probabilities. In: Orr, G., Müller, K.R., Caruana, R. (eds.) Tricks of the Trade: Lecture Notes in Computer Science State-of-the-Art Surveys, pp. 299–314. Springer, Heidelberg (1998)
Google Scholar
Trentin, E., Bengio, Y., Furnlanello, C., De Mori, R.: Neural Networks for Speech Recognition. In: De Mori (ed.) Spoken Dialogues with Computers, pp. 311–361. Academic Pr., New York (1998)
Google Scholar
Vicsi, K., Tóth, L., Kocsor, A., Csirik, J.: MTBA – A Hungarian Telephone Speech Database. Híradástechnika LVII(8), 35–43 (2002) (in Hungarian)
Google Scholar
Weiss, G.M., Provost, F.: The Effect of Class Distribution on Classifier Learning: An Empirical Study. Tech. Report ML-TR-44, Dep. Comp. Sci., Rutgers Univ. (2002)
Google Scholar
Young, S., et al.: The HMM Toolkit (HTK) – software and manual, http://htk.eng.cam.ac.uk

Download references

Author information

Authors and Affiliations

Research Group on Artificial Intelligence, H-6720, Szeged, Aradi vértanúk tere 1, Hungary
László Tóth & A. Kocsor

Authors

László Tóth
View author publications
You can also search for this author in PubMed Google Scholar
A. Kocsor
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Włodzisław Duch
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01–447, Warsaw, Poland
Janusz Kacprzyk
Adaptive Informatics Research Centre, Helsinki University of Technology, P.O. Box 5400, 02015 HUT, Finland
Erkki Oja
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447, Warsaw, Poland
Sławomir Zadrożny

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tóth, L., Kocsor, A. (2005). Training HMM/ANN Hybrid Speech Recognizers by Probabilistic Sampling. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds) Artificial Neural Networks: Biological Inspirations – ICANN 2005. ICANN 2005. Lecture Notes in Computer Science, vol 3696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550822_93

Download citation

DOI: https://doi.org/10.1007/11550822_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28752-0
Online ISBN: 978-3-540-28754-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics