Machine Learning for Ambient Intelligence: Boosting in Automatic Speech Recognition

Meyer, Carsten; Beyerlein, Peter

doi:10.1007/978-94-017-0703-9_9

Carsten Meyer &
Peter Beyerlein

Part of the book series: Philips Research ((PRBS,volume 2))

234 Accesses

Abstract

An important aspect of Ambient Intelligence is a convenient user interface, supporting several user-friendly input modalities. Speech is one of the most natural modalities for man-machine interaction. Numerous applications in the context of Ambient Intelligence — whether referring to a single input modality or combining different ones — involve some pattern classification task. Experience shows that for building successful and reliable real life applications, advanced classification algorithms are needed providing maximal accuracy for the underlying task. In this chapter, we investigate whether a generic machine learning technique, the boosting algorithm, can successfully be applied to increase the accuracy in a ‘large-scale’ classification problem, namely large vocabulary automatic speech recognition. Specifically, we outline an approach to implement the AdaBoost.M2 algorithm for training of acoustic models in a state-of-the-art automatic speech recognizer. Detailed evaluations in a large vocabulary name recognition task show that this ‘utterance approach’ improves the best test error rates obtained with standard training paradigms. In particular, we obtain additive performance gains when combining boosting with discriminative training, one of the most powerful training algorithms in speech recognition. Our findings motivate further applications of boosting in other classification tasks relevant for Ambient Intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abney, S., R.E. Schapire, and Y. Singer [1999]. Boosting applied to tagging and PP attachment. In Proc. of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, College Park, Maryland, pages 38–45.
Google Scholar
Bahl, L.R., P.F. Brown, P.V. de Souza, and R.L. Mercer [1986]. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In Proc. Intern. Conference on Acoustics, Speech and Signal Processing (ICASSP-86), Tokyo, Japan, pages 49–52.
Google Scholar
Beyerlein, P. [1998]. Discriminative model combination. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-98), Seattle, WA, pages 481–484.
Google Scholar
Cook, G.D., and A.J. Robinson [1996]. Boosting the performance of connectionist large vocabulary speech Recognition. In Proc. International Conference on Spoken Language Processing (ICSLP-96), Philadelphia, PA, pages 1305–1308.
Google Scholar
Davis, S.B., and P. Mermelstein [ 1980 ]. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP, 28: 357–366.
Article Google Scholar
Escudero, G., L. Marquez, and G. Rigau [ 2000 ]. Boosting applied to word sense disambiguation. In Proc. 12th European Conf on Machine Learning, pages 129–141.
Google Scholar
Freund, Y., and R.E. Schapire [ 1997 ]. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55: 119–139.
Article MathSciNet MATH Google Scholar
Freund, Y., R. Iyer, R.E. Schapire, and Y. Singer [1998]. An efficient boosting algorithm for combining preferences. In Machine Learning: Proc. 15th International Conference (ICML- 98).
Google Scholar
Henderson, J.C., and E. Brill [2000]. Bagging and boosting a treebank parser. In Proc. of the First Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2000), Seattle, WA, pages 34–41.
Google Scholar
Iyer, R.D., D.D. Levis, R.E. Schapire, Y. Singer, and A. Singhal [2000]. Boosting for document routing. In Proc. 9th International Conference on Information and Knowledge Management.
Google Scholar
Juang, B.H., and S. Katagiri [ 1992 ]. Discriminative learning for minimum error classification. IEEE Transactions on Signal Processing, 40: 3043–3054.
Article MATH Google Scholar
Mason, L., P. Bartlett, and M. Golea [ 1997 ]. Generalization error of combined classifiers. Technical report, Department of Systems Engineering, Australian National University.
Google Scholar
Meyer, C., and G. Rose [2000]. Rival training: Efficient use of data in discriminative training. In Proc. International Conf. on Spoken Language Processing (ICSLP-00), Beijing, China, pages 632–635.
Google Scholar
Meyer, C. [2002]. Utterance-level boosting of HMM speech recognizers. In Proc. International Conf. on Acoustics, Speech and Signal Processing (ICASSP-02), Orlando, FL, pages 109–112.
Google Scholar
Meyer, C., and P. Beyerlein [2002]. Towards “large margin” speech recognizers by boosting and discriminative training. In Machine Learning: Proc. of the Nineteenth International Conference (ICML-02), Sydney, Australia, pages 419–426.
Google Scholar
Odell, J.J. [1995]. The Use of Context in Large Vocabulary Speech Recognition. Ph.D. thesis, University of Cambridge 1995, England.
Google Scholar
Rochery, M., R. Schapire, M. Rahim, N. Gupta, G. Riccardi, S. Bangalore, H. Alshawi, and S. Douglas [2002]. Combining prior knowledge and boosting for call classification in spoken language dialogue. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-02), Orlando, FL, pages 29–32.
Google Scholar
Ruber, B. [1997]. Obtaining confidence measures from sentence probabilities. In Proc. EU- ROSPEECH, Rhodes, Greece, pages 739–742.
Google Scholar
Schapire, R.E. [ 1990 ]. The strength of weak learnability. Machine Learning, 5: 197–227.
Google Scholar
Schapire, R.E., Y. Freund, P. Bartlett, and W.S. Lee [ 1998a ]. Boosting the margin: A new explanation of the effectiveness of voting methods. The Annals of Statistics, 26: 1651–1686.
Article MathSciNet MATH Google Scholar
Schapire, R.E., Y. Singer, and A. Singhal [1998b]. Boosting and Rocchio applied to text filtering. In Proc. 21st Annual Int. Conf. on Research and Development in Information Retrieval.
Google Scholar
Schapire, R.E. [1999]. Theoretical views of boosting and applications. In Proc. 10th International Conference on Algorithmic Learning Theory, Tokyo, Japan.
Google Scholar
Schapire, R.E., and Y. Singer [ 2000 ]. BoosTexter: A boosting-based system for text categorization. Machine Learning, 39: 135–168.
Article MATH Google Scholar
Schapire, R.E., M. Rochery, M. Rahim, and N. Gupta [2002]. Incorporating prior knowledge into boosting. In Machine Learning: Proc. of the Nineteenth International Conference (ICML-02), Sydney, Australia, pages 538–545.
Google Scholar
Schwenk, H. [1999]. Using boosting to improve a hybrid HMM/neural network speech recognizer. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-99), Phoenix, AZ, pages 1009–1012.
Google Scholar
Tieu, K., and P. Viola [2000]. Boosting image retrieval. In Proc. of the IEEE conference on Computer Vision and Pattern Recognition.
Google Scholar
Zweig, G., and M. Padmanabhan [2000]. Boosting Gaussian mixtures in an LVCSR system. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-00), Istanbul, Turkey, pages 1527–1530.
Google Scholar

Download references

Authors

Carsten Meyer
View author publications
You can also search for this author in PubMed Google Scholar
Peter Beyerlein
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Philips Research Laboratories Eindhoven, Prof. Holstlaan 4, 5656 AA, Eindhoven, The Netherlands
Wim F. J. Verhaegh , Emile Aarts & Jan Korst , &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Meyer, C., Beyerlein, P. (2004). Machine Learning for Ambient Intelligence: Boosting in Automatic Speech Recognition. In: Verhaegh, W.F.J., Aarts, E., Korst, J. (eds) Algorithms in Ambient Intelligence. Philips Research, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-0703-9_9

Download citation

DOI: https://doi.org/10.1007/978-94-017-0703-9_9
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-6490-5
Online ISBN: 978-94-017-0703-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics