Skip to main content

Machine Learning for Ambient Intelligence: Boosting in Automatic Speech Recognition

  • Chapter
Algorithms in Ambient Intelligence

Part of the book series: Philips Research ((PRBS,volume 2))

  • 234 Accesses

Abstract

An important aspect of Ambient Intelligence is a convenient user interface, supporting several user-friendly input modalities. Speech is one of the most natural modalities for man-machine interaction. Numerous applications in the context of Ambient Intelligence — whether referring to a single input modality or combining different ones — involve some pattern classification task. Experience shows that for building successful and reliable real life applications, advanced classification algorithms are needed providing maximal accuracy for the underlying task. In this chapter, we investigate whether a generic machine learning technique, the boosting algorithm, can successfully be applied to increase the accuracy in a ‘large-scale’ classification problem, namely large vocabulary automatic speech recognition. Specifically, we outline an approach to implement the AdaBoost.M2 algorithm for training of acoustic models in a state-of-the-art automatic speech recognizer. Detailed evaluations in a large vocabulary name recognition task show that this ‘utterance approach’ improves the best test error rates obtained with standard training paradigms. In particular, we obtain additive performance gains when combining boosting with discriminative training, one of the most powerful training algorithms in speech recognition. Our findings motivate further applications of boosting in other classification tasks relevant for Ambient Intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abney, S., R.E. Schapire, and Y. Singer [1999]. Boosting applied to tagging and PP attachment. In Proc. of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, College Park, Maryland, pages 38–45.

    Google Scholar 

  • Bahl, L.R., P.F. Brown, P.V. de Souza, and R.L. Mercer [1986]. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In Proc. Intern. Conference on Acoustics, Speech and Signal Processing (ICASSP-86), Tokyo, Japan, pages 49–52.

    Google Scholar 

  • Beyerlein, P. [1998]. Discriminative model combination. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-98), Seattle, WA, pages 481–484.

    Google Scholar 

  • Cook, G.D., and A.J. Robinson [1996]. Boosting the performance of connectionist large vocabulary speech Recognition. In Proc. International Conference on Spoken Language Processing (ICSLP-96), Philadelphia, PA, pages 1305–1308.

    Google Scholar 

  • Davis, S.B., and P. Mermelstein [ 1980 ]. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASSP, 28: 357–366.

    Article  Google Scholar 

  • Escudero, G., L. Marquez, and G. Rigau [ 2000 ]. Boosting applied to word sense disambiguation. In Proc. 12th European Conf on Machine Learning, pages 129–141.

    Google Scholar 

  • Freund, Y., and R.E. Schapire [ 1997 ]. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55: 119–139.

    Article  MathSciNet  MATH  Google Scholar 

  • Freund, Y., R. Iyer, R.E. Schapire, and Y. Singer [1998]. An efficient boosting algorithm for combining preferences. In Machine Learning: Proc. 15th International Conference (ICML- 98).

    Google Scholar 

  • Henderson, J.C., and E. Brill [2000]. Bagging and boosting a treebank parser. In Proc. of the First Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-2000), Seattle, WA, pages 34–41.

    Google Scholar 

  • Iyer, R.D., D.D. Levis, R.E. Schapire, Y. Singer, and A. Singhal [2000]. Boosting for document routing. In Proc. 9th International Conference on Information and Knowledge Management.

    Google Scholar 

  • Juang, B.H., and S. Katagiri [ 1992 ]. Discriminative learning for minimum error classification. IEEE Transactions on Signal Processing, 40: 3043–3054.

    Article  MATH  Google Scholar 

  • Mason, L., P. Bartlett, and M. Golea [ 1997 ]. Generalization error of combined classifiers. Technical report, Department of Systems Engineering, Australian National University.

    Google Scholar 

  • Meyer, C., and G. Rose [2000]. Rival training: Efficient use of data in discriminative training. In Proc. International Conf. on Spoken Language Processing (ICSLP-00), Beijing, China, pages 632–635.

    Google Scholar 

  • Meyer, C. [2002]. Utterance-level boosting of HMM speech recognizers. In Proc. International Conf. on Acoustics, Speech and Signal Processing (ICASSP-02), Orlando, FL, pages 109–112.

    Google Scholar 

  • Meyer, C., and P. Beyerlein [2002]. Towards “large margin” speech recognizers by boosting and discriminative training. In Machine Learning: Proc. of the Nineteenth International Conference (ICML-02), Sydney, Australia, pages 419–426.

    Google Scholar 

  • Odell, J.J. [1995]. The Use of Context in Large Vocabulary Speech Recognition. Ph.D. thesis, University of Cambridge 1995, England.

    Google Scholar 

  • Rochery, M., R. Schapire, M. Rahim, N. Gupta, G. Riccardi, S. Bangalore, H. Alshawi, and S. Douglas [2002]. Combining prior knowledge and boosting for call classification in spoken language dialogue. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-02), Orlando, FL, pages 29–32.

    Google Scholar 

  • Ruber, B. [1997]. Obtaining confidence measures from sentence probabilities. In Proc. EU- ROSPEECH, Rhodes, Greece, pages 739–742.

    Google Scholar 

  • Schapire, R.E. [ 1990 ]. The strength of weak learnability. Machine Learning, 5: 197–227.

    Google Scholar 

  • Schapire, R.E., Y. Freund, P. Bartlett, and W.S. Lee [ 1998a ]. Boosting the margin: A new explanation of the effectiveness of voting methods. The Annals of Statistics, 26: 1651–1686.

    Article  MathSciNet  MATH  Google Scholar 

  • Schapire, R.E., Y. Singer, and A. Singhal [1998b]. Boosting and Rocchio applied to text filtering. In Proc. 21st Annual Int. Conf. on Research and Development in Information Retrieval.

    Google Scholar 

  • Schapire, R.E. [1999]. Theoretical views of boosting and applications. In Proc. 10th International Conference on Algorithmic Learning Theory, Tokyo, Japan.

    Google Scholar 

  • Schapire, R.E., and Y. Singer [ 2000 ]. BoosTexter: A boosting-based system for text categorization. Machine Learning, 39: 135–168.

    Article  MATH  Google Scholar 

  • Schapire, R.E., M. Rochery, M. Rahim, and N. Gupta [2002]. Incorporating prior knowledge into boosting. In Machine Learning: Proc. of the Nineteenth International Conference (ICML-02), Sydney, Australia, pages 538–545.

    Google Scholar 

  • Schwenk, H. [1999]. Using boosting to improve a hybrid HMM/neural network speech recognizer. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-99), Phoenix, AZ, pages 1009–1012.

    Google Scholar 

  • Tieu, K., and P. Viola [2000]. Boosting image retrieval. In Proc. of the IEEE conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  • Zweig, G., and M. Padmanabhan [2000]. Boosting Gaussian mixtures in an LVCSR system. In Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP-00), Istanbul, Turkey, pages 1527–1530.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Meyer, C., Beyerlein, P. (2004). Machine Learning for Ambient Intelligence: Boosting in Automatic Speech Recognition. In: Verhaegh, W.F.J., Aarts, E., Korst, J. (eds) Algorithms in Ambient Intelligence. Philips Research, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-0703-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-0703-9_9

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-6490-5

  • Online ISBN: 978-94-017-0703-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics