Iterative Optimization of the Data Driven Analysis in Continuous Speech

  • T. Kuhn
  • S. Kunzmann
  • E. Nöth
  • S. Rieck
  • E. Schukat-Talamazzini
Conference paper
Part of the NATO ASI Series book series (volume 75)

Abstract

We present an iterative method to optimize the word recognition rate for a data driven analysis in continuous speech by using a large set of speech samples. After a short description of our system environment a bootstrapping method for an iterative parameter estimation will be discussed. The initialization of the bootstrapping procedure is done by using a limited amount of hand labeled training data to estimate the statistical parameters roughly. In the second step the statistical parameters are estimated more exactly on the basis of unlabeled training data. Some experimental results for the bootstrapping method performed on unlabeled training data in comparison with results achieved by parameter estimation on labeled training data will be given.

Keywords

Prefix Acoustics Hone 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    S. Kunzmann. Die Worterkennung in einem Dialogsystem für kontinuierlich gesprochene Sprache. PhD thesis, Technische Fakultät der Universität Erlangen-Nürnberg, 1990.Google Scholar
  2. [2]
    S. Kunzmann, T. Kuhn, and H. Niemann. An Experimental Environment for Generating Word Hypotheses in Continuous Speech. In H. Niemann, M. Lang, and G. Sagerer, editors, Recent Advances in Speech Understanding and Dialog Systems, pages 311–316, Springer Verlag, Berlin, Heidelberg, New York, 1988.Google Scholar
  3. [3]
    K. Lee, H. Hon, M. Hwang, and S. Majahan. Recent Progress and Future Outlook of the SPHINX Speech Recognition System. Computer Speech & Language, 4(1):57–69, 1990.CrossRefGoogle Scholar
  4. [4]
    H. Niemann, A. Brietzmann, U. Ehrlich, S. Posch, P. Regel, G. Sagerer, R. Salzbrunn, and E.G. Schukat-Talamazzini. A Knowledge Based Speech Understanding System. Int. J. Pattern Recognition and Artificial Intelligence, 2(2):321–350, 1988.CrossRefGoogle Scholar
  5. E. Nöth, A. Batliner, and T. Kuhn. Intensity as a Predictor of Focal Accent. In XIIème Congrès International des Science Phonétiques, will be published in 1991.Google Scholar
  6. [6]
    L. R. Rabiner. Mathematical Foundations of Hidden Markov Models. In H. Niemann, M. Lang, and G. Sagerer, editors, Recent Advances in Speech Understanding and Dialog Systems, pages 183–205, Springer Verlag, Berlin, Heidelberg, New York, 1988.Google Scholar
  7. [7]
    A. Reißer. Ein zeitsynchrones Segmentierungsverfahren für die Lautklassifikation mit Markov Modellen. Technical Report, IMMD5 (Mustererkennung), Universität Erlangen-Nürnberg, 1990.Google Scholar
  8. [8]
    E. G. Schukat-Talamazzini. Generierung von Worthypothesen in kontinuierlicher Sprache. Volume 141 of Informatik Fachberichte, Springer Verlag, Berlin, Heidelberg, New York, Tokyo, 1987.MATHGoogle Scholar
  9. [9]
    R. Schwartz and F. Kubala. Hidden Markov Models and Speaker Adaptation. In this volume.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1992

Authors and Affiliations

  • T. Kuhn
    • 1
  • S. Kunzmann
    • 1
  • E. Nöth
    • 1
  • S. Rieck
  • E. Schukat-Talamazzini
    • 1
  1. 1.Lehrstuhl für Informatik 5 (Mustererkennung)Friedrich-Alexander-Universität Erlangen-NürnbergErlangenF.R. of Germany

Personalised recommendations