Skip to main content

Fast Discriminative Training

  • Chapter
  • First Online:
Speaker Authentication

Part of the book series: Signals and Communication Technology ((SCT))

  • 728 Accesses

Abstract

A good training algorithm for pattern recognition needs to satisfy two criteria. First, the objective function is associated to the desired performance, and second, the parameter estimation process derived from the objective is easy to compute using available computation resources and can converge in the required time. For example, the expectation-maximization (EM) algorithm guarantees in convergence but its objective is not to minimize the error rate which is desired by most applications. On the other hand, many new objective functions are very well defined to directly associate to desired performance, but are often too computationally complicated and may not be able to get the desired results in a reasonable amount of time. Therefore, for real applications, to define an objective and derive an estimation algorithm is a joint design process. This chapter presents an example where a discriminative objective was defined together with its fast training algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahl L. R., Brown P. F., de Souza P. V., Mercer R. L. “Maximum mutual information estimation of hidden Markov model parameters for speech recognition”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (Tokyo) pp. 49–52, 1986

    Google Scholar 

  2. Ben-Yishai A., Burshtein D.: “A discriminative training algorithm for hidden Markov models”. IEEE Trans. on Speech and Audio Processing, May 2004

    Google Scholar 

  3. Bishop, C.: Neural networks for pattern recognition. Oxford Univ. Press, NY (1995)

    Google Scholar 

  4. Chou, W.: “Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition”. Proceedings of the IEEE 88, 1201–1222 (2000)

    Article  Google Scholar 

  5. Dempster, A. P., Laird, N. M., Rubin, D. B.: “Maximum likelihood from incomplete data via the EM algorithm”. Journal of Royal Statistical Society 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  6. Duda, R. O., Hart, P. E., Stork, D. G.: Pattern Classification. Second edn. Wiley, New York (2001)

    MATH  Google Scholar 

  7. Gopalakrishnan, P. S., Kanevsky, D., Nadas, A., Nahamoo, D.: “An inequality for rational functions with applications to some statistical estimation problems”. IEEE Trans. on Information theory 37, 107–113 (1991)

    Article  MATH  Google Scholar 

  8. Juang, B.-H., Katagiri, S.: “Discriminative learning for minimum error classification”. IEEE Transactions on Signal Processing 40, 3043–3054 (1992)

    Article  MATH  Google Scholar 

  9. Kirkpatrick S., C. D. Gelatt, J., Vecchi, M. P.: “Optimization by simulated annealing”. Science 220:671–680 (1983)

    Google Scholar 

  10. Li Q., Juang B.-H. “Fast discriminative training for sequential observations with application to speaker identification”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, (Hong Kong), April 2003

    Google Scholar 

  11. Li, Q., Juang, B.-H.: “Study of a fast discriminative training algorithm for pattern recognition”. IEEE Trans. on Neural Networks 17, 1212–1221 (2006)

    Article  Google Scholar 

  12. Li, Q., Zheng, J., Tsai, A., Zhou, Q.: “Robust endpoint detection and energy normalization for real-time speech and speaker recognition”. IEEE Trans. on Speech and Audio Processing 10, 146–157 (2002)

    Article  Google Scholar 

  13. Markov K., Nakagawa S. “Discriminative training of GMM using a modified EM algorithm for speaker recognition”. in Proc. ICSLP, 1998

    Google Scholar 

  14. Markov K., Nakagawa S., Nakamura S. “Discriminative training of HMM using maximum normalized likelihood algorithm”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 497–500, 2001

    Google Scholar 

  15. Max, B., Tam, Y.-C., Li, Q.: “Discriminative auditory features for robust speech recognition”. IEEE Trans. on Speech and Audio Processing 12, 27–36 (2004)

    Article  Google Scholar 

  16. Mora-Jimenez, I., Cid-Sueiro, J.: “A universal learning rule that minimize well-formed cost functinos”. IEEE Trans. On Neural Networks 16, 810–820 (2005)

    Article  Google Scholar 

  17. Normandin, Y., Cardin, R., Mori, R. D.: “High-performance connected digit recognition using maximum mutual information estimation”. IEEE Trans. on Speech and Audio Processing 2, 299–311 (1994)

    Article  Google Scholar 

  18. Reynolds, D., Rose, R. C.: “Robust text-independent speaker identification using Gaussian mixture speaker models”. IEEE Trans. on Speech and Audio Processing 3, 72–83 (1995)

    Article  Google Scholar 

  19. Robinson, M., Azimi-Sadjadi, M. R., Salazar, J.: “Multi-aspect target discrimination using hidden Markov models and neural networks”. IEEE Trans. On Neural Networks 16, 447–459 (2005)

    Article  Google Scholar 

  20. Werbos, P. J.: The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. Wiley, New York (1994)

    Google Scholar 

  21. Wu, W., Feng, G., Li, Z., Xu, Y.: “Deterministic convergence of an online gradient method for BP networks”. IEEE Trans. On neural Networks 16, 533–540 (2005)

    Article  Google Scholar 

  22. Yin Y., Li Q. “Soft frame margin estimation of Gaussian mixture models for speaker recognition with sparse training data”. in ICASSP 2011,(2011)

    Google Scholar 

  23. Yu, X., Efe, M. O., kaynak, O.: “A general backpropagation algorithm for feedforward neural networks learning”. IEEE Trnas. On Neural Networks 13, 251–254 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qi (Peter) Li .

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Li, Q.(. (2012). Fast Discriminative Training. In: Speaker Authentication. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23731-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23731-7_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23730-0

  • Online ISBN: 978-3-642-23731-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics