Abstract
A good training algorithm for pattern recognition needs to satisfy two criteria. First, the objective function is associated to the desired performance, and second, the parameter estimation process derived from the objective is easy to compute using available computation resources and can converge in the required time. For example, the expectation-maximization (EM) algorithm guarantees in convergence but its objective is not to minimize the error rate which is desired by most applications. On the other hand, many new objective functions are very well defined to directly associate to desired performance, but are often too computationally complicated and may not be able to get the desired results in a reasonable amount of time. Therefore, for real applications, to define an objective and derive an estimation algorithm is a joint design process. This chapter presents an example where a discriminative objective was defined together with its fast training algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bahl L. R., Brown P. F., de Souza P. V., Mercer R. L. “Maximum mutual information estimation of hidden Markov model parameters for speech recognition”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (Tokyo) pp. 49–52, 1986
Ben-Yishai A., Burshtein D.: “A discriminative training algorithm for hidden Markov models”. IEEE Trans. on Speech and Audio Processing, May 2004
Bishop, C.: Neural networks for pattern recognition. Oxford Univ. Press, NY (1995)
Chou, W.: “Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition”. Proceedings of the IEEE 88, 1201–1222 (2000)
Dempster, A. P., Laird, N. M., Rubin, D. B.: “Maximum likelihood from incomplete data via the EM algorithm”. Journal of Royal Statistical Society 39, 1–38 (1977)
Duda, R. O., Hart, P. E., Stork, D. G.: Pattern Classification. Second edn. Wiley, New York (2001)
Gopalakrishnan, P. S., Kanevsky, D., Nadas, A., Nahamoo, D.: “An inequality for rational functions with applications to some statistical estimation problems”. IEEE Trans. on Information theory 37, 107–113 (1991)
Juang, B.-H., Katagiri, S.: “Discriminative learning for minimum error classification”. IEEE Transactions on Signal Processing 40, 3043–3054 (1992)
Kirkpatrick S., C. D. Gelatt, J., Vecchi, M. P.: “Optimization by simulated annealing”. Science 220:671–680 (1983)
Li Q., Juang B.-H. “Fast discriminative training for sequential observations with application to speaker identification”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, (Hong Kong), April 2003
Li, Q., Juang, B.-H.: “Study of a fast discriminative training algorithm for pattern recognition”. IEEE Trans. on Neural Networks 17, 1212–1221 (2006)
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: “Robust endpoint detection and energy normalization for real-time speech and speaker recognition”. IEEE Trans. on Speech and Audio Processing 10, 146–157 (2002)
Markov K., Nakagawa S. “Discriminative training of GMM using a modified EM algorithm for speaker recognition”. in Proc. ICSLP, 1998
Markov K., Nakagawa S., Nakamura S. “Discriminative training of HMM using maximum normalized likelihood algorithm”. in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 497–500, 2001
Max, B., Tam, Y.-C., Li, Q.: “Discriminative auditory features for robust speech recognition”. IEEE Trans. on Speech and Audio Processing 12, 27–36 (2004)
Mora-Jimenez, I., Cid-Sueiro, J.: “A universal learning rule that minimize well-formed cost functinos”. IEEE Trans. On Neural Networks 16, 810–820 (2005)
Normandin, Y., Cardin, R., Mori, R. D.: “High-performance connected digit recognition using maximum mutual information estimation”. IEEE Trans. on Speech and Audio Processing 2, 299–311 (1994)
Reynolds, D., Rose, R. C.: “Robust text-independent speaker identification using Gaussian mixture speaker models”. IEEE Trans. on Speech and Audio Processing 3, 72–83 (1995)
Robinson, M., Azimi-Sadjadi, M. R., Salazar, J.: “Multi-aspect target discrimination using hidden Markov models and neural networks”. IEEE Trans. On Neural Networks 16, 447–459 (2005)
Werbos, P. J.: The roots of backpropagation: from ordered derivatives to neural networks and political forecasting. Wiley, New York (1994)
Wu, W., Feng, G., Li, Z., Xu, Y.: “Deterministic convergence of an online gradient method for BP networks”. IEEE Trans. On neural Networks 16, 533–540 (2005)
Yin Y., Li Q. “Soft frame margin estimation of Gaussian mixture models for speaker recognition with sparse training data”. in ICASSP 2011,(2011)
Yu, X., Efe, M. O., kaynak, O.: “A general backpropagation algorithm for feedforward neural networks learning”. IEEE Trnas. On Neural Networks 13, 251–254 (2002)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Li, Q.(. (2012). Fast Discriminative Training. In: Speaker Authentication. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23731-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-23731-7_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23730-0
Online ISBN: 978-3-642-23731-7
eBook Packages: EngineeringEngineering (R0)