Large Margin Learning of Bayesian Classifiers Based on Gaussian Mixture Models

Pernkopf, Franz; Wohlmayr, Michael

doi:10.1007/978-3-642-15939-8_4

Franz Pernkopf²³ &
Michael Wohlmayr²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6323))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3602 Accesses
3 Citations

Abstract

We present a discriminative learning framework for Gaussian mixture models (GMMs) used for classification based on the extended Baum-Welch (EBW) algorithm [1]. We suggest two criteria for discriminative optimization, namely the class conditional likelihood (CL) and the maximization of the margin (MM). In the experiments, we present results for synthetic data, broad phonetic classification, and a remote sensing application. The experiments show that CL-optimized GMMs (CL-GMMs) achieve a lower performance compared to MM-optimized GMMs (MM-GMMs), whereas both discriminative GMMs (DGMMs) perform significantly better than generatively learned GMMs. We also show that the generative discriminatively parameterized GMM classifiers still allow to marginalize over missing features, a case where generative classifiers have an advantage over purely discriminative classifiers such as support vector machines or neural networks.

This work was supported by the Austrian Science Fund (Project number P22488-N23) and (Project number S10604-N13).

Download to read the full chapter text

Chapter PDF

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Article 01 May 2018

Ke Guo, Xia-bi Liu, … Zeng-min Geng

Simultaneous Predictive Gaussian Classifiers

Article 23 February 2016

Yaqiong Cui, Jukka Sirén, … Jukka Corander

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Gopalakrishnan, O., Kanevsky, D., Nàdas, A., Nahamoo, D.: An inequality for rational functions with applications to some statistical estimation problems. IEEE Transactions on Information Theory 37(1), 107–113 (1991)
Article MATH Google Scholar
Vapnik, V.: Statistical learning theory. Wiley & Sons, Chichester (1998)
MATH Google Scholar
Schölkopf, B., Smola, A.: Learning with kernels: Support Vector Machines, regularization, optimization, and beyond. MIT Press, Cambridge (2001)
Google Scholar
Taskar, B., Guestrin, C., Koller, D.: Max-margin markov networks. In: Advances in Neural Information Processing Systems, NIPS (2003)
Google Scholar
Guo, Y., Wilkinson, D., Schuurmans, D.: Maximum margin Bayesian networks. In: International Conference on Uncertainty in Artificial Intelligence, UAI (2005)
Google Scholar
Roos, T., Wettig, H., Grünwald, P., Myllymäki, P., Tirri, H.: On discriminative Bayesian network classifiers and logistic regression. Machine Learning 59, 267–296 (2005)
MATH Google Scholar
Sha, F., Saul, L.: Large margin Gaussian mixture modeling for phonetic classification and recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP (2006)
Google Scholar
Sha, F., Saul, L.: Comparison of large margin training to other discriminative methods for phonetic recognition by hidden Markov models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 313–316 (2007)
Google Scholar
Heigold, G., Deselaers, T., Schlüter, R., Ney, H.: Modified MMI/MPE: A direct evaluation of the margin in speech recognition. In: International Conference on Machine Learning (ICML), pp. 384–391 (2008)
Google Scholar
Collobert, R., Siz, F., Weston, J., Bottou, L.: Trading convexity for scalability. In: International Conference on Machine Learning (ICML), pp. 201–208 (2006)
Google Scholar
Schlüter, R., Macherey, W., Müller, B., Ney, H.: Comparison of discriminative training criteria and optimization methods for speech recognition. Speech Communication 34, 287–310 (2001)
Article MATH Google Scholar
Bahl, L., Brown, P., de Souza, P., Mercer, R.: Maximum Mutual Information estimation of HMM parameters for speech recognition. In: IEEE Conf. on Acoustics, Speech, and Signal Proc., pp. 49–52 (1986)
Google Scholar
Woodland, P., Povey, D.: Large scale discriminative training of hidden Markov models for speech recognition. Computer Speech and Language 16, 25–47 (2002)
Article Google Scholar
Klautau, A., Jevtić, N., Orlitsky, A.: Discriminative Gaussian mixture models: A comparison with kernel classifiers. In: Inter. Conf. on Machine Learning (ICML), pp. 353–360 (2003)
Google Scholar
Pernkopf, F., Van Pham, T., Bilmes, J.: Broad phonetic classification using discriminative Bayesian networks. Speech Communication 143(1), 123–138 (2008)
Google Scholar
Bishop, C.M.: Pattern recognition and machine learning. Springer, Heidelberg (2006)
MATH Google Scholar
Pernkopf, F., Bouchaffra, D.: Genetic-based EM algorithm for learning Gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1344–1348 (2005)
Article Google Scholar
Merialdo, B.: Phonetic recognition using hidden Markov models and maximum mutual information training. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 111–114 (1988)
Google Scholar
Normandin, Y., Morgera, S.: An improved MMIE training algorithm for speaker-independent small vocabulary, continuous speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 537–540 (1991)
Google Scholar
Normandin, Y., Cardin, R., De Mori, R.: High-performance connected digit recognition using maximum mutual information estimation. IEEE Trans. on Speech and Audio Proc. 2(2), 299–311 (1994)
Article Google Scholar
Lamel, L., Kassel, R., Seneff, S.: Speech database development: Design and analysis of the acoustic-phonetic corpus. In: DARPA Speech Recognition Workshop, Report No. SAIC-86/1546 (1986)
Google Scholar
Crammer, K., Singer, Y.: On the algorithmic interpretation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2, 265–292 (2001)
Article Google Scholar
Jain, A., Chandrasekaran, B.: Dimensionality and sample size considerations in pattern recognition in practice. Handbook of Statistics, vol. 2. North-Holland, Amsterdam (1982)
Google Scholar
Baum, L., Eagon, J.: An inequality with applications to statistical prediction for functions of Markov processes and to a model of ecology. Bull. Amer. Math. Soc. 73, 360–363 (1967)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Graz University of Technology, Inffeldgasse 16c, A-8010, Graz, Austria
Franz Pernkopf & Michael Wohlmayr

Authors

Franz Pernkopf
View author publications
You can also search for this author in PubMed Google Scholar
Michael Wohlmayr
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pernkopf, F., Wohlmayr, M. (2010). Large Margin Learning of Bayesian Classifiers Based on Gaussian Mixture Models. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6323. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15939-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-15939-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15938-1
Online ISBN: 978-3-642-15939-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Large Margin Learning of Bayesian Classifiers Based on Gaussian Mixture Models

Abstract

Chapter PDF

Similar content being viewed by others

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Simultaneous Predictive Gaussian Classifiers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Large Margin Learning of Bayesian Classifiers Based on Gaussian Mixture Models

Abstract

Chapter PDF

Similar content being viewed by others

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition

A new constrained maximum margin approach to discriminative learning of Bayesian classifiers

Simultaneous Predictive Gaussian Classifiers

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation