The kernelHMM: Learning Kernel Combinations in Structured Output Domains

Roth, Volker; Fischer, Bernd

doi:10.1007/978-3-540-74936-3_44

The kernelHMM: Learning Kernel Combinations in Structured Output Domains

Volker Roth¹ &
Bernd Fischer¹

Conference paper

2775 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4713))

Abstract

We present a model for learning convex kernel combinations in classification problems with structured output domains. The main ingredient is a hidden Markov model which forms a layered directed graph. Each individual layer represents a multilabel version of nonlinear kernel discriminant analysis for estimating the emission probabilities. These kernel learning machines are equipped with a mechanism for finding convex combinations of kernel matrices. The resulting kernelHMM can handle multiple partial paths through the label hierarchy in a consistent way. Efficient approximation algorithms allow us to train the model to large-scale learning problems. Applied to the problem of document categorization, the method exhibits excellent predictive performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden markov support vector machines. In: ICML 2003, pp. 3–10 (2003)
Google Scholar
Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. In: ICML 2004 (2004)
Google Scholar
Peña Centeno, T., Lawrence, N.D.: Optimising kernel parameters and regularisation coefficients for non-linear discriminant analysis. J. Machine Learning Research 7, 455–49 (2006)
Google Scholar
Crammer, K., Keshet, J., Singer, Y.: Kernel design using boosting. In: NIPS 15, pp. 537–544. MIT Press, Cambridge (2002)
Google Scholar
Dubrulle, A.A.: Retooling the method of block conjugate gradients. Electron. Trans. Numer. Anal. 12, 216–233 (2001)
MATH MathSciNet Google Scholar
Grandvalet, Y.: Least absolute shrinkage is equivalent to quadratic penalization. In: ICANN 1998, pp. 201–206. Springer, Heidelberg (1998)
Google Scholar
Hastie, T., Tibshirani, R.: Discriminant analysis by gaussian mixtures. J. Royal Statistical Society B 58, 158–176 (1996)
MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Buja, A.: Flexible discriminant analysis by optimal scoring. J. American Statistical Association 89, 1255–1270 (1994)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS 10, The MIT Press, Cambridge (1998)
Google Scholar
Kumar, N., Neti, C., Andreou, A.: Application of discriminant analysis to speech recognition with auditory features. In: 15th Annual Speech Research Symposium, pp. 153–160. Johns Hopkins University, Baltimore (1995)
Google Scholar
Lanckriet, G.R.G., Deng, M., Cristianini, N., Jordan, M.I., Noble, W.S.: Kernel-based data fusion and its application to protein function prediction in yeast. In: Pacific Symposium on Biocomputing, pp. 300–311 (2004)
Google Scholar
Lehmann, A., Shawe-Taylor, J.: A probabilistic model for text kernels. In: ICML 2006 (2006)
Google Scholar
Lewis, D., Yang, Y., Rose, T., Li, F.: RCV1: A new benchmark collection for text categorization research. J. Machine Learning Research 5, 361–397 (2004)
Google Scholar
McCallum, A.: Multi-label text classification with a mixture model trained by EM. In: AAAI 1999 Workshop on Text Learning (1999)
Google Scholar
Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.-R.: Fisher discriminant analysis with kernels. In: Proceedings of IEEE Neural Networks for Signal Processing Workshop, vol. 9, pp. 41–48 (1999)
Google Scholar
Roth, V., Steinhage, V.: Nonlinear discriminant analysis using kernel functions. In: NIPS 12, pp. 568–574. MIT Press, Cambridge (2000)
Google Scholar
Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Learning hierarchical multi-category text classification models. In: ICML 2005, pp. 744–751 (2005)
Google Scholar
Sonnenburg, S., Rätsch, G., Schäfer, C.: A general and efficient multiple kernel learning algorithm. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) NIPS 18, pp. 1275–1282. MIT Press, Cambridge (2006)
Google Scholar
Soong, F.K., Huang, E.-F.: A tree-trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition. In: Proceedings of a workshop on Speech and natural language, pp. 12–19 (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

ETH Zurich, Institute of Computational Science, Universität-Str. 6, CH-8092 Zurich,
Volker Roth & Bernd Fischer

Authors

Volker Roth
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Fischer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Fred A. Hamprecht Christoph Schnörr Bernd Jähne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Roth, V., Fischer, B. (2007). The kernelHMM: Learning Kernel Combinations in Structured Output Domains. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds) Pattern Recognition. DAGM 2007. Lecture Notes in Computer Science, vol 4713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74936-3_44

Download citation

DOI: https://doi.org/10.1007/978-3-540-74936-3_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74933-2
Online ISBN: 978-3-540-74936-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics