Skip to main content

The kernelHMM: Learning Kernel Combinations in Structured Output Domains

  • Conference paper
  • 2775 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4713))

Abstract

We present a model for learning convex kernel combinations in classification problems with structured output domains. The main ingredient is a hidden Markov model which forms a layered directed graph. Each individual layer represents a multilabel version of nonlinear kernel discriminant analysis for estimating the emission probabilities. These kernel learning machines are equipped with a mechanism for finding convex combinations of kernel matrices. The resulting kernelHMM can handle multiple partial paths through the label hierarchy in a consistent way. Efficient approximation algorithms allow us to train the model to large-scale learning problems. Applied to the problem of document categorization, the method exhibits excellent predictive performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden markov support vector machines. In: ICML 2003, pp. 3–10 (2003)

    Google Scholar 

  2. Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. In: ICML 2004 (2004)

    Google Scholar 

  3. Peña Centeno, T., Lawrence, N.D.: Optimising kernel parameters and regularisation coefficients for non-linear discriminant analysis. J. Machine Learning Research 7, 455–49 (2006)

    Google Scholar 

  4. Crammer, K., Keshet, J., Singer, Y.: Kernel design using boosting. In: NIPS 15, pp. 537–544. MIT Press, Cambridge (2002)

    Google Scholar 

  5. Dubrulle, A.A.: Retooling the method of block conjugate gradients. Electron. Trans. Numer. Anal. 12, 216–233 (2001)

    MATH  MathSciNet  Google Scholar 

  6. Grandvalet, Y.: Least absolute shrinkage is equivalent to quadratic penalization. In: ICANN 1998, pp. 201–206. Springer, Heidelberg (1998)

    Google Scholar 

  7. Hastie, T., Tibshirani, R.: Discriminant analysis by gaussian mixtures. J. Royal Statistical Society B 58, 158–176 (1996)

    MathSciNet  Google Scholar 

  8. Hastie, T., Tibshirani, R., Buja, A.: Flexible discriminant analysis by optimal scoring. J. American Statistical Association 89, 1255–1270 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  9. Hastie, T., Tibshirani, R.: Classification by pairwise coupling. In: NIPS 10, The MIT Press, Cambridge (1998)

    Google Scholar 

  10. Kumar, N., Neti, C., Andreou, A.: Application of discriminant analysis to speech recognition with auditory features. In: 15th Annual Speech Research Symposium, pp. 153–160. Johns Hopkins University, Baltimore (1995)

    Google Scholar 

  11. Lanckriet, G.R.G., Deng, M., Cristianini, N., Jordan, M.I., Noble, W.S.: Kernel-based data fusion and its application to protein function prediction in yeast. In: Pacific Symposium on Biocomputing, pp. 300–311 (2004)

    Google Scholar 

  12. Lehmann, A., Shawe-Taylor, J.: A probabilistic model for text kernels. In: ICML 2006 (2006)

    Google Scholar 

  13. Lewis, D., Yang, Y., Rose, T., Li, F.: RCV1: A new benchmark collection for text categorization research. J. Machine Learning Research 5, 361–397 (2004)

    Google Scholar 

  14. McCallum, A.: Multi-label text classification with a mixture model trained by EM. In: AAAI 1999 Workshop on Text Learning (1999)

    Google Scholar 

  15. Mika, S., Rätsch, G., Weston, J., Schölkopf, B., Müller, K.-R.: Fisher discriminant analysis with kernels. In: Proceedings of IEEE Neural Networks for Signal Processing Workshop, vol. 9, pp. 41–48 (1999)

    Google Scholar 

  16. Roth, V., Steinhage, V.: Nonlinear discriminant analysis using kernel functions. In: NIPS 12, pp. 568–574. MIT Press, Cambridge (2000)

    Google Scholar 

  17. Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Learning hierarchical multi-category text classification models. In: ICML 2005, pp. 744–751 (2005)

    Google Scholar 

  18. Sonnenburg, S., Rätsch, G., Schäfer, C.: A general and efficient multiple kernel learning algorithm. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) NIPS 18, pp. 1275–1282. MIT Press, Cambridge (2006)

    Google Scholar 

  19. Soong, F.K., Huang, E.-F.: A tree-trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition. In: Proceedings of a workshop on Speech and natural language, pp. 12–19 (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Fred A. Hamprecht Christoph Schnörr Bernd Jähne

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Roth, V., Fischer, B. (2007). The kernelHMM: Learning Kernel Combinations in Structured Output Domains. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds) Pattern Recognition. DAGM 2007. Lecture Notes in Computer Science, vol 4713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74936-3_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74936-3_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74933-2

  • Online ISBN: 978-3-540-74936-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics