Skip to main content

Regularized Linear Models in Stacked Generalization

  • Conference paper
Multiple Classifier Systems (MCS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5519))

Included in the following conference series:

Abstract

Stacked generalization is a flexible method for multiple classifier combination; however, it tends to overfit unless the combiner function is sufficiently smooth. Previous studies attempt to avoid overfitting by using a linear function at the combiner level. This paper demonstrates experimentally that even with a linear combination function, regularization is necessary to reduce overfitting and increase predictive accuracy. The standard linear least squares regression can be regularized with an L2 penalty (Ridge regression), an L1 penalty (lasso regression) or a combination of the two (elastic net regression). In multi-class classification, sparse linear models select and combine individual predicted probabilities instead of using complete probability distributions, allowing base classifiers to specialize in subproblems corresponding to different classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  2. Roli, F., Giacinto, G., Vernazza, G.: Methods for designing multiple classifier systems. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 78–87. Springer, Heidelberg (2001)

    Google Scholar 

  3. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience, Hoboken (2004)

    Book  MATH  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  5. Caruana, R., Niculescu-Mizil, A., Crew, G., Ksikes, A.: Ensemble selection from libraries of models. In: ICML 2004: Proceedings of the twenty-first international conference on Machine learning, p. 18. ACM, New York (2004)

    Google Scholar 

  6. Wolpert, D.H.: Stacked generalization. Neural Netw. 5, 241–259 (1992)

    Article  Google Scholar 

  7. Ting, K.M., Witten, I.H.: Issues in stacked generalization. Journal of Artificial Intelligence Research 10, 271–289 (1999)

    MATH  Google Scholar 

  8. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2003)

    MATH  Google Scholar 

  9. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society B 67, 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  10. Caruana, R., Munson, A., Niculescu-Mizil, A.: Getting the most out of ensemble selection. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, Washington, DC, USA, pp. 828–833. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  11. Seewald, A.K.: How to make stacking better and faster while also taking care of an unknown weakness. In: ICML 2002: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 554–561. Morgan Kaufmann Publishers Inc., San Francisco (2002)

    Google Scholar 

  12. Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Computation 10, 1895–1923 (1998)

    Article  Google Scholar 

  13. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  14. Efron, B., Hastie, T., Johnstone, L., Tibshirani, R.: Least angle regression. Annals of Statistics 32, 407–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Friedman, J., Hastie, T., Tibshirani, R.: Regularized paths for generalized linear models via coordinate descent. Technical report, Stanford (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Reid, S., Grudic, G. (2009). Regularized Linear Models in Stacked Generalization. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2009. Lecture Notes in Computer Science, vol 5519. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02326-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02326-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02325-5

  • Online ISBN: 978-3-642-02326-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics