Input Decimation Ensembles: Decorrelation through Dimensionality Reduction

Oza, Nikunj C.; Tumer, Kagan

doi:10.1007/3-540-48219-9_24

Nikunj C. Oza⁶ &
Kagan Tumer⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2096))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

1136 Accesses
41 Citations

Abstract

Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many machine learning problems [4, 16]. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers [1,14]. As such, reducing those correlations while keeping the base classifiers’ performance levels high is a promising research topic. In this paper, we describe input decimation, a method that decouples the base classifiers by training them with different subsets of the input features. In past work [15], we showed the theoretical benefits of input decimation and presented its application to a handful of real data sets. In this paper, we provide a systematic study of input decimation on synthetic data sets and analyze how the interaction between correlation and performance in base classifiers affects ensemble performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K.M. Ali and M.J. Pazzani. On the link between error correlation and error reduction in decision tree ensembles. Technical Report 95-38, Department of Information and Computer Science, University of California, Irvine, 1995.
Google Scholar
S.D. Bay. Combining nearest neighbor classifiers through multiple feature subsets. In Proc. 15th ICML, pages 415–425. Morgan Kaufmann, 1998.
Google Scholar
J. O. Berger. Statistical Decision Theory and Bayesian Analysis. (2nd Ed.), Springer, New York, 1985.
MATH Google Scholar
L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
MATH MathSciNet Google Scholar
T.G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of AI Research, 2:263–286, 1995.
MATH Google Scholar
Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40:139–158, Aug. 2000.
Article Google Scholar
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proc. 13th ICML, pages 148–156, Bari, Italy, 1996. Morgan Kaufmann.
Google Scholar
I.T. Jolliffe. Principal Component Analysis. Springer-Verlag, 1986.
Google Scholar
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation and active learning. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems-7, pages 231–238. M.I.T. Press, 1995.
Google Scholar
C.J. Merz. Classification and Regression by Combining Models. PhD thesis, University of California, Irvine, Irvine, CA, May 1998.
Google Scholar
N.C. Oza and K. Tumer. Dimensionality reduction through classifier ensembles. Technical Report NASA-ARC-IC-1999-126, NASA Ames Research Center, 1999.
Google Scholar
M.D. Richard and R.P. Lippmann. Neural network classifiers estimate Bayesian a posteriori probabilities. Neural Computation, 3(4):461–483, 1991.
Article Google Scholar
K. Tumer and J. Ghosh. Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29(2):341–348, February 1996.
Article Google Scholar
K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Science, Special Issue on Combining Artificial Neural Networks: Ensemble Approaches, 8(3 & 4):385–404, 1996.
Google Scholar
K. Tumer and N.C. Oza. Decimated input ensembles for improved generalization. In Proc. of the Int. Joint Conf. on Neural Networks (IJCNN-99), 1999.
Google Scholar
D. H. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.
Article Google Scholar
Z. Zheng and G.I. Webb. Stochastic attribute selection committees. In Proc. of the 11th Australian Joint Conf. on AI (AI’98), pages 321–332, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Division, University of California, Berkeley, CA, 94720-1776, USA
Nikunj C. Oza
Computational Sciences Division, NASA Ames Research Center, Mail Stop 269-3, Moffett Field, CA, 94035-1000, USA
Kagan Tumer

Authors

Nikunj C. Oza
View author publications
You can also search for this author in PubMed Google Scholar
Kagan Tumer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, Surrey, GU2 7XH, UK
Josef Kittler
Department of Electrical and Electronic Engineering, University of Cagliari, Piazza d’Armi, 09123, Cagliari, Italy
Fabio Roli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oza, N.C., Tumer, K. (2001). Input Decimation Ensembles: Decorrelation through Dimensionality Reduction. In: Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2001. Lecture Notes in Computer Science, vol 2096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48219-9_24

Download citation

DOI: https://doi.org/10.1007/3-540-48219-9_24
Published: 22 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42284-6
Online ISBN: 978-3-540-48219-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics