Abstract
Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many machine learning problems [4, 16]. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers [1,14]. As such, reducing those correlations while keeping the base classifiers’ performance levels high is a promising research topic. In this paper, we describe input decimation, a method that decouples the base classifiers by training them with different subsets of the input features. In past work [15], we showed the theoretical benefits of input decimation and presented its application to a handful of real data sets. In this paper, we provide a systematic study of input decimation on synthetic data sets and analyze how the interaction between correlation and performance in base classifiers affects ensemble performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
K.M. Ali and M.J. Pazzani. On the link between error correlation and error reduction in decision tree ensembles. Technical Report 95-38, Department of Information and Computer Science, University of California, Irvine, 1995.
S.D. Bay. Combining nearest neighbor classifiers through multiple feature subsets. In Proc. 15th ICML, pages 415–425. Morgan Kaufmann, 1998.
J. O. Berger. Statistical Decision Theory and Bayesian Analysis. (2nd Ed.), Springer, New York, 1985.
L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
T.G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of AI Research, 2:263–286, 1995.
Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40:139–158, Aug. 2000.
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proc. 13th ICML, pages 148–156, Bari, Italy, 1996. Morgan Kaufmann.
I.T. Jolliffe. Principal Component Analysis. Springer-Verlag, 1986.
A. Krogh and J. Vedelsby. Neural network ensembles, cross validation and active learning. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems-7, pages 231–238. M.I.T. Press, 1995.
C.J. Merz. Classification and Regression by Combining Models. PhD thesis, University of California, Irvine, Irvine, CA, May 1998.
N.C. Oza and K. Tumer. Dimensionality reduction through classifier ensembles. Technical Report NASA-ARC-IC-1999-126, NASA Ames Research Center, 1999.
M.D. Richard and R.P. Lippmann. Neural network classifiers estimate Bayesian a posteriori probabilities. Neural Computation, 3(4):461–483, 1991.
K. Tumer and J. Ghosh. Analysis of decision boundaries in linearly combined neural classifiers. Pattern Recognition, 29(2):341–348, February 1996.
K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Science, Special Issue on Combining Artificial Neural Networks: Ensemble Approaches, 8(3 & 4):385–404, 1996.
K. Tumer and N.C. Oza. Decimated input ensembles for improved generalization. In Proc. of the Int. Joint Conf. on Neural Networks (IJCNN-99), 1999.
D. H. Wolpert. Stacked generalization. Neural Networks, 5:241–259, 1992.
Z. Zheng and G.I. Webb. Stochastic attribute selection committees. In Proc. of the 11th Australian Joint Conf. on AI (AI’98), pages 321–332, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oza, N.C., Tumer, K. (2001). Input Decimation Ensembles: Decorrelation through Dimensionality Reduction. In: Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2001. Lecture Notes in Computer Science, vol 2096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48219-9_24
Download citation
DOI: https://doi.org/10.1007/3-540-48219-9_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42284-6
Online ISBN: 978-3-540-48219-2
eBook Packages: Springer Book Archive