Abstract
We have investigated the performance of a generalisation error predictor, Gest, in the context of error correcting output coding ensembles based on multi-layer perceptrons. An experimental evaluation on benchmark datasets with added classification noise shows that over-fitting can be detected and a comparison is made with the Q measure of ensemble diversity. Each dichotomy associated with a column of an ECOC code matrix is presented with a bootstrap sample of the training set. Gest uses the out-of-bootstrap samples to efficiently estimate the mean column error for the independent test set and hence the test error. This estimate can then be used select a suitable complexity for the base classifiers in the ensemble.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banfield, R., et al.: A New Ensemble Diversity Measure Applied to Thinning Ensembles. In: Proc. MCS 4th International Workshop. Springer, Guildford (2003)
Berger, B.: Error-correcting output coding for text classification. In: IJCAI 1999, Workshop on machine learning for information filtering, Stockholm, Sweden (1999)
Bishop, C.: Neural Networks for Pattern Recognition. Clarendon Press, Oxford (1995)
Breiman, L.: Out-of-bag estimation, Technical Report No. 421, University of California Berkeley (1994)
Bylander, T.: Estimating Generalization Error on Two-Class Datasets Using Out-Of-Bag Estimates. Machine Learning 48, 287–297 (2002)
Efron, B.: Bootstrap Methods: another look at the jackknife. Ann. Statistics 7, 1–26 (1979)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. In: Monographs on Statistics and Applied Probability, vol. 57, p. 47. Chapman and Hall, Boca Raton (1993)
Dietterich, T., Bakiri, G.: Solving Multi-class Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Research 2, 236–286 (1995)
Ghani, R.: Using error-correcting codes for text classification. In: Proceedings of the 17th International Conference on Machine Learning, pp. 303–310. Morgan Kaufmann, San Francisco (2000)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)
James, G., Hastie, T.: The error coding method and PICTs. Computation and Graphical Statistics 7, 377–387 (1998)
James, G.: Majority Vote Classifiers - Theory and applications, PhD. Dissertation, Stanford University (1998)
Kuncheva, L., Whitaker, C.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. In: Machine Learning, vol. 51(2), pp. 187–207. Kluwer Academic Publishers, Dordrecht (2002)
Paass, G.: Assessing and Improving Neural Network Predictions by the Bootstrap Algorithm. In: Advances in Neural Information Processing Systems, vol. 5, pp. 196–203. Morgan Kaufman, San Francisco (1993)
Rifkin, R., Klautau, A.: In Defence of One-Vs-All Classification. Journal of Machine Learning Research 5, 101–141 (2004)
Windeatt, T.: Vote Counting Measures for Ensemble Classifiers: Pattern Recognition 36, Pergamon, pp. 2743–2756 (2003)
Windeatt, T., Ghaderi, R.: Coding and Decoding Strategies for Multi-class Learning Problems. Information Fusion 4(1), 11–21 (2003)
Winston, W.: Operations Research – Applications and Algorithms, 3rd edn., ITP 1994, p. 628 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Prior, M., Windeatt, T. (2005). Over-Fitting in Ensembles of Neural Network Classifiers Within ECOC Frameworks. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2005. Lecture Notes in Computer Science, vol 3541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11494683_29
Download citation
DOI: https://doi.org/10.1007/11494683_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26306-7
Online ISBN: 978-3-540-31578-0
eBook Packages: Computer ScienceComputer Science (R0)