Abstract
We propose and study a new technique for aggregating an ensemble of bootstrapped classifiers. In this method we seek a linear combination of the base-classifiers such that the weights are optimized to reduce variance. Minimum variance combinations are computed using quadratic programming. This optimization technique is borrowed from Mathematical Finance where it is called Markowitz Mean-Variance Portfolio Optimization. We test the new method on a number of binary classification problems from the UCI repository using a Support Vector Machine (SVM) as the base-classifier learning algorithm. Our results indicate that the proposed technique can consistently outperform Bagging and can dramatically improve the SVM performance even in cases where the Bagging fails to improve the base-classifier.
The research was supported by the fund for promotion of research at the Technion and by the Ollendorff center.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
E. Bauer and R. Kohavi. An empirical comparison of voting classi.cation algorithms: Bagging, boosting, and variants. Machine Learning, 36(1–2):105–139, 1999.
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998. URL: http://www.ics.uci.edu/~mlearn/MLRepository.html.
L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
P. Buhlmann and B. Yu. Analyzing bagging. Annals of Statistics, 2001, in print.
T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–157, 2000.
T. G. Dietterich. Ensemble Methods in Machine Learning, pages 1–15. MIT Press, 2nd edition, 2001.
P. Domingos. Knowledge acquisition from examples via multiple models. In Proc. 14th International Conference on Machine Learning, pages 98–106. Morgan Kaufmann, 1997.
P. Domingos. Why does bagging work? A bayesian account and its implications. In D. Pregibon in D. Heckerman, H. Mannila and R. Uthurusamy eds, editors, Proceedings of the third international conference on Knowledge Discovery and Data Mining, pages 155–158. AAAI Press, 1997.
Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In International Conference on Machine Learning, pages 148–156, 1996.
J. Friedman and P. Hall. On bagging and nonlinear estimation, 2000. Preprint. URL: http://www-stat.stanford.edu/~jhf/ftp/bag.ps.
D. Grossman and T. Williams. Machine learning ensembles: An empirical study and novel approach. Unpublished manuscript, 2000. URL: http://www.cs.washington.edu/homes/~grossman/projects/573projects/learning.
S. Hashem, B. Schmeiser, and Y. Yih. Optimal linear combinations of neural networks: An overview. In 1994 IEEE International Conference on Neural Networks, 1994.
A. Krogh and P. Sollich. Statistical mechanics of ensemble learning. Physical Review E, 55(1):811–825, 1997.
R. Maclin and D. Opitz. An empirical evaluation of bagging and boosting. In The Fourteenth National Conference on Artificial Intelligence, pages 546–551. AAAI/IAAI, 1997.
G. Mani. Lowering variance of decisions by using artificial neural network portfolios. Neural Computation, 3(4):483–486, 1991.
H. Markowitz. Portfolio selection. Journal of Finance, 7:77–91, 1952.
H. Markowitz. Portfolio Selection: Efficient Diversification of Investments. New Haven: Yale University Press, 1959.
R. Meir. Bias, variance and the combination of least-squares estimators. In Advnces in Neural Information Processing Systems 7, pages 295–302. Morgan Kaufmann, San Francisco, CA, 1994.
J. Quinlan. Bagging, boosting and c4.5. In Proceedings of 13th Conference on AI, pages 725–730. MIT press, 1996.
J. Rao and R. Tibshirani. The out-of-bootstrap method for model averaging and selection. Technical report, Statistics Department, Stanford University, 1997. URL http://www-stat.stanford.edu/~tibs/ftp/outofbootstrap.ps.
B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, 2002.
W. F. Sharpe. Adjusting for risk in portfolio performance measurement. Journal of Portfolio Management, Winter:29–34, 1975.
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Derbeko, P., El-Yaniv, R., Meir, R. (2002). Variance Optimized Bagging. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Machine Learning: ECML 2002. ECML 2002. Lecture Notes in Computer Science(), vol 2430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36755-1_6
Download citation
DOI: https://doi.org/10.1007/3-540-36755-1_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44036-9
Online ISBN: 978-3-540-36755-0
eBook Packages: Springer Book Archive