Abstract
Bayesian and decision tree classifiers are among the most popular classifiers used in the data mining community and recently numerous researchers have examined their sufficiency in ensembles. Although, many methods of ensemble creation have been proposed, there is as yet no clear picture of which method is best. In this work, we propose Bagged Voting using different subsets of the same training dataset with the concurrent usage of a voting methodology that combines a Bayesian and a decision tree algorithm. In our algorithm, voters express the degree of their preference using as confidence score the probabilities of classifiers’ prediction. Next all confidence values are added for each candidate and the candidate with the highest sum wins the election. We performed a comparison of the presented ensemble with other ensembles that use either the Bayesian or the decision tree classifier as base learner and we took better accuracy in most cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Van den Bosch, A., Daelemans, W.: Memory-based morphological analysis. In: Proc. of the 37th Annual Meeting of the ACL, University of Maryland, pp. 285–292 (1999), http://ilk.kub.nl/~antalb/ltuia/week10.html
Breiman, L.: Bagging Predictors. Machine Learning 24(3), 123–140 (1996)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proceedings: ICML 1996, pp. 148–156 (1996)
Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Moore, T.E., Chao, C.: Distributed learning on very large datasets. In: ACM SIGKDD Workshop on Distributed and Parallel Knowledge Discovery (2000)
Chuanyi, J., Sheng, M.: Combinations of weak classifiers. IEEE Trans. Neural Networks 8(1), 32–42 (1997)
Kotsiantis, S., Pintelas, P.: On combining classifiers. In: Proceedings of HERCMA 2003 on computer mathematics and its applications, Athens (September 25-27, 2003)
Kotsiantis, S., Pierrakeas, C., Pintelas, P.: Preventing student dropout in distance learning systems using machine learning techniques. In: Palade, V., Howlett, R.J., Jain, L. (eds.) KES 2003. LNCS, vol. 2774, pp. 267–274. Springer, Heidelberg (2003)
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation and active learning. In: Advances in Neural Information Processing Systems, p. 7 (1995)
McQueen, R.J., Garner, S.R., Nevill-Manning, C.G., Witten, I.H.: Applying machine learning to agricultural data. Journal of Computing and Electronics in Agriculture (1994)
Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Artificial Intelligence Research 11, 169–198 (1999)
Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Quinlan, J.R.: Bagging, boosting, and C4.5. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 725–730. AAAI/MIT Press (1996)
Salzberg, S.: On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1, 317–328 (1997)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26, 1651–1686 (1998)
Seewald, A.K., Furnkranz, J.: An evaluation of grading classifiers. In: Hoffmann, F., Adams, N., Fisher, D., Guimarães, G., Hand, D.J. (eds.) IDA 2001. LNCS, vol. 2189, pp. 221–232. Springer, Heidelberg (2001)
Ting, K., Witten, I.: Issues in Stacked Generalization. Artificial Intelligence Research 10, 271–289 (1999)
Webb, G.I.: MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning 40, 159–196 (2000)
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Mateo (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kotsiantis, S.B., Pintelas, P.E. (2004). Bagged Voting Ensembles. In: Bussler, C., Fensel, D. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2004. Lecture Notes in Computer Science(), vol 3192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30106-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-30106-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22959-9
Online ISBN: 978-3-540-30106-6
eBook Packages: Springer Book Archive