Abstract
Multiple classifier systems have proven superiority over individual ones to solve classification tasks. One of the main issues in those solution relies in data size, when the amount of data to be analyzed becomes huge. In this paper, the performance of ensemble system to succeed by using only portions of the available data is analyzed. For this, extensive experimentation with homogeneous ensemble systems trained with 50% of data and 50% of features is performed, using bagging sampling schema. Simple and weighted majority voting schemes are implemented to combine the classifier outputs. Experimental results including 25 datasets show the benefit of using multiple classifiers trained on limited data. The ensemble size and the accuracy obtained with individual model trained over the entire dataset is compared.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Package e1071: http://cran.r-project.org/web/packages/e1071/index.html.
- 2.
Package partykit: http://cran.r-project.org/package=partykit.
- 3.
Package nnet: http://cran.r-project.org/web/packages/nnet/index.html.
- 4.
UCI MachineLearning Database Repository: http://archive.ics.uci.edu/ml/.
- 5.
Statistical inference procedures: http://sci2s.ugr.es/sicidm/.
References
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45:5–32
Bryll RK, Gutierrez-Osuna R, Quek FKH (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn 36(6):1291–1302
Cruz RM, Sabourin R, Cavalcanti GD (2018) Dynamic classifier selection: recent advances and perspectives. Inf Fusion 41:195–216
Dietterich TG (2000) Ensemble methods in machine learning. LNCS, vol 1857, pp 1–15
Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V (1994) Boosting and other ensemble methods. Neural Comput 6(6):1289–1301
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Hearst MA, Dumais ST, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst 13(4):18–28
Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition, ICDAR 1995, vol 1. IEEE Computer Society, Washington, p 278
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Huang Y, Suen C (1993) Behavior-knowledge space method for combination of multiple classifiers. In: IEEE conference on computer vision and pattern recognition, pp 79–87. https://doi.org/10.1109/CVPR.1993.1626170
Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87
Krawczyk B, Woźniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput J 14(PART C):554–562
Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms. Wiley, Hoboken
Martinez-Muñoz G, Hernández-Lobato D, Suarez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259
Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Mag 6(3):21–45
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Soares S, Antunes CH, Araújo R (2013) Comparison of a genetic algorithm and simulated annealing for automatic neural network ensemble development. Neurocomputing 121:498–511
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics Bull 1(6):80–83
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Wolpert DH (2001) The supervised learning no-free-lunch theorems. In: Proceedings of the 6th online world conference on soft computing in industrial applications, pp 25–42
Woźniak M, Graña M (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17
Acknowledgements
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No̱ 665959. In addition, Michał Woźniak was supported by the statutory funds of the Department of Systems and Computer Networks, Faculty of Electronics, Wrocław University of Science and Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mohammed, A.M., Onieva, E., Woźniak, M. (2020). Vertical and Horizontal Data Partitioning for Classifier Ensemble Learning. In: Burduk, R., Kurzynski, M., Wozniak, M. (eds) Progress in Computer Recognition Systems. CORES 2019. Advances in Intelligent Systems and Computing, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-19738-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-19738-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19737-7
Online ISBN: 978-3-030-19738-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)