Skip to main content

Vertical and Horizontal Data Partitioning for Classifier Ensemble Learning

  • Conference paper
  • First Online:
Book cover Progress in Computer Recognition Systems (CORES 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 977))

Included in the following conference series:

Abstract

Multiple classifier systems have proven superiority over individual ones to solve classification tasks. One of the main issues in those solution relies in data size, when the amount of data to be analyzed becomes huge. In this paper, the performance of ensemble system to succeed by using only portions of the available data is analyzed. For this, extensive experimentation with homogeneous ensemble systems trained with 50% of data and 50% of features is performed, using bagging sampling schema. Simple and weighted majority voting schemes are implemented to combine the classifier outputs. Experimental results including 25 datasets show the benefit of using multiple classifiers trained on limited data. The ensemble size and the accuracy obtained with individual model trained over the entire dataset is compared.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Package e1071: http://cran.r-project.org/web/packages/e1071/index.html.

  2. 2.

    Package partykit: http://cran.r-project.org/package=partykit.

  3. 3.

    Package nnet: http://cran.r-project.org/web/packages/nnet/index.html.

  4. 4.

    UCI MachineLearning Database Repository: http://archive.ics.uci.edu/ml/.

  5. 5.

    Statistical inference procedures: http://sci2s.ugr.es/sicidm/.

References

  1. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  2. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  3. Bryll RK, Gutierrez-Osuna R, Quek FKH (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn 36(6):1291–1302

    Article  Google Scholar 

  4. Cruz RM, Sabourin R, Cavalcanti GD (2018) Dynamic classifier selection: recent advances and perspectives. Inf Fusion 41:195–216

    Article  Google Scholar 

  5. Dietterich TG (2000) Ensemble methods in machine learning. LNCS, vol 1857, pp 1–15

    Google Scholar 

  6. Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V (1994) Boosting and other ensemble methods. Neural Comput 6(6):1289–1301

    Article  Google Scholar 

  7. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  Google Scholar 

  8. Hearst MA, Dumais ST, Osman E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst 13(4):18–28

    Article  Google Scholar 

  9. Ho TK (1995) Random decision forests. In: Proceedings of the third international conference on document analysis and recognition, ICDAR 1995, vol 1. IEEE Computer Society, Washington, p 278

    Google Scholar 

  10. Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  11. Huang Y, Suen C (1993) Behavior-knowledge space method for combination of multiple classifiers. In: IEEE conference on computer vision and pattern recognition, pp 79–87. https://doi.org/10.1109/CVPR.1993.1626170

  12. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87

    Article  Google Scholar 

  13. Krawczyk B, Woźniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput J 14(PART C):554–562

    Article  Google Scholar 

  14. Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms. Wiley, Hoboken

    MATH  Google Scholar 

  15. Martinez-Muñoz G, Hernández-Lobato D, Suarez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259

    Article  Google Scholar 

  16. Polikar R (2006) Ensemble based systems in decision making. IEEE Circ Syst Mag 6(3):21–45

    Article  Google Scholar 

  17. Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227

    Google Scholar 

  18. Soares S, Antunes CH, Araújo R (2013) Comparison of a genetic algorithm and simulated annealing for automatic neural network ensemble development. Neurocomputing 121:498–511

    Article  Google Scholar 

  19. Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics Bull 1(6):80–83

    Article  Google Scholar 

  20. Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Article  Google Scholar 

  21. Wolpert DH (2001) The supervised learning no-free-lunch theorems. In: Proceedings of the 6th online world conference on soft computing in industrial applications, pp 25–42

    Chapter  Google Scholar 

  22. Woźniak M, Graña M (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17

    Article  Google Scholar 

Download references

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No̱ 665959. In addition, Michał Woźniak was supported by the statutory funds of the Department of Systems and Computer Networks, Faculty of Electronics, Wrocław University of Science and Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amgad M. Mohammed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mohammed, A.M., Onieva, E., Woźniak, M. (2020). Vertical and Horizontal Data Partitioning for Classifier Ensemble Learning. In: Burduk, R., Kurzynski, M., Wozniak, M. (eds) Progress in Computer Recognition Systems. CORES 2019. Advances in Intelligent Systems and Computing, vol 977. Springer, Cham. https://doi.org/10.1007/978-3-030-19738-4_10

Download citation

Publish with us

Policies and ethics