Ensembled Accuracies

Less Noise but More Risk of Overfit
  • Ton J. Cleophas
  • Aeilko H. Zwinderman


Ensemble learning refers to the simultaneous use of multiple learning algorithms for the purpose of a better predictive power. Using SPSS Modeler a 200 patient data file with 11 variables, mainly patients’ laboratory values and their subsequent outcome (death or alive), were analyzed with decision trees, logistic regression, Bayesian networks, discriminant analysis, nearest neighbors clustering, supprot vector machines, chisquare automatic interaction detection, and neural networks. The overall accuracies of the four best fit models were used for computing an average accuracy and its errors. This average accuracy was a lot better than those of the separate algorithms.


  1. In this chapter SPSS modeler, a work bench for automatic data mining and data modeling from IBM, was used. It is an analytics software application entirely distinct from SPSS statistical software, though it uses most if not all of the calculus methods of it. It is a standard software package particularly used by market analysts, but, as shown, can, perfectly, well be applied for exploratory purposes in medical research. Alternatively, R statistical software, Knime (Konstanz information miner machine learning software), the packages for Support Vector Machines (LIBSVM), and ensembled support vector machines (ESVM), and many more software programs can be used for ensembled analyses.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2017

Authors and Affiliations

  • Ton J. Cleophas
    • 1
  • Aeilko H. Zwinderman
    • 2
  1. 1.Department of MedicineAlbert Schweitzer HospitalSliedrechtThe Netherlands
  2. 2.Department of Epidemiology and BiostatisticsAcademic Medical CenterAmsterdamThe Netherlands

Personalised recommendations