Skip to main content

Ensembles of Learning Machines

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2486))

Abstract

Ensembles of learning machines constitute one of the main current directions in machine learning research, and have been applied to a wide range of real problems. Despite of the absence of an unified theory on ensembles, there are many theoretical reasons for combining multiple learners, and an empirical evidence of the effectiveness of this approach. In this paper we present a brief overview of ensemble methods, explaining the main reasons why they are able to outperform any single classifier within the ensemble, and proposing a taxonomy based on the main ways base classifiers can be generated or combined together.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. Aha and R. Bankert. Cloud classification using error-correcting output codes. In Artificial Intelligence Applications: Natural Science, Agriculture and Environmental Science, volume 11, pages 13–28. 1997.

    Google Scholar 

  2. K. M. Ali and M. J. Pazzani. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173–202, 1996.

    Google Scholar 

  3. E. L. Allwein, R. E. Schapire, and Y. Singer. Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113–141, 2000.

    Article  MathSciNet  Google Scholar 

  4. E. Alpaydin and E. Mayoraz. Learning error-correcting output codes from data. In ICANN’99, pages 743–748, Edinburgh, UK, 1999.

    Google Scholar 

  5. R. Anand, G. Mehrotra, C. K. Mohan, and S. Ranka. Efficient classification for multiclass problems using modular neural networks. IEEE Transactions on Neural Networks, 6:117–124, 1995.

    Article  Google Scholar 

  6. G. Bakiri and T. G. Dietterich. Achieving high accuracy text-to-speech with machine learning. In Data mining in speech synthesis. 1999.

    Google Scholar 

  7. R. Battiti and A. M. Colla. Democracy in neural nets: Voting schemes for classification. Neural Networks, 7:691–707, 1994.

    Article  Google Scholar 

  8. E. Bauer and R.. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, 36(1/2):525–536, 1999.

    Article  Google Scholar 

  9. J. Benediktsson, J. Sveinsson, O. Ersoy, and P. Swain. Parallel consensual neural networks. IEEE Transactions on Neural Networks, 8:54–65, 1997.

    Article  Google Scholar 

  10. J. Benediktsson and P. Swain. Consensus theoretic classification methods. IEEE Transactions on Systems, Man and Cybernetics, 22:688–704, 1992.

    Article  MATH  Google Scholar 

  11. A. Berger. Error correcting output coding for text classification. In IJCAI’99: Workshop on machine learning for information filtering, 1999.

    Google Scholar 

  12. C. M. Bishop. Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995.

    Google Scholar 

  13. A. Blum and R.L. Rivest. Training a 3-node neural network is NP-complete. In Proc. of the 1988 Workshop ob Computational Learning Learning Theory, pages 9–18, San Francisco, CA, 1988. Morgan Kaufmann.

    Google Scholar 

  14. R. C. Bose and D. K. Ray-Chauduri. On a class of error correcting binary group codes. Information and Control, (3):68–79, 1960.

    Google Scholar 

  15. L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  16. L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801–849, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  17. L. Breiman. Prediction games and arcing classifiers. Neural Computation, 11(7):1493–1517, 1999.

    Article  Google Scholar 

  18. M. Breukelen van, R.P.W. Duin, D. Tax, and J.E. Hartog den. Combining classifiers fir the recognition of handwritten digits. In Ist IAPR TC1 Workshop on Statistical Techniques in Pattern Recognition, pages 13–18, Prague, Czech republic, 1997.

    Google Scholar 

  19. G. J. Briem, J. A. Benediktsson, and J. R. Sveinsson. Boosting. Bagging and Consensus Based Classification of Multisource Remote Sensing Data. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 279–288. Springer-Verlag, 2001.

    Google Scholar 

  20. D.. Chen. Statistical estimates for Kleinberg’s method of Stochastic Discrimination. PhD thesis, The State University of New York, Buffalo, USA, 1998.

    Google Scholar 

  21. K. J. Cherkauker. Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks. In Chan P., editor, Working notes of the AAAI Workshop on Integrating Multiple Learned Models, pages 15–21. 1996.

    Google Scholar 

  22. S. Cho and J. Kim. Combining multiple neural networks by fuzzy integral and robust classification. IEEE Transactions on Systems, Man and Cybernetics, 25:380–384, 1995.

    Article  Google Scholar 

  23. S. Cho and J. Kim. Multiple network fusion using fuzzy logic. IEEE Transactions on Neural Networks, 6:497–501, 1995.

    Article  Google Scholar 

  24. S. Cohen and N. Intrator. A Hybrid Projection Based and Radial Basis Function Architecture. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 147–156. Springer-Verlag, 2000.

    Google Scholar 

  25. S. Cohen and N. Intrator. Automatic Model Selection in a Hybrid Percep-tron/Radial Network. In Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 349–358. Springer-Verlag, 2001.

    Google Scholar 

  26. N. C. de Condorcet. Essai sur l’ application de l’ analyse á la probabilité des decisions rendues á la pluralité des voix. Imprimerie Royale, Paris, 1785.

    Google Scholar 

  27. K. Crammer and Y. Singer. On the learnability and design of output codes for multiclass problems. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, pages 35–46, 2000.

    Google Scholar 

  28. T. G. Dietterich. Ensemble methods in machine learning. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 1–15. Springer-Verlag, 2000.

    Google Scholar 

  29. T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision tress: Bagging, boosting and randomization. Machine Learning, 40(2):139–158, 2000.

    Article  Google Scholar 

  30. T. G. Dietterich and G. Bakiri. Error-correcting output codes: A general method for improving multiclass inductive learning programs. In Proceedings of AAAI-91, pages 572–577. AAAI Press / MIT Press, 1991.

    Google Scholar 

  31. T. G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, (2):263–286, 1995.

    Google Scholar 

  32. H. Drucker and C. Cortes. Boosting decision trees. In Advances in Neural Information Processing Systems, volume 8. 1996.

    Google Scholar 

  33. H. Drucker, C. Cortes, L. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6(6):1289–1301, 1994.

    Article  MATH  Google Scholar 

  34. R. P. W. Duin and D. M. J. Tax. Experiments with Classifier Combination Rules. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 16–29. Springer-Verlag, 2000.

    Google Scholar 

  35. B. Efron and R. Tibshirani. An introduction to the Bootstrap. Chapman and Hall, New York, 1993.

    MATH  Google Scholar 

  36. S. E. Fahlman and C. Lebiere. The cascade-correlation learning architecture. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems, volume 2, pages 524–532. Morgan Kauffman, San Mateo, CA, 1990.

    Google Scholar 

  37. E. Filippi, M. Costa, and E. Pasero. Multi-layer perceptron ensembles for increased performance and fault-tolerance in pattern recognition tasks. In IEEE International Conference on Neural Networks, pages 2901–2906, Orlando, Florida, 1994.

    Google Scholar 

  38. Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  39. Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and Systems Sciences, 55(1):119–139, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  40. Y. Freund and R. E. Schapire. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, pages 148–156. Morgan Kauffman, 1996.

    Google Scholar 

  41. J. Friedman. Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 39(5), 2001.

    Google Scholar 

  42. J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 38(2):337–374, 2000.

    Article  MathSciNet  Google Scholar 

  43. J. H. Friedman. On bias, variance, 0/1 loss and the curse of dimensionality. Data Mining and Knowledge Discovery, 1:55–77, 1997.

    Article  Google Scholar 

  44. C. Furlanello and S. Merler. Boosting of Tree-based Classifiers for Predictive Risk Modeling in GIS. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 857 of Lecture Notes in Computer Science, pages 220–229. Springer-Verlag, 2000.

    Google Scholar 

  45. S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias-variance dilemma. Neural Computation, 4(1):1–58, 1992.

    Article  Google Scholar 

  46. R. Ghani. Using error correcting output codes for text classification. In ICML 2000: Proceedings of the 17th International Conference on Machine Learning, pages 303–310, San Francisco, US, 2000. Morgan Kaufmann Publishers.

    Google Scholar 

  47. G. Giacinto and F. Roli. Dynamic Classifier Fusion. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 177–189. Springer-Verlag, 2000.

    Google Scholar 

  48. G. Giacinto and F. Roli. An approach to automatic design of multiple classifier systems. Pattern Recognition Letters, 22:25–33, 2001.

    Article  MATH  Google Scholar 

  49. T. Hastie and R. Tibshirani. Generalized Additive Models. Chapman and Hall, London, 1990.

    MATH  Google Scholar 

  50. T. Hastie and R. Tibshirani. Classification by pairwise coupling. The Annals of Statistics, 26(1):451–471, 1998.

    MATH  MathSciNet  Google Scholar 

  51. T. K. Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.

    Article  Google Scholar 

  52. T. K. Ho. Complexity of Classification Problems ans Comparative Advantages of Combined Classifiers. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 97–106. Springer-Verlag, 2000.

    Google Scholar 

  53. T. K. Ho. Data Complexity Analysis for Classifiers Combination. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 53–67, Berlin, 2001. Springer-Verlag.

    Google Scholar 

  54. T. K. Ho, J. J. Hull, and S. N. Srihari. Decision combination in multiple classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(4):405–410, 1997.

    Article  Google Scholar 

  55. K. Hornik. Approximation capabilities of multilayer feedforward networks. Neural Networks, 4:251–257, 1991.

    Article  Google Scholar 

  56. Y. S. Huang and Suen. C. Y. Combination of multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17:90–94, 1995.

    Article  Google Scholar 

  57. L. Hyafil and R. L. Rivest. Constructing optimal binary decision tree is np-complete. Information Processing Letters, 5(1):15–17, 1976.

    Article  MATH  MathSciNet  Google Scholar 

  58. S. Impedovo and A. Salzo. A New Evaluation Method for Expert Combination in Multi-expert System Designing. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 230–239. Springer-Verlag, 2000.

    Google Scholar 

  59. R.A. Jacobs. Methods for combining experts probability assessment. Neural Computation, 7:867–888, 1995.

    Article  Google Scholar 

  60. R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3(1):125–130, 1991.

    Article  Google Scholar 

  61. A. Jain, R. Duin, and J. Mao. Statistical pattern recognition: a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:4–37, 2000.

    Article  Google Scholar 

  62. G. James. Majority vote classifiers: theory and applications. PhD thesis, Department of Statistics-Stanford University, Stanford, CA, 1998.

    Google Scholar 

  63. C. Ji and S. Ma. Combinination of weak classifiers. IEEE Trans. Neural Networks, 8(1):32–42, 1997.

    Article  Google Scholar 

  64. M. Jordan and R. Jacobs. Hierarchies of adaptive experts. In Advances in Neural Information Processing Systems, volume 4, pages 985–992. Morgan Kauffman, San Mateo, CA, 1992.

    Google Scholar 

  65. M. I. Jordan and R. A. Jacobs. Hierarchical mixture of experts and the em algorithm. Neural Computation, 6:181–214, 1994.

    Article  Google Scholar 

  66. J. M. Keller, P. Gader, H. Tahani, J. Chiang, and M. Mohamed. Advances in fuzzy integratiopn for pattern recognition. Fuzzy Sets and Systems, 65:273–283, 1994.

    Article  MathSciNet  Google Scholar 

  67. F. Kimura and M. Shridar. Handwritten Numerical Recognition Based on Multiple Algorithms. Pattern Recognition, 24(10):969–983, 1991.

    Article  Google Scholar 

  68. J. Kittler. Combining classifiers: a theoretical framework. Pattern Analysis and Applications, (1):18–27, 1998.

    Google Scholar 

  69. J. Kittler, M. Hatef, R. P. W. Duin, and aiMatas J. On combining classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(3):226–239, 1998.

    Article  Google Scholar 

  70. J. Kittler and F. (editors) Roli. Multiple Classifier Systems, Proc. of 1st International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, 2000.

    Google Scholar 

  71. J. Kittler and F. (editors) Roli. Multiple Classifier Systems, Proc. of 2nd International Workshop, MCS2001, Cambridge, UK. Springer-Verlag, Berlin, 2001.

    Google Scholar 

  72. E. M. Kleinberg. On the Algorithmic Implementation of Stochastic Discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence.

    Google Scholar 

  73. E. M. Kleinberg. Stochastic Discrimination. Annals of Mathematics and Artificial Intelligence, pages 207–239, 1990.

    Google Scholar 

  74. E. M. Kleinberg. An overtraining-resistant stochastic modeling method for pattern recognition. Annals of Statistics, 4(6):2319–2349, 1996.

    MathSciNet  Google Scholar 

  75. E. M. Kleinberg. A Mathematically Rigorous Foundation for Supervised Learning. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 67–76. Springer-Verlag, 2000.

    Google Scholar 

  76. J. Kolen and Pollack J. Back propagation is sensitive to initial conditions. In Advances in Neural Information Processing Systems, volume 3, pages 860–867. Morgan Kauffman, San Francisco, CA, 1991.

    Google Scholar 

  77. E. Kong and T. G. Dietterich. Error-correcting output coding correct bias and variance. In The XII International Conference on Machine Learning, pages 313–321, San Francisco, CA, 1995. Morgan Kauffman.

    Google Scholar 

  78. A. Krogh and J. Vedelsby. Neural networks ensembles, cross validation and active learning. In D.S. Touretzky, G. Tesauro, and T.K. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 107–115. MIT Press, Cambridge, MA, 1995.

    Google Scholar 

  79. L. I. Kuncheva. Genetic algorithm for feature selection for parallel classifiers. Information Processing Letters, 46:163–168, 1993.

    Article  MATH  Google Scholar 

  80. L. I. Kuncheva. An application of OWA operators to the aggragation of multiple classification decisions. In The Ordered Weighted Averaging operators. Theory and Applciations, pages 330–343. Kluwer Academic Publisher, USA, 1997.

    Google Scholar 

  81. L. I. Kuncheva, J. C. Bezdek, and R. P. W. Duin. Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognition, 34(2):299–314, 2001.

    Article  MATH  Google Scholar 

  82. L. I. Kuncheva, F. Roli, G. L. Marcialis, and C. A. Shipp. Complexity of Data Subsets Generated by the Random Subspace Method: An Experimental Investigation. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 349–358. Springer-Verlag, 2001.

    Google Scholar 

  83. L. I. Kuncheva and C. J. Whitaker. Feature Subsets for Classifier Combination: An Enumerative Experiment. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 228–237. Springer-Verlag, 2001.

    Google Scholar 

  84. L. I. Kuncheva et al. Is independence good for combining classifiers? In Proc. of 15th Int. Conf. on Pattern Recognition, Barcelona, Spain, 2000.

    Google Scholar 

  85. L. Lam. Classifier combinations: Implementations and theoretical issues. In Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 77–86. Springer-Verlag, 2000.

    Google Scholar 

  86. L. Lam and C. Sue. Optimal combination of pattern classifiers. Pattern Recognition Letters, 16:945–954, 1995.

    Article  Google Scholar 

  87. L. Lam and C. Sue. Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man and Cybernetics, 27(5):553–568, 1997.

    Article  Google Scholar 

  88. M. Li and P Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, Berlin, 1993.

    MATH  Google Scholar 

  89. L. Mason, P. Bartlett, and J. Baxter. Improved generalization through explicit optimization of margins. Machine Learning, 2000.

    Google Scholar 

  90. F. Masulli and G. Valentini. Comparing decomposition methods for classification. In R. J. Howlett and L. C. Jain, editors, KES’2000, Fourth International Conference on Knowledge-Based Intelligent Engineering Systems & Allied Technologies, pages 788–791, Piscataway, NJ, 2000. IEEE.

    Google Scholar 

  91. F. Masulli and G. Valentini. Effectiveness of error correcting output codes in multiclass learning problems. In Lecture Notes in Computer Science, volume 1857, pages 107–116. Springer-Verlag, Berlin, Heidelberg, 2000.

    Google Scholar 

  92. F. Masulli and G. Valentini. Dependence among Codeword Bits Errors in ECOC Learning Machines: an Experimental Analysis. In Lecture Notes in Computer Science, volume 2096, pages 158–167. Springer-Verlag, Berlin, 2001.

    Google Scholar 

  93. F. Masulli and G. Valentini. Quantitative Evaluation of Dependence among Outputs in ECOC Classifiers Using Mutual Information Based Measures. In K. Marko and P. Webos, editors, Proceedings of the International Joint Conference on Neural Networks IJCNN’01, volume 2, pages 784–789, Piscataway, NJ, USA, 2001. IEEE.

    Google Scholar 

  94. E. Mayoraz and M. Moreira. On the decomposition of polychotomies into dichotomies. In The XIV International Conference on Machine Learning, pages 219–226, Nashville, TN, July 1997.

    Google Scholar 

  95. S. Merler, C. Furlanello, B. Larcher, and A. Sboner. Tuning Cost-Sensitive Boosting and its Application to Melanoma Diagnosis. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 32–42. Springer-Verlag, 2001.

    Google Scholar 

  96. M. Moreira and E. Mayoraz. Improved pairwise coupling classifiers with correcting classifiers. In C. Nedellec and C. Rouveirol, editors, Lecture Notes in Artificial Intelligence, Vol. 1398, pages 160–171, Berlin, Heidelberg, New York, 1998.

    Google Scholar 

  97. D.W. Opitz and J. W. Shavlik. Actively searching for an effective neural network ensemble. Connection Science, 8(3/4):337–353, 1996.

    Article  Google Scholar 

  98. N. C. Oza and K. Tumer. Input Decimation Ensembles: Decorrelation through Dimensionality Reduction. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 238–247. Springer-Verlag, 2001.

    Google Scholar 

  99. H. S. Park and S. W. Lee. Off-line recognition of large sets handwritten characters with multiple Hidden-Markov models. Pattern Recognition, 29(2):231–244, 1996.

    Article  Google Scholar 

  100. J. Park and I. W. Sandberg. Approximation and radial basis function networks. Neural Computation, 5(2):305–316, 1993.

    Article  Google Scholar 

  101. B. Parmanto, P. Munro, and H. Doyle. Improving committe diagnosis with resampling techniques. In D. S. Touretzky, M. Mozer, and M. Hesselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 882–888. MIT Press, Cambridge, MA, 1996.

    Google Scholar 

  102. B. Parmanto, P. Munro, and H. Doyle. Reducing variance of committee predition with resampling techniques. Connection Science, 8(3/4):405–416, 1996.

    Article  Google Scholar 

  103. D. Partridge and W. B. Yates. Engineering multiversion neural-net systems. Neural Computation, 8:869–893, 1996.

    Article  Google Scholar 

  104. M. P. Perrone and L. N. Cooper. When networks disagree: ensemble methods for hybrid neural networks. In Mammone R. J., editor, Artificial Neural Networks for Speech and Vision, pages 126–142. Chapman & Hall, London, 1993.

    Google Scholar 

  105. W.W. Peterson and E. J. Jr. Weldon. Error correcting codes. MIT Press, Cambridge, MA, 1972.

    MATH  Google Scholar 

  106. J.R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kauffman, 1993.

    Google Scholar 

  107. Y. Raviv and N. Intrator. Bootstrapping with noise: An effective regularization technique. Connection Science, 8(3/4):355–372, 1996.

    Article  Google Scholar 

  108. G. Rogova. Combining the results of several neural neetworks classifiers. Neural Networks, 7:777–781, 1994.

    Article  Google Scholar 

  109. F. Roli, G. Giacinto, and G. Vernazza. Methods for Designing Multiple Classifier Systems. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 78–87. Springer-Verlag, 2001.

    Google Scholar 

  110. R. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135–168, 2000.

    Article  MATH  Google Scholar 

  111. R.E. Schapire. The strenght of weak learnability. Machine Learning, 5(2):197–227, 1990.

    Google Scholar 

  112. R.E. Schapire. A brief introduction to boosting. In Thomas Dean, editor, 16th International Joint Conference on Artificial Intelligence, pages 1401–1406. Morgan Kauffman, 1999.

    Google Scholar 

  113. R. E. Schapire, Y. Freund, P. Bartlett, and W. Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651–1686, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  114. R.E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3):297–336, 1999.

    Article  MATH  Google Scholar 

  115. H. Schwenk and Y. Bengio. Training methods for adaptive boosting of neural networks. In Advances in Neural Information Processing Systems, volume 10, pages 647–653. 1998.

    Google Scholar 

  116. A. Sharkey, N. Sharkey, and G. Chandroth. Diverse neural net solutions to a fault diagnosis problem. Neural Computing and Applications, 4:218–227, 1996.

    Article  Google Scholar 

  117. A Sharkey, N. Sharkey, U. Gerecke, and G. Chandroth. The test and select approach to ensemble combination. In J. Kittler and F. Roli, editors, Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 30–44. Springer-Verlag, 2000.

    Google Scholar 

  118. A. Sharkey (editor). Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems. Springer-Verlag, London, 1999.

    MATH  Google Scholar 

  119. M. Skurichina and R. P. W. Duin. Bagging, boosting and the randon subspace method for linear classifiers. Pattern Analysis and Applications. (in press).

    Google Scholar 

  120. M. Skurichina and R. P. W. Duin. Bagging for linear classifiers. Pattern Recognition, 31(7):909–930, 1998.

    Article  Google Scholar 

  121. M. Skurichina and R. P. W. Duin. Bagging and the Random Subspace Method for Redundant Feature Spaces. In Multiple Classifier Systems. Second International Workshop, MCS 2001, Cambridge, UK, volume 2096 of Lecture Notes in Computer Science, pages 1–10. Springer-Verlag, 2001.

    Google Scholar 

  122. C. Suen and L. Lam. Multiple classifier combination methodologies for different output levels. In Multiple Classifier Systems. First International Workshop, MCS 2000, Cagliari, Italy, volume 1857 of Lecture Notes in Computer Science, pages 52–66. Springer-Verlag, 2000.

    Google Scholar 

  123. K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Science, 8(3/4):385–404, 1996.

    Article  Google Scholar 

  124. K. Tumer and N. C. Oza. Decimated input ensembles for improved generalization. In IJCNN-99, The IEEE-INNS-ENNS International Joint Conference on Neural Networks, 1999.

    Google Scholar 

  125. G. Valentini. Upper bounds on the training error of ECOC-SVM ensembles. Technical Report TR-00-17, DISI-Dipartimento di Informatica e Scienze dell’ Informazione-Universita di Genova, 2000. ftp://ftp.disi.unige.it/person/ValentiniG/papers/TR-00-17.ps.gz.

  126. G. Valentini. Gene expression data analysis of human lymphoma using Support Vector Machines and Output Coding ensembles. Artificial Intelligence in Medicine (to appear).

    Google Scholar 

  127. J. Van Lint. Coding theory. Spriger Verlag, Berlin, 1971.

    MATH  Google Scholar 

  128. D. Wang, J.M. Keller, C. A. Carson, K.K. McAdoo-Edwards, and C. W. Bailey. Use of fuzzy logic inspired features to improve bacterial recognition through classifier fusion. IEEE Transactions on Systems, Man and Cybernetics, 28B(4):583–591, 1998.

    Google Scholar 

  129. K. Woods, W. P. Kegelmeyer, and K. Bowyer. Combination of multiple classifiers using local accuracy estimates. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(4):405–410, 1997.

    Article  Google Scholar 

  130. L Xu, C Krzyzak, and C. Suen. Methods of combining multiple classifiers and their applications to handwritting recognition. IEEE Transactions on Systems, Man and Cybernetics, 22(3):418–435, 1992.

    Article  Google Scholar 

  131. C. Yeang et al. Molecular classification of multiple tumor types. In ISMB 2001, Proceedings of the 9th International Conference on Intelligent Systems for Molecular Biology, pages 316–322, Copenaghen, Denmark, 2001. Oxford University Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Valentini, G., Masulli, F. (2002). Ensembles of Learning Machines. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-45808-5_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44265-3

  • Online ISBN: 978-3-540-45808-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics