Advertisement

Supervised Neural Networks and Ensemble Methods

  • Francesco CamastraEmail author
  • Alessandro Vinciarelli
Chapter
Part of the Advanced Information and Knowledge Processing book series (AI&KP)

Abstract

What the reader should know to understand this chapter \(\bullet \) Fundamentals of machine learning (Chap.  4). \(\bullet \) Statistics (Appendix A).

Keywords

Hide Layer Activation Function Recognition Rate Hide Node Output Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    C. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1996.Google Scholar
  2. 2.
    H. Bourlard and Y. Kamp. Auto-association by Multi-Layer Perceptron and Singular Value Decomposition. Biological Cybernetics, 59:291–294, 1988.Google Scholar
  3. 3.
    L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.Google Scholar
  4. 4.
    L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801–824, 1998.Google Scholar
  5. 5.
    F. Camastra and A. Vinciarelli. Cursive character recognition by learning vector quantization. Pattern Recognition Letters, 22(6–7):625–629, 2001.Google Scholar
  6. 6.
    K.J. Cherkauer. Human expert-level performance on a scientific image analysis task by a system using combined Artificial Neural Networks. In Working Notes of the AAAI Workshop on Integrating Multiple Learned Models, pages 15–21, 1996.Google Scholar
  7. 7.
    R. Collobert, S. Bengio, and J. Mariethoz. Torch: a modular machine learning software library. Technical Report IDIAP-RR-02-46, IDIAP Research Institute, 2002.Google Scholar
  8. 8.
    K. Crammer, R. Gilad-Bachrach, A. Navot, and N. Tishby. Margin analysis of the LVQ algorithm. In Advances in Neural Information Processing Systems, volume 14, pages 109–114, 2002.Google Scholar
  9. 9.
    G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2:303–314, 1989.Google Scholar
  10. 10.
    T. Dietterich. Ensemble methods in machine learning. In Proceedings of \(1^{st}\) International Workshop on Multiple Classifier Systems, pages 1–15, 2000.Google Scholar
  11. 11.
    T.G. Dietterich. Ensemble learning. In M. Arbib, editor, The handbook of brain theory and neural networks. MIT Press, 2002.Google Scholar
  12. 12.
    T.G. Dietterich and G. Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2:263–286, 1995.Google Scholar
  13. 13.
    R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley, 2001.Google Scholar
  14. 14.
    B. Efron and R. J. Tibshirani. An Introduction to the Bootstrap. Chapman and Hall, 1993.Google Scholar
  15. 15.
    Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In International Conference in Machine Learning, pages 138–146, 1996.Google Scholar
  16. 16.
    Y. Freund and R.E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.Google Scholar
  17. 17.
    B. Hammer and T. Villmann. Generalized relevance learning vector quantization. Neural Networks, 15(8–9):1059–1068, 2002.Google Scholar
  18. 18.
    S. Haykin. Neural Networks: a comprehensive foundation. Prentice-Hall, 1998.Google Scholar
  19. 19.
    R. Hecht-Nielsen, editor. Neurocomputing. Addison-Wesley, 1990.Google Scholar
  20. 20.
    J. Hertz, A. Krogh, and R.G. Palmer, editors. Introduction to the Theory of Neural Computation. Addison-Wesley, 1991.Google Scholar
  21. 21.
    K. Hornik, M. Stinchcombe, and H. White. Multi-Layer feedforward networks are universal approximators. Neural Networks, 2(5):359–366, 1989.Google Scholar
  22. 22.
    A.K. Jain, J. Mao, and K.M. Mohiuddin. Artificial neural networks: a tutorial. IEEE Computer, pages 31–44, 1996.Google Scholar
  23. 23.
    J. Kittler, M. Hatef, R.P.W. Duin, and J. Matas. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3):226–239, 1998.Google Scholar
  24. 24.
    T. Kohonen. Self-Organizing Maps. Springer-Verlag, 1997.Google Scholar
  25. 25.
    T. Kohonen, J. Hynninen, J. Kangas, J. Laaksonen, and K. Torkkola. Lvq\_pak: the Learning Vector Quantization program package. Technical Report A30, Helsinki University of Technology - Laboratory of Computer and Information Science, 1996.Google Scholar
  26. 26.
    L. Kuncheva. Combining Pattern Classifiers. Wiley-Interscience, 2004.Google Scholar
  27. 27.
    J.L. McClelland, G.E. Hinton, and D.E. Rumelhart. A general framework for parallel distributed processing. In J.L. McClelland and Rumelhart, editors, Parallel Distributed Processing, volume Vol. 1: Foundations, pages 45–76. MIT Press, 1986.Google Scholar
  28. 28.
    J.L. McClelland, D.E. Rumelhart, and G.E. Hinton. The appeal of a parallel distributed processing. In J.L. McClelland and Rumelhart, editors, Parallel Distributed Processing, volume Vol. 1: Foundations, pages 3–44. MIT Press, 1986.Google Scholar
  29. 29.
    W.S. McCulloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 9:127–147, 1943.Google Scholar
  30. 30.
    E. McDermott and S. Katagiri. Prototype-based minimum classification error/generalized probabilistic descent training for various speech units. Computer Speech and Languages, 8(4):351–368, 1994.Google Scholar
  31. 31.
    D.A. Medler. A brief history of connectionism. Neural Computing Surveys, 1:61–101, 1998.Google Scholar
  32. 32.
    M.L. Minsky and S.A. Papert. Perceptrons. MIT Press, 1969.Google Scholar
  33. 33.
    T. Mitchell. Machine Learning. McGraw-Hill, 1997.Google Scholar
  34. 34.
    J. Nolte. The human brain: an introduction to its functional anatomy. Mosby, 2002.Google Scholar
  35. 35.
    M. Oja, S. Karski, and T. Kohonen. Bibliography of self-organizing map papers: 1998–2001 addendum. Neural Computing Surveys, 3:1–156, 2002.Google Scholar
  36. 36.
    D. Opitz and R. Maclin. Popular ensemble methods: an empirical study. Journal of Artificial Intelligence Research, 11:169–198, 1999.Google Scholar
  37. 37.
    B. Parmanto, P.W. Munro, and H.R. Doyle. Improving committee diagnosis with resampling techniques. In Advances in Neural Information Processing Systems, volume 8, pages 882–888, 1996.Google Scholar
  38. 38.
    M.P. Perrone and L.N. Cooper. When networks disagree: ensemble methods for hybrid neural networks. In R.J. Mammone, editor, Artificial Neural Networks for speech and vision, pages 126–142. Chapman & Hall, 1993.Google Scholar
  39. 39.
    F. Rosenblatt. Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Spartan, 1961.Google Scholar
  40. 40.
    D.E. Rumelhart, G.E. Hinton, and R.J. Williams. Learning internal representations by error propagation. In J.L. McClelland and Rumelhart, editors, Parallel Distributed Processing, volume Vol. 1: Foundations, pages 318–362. MIT Press, 1986.Google Scholar
  41. 41.
    D.E. Rumelhart, G.E. Hinton, and R.J. Williams. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.Google Scholar
  42. 42.
    D.E. Rumelhart and J.L. McClelland, editors. Parallel Distributed Processing. MIT Press, 1986.Google Scholar
  43. 43.
    A. S. Sato and K. Yamada. Generalized learning vector quantization. In Advances in Neural Information Processing Systems, volume 7, pages 423–429, 1995.Google Scholar
  44. 44.
    R. E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.Google Scholar
  45. 45.
    K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Science, 8(3–4):385–404, 1996.Google Scholar
  46. 46.
    P. J. Werbos. Beyond regression: New tools for prediction and analysis in the behavioral sciences. Technical report, Harvard University, Ph.D. Dissertation, 1974.Google Scholar
  47. 47.
    B. Widrow and M.E. Hoff. Adaptive switching circuits. In Convention Record of the Institute of Radio Engineers, Western Electronic Show and Convention, pages 96–104. Institute for Radio Engineers, 1960.Google Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  1. 1.Department of Science and TechnologyParthenope University of NaplesNaplesItaly
  2. 2.School of Computing Science and the Institute of Neuroscience and PsychologyUniversity of GlasgowGlasgowUK

Personalised recommendations