Skip to main content

Multilayer Perceptrons: Other Learing Techniques

  • Chapter
  • First Online:
Neural Networks and Statistical Learning

Abstract

This chapter continues to deal with multilayer perceptron. But the focus is on various second-order learning methods to speed up the learning process. Complex-valued multilayer perceptrons and spiking neural networks are also introduced in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amari, S. I. (1998). Natural gradient works efficiently in learning. Neural Computation, 10, 251–276.

    Article  Google Scholar 

  2. Amari, A., Park, H., & Fukumizu, K. (2000). Adaptive method of realizing natiral gradient learning for multilayer perceptrons. Neural Computation, 12, 1399–1409.

    Article  Google Scholar 

  3. Ampazis, N., & Perantonis, S. J. (2002). Two highly efficient second-order algorithms for training feedforward networks. IEEE Transactions on Neural Networks, 13(5), 1064–1074.

    Article  Google Scholar 

  4. Azimi-Sadjadi, R., & Liou, R. J. (1992). Fast learning process of multilayer neural networks using recursive least squares method. IEEE Transactions on Signal Processing, 40(2), 446–450.

    Article  Google Scholar 

  5. Baermann, F., & Biegler-Koenig, F. (1992). On a class of efficient learning algorithms for neural networks. Neural Networks, 5(1), 139–144.

    Article  Google Scholar 

  6. Barnard, E. (1992). Optimization for training neural nets. IEEE Transactions on Neural Networks, 3(2), 232–240.

    Article  Google Scholar 

  7. Battiti, R., & Masulli, F. (1990). BFGS optimization for faster automated supervised learning. In Proceedings of International Neural Network Conference (Vol. 2, pp. 757–760). Dordrecht, Netherland: Kluwer. Paris, France.

    Google Scholar 

  8. Battiti, R. (1992). First- and second-order methods for learning: Between steepest descent and Newton methods. Neural Computation, 4(2), 141–166.

    Article  Google Scholar 

  9. Battiti, R., Masulli, G., & Tecchiolli, G. (1994). Learning with first, second, and no derivatives: A case study in high energy physics. Neurocomputing, 6(2), 181–206.

    Article  Google Scholar 

  10. Beigi, H. S. M. (1993). Neural network learning through optimally conditioned quadratically convergent methods requiring no line search. In Proceedings of IEEE the 36th Midwest Symposium on Circuits and Systems (Vol. 1, pp. 109–112). Detroit, MI.

    Google Scholar 

  11. Benvenuto, N., & Piazza, F. (1992). On the complex backpropagation algorithm. IEEE Transactions on Signal Processing, 40(4), 967–969.

    Article  Google Scholar 

  12. Bhaya, A., & Kaszkurewicz, E. (2004). Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Networks, 17, 65–71.

    Article  MATH  Google Scholar 

  13. Bilski, J., & Rutkowski, L. (1998). A fast training algorithm for neural networks. IEEE Transactions on Circuits and Systems II, 45(6), 749–753.

    Article  Google Scholar 

  14. Bishop, C. M. (1992). Exact calculation of the Hessian matrix for the multilayer perceptron. Neural Computation, 4(4), 494–501.

    Article  Google Scholar 

  15. Bishop, C. M. (1995). Neural networks for pattern recogonition. New York: Oxford Press.

    Google Scholar 

  16. Bortoletti, A., Di Fiore, C., Fanelli, S., & Zellini, P. (2003). A new class of quasi-Newtonian methods for optimal learning in MLP-networks. IEEE Transactions on Neural Networks, 14(2), 263–273.

    Article  Google Scholar 

  17. Burton, R. M., & Mpitsos, G. J. (1992). Event dependent control of noise enhances learning in neural networks. Neural Networks, 5, 627–637.

    Article  Google Scholar 

  18. Charalambous, C. (1992). Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proceedings—G, 139(3), 301–310.

    Article  Google Scholar 

  19. Chen, H. H., Manry, M. T., & Chandrasekaran, H. (1999). A neural network training algorithm utilizing multiple sets of linear equations. Neurocomputing, 25, 55–72.

    Article  MATH  Google Scholar 

  20. Chen, Y. X., & Wilamowski, B. M. (2002). TREAT: A trust-region-based error-aggregated training algorithm for neural networks. In Proceedings of International Joint Conference on Neural Networks (Vol. 2, pp. 1463–1468).

    Google Scholar 

  21. Dai, Y. H., & Yuan, Y. (1999). A nonlinear conjugate gradient method with a strong global convergence property. SIAM Journal on Optimization, 10, 177–182.

    Article  MathSciNet  MATH  Google Scholar 

  22. Dixon, L. C. W. (1975). Conjugate gradient algorithms: Quadratic termination properties without linear searches. IMA Journal of Applied Mathematics, 15, 9–18.

    Article  MATH  Google Scholar 

  23. Ergezinger, S., & Thomsen, E. (1995). An accelerated learning algorithm for multilayer perceptrons: Optimization layer by layer. IEEE Transactions on Neural Networks, 6(1), 31–42.

    Article  Google Scholar 

  24. Fairbank, M., Alonso, E., & Schraudolph, N. (2012). Efficient calculation of the Gauss-Newton approximation of the Hessian matrix in neural networks. Neural Computation, 24(3), 607–610.

    Article  MATH  Google Scholar 

  25. Fletcher, R. (1991). Practical methods of optimization. New York: Wiley.

    MATH  Google Scholar 

  26. Fletcher, R., & Reeves, C. W. (1964). Function minimization by conjugate gradients. Computer Journal, 7, 148–154.

    Article  MathSciNet  MATH  Google Scholar 

  27. Fukuoka, Y., Matsuki, H., Minamitani, H., & Ishida, A. (1998). A modified back-propagation method to avoid false local minima. Neural Networks, 11, 1059–1072.

    Article  Google Scholar 

  28. Georgiou, G., & Koutsougeras, C. (1992). Complex domain backpropagation. IEEE Transactions on Circuits and Systems II, 39(5), 330–334.

    Article  MATH  Google Scholar 

  29. Gonzalez, A., & Dorronsoro, J. R. (2008). Natural conjugate gradient training of multilayer perceptrons. Neurocomputing, 71, 2499–2506.

    Article  Google Scholar 

  30. Goryn, D., & Kaveh, M. (1989). Conjugate gradient learning algorithms for multilayer perceptrons. In Proceedings of the 32nd Midwest Symposium on Circuits and Systems (pp. 736–739). Champaign, IL.

    Google Scholar 

  31. Hagan, M. T., & Menhaj, M. B. (1994). Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks, 5(6), 989–993.

    Article  Google Scholar 

  32. Hanna, A. I. & Mandic, D. P. (2002). A normalised complex backpropagation algorithm. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 977–980). Orlando, FL.

    Google Scholar 

  33. Hassibi, B., Stork, D. G., & Wolff, G. J. (1992). Optimal brain surgeon and general network pruning. In Proceedings of IEEE International Conference on Neural Networks (pp. 293–299). San Francisco, CA.

    Google Scholar 

  34. Heskes, T. (2000). On “natural” learning and pruning in multilayered perceptrons. Neural Computation, 12, 881–901.

    Article  Google Scholar 

  35. Hestenes, M. R., & Stiefel, E. (1952). Methods of conjugate gradients for solving linear systems. Journal of Research of National Bureau of Standards B, 49, 409–436.

    Article  MathSciNet  MATH  Google Scholar 

  36. Hush, D. R., Horne, B., & Salas, J. M. (1992). Error surfaces for multilayer perceptrons. IEEE Transactions on Systems, Man, and Cybernetics, 22(5), 1152–1161.

    Article  Google Scholar 

  37. Igel, C., Toussaint, M., & Weishui, W. (2005). Rprop using the natural gradient. In M. G. de Bruin, D. H. Mache, & J. Szabados (Eds.), Trends and applications in constructive approximation, International series of numerical mathematics (Vol. 151, pp. 259–272). Basel, Switzerland: Birkhauser.

    Chapter  Google Scholar 

  38. IIguni, Y., Sakai, H., & Tokumaru, H., (1992). A real-time learning algorithm for a multilayered neural network based on the extended Kalman filter. IEEE Transactions on Signal Processing, 40(4), 959–967.

    Google Scholar 

  39. Johansson, E. M., Dowla, F. U., & Goodman, D. M. (1991). Backpropagation learning for multilayer feedforward neural networks using the conjugate gradient method. International Journal of Neural Systems, 2(4), 291–301.

    Article  Google Scholar 

  40. Kamarthi, S. V., & Pittner, S. (1999). Accelerating neural network training using weight extrapolations. Neural Networks, 12, 1285–1299.

    Article  Google Scholar 

  41. Kantsila, A., Lehtokangas, M., & Saarinen, J. (2004). Complex RPROP-algorithm for neural network equalization of GSM data bursts. Neurocomputing, 61, 339–360.

    Article  Google Scholar 

  42. Kim, T., & Adali, T. (2002). Fully complex multi-layer perceptron network for nonlinear signal processing. Journal of VLSI Signal Processing, 32(1), 29–43.

    Article  MATH  Google Scholar 

  43. Kim, T., & Adali, T. (2003). Approximation by fully complex multilayer perceptrons. Neural Computation, 15, 1641–1666.

    Article  MATH  Google Scholar 

  44. Kostopoulos, A. E., & Grapsa, T. N. (2009). Self-scaled conjugate gradient training algorithms. Neurocomputing, 72, 3000–3019.

    Article  Google Scholar 

  45. Lee, J. (2003). Attractor-based trust-region algorithm for efficient training of multilayer perceptrons. Electronics Letters, 39(9), 727–728.

    Article  Google Scholar 

  46. Leung, H., & Haykin, S. (1991). The complex backpropagation algorithm. IEEE Transactions on Signal Processing, 3(9), 2101–2104.

    Article  Google Scholar 

  47. Leung, C. S., Wong, K. W., Sum, P. F., & Chan, L. W. (2001). A pruning method for the recursive least squared algorithm. Neural Networks, 14, 147–174.

    Article  Google Scholar 

  48. Leung, C. S., Tsoi, A. C., & Chan, L. W. (2001). Two regularizers for recursive least squared algorithms in feedforward multilayered neural networks. IEEE Transactions on Neural Networks, 12, 1314–1332.

    Article  Google Scholar 

  49. Li, Y., Zhang, D., & Wang, K. (2006). Parameter by parameter algorithm for multilayer perceptrons. Neural Processing Letters, 23, 229–242.

    Article  Google Scholar 

  50. Liu, C. S., & Tseng, C. H. (1999). Quadratic optimization method for multilayer neural networks with local error-backpropagation. International Journal on Systems Science, 30(8), 889–898.

    Article  MATH  Google Scholar 

  51. Manry, M. T., Apollo, S. J., Allen, L. S., Lyle, W. D., Gong, W., Dawson, M. S., et al. (1994). Fast training of neural networks for remote sensing. Remote Sensing Reviews, 9, 77–96.

    Article  Google Scholar 

  52. McLoone, S., & Irwin, G. (1999). A variable memory quasi-Newton training algorithm. Neural Processing Letters, 9, 77–89.

    Article  Google Scholar 

  53. McLoone, S. F., & Irwin, G. W. (1997). Fast parallel off-line training of multilayer perceptrons. IEEE Transactions on Neural Networks, 8(3), 646–653.

    Article  Google Scholar 

  54. McLoone, S. F., Asirvadam, V. S., & Irwin, G. W. (2002). A memory optimal BFGS neural network training algorithm. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 513–518). Honolulu, HI.

    Google Scholar 

  55. Moller, M. F. (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks, 6(4), 525–533.

    Article  Google Scholar 

  56. More, J. J. (1977). The Levenberg-Marquardt algorithm: Implementation and theory. In G. A. Watson (Ed.), Numerical analysis (Vol. 630, pp. 105–116)., Lecture notes in mathematics Berlin: Springer-Verlag.

    Chapter  Google Scholar 

  57. Nazareth, J. L. (2003). Differentiable optimization and equation solving. New York: Springer.

    MATH  Google Scholar 

  58. Ng, S. C., Leung, S. H., & Luk, A. (1999). Fast convergent generalized back-propagation algorithm with constant learning rate. Neural Processing Letters, 9, 13–23.

    Article  Google Scholar 

  59. Ngia, L. S. H., & Sjoberg, J. (2000). Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm. IEEE Transactions on Signal Processing, 48(7), 1915–1927.

    Article  MATH  Google Scholar 

  60. Nishiyama, K., & Suzuki, K. (2001). H\(_\infty \)-learning of layered neural networks. IEEE Transactions on Neural Networks, 12(6), 1265–1277.

    Article  Google Scholar 

  61. Nitta, T. (1997). An extension to the back-propagation algorithm to complex numbers. Neural Networks, 10(8), 1391–1415.

    Article  Google Scholar 

  62. Parisi, R., Di Claudio, E. D., Orlandim, G., & Rao, B. D. (1996). A generalized learning paradigm exploiting the structure of feedforward neural networks. IEEE Transactions on Neural Networks, 7(6), 1450–1460.

    Article  Google Scholar 

  63. Perantonis, S. J., Ampazis, N., & Spirou, S. (2000). Training feedforward neural networks with the dogleg method and BFGS Hessian updates. In Proceedings of International Joint Conference on Neural Networks (pp. 138–143). Como, Italy.

    Google Scholar 

  64. Perry, A. (1978). A modified conjugate gradient algorithm. Operations Research, 26, 26–43.

    Article  MathSciNet  MATH  Google Scholar 

  65. Phua, P. K. H., & Ming, D. (2003). Parallel nonlinear optimization techniques for training neural networks. IEEE Transactions on Neural Networks, 14(6), 1460–1468.

    Article  Google Scholar 

  66. Polak, E. (1971). Computational methods in optimization: A unified approach. New York: Academic Press.

    Google Scholar 

  67. Powell, M. J. D. (1977). Restart procedures for the conjugate gradient method. Mathematical Programming, 12, 241–254.

    Article  MathSciNet  MATH  Google Scholar 

  68. Puskorius, G. V., & Feldkamp, L. A. (1991). Decoupled extended Kalman filter training of feedforward layered networks. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 771–777). Seattle, WA.

    Google Scholar 

  69. Rao, K. D., Swamy, M. N. S., & Plotkin, E. I. (2000). Complex EKF neural network for adaptive equalization. In Proceedings of IEEE International Symposium on Circuits and Systems (pp. 349–352). Geneva, Switzerland.

    Google Scholar 

  70. Rigler, A. K., Irvine, J. M., & Vogl, T. P. (1991). Rescaling of variables in back propagation learning. Neural Networks, 4(2), 225–229.

    Article  Google Scholar 

  71. Rivals, I., & Personnaz, L. (1998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Neurocomputing, 20, 279–294.

    Article  Google Scholar 

  72. Rubanov, N. S. (2000). The layer-wise method and the backpropagation hybrid approach to learning a feedforward neural network. IEEE Transactions on Neural Networks, 11(2), 295–305.

    Article  Google Scholar 

  73. Ruck, D. W., Rogers, S. K., Kabrisky, M., Maybeck, P. S., & Oxley, M. E. (1992). Comparative analysis of backpropagation and the extended Kalman filter for training multilayer perceptrons. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(6), 686–691.

    Article  Google Scholar 

  74. Saarinen, S., Bramley, R., & Cybenko, G. (1993). Ill conditioning in neural network training problems. SIAM Journal on Scientific Computing, 14(3), 693–714.

    Article  MathSciNet  MATH  Google Scholar 

  75. Saito, K., & Nakano, R. (1997). Partial BFGS update and efficient step-length calculation for three-layer neural networks. Neural Computation, 9, 123–141.

    Article  MATH  Google Scholar 

  76. Savitha, R., Suresh, S., Sundararajan, N., & Saratchandran, P. (2009). A new learning algorithm with logarithmic performance index for complex-valued neural networks. Neurocomputing, 72, 3771–3781.

    Article  Google Scholar 

  77. Scalero, R. S., & Tepedelenlioglu, N. (1992). A fast new algorithm for training feedforward neural networks. IEEE Transactions on Signal Processing, 40(1), 202–210.

    Article  Google Scholar 

  78. Shanno, D. (1978). Conjugate gradient methods with inexact searches. Mathematics of Operations Research, 3, 244–256.

    Article  MathSciNet  MATH  Google Scholar 

  79. Shah, S., & Palmieri, F. (1990). MEKA–A fast, local algorithm for training feedforward neural networks. In Proceedings of International Joint Conference on Neural Networks (IJCNN) (Vol. 3, pp. 41–46). San Diego, CA.

    Google Scholar 

  80. Shawe-Taylor, J. S., & Cohen, D. A. (1990). Linear programming algorithm for neural networks. Neural Networks, 3(5), 575–582.

    Article  Google Scholar 

  81. Singhal, S., & Wu, L. (1989). Training feedforward networks with the extended Kalman algorithm. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Vol. 2, 1187–1190). Glasgow, UK.

    Google Scholar 

  82. Stan, O., & Kamen, E. (2000). A local linearized least squares algorithm for training feedforward neural networks. IEEE Transactions on Neural Networks, 11(2), 487–495.

    Article  Google Scholar 

  83. Sum, J., Leung, C. S., Young, G. H., & Kan, W. K. (1999). On the Kalman filtering method in neural network training and pruning. IEEE Transactions on Neural Networks, 10, 161–166.

    Article  Google Scholar 

  84. Uncini, A., Vecci, L., Campolucci, P., & Piazza, F. (1999). Complex-valued neural networks with adaptive spline activation functions. IEEE Transactions on Signal Processing, 47(2), 505–514.

    Article  Google Scholar 

  85. van der Smagt, P. (1994). Minimisation methods for training feed-forward neural networks. Neural Networks, 7(1), 1–11.

    Article  Google Scholar 

  86. Verikas, A., & Gelzinis, A. (2000). Training neural networks by stochastic optimisation. Neurocomputing, 30, 153–172.

    Article  Google Scholar 

  87. Wang, Y. J., & Lin, C. T. (1998). A second-order learning algorithm for multilayer networks based on block Hessian matrix. Neural Networks, 11, 1607–1622.

    Article  Google Scholar 

  88. Wilamowski, B. M., Iplikci, S., Kaynak, O., & Efe, M.O.(2001). An algorithm for fast convergence in training neural networks. In Proceedings of International Joint Conference on Neural Networks (Vol 3, pp. 1778–1782). Washington, DC.

    Google Scholar 

  89. Wilamowski, B. M., Cotton, N. J., Kaynak, O., & Dundar, G. (2008). Computing gradient vector and Jacobian matrix in arbitrarily connected neural networks. IEEE Transactions on Industrial Electronics, 55(10), 3784–3790.

    Article  Google Scholar 

  90. Wilamowski, B. M., & Yu, H. (2010). Improved computation for Levenberg-Marquardt training. IEEE Transactions on Neural Networks, 21(6), 930–937.

    Article  Google Scholar 

  91. Wilamowski, B. M., & Yu, H. (2010). Neural network learning without backpropagation. IEEE Transactions on Neural Networks, 21(11), 1793–1803.

    Article  Google Scholar 

  92. Xu, D., Zhang, H., & Liu, L. (2010). Convergence analysis of three classes of split-complex gradient algorithms for complex-valued recurrent neural networks. Neural Computation, 22(10), 2655–2677.

    Article  MathSciNet  MATH  Google Scholar 

  93. Xu, Y., Wong, K.-W., & Leung, C.-S. (2006). Generalized RLS approach to the training of neural networks. IEEE Trans Neural Netw, 17(1), 19–34.

    Article  Google Scholar 

  94. Yang, S.-S., Ho, C.-L., & Siu, S. (2007). Sensitivity analysis of the split-complex valued multilayer perceptron due to the errors of the i.i.d. inputs and weights. IEEE Transactions on Neural Networks, 18(5), 1280–1293.

    Google Scholar 

  95. Yang, S.-S., Siu, S., & Ho, C.-L. (2008). Analysis of the initial values in split-complex backpropagation algorithm. IEEE Transactions on Neural Networks, 19(9), 1564–1573.

    Article  Google Scholar 

  96. You, C., & Hong, D. (1998). Nonlinear blind equalization schemes using complex-valued multilayer feedforward neural networks. IEEE Transactions on Neural Networks, 9(6), 1442–1455.

    Article  Google Scholar 

  97. Yu, X. H., Chen, G. A., & Cheng, S. X. (1995). Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Networks, 6(3), 669–677.

    Article  Google Scholar 

  98. Yu, C., Manry, M. T., Li, J., & Narasimha, P. L. (2006). An efficient hidden layer training method for the multilayer perceptron. Neurocomputing, 70, 525–535.

    Article  Google Scholar 

  99. Zhang, Y., & Li, X. (1999). A fast U-D factorization-based learning algorithm with applications to nonlinear system modeling and identification. IEEE Transactions on Neural Networks, 10, 930–938.

    Article  Google Scholar 

  100. Zhang, H., Zhang, C., & Wu, W. (2009). Convergence of batch split-complex backpropagation algorithm for complex-valued neural networks. Discrete Dynamics in Nature and Society, 2009, 1–16.

    MATH  Google Scholar 

  101. Zhang, H., Xu, D., & Zhang, Y. (2014). Boundedness and convergence of split-complex back-propagation algorithm with momentum and penalty. Neural Processing Letters, 39, 297–307.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke-Lin Du .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer-Verlag London Ltd., part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Du, KL., Swamy, M.N.S. (2019). Multilayer Perceptrons: Other Learing Techniques. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_6

Download citation

Publish with us

Policies and ethics