Abstract
This chapter continues to deal with multilayer perceptron. But the focus is on various second-order learning methods to speed up the learning process. Complex-valued multilayer perceptrons and spiking neural networks are also introduced in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amari, S. I. (1998). Natural gradient works efficiently in learning. Neural Computation, 10, 251–276.
Amari, A., Park, H., & Fukumizu, K. (2000). Adaptive method of realizing natiral gradient learning for multilayer perceptrons. Neural Computation, 12, 1399–1409.
Ampazis, N., & Perantonis, S. J. (2002). Two highly efficient second-order algorithms for training feedforward networks. IEEE Transactions on Neural Networks, 13(5), 1064–1074.
Azimi-Sadjadi, R., & Liou, R. J. (1992). Fast learning process of multilayer neural networks using recursive least squares method. IEEE Transactions on Signal Processing, 40(2), 446–450.
Baermann, F., & Biegler-Koenig, F. (1992). On a class of efficient learning algorithms for neural networks. Neural Networks, 5(1), 139–144.
Barnard, E. (1992). Optimization for training neural nets. IEEE Transactions on Neural Networks, 3(2), 232–240.
Battiti, R., & Masulli, F. (1990). BFGS optimization for faster automated supervised learning. In Proceedings of International Neural Network Conference (Vol. 2, pp. 757–760). Dordrecht, Netherland: Kluwer. Paris, France.
Battiti, R. (1992). First- and second-order methods for learning: Between steepest descent and Newton methods. Neural Computation, 4(2), 141–166.
Battiti, R., Masulli, G., & Tecchiolli, G. (1994). Learning with first, second, and no derivatives: A case study in high energy physics. Neurocomputing, 6(2), 181–206.
Beigi, H. S. M. (1993). Neural network learning through optimally conditioned quadratically convergent methods requiring no line search. In Proceedings of IEEE the 36th Midwest Symposium on Circuits and Systems (Vol. 1, pp. 109–112). Detroit, MI.
Benvenuto, N., & Piazza, F. (1992). On the complex backpropagation algorithm. IEEE Transactions on Signal Processing, 40(4), 967–969.
Bhaya, A., & Kaszkurewicz, E. (2004). Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Networks, 17, 65–71.
Bilski, J., & Rutkowski, L. (1998). A fast training algorithm for neural networks. IEEE Transactions on Circuits and Systems II, 45(6), 749–753.
Bishop, C. M. (1992). Exact calculation of the Hessian matrix for the multilayer perceptron. Neural Computation, 4(4), 494–501.
Bishop, C. M. (1995). Neural networks for pattern recogonition. New York: Oxford Press.
Bortoletti, A., Di Fiore, C., Fanelli, S., & Zellini, P. (2003). A new class of quasi-Newtonian methods for optimal learning in MLP-networks. IEEE Transactions on Neural Networks, 14(2), 263–273.
Burton, R. M., & Mpitsos, G. J. (1992). Event dependent control of noise enhances learning in neural networks. Neural Networks, 5, 627–637.
Charalambous, C. (1992). Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proceedings—G, 139(3), 301–310.
Chen, H. H., Manry, M. T., & Chandrasekaran, H. (1999). A neural network training algorithm utilizing multiple sets of linear equations. Neurocomputing, 25, 55–72.
Chen, Y. X., & Wilamowski, B. M. (2002). TREAT: A trust-region-based error-aggregated training algorithm for neural networks. In Proceedings of International Joint Conference on Neural Networks (Vol. 2, pp. 1463–1468).
Dai, Y. H., & Yuan, Y. (1999). A nonlinear conjugate gradient method with a strong global convergence property. SIAM Journal on Optimization, 10, 177–182.
Dixon, L. C. W. (1975). Conjugate gradient algorithms: Quadratic termination properties without linear searches. IMA Journal of Applied Mathematics, 15, 9–18.
Ergezinger, S., & Thomsen, E. (1995). An accelerated learning algorithm for multilayer perceptrons: Optimization layer by layer. IEEE Transactions on Neural Networks, 6(1), 31–42.
Fairbank, M., Alonso, E., & Schraudolph, N. (2012). Efficient calculation of the Gauss-Newton approximation of the Hessian matrix in neural networks. Neural Computation, 24(3), 607–610.
Fletcher, R. (1991). Practical methods of optimization. New York: Wiley.
Fletcher, R., & Reeves, C. W. (1964). Function minimization by conjugate gradients. Computer Journal, 7, 148–154.
Fukuoka, Y., Matsuki, H., Minamitani, H., & Ishida, A. (1998). A modified back-propagation method to avoid false local minima. Neural Networks, 11, 1059–1072.
Georgiou, G., & Koutsougeras, C. (1992). Complex domain backpropagation. IEEE Transactions on Circuits and Systems II, 39(5), 330–334.
Gonzalez, A., & Dorronsoro, J. R. (2008). Natural conjugate gradient training of multilayer perceptrons. Neurocomputing, 71, 2499–2506.
Goryn, D., & Kaveh, M. (1989). Conjugate gradient learning algorithms for multilayer perceptrons. In Proceedings of the 32nd Midwest Symposium on Circuits and Systems (pp. 736–739). Champaign, IL.
Hagan, M. T., & Menhaj, M. B. (1994). Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks, 5(6), 989–993.
Hanna, A. I. & Mandic, D. P. (2002). A normalised complex backpropagation algorithm. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 977–980). Orlando, FL.
Hassibi, B., Stork, D. G., & Wolff, G. J. (1992). Optimal brain surgeon and general network pruning. In Proceedings of IEEE International Conference on Neural Networks (pp. 293–299). San Francisco, CA.
Heskes, T. (2000). On “natural” learning and pruning in multilayered perceptrons. Neural Computation, 12, 881–901.
Hestenes, M. R., & Stiefel, E. (1952). Methods of conjugate gradients for solving linear systems. Journal of Research of National Bureau of Standards B, 49, 409–436.
Hush, D. R., Horne, B., & Salas, J. M. (1992). Error surfaces for multilayer perceptrons. IEEE Transactions on Systems, Man, and Cybernetics, 22(5), 1152–1161.
Igel, C., Toussaint, M., & Weishui, W. (2005). Rprop using the natural gradient. In M. G. de Bruin, D. H. Mache, & J. Szabados (Eds.), Trends and applications in constructive approximation, International series of numerical mathematics (Vol. 151, pp. 259–272). Basel, Switzerland: Birkhauser.
IIguni, Y., Sakai, H., & Tokumaru, H., (1992). A real-time learning algorithm for a multilayered neural network based on the extended Kalman filter. IEEE Transactions on Signal Processing, 40(4), 959–967.
Johansson, E. M., Dowla, F. U., & Goodman, D. M. (1991). Backpropagation learning for multilayer feedforward neural networks using the conjugate gradient method. International Journal of Neural Systems, 2(4), 291–301.
Kamarthi, S. V., & Pittner, S. (1999). Accelerating neural network training using weight extrapolations. Neural Networks, 12, 1285–1299.
Kantsila, A., Lehtokangas, M., & Saarinen, J. (2004). Complex RPROP-algorithm for neural network equalization of GSM data bursts. Neurocomputing, 61, 339–360.
Kim, T., & Adali, T. (2002). Fully complex multi-layer perceptron network for nonlinear signal processing. Journal of VLSI Signal Processing, 32(1), 29–43.
Kim, T., & Adali, T. (2003). Approximation by fully complex multilayer perceptrons. Neural Computation, 15, 1641–1666.
Kostopoulos, A. E., & Grapsa, T. N. (2009). Self-scaled conjugate gradient training algorithms. Neurocomputing, 72, 3000–3019.
Lee, J. (2003). Attractor-based trust-region algorithm for efficient training of multilayer perceptrons. Electronics Letters, 39(9), 727–728.
Leung, H., & Haykin, S. (1991). The complex backpropagation algorithm. IEEE Transactions on Signal Processing, 3(9), 2101–2104.
Leung, C. S., Wong, K. W., Sum, P. F., & Chan, L. W. (2001). A pruning method for the recursive least squared algorithm. Neural Networks, 14, 147–174.
Leung, C. S., Tsoi, A. C., & Chan, L. W. (2001). Two regularizers for recursive least squared algorithms in feedforward multilayered neural networks. IEEE Transactions on Neural Networks, 12, 1314–1332.
Li, Y., Zhang, D., & Wang, K. (2006). Parameter by parameter algorithm for multilayer perceptrons. Neural Processing Letters, 23, 229–242.
Liu, C. S., & Tseng, C. H. (1999). Quadratic optimization method for multilayer neural networks with local error-backpropagation. International Journal on Systems Science, 30(8), 889–898.
Manry, M. T., Apollo, S. J., Allen, L. S., Lyle, W. D., Gong, W., Dawson, M. S., et al. (1994). Fast training of neural networks for remote sensing. Remote Sensing Reviews, 9, 77–96.
McLoone, S., & Irwin, G. (1999). A variable memory quasi-Newton training algorithm. Neural Processing Letters, 9, 77–89.
McLoone, S. F., & Irwin, G. W. (1997). Fast parallel off-line training of multilayer perceptrons. IEEE Transactions on Neural Networks, 8(3), 646–653.
McLoone, S. F., Asirvadam, V. S., & Irwin, G. W. (2002). A memory optimal BFGS neural network training algorithm. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 513–518). Honolulu, HI.
Moller, M. F. (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks, 6(4), 525–533.
More, J. J. (1977). The Levenberg-Marquardt algorithm: Implementation and theory. In G. A. Watson (Ed.), Numerical analysis (Vol. 630, pp. 105–116)., Lecture notes in mathematics Berlin: Springer-Verlag.
Nazareth, J. L. (2003). Differentiable optimization and equation solving. New York: Springer.
Ng, S. C., Leung, S. H., & Luk, A. (1999). Fast convergent generalized back-propagation algorithm with constant learning rate. Neural Processing Letters, 9, 13–23.
Ngia, L. S. H., & Sjoberg, J. (2000). Efficient training of neural nets for nonlinear adaptive filtering using a recursive Levenberg-Marquardt algorithm. IEEE Transactions on Signal Processing, 48(7), 1915–1927.
Nishiyama, K., & Suzuki, K. (2001). H\(_\infty \)-learning of layered neural networks. IEEE Transactions on Neural Networks, 12(6), 1265–1277.
Nitta, T. (1997). An extension to the back-propagation algorithm to complex numbers. Neural Networks, 10(8), 1391–1415.
Parisi, R., Di Claudio, E. D., Orlandim, G., & Rao, B. D. (1996). A generalized learning paradigm exploiting the structure of feedforward neural networks. IEEE Transactions on Neural Networks, 7(6), 1450–1460.
Perantonis, S. J., Ampazis, N., & Spirou, S. (2000). Training feedforward neural networks with the dogleg method and BFGS Hessian updates. In Proceedings of International Joint Conference on Neural Networks (pp. 138–143). Como, Italy.
Perry, A. (1978). A modified conjugate gradient algorithm. Operations Research, 26, 26–43.
Phua, P. K. H., & Ming, D. (2003). Parallel nonlinear optimization techniques for training neural networks. IEEE Transactions on Neural Networks, 14(6), 1460–1468.
Polak, E. (1971). Computational methods in optimization: A unified approach. New York: Academic Press.
Powell, M. J. D. (1977). Restart procedures for the conjugate gradient method. Mathematical Programming, 12, 241–254.
Puskorius, G. V., & Feldkamp, L. A. (1991). Decoupled extended Kalman filter training of feedforward layered networks. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 771–777). Seattle, WA.
Rao, K. D., Swamy, M. N. S., & Plotkin, E. I. (2000). Complex EKF neural network for adaptive equalization. In Proceedings of IEEE International Symposium on Circuits and Systems (pp. 349–352). Geneva, Switzerland.
Rigler, A. K., Irvine, J. M., & Vogl, T. P. (1991). Rescaling of variables in back propagation learning. Neural Networks, 4(2), 225–229.
Rivals, I., & Personnaz, L. (1998). A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Neurocomputing, 20, 279–294.
Rubanov, N. S. (2000). The layer-wise method and the backpropagation hybrid approach to learning a feedforward neural network. IEEE Transactions on Neural Networks, 11(2), 295–305.
Ruck, D. W., Rogers, S. K., Kabrisky, M., Maybeck, P. S., & Oxley, M. E. (1992). Comparative analysis of backpropagation and the extended Kalman filter for training multilayer perceptrons. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(6), 686–691.
Saarinen, S., Bramley, R., & Cybenko, G. (1993). Ill conditioning in neural network training problems. SIAM Journal on Scientific Computing, 14(3), 693–714.
Saito, K., & Nakano, R. (1997). Partial BFGS update and efficient step-length calculation for three-layer neural networks. Neural Computation, 9, 123–141.
Savitha, R., Suresh, S., Sundararajan, N., & Saratchandran, P. (2009). A new learning algorithm with logarithmic performance index for complex-valued neural networks. Neurocomputing, 72, 3771–3781.
Scalero, R. S., & Tepedelenlioglu, N. (1992). A fast new algorithm for training feedforward neural networks. IEEE Transactions on Signal Processing, 40(1), 202–210.
Shanno, D. (1978). Conjugate gradient methods with inexact searches. Mathematics of Operations Research, 3, 244–256.
Shah, S., & Palmieri, F. (1990). MEKA–A fast, local algorithm for training feedforward neural networks. In Proceedings of International Joint Conference on Neural Networks (IJCNN) (Vol. 3, pp. 41–46). San Diego, CA.
Shawe-Taylor, J. S., & Cohen, D. A. (1990). Linear programming algorithm for neural networks. Neural Networks, 3(5), 575–582.
Singhal, S., & Wu, L. (1989). Training feedforward networks with the extended Kalman algorithm. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Vol. 2, 1187–1190). Glasgow, UK.
Stan, O., & Kamen, E. (2000). A local linearized least squares algorithm for training feedforward neural networks. IEEE Transactions on Neural Networks, 11(2), 487–495.
Sum, J., Leung, C. S., Young, G. H., & Kan, W. K. (1999). On the Kalman filtering method in neural network training and pruning. IEEE Transactions on Neural Networks, 10, 161–166.
Uncini, A., Vecci, L., Campolucci, P., & Piazza, F. (1999). Complex-valued neural networks with adaptive spline activation functions. IEEE Transactions on Signal Processing, 47(2), 505–514.
van der Smagt, P. (1994). Minimisation methods for training feed-forward neural networks. Neural Networks, 7(1), 1–11.
Verikas, A., & Gelzinis, A. (2000). Training neural networks by stochastic optimisation. Neurocomputing, 30, 153–172.
Wang, Y. J., & Lin, C. T. (1998). A second-order learning algorithm for multilayer networks based on block Hessian matrix. Neural Networks, 11, 1607–1622.
Wilamowski, B. M., Iplikci, S., Kaynak, O., & Efe, M.O.(2001). An algorithm for fast convergence in training neural networks. In Proceedings of International Joint Conference on Neural Networks (Vol 3, pp. 1778–1782). Washington, DC.
Wilamowski, B. M., Cotton, N. J., Kaynak, O., & Dundar, G. (2008). Computing gradient vector and Jacobian matrix in arbitrarily connected neural networks. IEEE Transactions on Industrial Electronics, 55(10), 3784–3790.
Wilamowski, B. M., & Yu, H. (2010). Improved computation for Levenberg-Marquardt training. IEEE Transactions on Neural Networks, 21(6), 930–937.
Wilamowski, B. M., & Yu, H. (2010). Neural network learning without backpropagation. IEEE Transactions on Neural Networks, 21(11), 1793–1803.
Xu, D., Zhang, H., & Liu, L. (2010). Convergence analysis of three classes of split-complex gradient algorithms for complex-valued recurrent neural networks. Neural Computation, 22(10), 2655–2677.
Xu, Y., Wong, K.-W., & Leung, C.-S. (2006). Generalized RLS approach to the training of neural networks. IEEE Trans Neural Netw, 17(1), 19–34.
Yang, S.-S., Ho, C.-L., & Siu, S. (2007). Sensitivity analysis of the split-complex valued multilayer perceptron due to the errors of the i.i.d. inputs and weights. IEEE Transactions on Neural Networks, 18(5), 1280–1293.
Yang, S.-S., Siu, S., & Ho, C.-L. (2008). Analysis of the initial values in split-complex backpropagation algorithm. IEEE Transactions on Neural Networks, 19(9), 1564–1573.
You, C., & Hong, D. (1998). Nonlinear blind equalization schemes using complex-valued multilayer feedforward neural networks. IEEE Transactions on Neural Networks, 9(6), 1442–1455.
Yu, X. H., Chen, G. A., & Cheng, S. X. (1995). Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Networks, 6(3), 669–677.
Yu, C., Manry, M. T., Li, J., & Narasimha, P. L. (2006). An efficient hidden layer training method for the multilayer perceptron. Neurocomputing, 70, 525–535.
Zhang, Y., & Li, X. (1999). A fast U-D factorization-based learning algorithm with applications to nonlinear system modeling and identification. IEEE Transactions on Neural Networks, 10, 930–938.
Zhang, H., Zhang, C., & Wu, W. (2009). Convergence of batch split-complex backpropagation algorithm for complex-valued neural networks. Discrete Dynamics in Nature and Society, 2009, 1–16.
Zhang, H., Xu, D., & Zhang, Y. (2014). Boundedness and convergence of split-complex back-propagation algorithm with momentum and penalty. Neural Processing Letters, 39, 297–307.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer-Verlag London Ltd., part of Springer Nature
About this chapter
Cite this chapter
Du, KL., Swamy, M.N.S. (2019). Multilayer Perceptrons: Other Learing Techniques. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_6
Download citation
DOI: https://doi.org/10.1007/978-1-4471-7452-3_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7451-6
Online ISBN: 978-1-4471-7452-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)