Abstract
MLPs are feedforward networks with one or more layers of units between the input and output layers. The output units represent a hyperplane in the space of the input patterns. The architecture of MLP is illustrated in Fig. 4.1.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abid, S., Fnaiech, F., & Najim, M. (2001). A fast feedforward training algorithm using a modified form of the standard backpropagation algorithm. IEEE Transactions on Neural Networks, 12(2), 424–430.
Aires, F., Schmitt, M., Chedin, A., & Scott, N. (1999). The “weight smoothing” regularization of MLP for Jacobian stabilization. IEEE Transactions on Neural Networks, 10(6), 1502–1510.
Anastasiadis, A. D., Magoulas, G. D., & Vrahatis, M. N. (2005). New globally convergent training scheme based on the resilient propagation algorithm. Neurocomputing, 64, 253–270.
Barhen, J., Protopopescu, V., & Reister, D. (1997). TRUST: A deterministic algorithm for global optimization. Science, 276, 1094–1097.
Barron, A. R. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39(3), 930–945.
Battiti, R. (1989). Accelerated backpropagation learning: Two optimization methods. Complex Systems, 3, 331–342.
Baykal, N. & Erkmen, A. M. (2000). Resilient backpropagation for RBF networks. In Proceedings of 4th International Conference on Knowledge-Based Intelligent Engineering Systems & Allied Technologies (pp. 624–627). Brighton, UK.
Behera, L., Kumar, S., & Patnaik, A. (2006). On adaptive learning rate that guarantees convergence in feedforward networks. IEEE Transactions on Neural Networks, 17(5), 1116–1125.
Bishop, C. M. (1995). Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7(1), 108–116.
Brouwer, R. K. (1997). Training a feed-forward network by feeding gradients forward rather than by back-propagation of errors. Neurocomputing, 16, 117–126.
Castellano, G., Fanelli, A. M., & Pelillo, M. (1997). An iterative pruning algorithm for feedforward neural networks. IEEE Transactions on Neural Networks, 8(3), 519–531.
Cetin, B. C., Burdick, J. W. & Barhen, J. (1993). Global descent replaces gradient descent to avoid local minima problem in learning with artificial neural networks. In Proceedings of IEEE International Conference on Neural Networks (pp. 836–842). San Francisco.
Chandra, P., & Singh, Y. (2004). An activation function adapting training algorithm for sigmoidal feedforward networks. Neurocomputing, 61, 429–437.
Chandrasekaran, H., Chen, H. H., & Manry, M. T. (2000). Pruning of basis functions in nonlinear approximators. Neurocomputing, 34, 29–53.
Chen, D. S., & Jain, R. C. (1994). A robust backpropagation learning algorithm for function approximation. IEEE Transactions on Neural Networks, 5(3), 467–479.
Chester, D. L. (1990). Why two hidden layers are better than one. In Proceedings of International Joint Conference on Neural Networks (pp. 265–268). Washington DC.
Choi, J. J., Arabshahi, P., Marks II, R. J. & Caudell, T. P. (1992). Fuzzy parameter adaptation in neural systems. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 232–238). Baltimore, MD.
Chuang, C. C., Su, S. F., & Hsiao, C. C. (2000). The annealing robust backpropagation (ARBP) learning algorithm. IEEE Transactions on Neural Networks, 11(5), 1067–1077.
Cibas, T., Soulie, F. F., Gallinari, P., & Raudys, S. (1996). Variable selection with neural networks. Neurocomputing, 12, 223–248.
Cichocki, A., & Unbehauen, R. (1992). Neural networks for optimization and signal processing. New York: Wiley.
Costa, P., & Larzabal, P. (1999). Initialization of supervised training for parametric estimation. Neural Processing Letters, 9, 53–61.
Cybenko, G. (1989). Approximation by superposition of a sigmoid function. Mathematics of Control, Signals, and Systems, 2, 303–314.
Denoeux, T., & Lengelle, R. (1993). Initializing backpropagation networks with prototypes. Neural Networks, 6(3), 351–363.
Drago, G., & Ridella, S. (1992). Statistically controlled activation weight initialization (SCAWI). IEEE Transactions on Neural Networks, 3(4), 627–631.
Duch, W. (2005). Uncertainty of data, fuzzy membership functions, and multilayer perceptrons. IEEE Transactions on Neural Networks, 16(1), 10–23.
Engelbrecht, A. P. (2001). A new pruning heuristic based on variance analysis of sensitivity information. IEEE Transactions on Neural Networks, 12(6), 1386–1399.
Eom, K., Jung, K., & Sirisena, H. (2003). Performance improvement of backpropagation algorithm by automatic activation function gain tuning using fuzzy logic. Neurocomputing, 50, 439–460.
Fabisch, A., Kassahun, Y., Wohrle, H., & Kirchner, F. (2013). Learning in compressed space. Neural Networks, 42, 83–93.
Fahlman, S. E. (1988). Fast learning variations on back-propation: an empirical study. In D. S. Touretzky, G. E. Hinton & T. Sejnowski (Eds.), Proceedings of 1988 Connectionist Models Summer School (San Mateo, CA: Morgan Kaufmann, 1988) (pp. 38–51). Pittsburgh.
Fahlman, S. E., & Lebiere, C. (1990). The cascade-correlation learning architecture. In D. S. Touretzky (Ed.), Advances in Neural Information Processing Systems 2 (San Mateo (pp. 524–532). CA: Morgan Kaufmann.
Finnoff, W. (1994). Diffusion approximations for the constant learning rate backpropagation algorithm and resistance to local minima. Neural Computation, 6(2), 285–295.
Frean, M. (1990). The upstart algorithm: A method for constructing and training feedforward neural networks. Neural Computation, 2(2), 198–209.
Funahashi, K. (1989). On the approximate realization of continuous mappings by neural networks. Neural Networks, 2(3), 183–192.
Gallant, S. I. (1990). Perceptron-based learning algorithms. IEEE Transactions on Neural Networks, 1(2), 179–191.
Goh, Y. S. & Tan, E. C. (1994). Pruning neural networks during training by backpropagation. In Proceedings of IEEE Region 10’s Ninth Ann. Int. Conf. (TENCON’94) (pp. 805–808). Singapore.
Gupta, A., & Lam, S. M. (1998). Weight decay backpropagation for noisy data. Neural Networks, 11, 1127–1137.
Hannan, J. M., & Bishop, J. M. (1996). A Class of fast artificial NN training algorithms. Technical Report JMH-JMB 01/96, Department of Cybernetics, University of Reading, UK.
Hannan, J. M. & Bishop, J. M. (1997) A comparison of fast training algorithms over two real problems. In Proceedings of IEE Conference on Artificial Neural Networks (pp. 1–6). Cambridge, UK.
Hassibi, B., Stork, D. G., & Wolff, G. J. (1992). Optimal brain surgeon and general network pruning. In Proceedings of IEEE International Conference on Neural Networks (pp. 293–299). San Francisco.
Heskes, T., & Wiegerinck, W. (1996). A theoretical comparison of batch-mode, online, cyclic, and almost-cyclic learning. IEEE Transactions on Neural Networks, 7, 919–925.
Hinton, G. E. (1987). Connectionist learning procedures. Technical Report CMU-CS-87-115, Carnegie-Mellon University, Computer Sci. Department, Pittsburgh, PA.
Hinton, G. E. (1989). Connectionist learning procedure. Artificial Intelligence, 40, 185–234.
Hornik, K. M., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2, 359–366.
Huang, G. B. (2003). Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Transactions on Neural Networks, 14(2), 274–281.
Igel, C., & Husken, M. (2003). Empirical evaluation of the improved Rprop learning algorithms. Neurocomputing, 50, 105–123.
Ishikawa, M. (1995). Learning of modular structured networks. Artificial Intelligence, 75, 51–62.
Jacobs, R. A. (1988). Increased rates of convergence through learning rate adaptation. Neural Networks, 1, 295–307.
Jiang, X., Chen, M., Manry, M. T., Dawson, M. S., & Fung, A. K. (1994). Analysis and optimization of neural networks for remote sensing. Remote Sensing Reviews, 9, 97–144.
Jiang, M., & Yu, X. (2001). Terminal attractor based back propagation learning for feedforward neural networks. In Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS) (Vol. 2, pp. 711–714). Sydney, Australia.
Kamarthi, S. V., & Pittner, S. (1999). Accelerating neural network training using weight extrapolations. Neural Networks, 12, 1285–1299.
Kanjilal, P. P., & Banerjee, D. N. (1995). On the application of orthogonal transformation for the design and analysis of feedforward networks. IEEE Transactions on Neural Networks, 6(5), 1061–1070.
Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. IEEE Transactions on Neural Networks, 1(2), 239–242.
Khashman, A. (2008). A modified backpropagation learning algorithm with added emotional coefficients. IEEE Transactions on Neural Networks, 19(11), 1896–1909.
Kolen, J. F., & Pollack, J. B. (1990). Backpropagation is sensitive to initial conditions. Complex Systems, 4(3), 269–280.
Kozma, R., Sakuma, M., Yokoyama, Y., & Kitamura, M. (1996). On the accuracy of mapping by neural networks trained by backpropagation with forgetting. Neurocomputing, 13, 295–311.
Kruschke, J. K., & Movellan, J. R. (1991). Benefits of gain: Speeded learning and minimal layers in back-propagation networks. IEEE Transactions on Systems, Man, and Cybernetics, 21(1), 273–280.
Kwok, T. Y., & Yeung, D. Y. (1997). Objective functions for training new hidden units in constructive neural networks. IEEE Transactions on Neural Networks, 8(5), 1131–1148.
Le Cun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In D. S. Touretzky (Ed.), Advances in Neural Information Processing Systems 2 (pp. 598–605). San Mateo, CA: Morgan Kaufmann.
Le Cun, Y., Kanter, I., & Solla, S. A. (1991). Second order properties of error surfaces: learning time and generalization. In R. P. Lippmann, J. E. Moody, & D. S. Touretzky (Eds.), Advances in Neural Information Processing Systems 3 (pp. 918–924). San Mateo, CA: Morgan Kaufmann.
Le Cun, Y., Simard, P. Y., & Pearlmutter, B. (1993). Automatic learning rate maximization by on-line estimation of the Hessian’s eigenvectors. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in neural information processing systems 5 (pp. 156–163). San Mateo, CA: Morgan Kaufmann.
Lee, Y., Oh, S. H., & Kim, M. W. (1991). The effect of initial weights on premature saturation in back-propagation training. In Proc. IEEE International Joint Conference on Neural Networks (Vol. 1, pp. 765–770). Seattle, WA.
Lee, H. M., Chen, C. M., & Huang, T. C. (2001). Learning efficiency improvement of back-propagation algorithm by error saturation prevention method. Neurocomputing, 41, 125–143.
Lehtokangas, M., Saarinen, J., Huuhtanen, P., & Kaski, K. (1995). Initializing weights of a multilayer perceptron network by using the orthogonal least squares algorithm. Neural Computation, 7, 982–999.
Lehtokangas, M., Korpisaari, P., & Kaski, K. (1996). Maximum covariance method for weight initialization of multilayer perceptron networks. In Proceedings of European Symposium on Artificial Neural Networks (ESANN’96) (pp. 243–248). Bruges, Belgium.
Lehtokangas, M. (1999). Modelling with constructive backpropagation. Neural Networks, 12, 707–716.
Leung, C. S., Wong, K. W., Sum, P. F., & Chan, L. W. (2001). A pruning method for the recursive least squared algorithm. Neural Networks, 14, 147–174.
Levin, A. U., Leen, T. K., & Moody, J. E. (1994). Fast pruning using principal components. In J. D. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems 6 (pp. 35–42). San Francisco, CA: Morgan Kaufman.
Liang, Y. C., Feng, D. P., Lee, H. P., Lim, S. P., & Lee, K. H. (2002). Successive approximation training algorithm for feedforward neural networks. Neurocomputing, 42, 311–322.
Liu, D., Chang, T. S., & Zhang, Y. (2002). A constructive algorithm for feedforward neural networks with incremental training. IEEE Transactions on Circuits and Systems I, 49(12), 1876–1879.
Llanas, B., Lantaron, S., & Sainz, F. J. (2008). Constructive approximation of discontinuous functions by neural networks. Neural Processing Letters, 27, 209–226.
MacKay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4(3), 415–447.
Magdon-Ismail, M., & Atiya, A. F. (2000). The early restart algorithm. Neural Computation, 12, 1303–1312.
Magoulas, G. D., Vrahatis, M. N., & Androulakis, G. S. (1997). Effective backpropagation training with variable stepsize. Neural Networks, 10(1), 69–82.
Magoulas, G. D., Plagianakos, V. P., & Vrahatis, M. N. (2002). Globally convergent algorithms with local learning rates. IEEE Transactions on Neural Networks, 13(3), 774–779.
Maiorov, V., & Pinkus, A. (1999). Lower bounds for approximation by MLP neural networks. Neurocomputing, 25, 81–91.
Man, Z., Wu, H. R., Liu, S., & Yu, X. (2006). A new adaptive backpropagation algorithm based on Lyapunov stability theory for neural networks. IEEE Transactions on Neural Networks, 17(6), 1580–1591.
Manry, M. T., Apollo, S. J., Allen, L. S., Lyle, W. D., Gong, W., Dawson, M. S., et al. (1994). Fast training of neural networks for remote sensing. Remote Sensing Reviews, 9, 77–96.
Martens, J. P., & Weymaere, N. (2002). An equalized error backpropagation algorithm for the on-line training of multilayer perceptrons. IEEE Transactions on Neural Networks, 13(3), 532–541.
Mastorocostas, P. A. (2004). Resilient back propagation learning algorithm for recurrent fuzzy neural networks. Electronics Letters, 40(1), 57–58.
McLoone, S., Brown, M. D., Irwin, G., & Lightbody, G. (1998). A hybrid linear/nonlinear training algorithm for feedforward neural networks. IEEE Transactions on Neural Networks, 9(4), 669–684.
Mezard, M., & Nadal, J. P. (1989). Learning in feedforward layered networks: The tiling algorithm. Journal of Physics A, 22, 2191–2203.
Minai, A. A., & Williams, R. D. (1990) Backpropagation heuristics: A study of the extended delta-bar-delta algorithm. In Proceedings of IEEE International Conference on Neural Networks (Vol. 1, pp. 595–600) San Diego, CA.
Moody, J. O., & Antsaklis, P. J. (1996). The dependence identification neural network construction algorithm. IEEE Transactions on Neural Networks, 7(1), 3–13.
Mozer, M. C., & Smolensky, P. (1989). Using relevance to reduce network size automatically. Connection Science, 1(1), 3–16.
Nakama, T. (2009). Theoretical analysis of batch and on-line training for gradient descent learning in neural networks. Neurocomputing, 73, 151–159.
Narayan, S. (1997). The generalized sigmoid activation function: Competitive supervised learning. Information Sciences, 99, 69–82.
Ng, S. C., Leung, S. H., & Luk, A. (1999). Fast convergent generalized back-propagation algorithm with constant learning rate. Neural Processing Letters, 9, 13–23.
Nguyen, D. & Widrow, B. (1990). Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In Proceedings of International Joint Conference on Neural Networks (Vol. 3, pp. 21–26). San Diego, CA.
Oh, S. H. (1997). Improving the error back-propagation algorithm with a modified error function. IEEE Transactions on Neural Networks, 8(3), 799–803.
Parlos, A. G., Femandez, B., Atiya, A. F., Muthusami, J., & Tsai, W. K. (1994). An accelerated learning algorithm for multilayer perceptron networks. IEEE Transactions on Neural Networks, 5(3), 493–497.
Parma, G. G., Menezes, B. R., & Braga, A. P. (1998). Sliding mode algorithm for training multilayer artificial neural networks. Electronics Letters, 34(1), 97–98.
Perantonis, S. J., & Virvilis, V. (1999). Input feature extraction for multilayered perceptrons using supervised principal component analysis. Neural Processing Letters, 10, 243–252.
Pernia-Espinoza, A. V., Ordieres-Mere, J. B., Martinez-de-Pison, F. J., & Gonzalez-Marcos, A. (2005). TAO-robust backpropagation learning algorithm. Neural Networks, 18, 191–204.
Pfister, M., & Rojas, R. (1993) Speeding-up backpropagation—A comparison of orthogonal techniques. In Proceedings of International Joint Conference on Neural Networks (Vol. 1, pp. 517–523). Nagoya, Japan.
Pfister, M., & Rojas, R. (1994). Qrprop-a hybrid learning algorithm which adaptively includes second order information. In Proceedings of 4th Dortmund Fuzzy Days (pp. 55–62).
Poggio, T., & Girosi, F. (1990). Networks for approximation and learning. Proceedings of the IEEE, 78(9), 1481–1497.
Ponnapalli, P. V. S., Ho, K. C., & Thomson, M. (1999). A formal selection and pruning algorithm for feedforward artificial neural network optimization. IEEE Transactions on Neural Networks, 10(4), 964–968.
Rathbun, T. F., Rogers, S. K., DeSimio, M. P., & Oxley, M. E. (1997). MLP iterative construction algorithm. Neurocomputing, 17, 195–216.
M. Riedmiller & H. Braun, A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of IEEE International Conference on Neural Networks (pp. 586–591). San Francisco, CA.
RoyChowdhury, P., Singh, Y. P., & Chansarkar, R. A. (1999). Dynamic tunneling technique for efficient training of multilayer perceptrons. IEEE Transactions on Neural Networks, 10(1), 48–55.
Ruck, D. W., Rogers, S. K., & Kabrisky, M. (1990). Feature selection using a multilayer perceptron. Neural Network Computing, 2(2), 40–48.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, 1: Foundation (pp. 318–362). Cambridge: MIT Press.
Rumelhart, D. E., Durbin, R., Golden, R., & Chauvin, Y. (1995). Backpropagation: the basic theory. In Y. Chauvin & D. E. Rumelhart (Eds.), Backpropagation: Theory, Architecture, and Applications (pp. 1–34). Hillsdale, NJ: Lawrence Erlbaum.
Satoh, S., & Nakano, R. (2013). Fast and stable learning utilizing singular regions of multilayer perceptron. Neural Processing Letters. doi:10.1007/s11063-013-9283-z.
Selmic, R. R., & Lewis, F. L. (2002). Neural network approximation of piecewise continuous functions: Application to friction compensation. IEEE Transactions on Neural Networks, 13(3), 745–751.
Setiono, R., & Hui, L. C. K. (1995). Use of quasi-Newton method in a feed-forward neural network construction algorithm. IEEE Transactions on Neural Networks, 6(1), 273–277.
Sietsma, J., & Dow, R. J. F. (1991). Creating artificial neural networks that generalize. Neural Networks, 4, 67–79.
Silva, F. M., & Almeida, L. B. (1990). Speeding-up backpropagation. In R. Eckmiller (Ed.), Advanced Neural Computers (pp. 151–156). Amsterdam: North-Holland.
Sira-Ramirez, H., & Colina-Morles, E. (1995). A sliding mode strategy for adaptive learning in adalines. IEEE Transactions on Circuits and Systems I, 42(12), 1001–1012.
Smyth, S. G. (1992). Designing multilayer perceptrons from nearest neighbor systems. IEEE Transactions on Neural Networks, 3(2), 329–333.
Sperduti, A., & Starita, A. (1993). Speed up learning and networks optimization with extended back propagation. Neural Networks, 6(3), 365–383.
Stahlberger, A., & Riedmiller, M. (1997). Fast network pruning and feature extraction using the unit-OBS algorithm. In M. C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 655–661). Cambridge, MA: MIT Press.
Sum, J., Leung, C. S., Young, G. H., & Kan, W. K. (1999). On the Kalman filtering method in neural network training and pruning. IEEE Transactions on Neural Networks, 10, 161–166.
Tamura, S., & Tateishi, M. (1997). Capabilities of a four-layered feedforward neural network: four layers versus three. IEEE Transactions on Neural Networks, 8(2), 251–255.
Tang, Z., Wang, X., Tamura, H., & Ishii, M. (2003). An algorithm of supervised learning for multilayer neural networks. Neural Computation, 15, 1125–1142.
Teoh, E. J., Tan, K. C., & Xiang, C. (2006). Estimating the number of hidden neurons in a feedforward network using the singular value decomposition. IEEE Transactions on Neural Networks, 17(6), 1623–1629.
Tesauro, G., & Janssens, B. (1988). Scaling relationships in back-propagation learning. Complex Systems, 2, 39–44.
Thimm, G., & Fiesler, E. (1997). High-order and multilayer perceptron initialization. IEEE Transactions on Neural Networks, 8(2), 349–359.
Tollenaere, T. (1990). SuperSAB: fast adaptive backpropation with good scaling properties. Neural Networks, 3(5), 561–573.
Treadgold, N. K., & Gedeon, T. D. (1998). Simulated annealing and weight decay in adaptive learning: the SARPROP algorithm. IEEE Transactions on Neural Networks, 9(4), 662–668.
Trenn, S. (2008). Multilayer perceptrons: approximation order and necessary number of hidden units. IEEE Transactions on Neural Networks, 19(5), 836–844.
Tresp, V., Neuneier, R., & Zimmermann, H. G. (1997). Early brain damage. In M. Mozer, M. I. Jordan, & P. Petsche (Eds.), Advances in Neural Information Processing Systems 9 (pp. 669–675). Cambridge, MA: MIT Press.
Tripathi, B. K., & Kalra, P. K. (2011). On efficient learning machine with root-power mean neuron in complex domain. IEEE Transactions on Neural Networks, 22(5), 727–738.
Vitela, J. E., & Reifman, J. (1997). Premature saturation in backpropagation networks: mechanism and necessary condition. Neural Networks, 10(4), 721–735.
Vogl, T. P., Mangis, J. K., Rigler, A. K., Zink, W. T., & Alkon, D. L. (1988). Accelerating the convergence of the backpropagation method. Biological Cybernetics, 59, 257–263.
Wang, S. D., & Hsu, C. H. (1991). Terminal attractor learning algorithms for back propagation neural networks. In Proceedings of International Joint Conference on Neural Networks (pp. 183–189). Seattle, WA.
Wang, X. G., Tang, Z., Tamura, H., & Ishii, M. (2004). A modified error function for the backpropagation algorithm. Neurocomputing, 57, 477–484.
Wang, J., Yang, J., & Wu, W. (2011). Convergence of cyclic and almost-cyclic learning with momentum for feedforward neural networks. IEEE Transactions on Neural Networks, 22(8), 1297–1306.
Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In R. P. Lippmann, J. E. Moody, & D. S. Touretzky (Eds.), Advances in Neural Information Processing Systems 3 (pp. 875–882). San Mateo, CA: Morgan Kaufmann.
Wessels, L. F. A., & Barnard, E. (1992). Avoiding false local minima by proper initialization of connections. IEEE Transactions on Neural Networks, 3(6), 899–905.
Werbos, P. J. (1974). Beyond Regressions: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD Thesis, Harvard University, Cambridge, MA.
Weymaere, N., & Martens, J. P. (1994). On the initializing and optimization of multilayer perceptrons. IEEE Transactions on Neural Networks, 5, 738–751.
White, H. (1989). Learning in artificial neural networks: A statistical perspective. Neural Computation, 1(4), 425–469.
Widrow, B., & Stearns, S. D. (1985). Adaptive Signal Processing. Englewood Cliffs, NJ: Prentice-Hall.
Wilson, D. R., & Martinez, T. R. (2003). The general inefficiency of batch training for gradient descent learning. Neural Networks, 16, 1429–1451.
Xiang, C., Ding, S. Q., & Lee, T. H. (2005). Geometrical interpretation and architecture selection of MLP. IEEE Transactions on Neural Networks, 16(1), 84–96.
Xing, H.-J., & Hu, B.-G. (2009). Two-phase construction of multilayer perceptrons using information theory. IEEE Transactions on Neural Networks, 20(4), 715–721.
Xu, Z.-B., Zhang, R., & Jing, W.-F. (2009). When does online BP training converge? IEEE Transactions on Neural Networks, 20(10), 1529–1539.
Yam, J. Y. F., & Chow, T. W. S. (2000). A weight initialization method for improving training speed in feedforward neural network. Neurocomputing, 30, 219–232.
Yam, Y. F., Chow, T. W. S., & Leung, C. T. (1997). A new method in determining the initial weights of feedforward neural networks. Neurocomputing, 16, 23–32.
Yam, J. Y. F., & Chow, T. W. S. (2001). Feedforward networks training speed enhancement by optimal initialization of the synaptic coefficients. IEEE Transactions on Neural Networks, 12(2), 430–434.
Yam, Y. F., Leung, C. T., Tam, P. K. S., & Siu, W. C. (2002). An independent component analysis based weight initialization method for multilayer perceptrons. Neurocomputing, 48, 807–818.
Yang, L., & Yu, W. (1993). Backpropagation with homotopy. Neural Computation, 5(3), 363–366.
Yu, X. H., & Chen, G. A. (1997). Efficient backpropagation learning using optimal learning rate and momentum. Neural Networks, 10(3), 517–527.
Yu, X. H., Chen, G. A., & Cheng, S. X. (1995). Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Networks, 6(3), 669–677.
Yu, X., Efe, M. O., & Kaynak, O. (2002). A general backpropagation algorithm for feedforward neural networks learning. IEEE Transactions on Neural Networks, 13(1), 251–254.
Zak, M. (1989). Terminal attractors in neural networks. Neural Networks, 2, 259–274.
Zhang, X. M., Chen, Y. Q., Ansari, N., & Shi, Y. Q. (2004). Mini-max initialization for function approximation. Neurocomputing, 57, 389–409.
Zhang, R., Xu, Z.-B., Huang, G.-B., & Wang, D. (2012). Global convergence of online BP training with dynamic learning rate. IEEE Transactions on Neural Networks and Learning Systems, 23(2), 330–341.
Zurada, J. M., Malinowski, A., & Usui, S. (1997). Perturbation method for deleting redundant inputs of perceptron networks. Neurocomputing, 14, 177–193.
Zweiri, Y. H., Whidborne, J. F., & Seneviratne, L. D. (2003). A three-term backpropagation algorithm. Neurocomputing, 50, 305–318.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2014 Springer-Verlag London
About this chapter
Cite this chapter
Du, KL., Swamy, M.N.S. (2014). Multilayer Perceptrons: Architecture and Error Backpropagation. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-5571-3_4
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5571-3_4
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5570-6
Online ISBN: 978-1-4471-5571-3
eBook Packages: EngineeringEngineering (R0)