Abstract
We investigate and demonstrate the benefits of applying interior point methods (IPM) in supervised learning of artificial neural networks. Specifically, three IPM algorithms are presented in this paper: a deterministic logarithmic barrier (LB), a stochastic logarithmic barrier function (SB) and a quadratic trust region method respectively. Those are applied to the training of supervised feedforward artificial neural networks. We consider neural network training as a nonlinear constrained optimization problem. Specifically, we put constraints on the weights to avoid network paralysis. In the case of the (LB) method, the search direction is derived using a recursive prediction error method (RPEM) that approximates the inverse of the Hessian of a logarithmic error function iteratively. The weights move on a center trajectory in the interior of the feasible weight space and have good convergence properties. For its stochastic version, at each iteration a stochastic optimization procedure is used to add random fluctuations to the RPEM direction in order to escape local minima. This optimization technique can be viewed as a hybrid of the barrier function method and simulated annealing procedure. In the third algorithm, we approximate the objective function by a quadratic convex function and use a trust region method to find the optimal weights. Computational experiments in approximation of discrete dynamical systems and medical diagnosis problems are also provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achenie L.E.K. (1993), “ Quasi-Newton Based Approach to the Training of the Feedforward Neural Network”, elligent Engineering Systems through Artificial Neural Networks Vol. 3, Editors: C. H. Dagli, L. I. Burke, B. R. Fernandez and J. Ghosh, 155–160
Andersen E.D., Gondzio J., Meszaros Cs, and Xu X. (1996), ”Implementation of Interior Point Methods for Large Scale Linear Programming” In erior Point Methods in Mathematical Programming Terlaky, Editor, Kluwer Academic Publishers, 189–252
Barnard E. (1992), ”Optimization for Training Neural Nets”, E Transactions on Neural Networks Vol. 3, (2), 232–240
Battiti R. (1992), ”First and Second-Order Methods for Learning Between Steepest Descent and Newton’ Method”, ral Computation Vol. 4, 141–166
Bazaraa M.S., Sherali H.D., and Shetty C.M. (1993), linear programming theory and algorithms NY: Wiley
Bennett K.P., and Mangasarian O.L. (1992), ”Neural Network Training via Linear Programming”, In P.M. Pardalos (ed), ances in Optimization and Parallel Computing, North Holland, Amsterdam
Bertsekas D. (1995), ”ncremental Least Squares Methods and the Extended Kalman Filter” Technical Report, LIDS-P-2237, Lab. for Info, and Dec. Systems, M.I.T, Cambridge, MA, 02139, to appear in SIAM Journal on Optimization
Burke L. I. (1991), ”ntroduction to Artificial Neural Systems for Pattern Recognition” puters Operations Research, 18, 2, 211–220
Breitfeld M. and Shanno D. (1994), ”Preliminary Computational Experience with Modified Log-Barrier Functions for Large-Scale Nonlinear programming ” ge Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 45–67
Cichocki A. and Unbehauen R. (1993), Neural Networks for Optimization and Signal Processing, Wiley, N. Y
Chen S., Cowan C, Billings S., and Grant P. (1990), ”Parallel Recursive Prediction Error Algorithm For Training Layered Neural Networks ”, International Journal of Control, vol. 51, No. 6, 1215–1228
Davidon W. C. (1976), ”New Least-Square Algorithms”, Journal of Optimization Theory and Applications, Vol. 18, no. 2, 187–197
Fiacco A.V., and McCormick G.P. (1968), Nonlinear Programming: Sequential Unconstrained Minimization Techniques, John Wiley, New York
Flippo O.E., and Jansen B., (1992), ”Duality and Sensitivity in Quadratic Opimization Over a Sphere”, Technical Report 92-65. Falculty of Technical Mathematics and Informatics, Delft University of Technology
Frisch K.R. (1955), ”The Logarithmic Potential Method of Convex Programming”, Technical Report, University Institute of Economics, Oslo, Norway
Geman S. and Hwan C.R. (1986), ”Diffusions for Global Optimization”, SIAM J, of Control and Optimization, 24, 1031–1043
Goldberg D.E. (1989), Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, MA
Gondzio J., and Terlaky T., (1996), ”A Computational View of Interior Point Methods for Linear Programming”, In Advances in Linear and Integer Program-ming. Beasley, Editor. Oxford University Press, Oxford, Great Britain
Gonzaga C. C. (1991), ” Large Step Path-Following Methods for Linear Programming”, Parts 1 and 2. SIAM Journal of Optimization, Vol. 1, 268–280
Gonzaga C. C. (1992), ” Path-Following Methods for Linear Programming”, SIAM REVIEW, Vol. 34, (2), 167–224
Haykin S. (1994), Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Company, N.Y
Hertz J., Krogh A., and Palmer R. G., (1991), Introduction to the Theory of Neural Computation, Redwood City, CA: Addison-Wesley
Kasparian V., Batur C, Zhang H., and Padovan J. (1994), ”Davidon Least Squares-Based Learning Algorithm for Feedforward Neural Networks ”, Neural Networks, Vol. 7, No.4, 661–670
Kinsells J. A. (1992), ”Comparison and Evaluation of Variants of the Conjugate Gradient Method for Efficient Learning in Feed-Forward Neural Networks with Backward Error Propagation”, Networks, 3, 27–35
Kirkpatrick S., Gelatt CD. Jr., and Vecchi M.P. (1983), ”Optimization by Simulated Annealing”, Science, 220, 671–680
Kollias S. and Anastassiou D. (1988), ”Adapting Training of Multilayer Neural Networks using a Least Squares Estimation Technique”, IEEE First International Conference on Neural Networks, San Diego, CA, Vol. I, 384–390
Li Y., Joerding W., and Genz A., (1993), ”Global Training of Feedforward Nets with Hybrid LLS/Simulated Annealing”, Proceedings of WCNN, Portland, Oregon, III, 393–396
Lustig I. J., Marsten R. E,, and Shanno D. F. (1994), ”Interior Point Methods for Linear Programming: Computational State of the Art”, ORSA Journal on Computing, Vol. 6 (1), Winter 1994, 1–14
Mangasarian O.L., Setiono R. and Wolberg W.H. (1990), ”Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis”, In Coleman T.F., and Li. Y. (Eds.)Large-scale numerical optimization, Philadelphia: SIAM, 22–30
Mangasarian O.L. (1993), “Mathematical Programming in Neural Networks”, ORSA Journal on Computing, Vol. 5, (4), 349–360
Nash S. G. (1984), “Newton-type Minimization via Lanczos Method”, SIAM J. Numer. Anal, Vol 21, No. 4, 770–788
Nash S. G., Polyak R. and Sofer A. (1994), ”A Numerical Comparison of Barrier and Modified Barrier Methods for Large-Scale Bound-Constrained Optimization”, Large Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 319–337
Nash S. G. and Sofer A. (1993), ”A Barrier Method for Large-scale Constrained Optimization”, ORSA Journal on Computing, vol. 5, No. 1, 40–53
Poggio T. and Girosi F. (1990), ”Networks for Approximation and Learning”, Proceedings of the IEEE, Vol. 78, 1481–1497
Sartori M. A., Antsaklis P. J. (1992), ”Neural Network Training via Quadratic Optimization”, IEEE International Symposium on Circuits and Systems, 49–52
Shanno D.F., Breitfeld M.G., and Simantiraki E.M. (1996), ”Implementing Barrier Methods for Nonlinear Programming”, In Interior Point Methods in Mathematical Programming, Terlaky, Editor, Kluwer Academic Publishers, 399–413
Soderstrom T. and Stoica P. (1989), System Identification, Prentice Hall International (UK), Englewood Cliffs, NJ
Stinchcombe M. and White H. (1990), ”Approximating and Learning Unknown Mappings Using Multilayer Feedforward Networks with Bounded Weights”, Proceedings of WCNN, Vol.111, 7–16
Trafalis T. B. and Sieger D. B. (1993), ”Training of Multilayer Feedforward Artificial Neural Networks by a Logarithmic Barrier Function Adaptation Scheme”, In C.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 3, 167–173
Trafalis T.B. and Couëllan N.P. (1994), ”Neural Networks Training via PrimalDual Interior Point Method for Linear Programming”, Proceedings of WCNN, Vol. II, 798–803
Trafalis T.B. and Couëllan N.P. (1996), ”Neural Networks Training via an Affine Scaling Quadratic Optimization Algorithm”, Neural Networks, 9:3, 475–481
Trafalis T.B. and Tutunji T. (1994), ”A Quasi-Newton Barrier Function Algorithm for Artificial Network Training with Bounded Weights”, In eC.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent engineering systems through artificial neural networks, Vol. 4, 161–173
Ye Y., (1992), ”On Affine Scaling Algorithms for Nonconvex Quadratic Programming”, Mathematical Programming, Vol. 56, 285–300
Wasserman P.D., (1989), Neural Computing: Theory and Practice, NY: Van Nostrand Reinhold
Watrous R. L. (1987), ”Learning Algorithm for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization”, IEEE First International Conference on Neural Networks, vol. 2, San Diego, CA, 619–628
Werbos P., (1974), Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD. Thesis, Committee on Applied Mathematics, Harvard University, Cambridge, MA. Reprinted in P. Werbos, The Roots of Backpropagation:From Ordered Derivatives to Neural Networks and Political Forecasting, NY: Wiley (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Trafalis, T.B., Tutunji, T.A., Couëllan, N.P. (1997). Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights. In: Pardalos, P.M., Hearn, D.W., Hager, W.W. (eds) Network Optimization. Lecture Notes in Economics and Mathematical Systems, vol 450. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59179-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-59179-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62541-4
Online ISBN: 978-3-642-59179-2
eBook Packages: Springer Book Archive