Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights

Trafalis, Theodore B.; Tutunji, Tarek A.; Couëllan, Nicolas P.

doi:10.1007/978-3-642-59179-2_22

Theodore B. Trafalis⁵,
Tarek A. Tutunji⁵ &
Nicolas P. Couëllan⁵

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 450))

438 Accesses
2 Citations

Abstract

We investigate and demonstrate the benefits of applying interior point methods (IPM) in supervised learning of artificial neural networks. Specifically, three IPM algorithms are presented in this paper: a deterministic logarithmic barrier (LB), a stochastic logarithmic barrier function (SB) and a quadratic trust region method respectively. Those are applied to the training of supervised feedforward artificial neural networks. We consider neural network training as a nonlinear constrained optimization problem. Specifically, we put constraints on the weights to avoid network paralysis. In the case of the (LB) method, the search direction is derived using a recursive prediction error method (RPEM) that approximates the inverse of the Hessian of a logarithmic error function iteratively. The weights move on a center trajectory in the interior of the feasible weight space and have good convergence properties. For its stochastic version, at each iteration a stochastic optimization procedure is used to add random fluctuations to the RPEM direction in order to escape local minima. This optimization technique can be viewed as a hybrid of the barrier function method and simulated annealing procedure. In the third algorithm, we approximate the objective function by a quadratic convex function and use a trust region method to find the optimal weights. Computational experiments in approximation of discrete dynamical systems and medical diagnosis problems are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Achenie L.E.K. (1993), “ Quasi-Newton Based Approach to the Training of the Feedforward Neural Network”, elligent Engineering Systems through Artificial Neural Networks Vol. 3, Editors: C. H. Dagli, L. I. Burke, B. R. Fernandez and J. Ghosh, 155–160
Google Scholar
Andersen E.D., Gondzio J., Meszaros Cs, and Xu X. (1996), ”Implementation of Interior Point Methods for Large Scale Linear Programming” In erior Point Methods in Mathematical Programming Terlaky, Editor, Kluwer Academic Publishers, 189–252
Google Scholar
Barnard E. (1992), ”Optimization for Training Neural Nets”, E Transactions on Neural Networks Vol. 3, (2), 232–240
Article Google Scholar
Battiti R. (1992), ”First and Second-Order Methods for Learning Between Steepest Descent and Newton’ Method”, ral Computation Vol. 4, 141–166
Google Scholar
Bazaraa M.S., Sherali H.D., and Shetty C.M. (1993), linear programming theory and algorithms NY: Wiley
Google Scholar
Bennett K.P., and Mangasarian O.L. (1992), ”Neural Network Training via Linear Programming”, In P.M. Pardalos (ed), ances in Optimization and Parallel Computing, North Holland, Amsterdam
Google Scholar
Bertsekas D. (1995), ”ncremental Least Squares Methods and the Extended Kalman Filter” Technical Report, LIDS-P-2237, Lab. for Info, and Dec. Systems, M.I.T, Cambridge, MA, 02139, to appear in SIAM Journal on Optimization
Google Scholar
Burke L. I. (1991), ”ntroduction to Artificial Neural Systems for Pattern Recognition” puters Operations Research, 18, 2, 211–220
Article Google Scholar
Breitfeld M. and Shanno D. (1994), ”Preliminary Computational Experience with Modified Log-Barrier Functions for Large-Scale Nonlinear programming ” ge Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 45–67
Google Scholar
Cichocki A. and Unbehauen R. (1993), Neural Networks for Optimization and Signal Processing, Wiley, N. Y
Google Scholar
Chen S., Cowan C, Billings S., and Grant P. (1990), ”Parallel Recursive Prediction Error Algorithm For Training Layered Neural Networks ”, International Journal of Control, vol. 51, No. 6, 1215–1228
Article Google Scholar
Davidon W. C. (1976), ”New Least-Square Algorithms”, Journal of Optimization Theory and Applications, Vol. 18, no. 2, 187–197
Article Google Scholar
Fiacco A.V., and McCormick G.P. (1968), Nonlinear Programming: Sequential Unconstrained Minimization Techniques, John Wiley, New York
Google Scholar
Flippo O.E., and Jansen B., (1992), ”Duality and Sensitivity in Quadratic Opimization Over a Sphere”, Technical Report 92-65. Falculty of Technical Mathematics and Informatics, Delft University of Technology
Google Scholar
Frisch K.R. (1955), ”The Logarithmic Potential Method of Convex Programming”, Technical Report, University Institute of Economics, Oslo, Norway
Google Scholar
Geman S. and Hwan C.R. (1986), ”Diffusions for Global Optimization”, SIAM J, of Control and Optimization, 24, 1031–1043
Article Google Scholar
Goldberg D.E. (1989), Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, MA
Google Scholar
Gondzio J., and Terlaky T., (1996), ”A Computational View of Interior Point Methods for Linear Programming”, In Advances in Linear and Integer Program-ming. Beasley, Editor. Oxford University Press, Oxford, Great Britain
Google Scholar
Gonzaga C. C. (1991), ” Large Step Path-Following Methods for Linear Programming”, Parts 1 and 2. SIAM Journal of Optimization, Vol. 1, 268–280
Article Google Scholar
Gonzaga C. C. (1992), ” Path-Following Methods for Linear Programming”, SIAM REVIEW, Vol. 34, (2), 167–224
Article Google Scholar
Haykin S. (1994), Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Company, N.Y
Google Scholar
Hertz J., Krogh A., and Palmer R. G., (1991), Introduction to the Theory of Neural Computation, Redwood City, CA: Addison-Wesley
Google Scholar
Kasparian V., Batur C, Zhang H., and Padovan J. (1994), ”Davidon Least Squares-Based Learning Algorithm for Feedforward Neural Networks ”, Neural Networks, Vol. 7, No.4, 661–670
Article Google Scholar
Kinsells J. A. (1992), ”Comparison and Evaluation of Variants of the Conjugate Gradient Method for Efficient Learning in Feed-Forward Neural Networks with Backward Error Propagation”, Networks, 3, 27–35
Article Google Scholar
Kirkpatrick S., Gelatt CD. Jr., and Vecchi M.P. (1983), ”Optimization by Simulated Annealing”, Science, 220, 671–680
Article Google Scholar
Kollias S. and Anastassiou D. (1988), ”Adapting Training of Multilayer Neural Networks using a Least Squares Estimation Technique”, IEEE First International Conference on Neural Networks, San Diego, CA, Vol. I, 384–390
Google Scholar
Li Y., Joerding W., and Genz A., (1993), ”Global Training of Feedforward Nets with Hybrid LLS/Simulated Annealing”, Proceedings of WCNN, Portland, Oregon, III, 393–396
Google Scholar
Lustig I. J., Marsten R. E,, and Shanno D. F. (1994), ”Interior Point Methods for Linear Programming: Computational State of the Art”, ORSA Journal on Computing, Vol. 6 (1), Winter 1994, 1–14
Article Google Scholar
Mangasarian O.L., Setiono R. and Wolberg W.H. (1990), ”Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis”, In Coleman T.F., and Li. Y. (Eds.)Large-scale numerical optimization, Philadelphia: SIAM, 22–30
Google Scholar
Mangasarian O.L. (1993), “Mathematical Programming in Neural Networks”, ORSA Journal on Computing, Vol. 5, (4), 349–360
Article Google Scholar
Nash S. G. (1984), “Newton-type Minimization via Lanczos Method”, SIAM J. Numer. Anal, Vol 21, No. 4, 770–788
Article Google Scholar
Nash S. G., Polyak R. and Sofer A. (1994), ”A Numerical Comparison of Barrier and Modified Barrier Methods for Large-Scale Bound-Constrained Optimization”, Large Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 319–337
Google Scholar
Nash S. G. and Sofer A. (1993), ”A Barrier Method for Large-scale Constrained Optimization”, ORSA Journal on Computing, vol. 5, No. 1, 40–53
Article Google Scholar
Poggio T. and Girosi F. (1990), ”Networks for Approximation and Learning”, Proceedings of the IEEE, Vol. 78, 1481–1497
Article Google Scholar
Sartori M. A., Antsaklis P. J. (1992), ”Neural Network Training via Quadratic Optimization”, IEEE International Symposium on Circuits and Systems, 49–52
Google Scholar
Shanno D.F., Breitfeld M.G., and Simantiraki E.M. (1996), ”Implementing Barrier Methods for Nonlinear Programming”, In Interior Point Methods in Mathematical Programming, Terlaky, Editor, Kluwer Academic Publishers, 399–413
Google Scholar
Soderstrom T. and Stoica P. (1989), System Identification, Prentice Hall International (UK), Englewood Cliffs, NJ
Google Scholar
Stinchcombe M. and White H. (1990), ”Approximating and Learning Unknown Mappings Using Multilayer Feedforward Networks with Bounded Weights”, Proceedings of WCNN, Vol.111, 7–16
Google Scholar
Trafalis T. B. and Sieger D. B. (1993), ”Training of Multilayer Feedforward Artificial Neural Networks by a Logarithmic Barrier Function Adaptation Scheme”, In C.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 3, 167–173
Google Scholar
Trafalis T.B. and Couëllan N.P. (1994), ”Neural Networks Training via PrimalDual Interior Point Method for Linear Programming”, Proceedings of WCNN, Vol. II, 798–803
Google Scholar
Trafalis T.B. and Couëllan N.P. (1996), ”Neural Networks Training via an Affine Scaling Quadratic Optimization Algorithm”, Neural Networks, 9:3, 475–481
Article Google Scholar
Trafalis T.B. and Tutunji T. (1994), ”A Quasi-Newton Barrier Function Algorithm for Artificial Network Training with Bounded Weights”, In eC.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent engineering systems through artificial neural networks, Vol. 4, 161–173
Google Scholar
Ye Y., (1992), ”On Affine Scaling Algorithms for Nonconvex Quadratic Programming”, Mathematical Programming, Vol. 56, 285–300
Article Google Scholar
Wasserman P.D., (1989), Neural Computing: Theory and Practice, NY: Van Nostrand Reinhold
Google Scholar
Watrous R. L. (1987), ”Learning Algorithm for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization”, IEEE First International Conference on Neural Networks, vol. 2, San Diego, CA, 619–628
Google Scholar
Werbos P., (1974), Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD. Thesis, Committee on Applied Mathematics, Harvard University, Cambridge, MA. Reprinted in P. Werbos, The Roots of Backpropagation:From Ordered Derivatives to Neural Networks and Political Forecasting, NY: Wiley (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Industrial Engineering, University of Oklahoma, Norman, OK, UK
Theodore B. Trafalis, Tarek A. Tutunji & Nicolas P. Couëllan

Authors

Theodore B. Trafalis
View author publications
You can also search for this author in PubMed Google Scholar
Tarek A. Tutunji
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas P. Couëllan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Applied Optimization ISE Department, University of Florida, 303 Weil Hall, Gainesville, FL, 32611, USA
Panos M. Pardalos & Donald W. Hearn &
Math Department, University of Florida, Gainesville, FL, 32611, USA
William W. Hager

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trafalis, T.B., Tutunji, T.A., Couëllan, N.P. (1997). Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights. In: Pardalos, P.M., Hearn, D.W., Hager, W.W. (eds) Network Optimization. Lecture Notes in Economics and Mathematical Systems, vol 450. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59179-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-59179-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62541-4
Online ISBN: 978-3-642-59179-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics