Skip to main content

Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights

  • Conference paper
Book cover Network Optimization

Abstract

We investigate and demonstrate the benefits of applying interior point methods (IPM) in supervised learning of artificial neural networks. Specifically, three IPM algorithms are presented in this paper: a deterministic logarithmic barrier (LB), a stochastic logarithmic barrier function (SB) and a quadratic trust region method respectively. Those are applied to the training of supervised feedforward artificial neural networks. We consider neural network training as a nonlinear constrained optimization problem. Specifically, we put constraints on the weights to avoid network paralysis. In the case of the (LB) method, the search direction is derived using a recursive prediction error method (RPEM) that approximates the inverse of the Hessian of a logarithmic error function iteratively. The weights move on a center trajectory in the interior of the feasible weight space and have good convergence properties. For its stochastic version, at each iteration a stochastic optimization procedure is used to add random fluctuations to the RPEM direction in order to escape local minima. This optimization technique can be viewed as a hybrid of the barrier function method and simulated annealing procedure. In the third algorithm, we approximate the objective function by a quadratic convex function and use a trust region method to find the optimal weights. Computational experiments in approximation of discrete dynamical systems and medical diagnosis problems are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Achenie L.E.K. (1993), “ Quasi-Newton Based Approach to the Training of the Feedforward Neural Network”, elligent Engineering Systems through Artificial Neural Networks Vol. 3, Editors: C. H. Dagli, L. I. Burke, B. R. Fernandez and J. Ghosh, 155–160

    Google Scholar 

  2. Andersen E.D., Gondzio J., Meszaros Cs, and Xu X. (1996), ”Implementation of Interior Point Methods for Large Scale Linear Programming” In erior Point Methods in Mathematical Programming Terlaky, Editor, Kluwer Academic Publishers, 189–252

    Google Scholar 

  3. Barnard E. (1992), ”Optimization for Training Neural Nets”, E Transactions on Neural Networks Vol. 3, (2), 232–240

    Article  Google Scholar 

  4. Battiti R. (1992), ”First and Second-Order Methods for Learning Between Steepest Descent and Newton’ Method”, ral Computation Vol. 4, 141–166

    Google Scholar 

  5. Bazaraa M.S., Sherali H.D., and Shetty C.M. (1993), linear programming theory and algorithms NY: Wiley

    Google Scholar 

  6. Bennett K.P., and Mangasarian O.L. (1992), ”Neural Network Training via Linear Programming”, In P.M. Pardalos (ed), ances in Optimization and Parallel Computing, North Holland, Amsterdam

    Google Scholar 

  7. Bertsekas D. (1995), ”ncremental Least Squares Methods and the Extended Kalman Filter” Technical Report, LIDS-P-2237, Lab. for Info, and Dec. Systems, M.I.T, Cambridge, MA, 02139, to appear in SIAM Journal on Optimization

    Google Scholar 

  8. Burke L. I. (1991), ”ntroduction to Artificial Neural Systems for Pattern Recognition” puters Operations Research, 18, 2, 211–220

    Article  Google Scholar 

  9. Breitfeld M. and Shanno D. (1994), ”Preliminary Computational Experience with Modified Log-Barrier Functions for Large-Scale Nonlinear programming ” ge Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 45–67

    Google Scholar 

  10. Cichocki A. and Unbehauen R. (1993), Neural Networks for Optimization and Signal Processing, Wiley, N. Y

    Google Scholar 

  11. Chen S., Cowan C, Billings S., and Grant P. (1990), ”Parallel Recursive Prediction Error Algorithm For Training Layered Neural Networks ”, International Journal of Control, vol. 51, No. 6, 1215–1228

    Article  Google Scholar 

  12. Davidon W. C. (1976), ”New Least-Square Algorithms”, Journal of Optimization Theory and Applications, Vol. 18, no. 2, 187–197

    Article  Google Scholar 

  13. Fiacco A.V., and McCormick G.P. (1968), Nonlinear Programming: Sequential Unconstrained Minimization Techniques, John Wiley, New York

    Google Scholar 

  14. Flippo O.E., and Jansen B., (1992), ”Duality and Sensitivity in Quadratic Opimization Over a Sphere”, Technical Report 92-65. Falculty of Technical Mathematics and Informatics, Delft University of Technology

    Google Scholar 

  15. Frisch K.R. (1955), ”The Logarithmic Potential Method of Convex Programming”, Technical Report, University Institute of Economics, Oslo, Norway

    Google Scholar 

  16. Geman S. and Hwan C.R. (1986), ”Diffusions for Global Optimization”, SIAM J, of Control and Optimization, 24, 1031–1043

    Article  Google Scholar 

  17. Goldberg D.E. (1989), Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, MA

    Google Scholar 

  18. Gondzio J., and Terlaky T., (1996), ”A Computational View of Interior Point Methods for Linear Programming”, In Advances in Linear and Integer Program-ming. Beasley, Editor. Oxford University Press, Oxford, Great Britain

    Google Scholar 

  19. Gonzaga C. C. (1991), ” Large Step Path-Following Methods for Linear Programming”, Parts 1 and 2. SIAM Journal of Optimization, Vol. 1, 268–280

    Article  Google Scholar 

  20. Gonzaga C. C. (1992), ” Path-Following Methods for Linear Programming”, SIAM REVIEW, Vol. 34, (2), 167–224

    Article  Google Scholar 

  21. Haykin S. (1994), Neural Networks: A Comprehensive Foundation, Macmillan College Publishing Company, N.Y

    Google Scholar 

  22. Hertz J., Krogh A., and Palmer R. G., (1991), Introduction to the Theory of Neural Computation, Redwood City, CA: Addison-Wesley

    Google Scholar 

  23. Kasparian V., Batur C, Zhang H., and Padovan J. (1994), ”Davidon Least Squares-Based Learning Algorithm for Feedforward Neural Networks ”, Neural Networks, Vol. 7, No.4, 661–670

    Article  Google Scholar 

  24. Kinsells J. A. (1992), ”Comparison and Evaluation of Variants of the Conjugate Gradient Method for Efficient Learning in Feed-Forward Neural Networks with Backward Error Propagation”, Networks, 3, 27–35

    Article  Google Scholar 

  25. Kirkpatrick S., Gelatt CD. Jr., and Vecchi M.P. (1983), ”Optimization by Simulated Annealing”, Science, 220, 671–680

    Article  Google Scholar 

  26. Kollias S. and Anastassiou D. (1988), ”Adapting Training of Multilayer Neural Networks using a Least Squares Estimation Technique”, IEEE First International Conference on Neural Networks, San Diego, CA, Vol. I, 384–390

    Google Scholar 

  27. Li Y., Joerding W., and Genz A., (1993), ”Global Training of Feedforward Nets with Hybrid LLS/Simulated Annealing”, Proceedings of WCNN, Portland, Oregon, III, 393–396

    Google Scholar 

  28. Lustig I. J., Marsten R. E,, and Shanno D. F. (1994), ”Interior Point Methods for Linear Programming: Computational State of the Art”, ORSA Journal on Computing, Vol. 6 (1), Winter 1994, 1–14

    Article  Google Scholar 

  29. Mangasarian O.L., Setiono R. and Wolberg W.H. (1990), ”Pattern Recognition via Linear Programming: Theory and Application to Medical Diagnosis”, In Coleman T.F., and Li. Y. (Eds.)Large-scale numerical optimization, Philadelphia: SIAM, 22–30

    Google Scholar 

  30. Mangasarian O.L. (1993), “Mathematical Programming in Neural Networks”, ORSA Journal on Computing, Vol. 5, (4), 349–360

    Article  Google Scholar 

  31. Nash S. G. (1984), “Newton-type Minimization via Lanczos Method”, SIAM J. Numer. Anal, Vol 21, No. 4, 770–788

    Article  Google Scholar 

  32. Nash S. G., Polyak R. and Sofer A. (1994), ”A Numerical Comparison of Barrier and Modified Barrier Methods for Large-Scale Bound-Constrained Optimization”, Large Scale Optimization State of the Art. Hager, Hearn, and Pardalos, Editors. Kluwer Academic Publishers, 319–337

    Google Scholar 

  33. Nash S. G. and Sofer A. (1993), ”A Barrier Method for Large-scale Constrained Optimization”, ORSA Journal on Computing, vol. 5, No. 1, 40–53

    Article  Google Scholar 

  34. Poggio T. and Girosi F. (1990), ”Networks for Approximation and Learning”, Proceedings of the IEEE, Vol. 78, 1481–1497

    Article  Google Scholar 

  35. Sartori M. A., Antsaklis P. J. (1992), ”Neural Network Training via Quadratic Optimization”, IEEE International Symposium on Circuits and Systems, 49–52

    Google Scholar 

  36. Shanno D.F., Breitfeld M.G., and Simantiraki E.M. (1996), ”Implementing Barrier Methods for Nonlinear Programming”, In Interior Point Methods in Mathematical Programming, Terlaky, Editor, Kluwer Academic Publishers, 399–413

    Google Scholar 

  37. Soderstrom T. and Stoica P. (1989), System Identification, Prentice Hall International (UK), Englewood Cliffs, NJ

    Google Scholar 

  38. Stinchcombe M. and White H. (1990), ”Approximating and Learning Unknown Mappings Using Multilayer Feedforward Networks with Bounded Weights”, Proceedings of WCNN, Vol.111, 7–16

    Google Scholar 

  39. Trafalis T. B. and Sieger D. B. (1993), ”Training of Multilayer Feedforward Artificial Neural Networks by a Logarithmic Barrier Function Adaptation Scheme”, In C.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent Engineering Systems Through Artificial Neural Networks, Vol. 3, 167–173

    Google Scholar 

  40. Trafalis T.B. and Couëllan N.P. (1994), ”Neural Networks Training via PrimalDual Interior Point Method for Linear Programming”, Proceedings of WCNN, Vol. II, 798–803

    Google Scholar 

  41. Trafalis T.B. and Couëllan N.P. (1996), ”Neural Networks Training via an Affine Scaling Quadratic Optimization Algorithm”, Neural Networks, 9:3, 475–481

    Article  Google Scholar 

  42. Trafalis T.B. and Tutunji T. (1994), ”A Quasi-Newton Barrier Function Algorithm for Artificial Network Training with Bounded Weights”, In eC.H. Dagli, L.I. Burke, B.R. Fernandez and J. Ghosh (Eds.), Intelligent engineering systems through artificial neural networks, Vol. 4, 161–173

    Google Scholar 

  43. Ye Y., (1992), ”On Affine Scaling Algorithms for Nonconvex Quadratic Programming”, Mathematical Programming, Vol. 56, 285–300

    Article  Google Scholar 

  44. Wasserman P.D., (1989), Neural Computing: Theory and Practice, NY: Van Nostrand Reinhold

    Google Scholar 

  45. Watrous R. L. (1987), ”Learning Algorithm for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization”, IEEE First International Conference on Neural Networks, vol. 2, San Diego, CA, 619–628

    Google Scholar 

  46. Werbos P., (1974), Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD. Thesis, Committee on Applied Mathematics, Harvard University, Cambridge, MA. Reprinted in P. Werbos, The Roots of Backpropagation:From Ordered Derivatives to Neural Networks and Political Forecasting, NY: Wiley (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Trafalis, T.B., Tutunji, T.A., Couëllan, N.P. (1997). Interior Point Methods for Supervised Training of Artificial Neural Networks with Bounded Weights. In: Pardalos, P.M., Hearn, D.W., Hager, W.W. (eds) Network Optimization. Lecture Notes in Economics and Mathematical Systems, vol 450. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59179-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-59179-2_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62541-4

  • Online ISBN: 978-3-642-59179-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics