Skip to main content

Determining Regularization Parameters for Derivative Free Neural Learning

  • Conference paper
Book cover Machine Learning and Data Mining in Pattern Recognition (MLDM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3587))

Abstract

Derivative free optimization methods have recently gained a lot of attractions for neural learning. The curse of dimensionality for the neural learning problem makes local optimization methods very attractive; however the error surface contains many local minima. Discrete gradient method is a special case of derivative free methods based on bundle methods and has the ability to jump over many local minima. There are two types of problems that are associated with this when local optimization methods are used for neural learning. The first type of problems is initial sensitivity dependence problem – that is commonly solved by using a hybrid model. Our early research has shown that discrete gradient method combining with other global methods such as evolutionary algorithm makes them even more attractive. These types of hybrid models have been studied by other researchers also. Another less mentioned problem is the problem of large weight values for the synaptic connections of the network. Large synaptic weight values often lead to the problem of paralysis and convergence problem especially when a hybrid model is used for fine tuning the learning task. In this paper we study and analyse the effect of different regularization parameters for our objective function to restrict the weight values without compromising the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford Press, Oxford (1995)

    Google Scholar 

  2. Mangasarian, O.L.: Mathematical programming in neural networks. ORSA Journal on Computing 5, 349–360 (1993)

    MATH  Google Scholar 

  3. Zhang, X.M., Chen, Y.Q.: Ray-guided global optimization method for training neural networks. Neurocomputing 30, 333–337 (2000)

    Article  Google Scholar 

  4. Masters, T.: Practical neural network recipes in C++. Academic Press, Boston (1993)

    Google Scholar 

  5. Masters, T.: Advanced algorithms for neural networks: a C++ sourcebook. Wiley, New York (1995)

    Google Scholar 

  6. Duch, W., Korczak, J.: Optimization and global minimization methods suitable for neural networks. Neural computing surveys (1999)

    Google Scholar 

  7. Coetzee, F.M., Stonick, V.L.: On the uniqueness of weights in singlelayer perceptrons. IEEE Transactions on Neural Networks 7, 318(8) (1996)

    Article  Google Scholar 

  8. Horst, R., Pardalos, P.M.: Handbook of global optimization. Kluwer Academic Publishers, Dordrecht (1995)

    MATH  Google Scholar 

  9. Pinter, J.: Global optimization in action: continuous and Lipschitz optimization–algorithms, implementations, and applications. Kluwer Academic Publishers, Dordrecht (1996)

    MATH  Google Scholar 

  10. Torn, A., Zhilinskas, A.: Global optimization. Springer, Heidelberg (1989)

    Google Scholar 

  11. Porto, V.W., Fogel, D.B., Fogel, L.J.: Alternative Neural Network training algorithm. Intelligent system 10(3), 16–22 (1995)

    Google Scholar 

  12. Glover, F.: Future path for integer Programming and Links to Artificial Intelligence. Computer Operations Research 13, 533–549

    Google Scholar 

  13. Hansen, P., Jaumard, B.: Algorithms for the Maximum satisfatibility problem. RUTCOR Research Report, 43 – 87, Rutger Unoversity, New Burnswick, NJ

    Google Scholar 

  14. Rechenberg, I.: Cybernetic solution path of an experimental problem, Royal Aircraft Establishment. Library Translation no. 1122, Farnborough, Hants, U.K. (August 1965)

    Google Scholar 

  15. Whitley, D., Starkweather, T., Bogart, C.: Genetic algorithms and neural networks - optimizing connections and connectivity. Parallel Computing 14, 347–361 (1990)

    Article  Google Scholar 

  16. Montana, D., Davis, L.: Training feed forward neural networks using genetic algorithms. In: Proceedings of 11th International Joint Conference on Artificial Intelligence IJCAI 1989, vol. 1, pp. 762–767 (1989)

    Google Scholar 

  17. Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading (1989)

    MATH  Google Scholar 

  18. Holland, J.H.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)

    Google Scholar 

  19. Bagirov, A.M.: Derivative-free methods for unconstrained nonsmooth optimization and its numerical analysis. Investigacao Operacional 19, 75–93 (1999)

    Google Scholar 

  20. Bagirov, A.M.: Minimization methods for one class of nonsmooth functions and calculation of semi-equilibrium prices, Applied Optimization. In: Eberhard, A., et al. (eds.) Progress in Optimization: Contribution from Australasia, vol. 30, pp. 147–175. Kluwer Academic Publishers, Dordrecht (1999)

    Google Scholar 

  21. Bagirov, A.M.: A method for minimization of quasidifferentiable functions. Optimization Methods and Software 17(1), 31–60 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  22. Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms. Springer, Heidelberg (1993)

    Google Scholar 

  23. Bartlett, P.L.: For valid generalization, the size of the weights is more important than the size of the network. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9, pp. 134–140. The MIT Press, Cambrideg (1997)

    Google Scholar 

  24. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)

    Google Scholar 

  25. Geman, S., Bienenstock, E., Doursat, R.: Neural Networks and the Bias/Variance Di-lemma. Neural Computation 4, 1–58 (1992)

    Article  Google Scholar 

  26. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)

    MATH  Google Scholar 

  27. Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Lippmann, R.P., Moody, J., Touretzky, D.S. (eds.) Advances in Neural Information Processing Systems 3. Morgan Kaufmann, San Mateo (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ghosh, R., Ghosh, M., Yearwood, J., Bagirov, A. (2005). Determining Regularization Parameters for Derivative Free Neural Learning. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_8

Download citation

  • DOI: https://doi.org/10.1007/11510888_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26923-6

  • Online ISBN: 978-3-540-31891-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics