Determining Regularization Parameters for Derivative Free Neural Learning

Ghosh, Ranadhir; Ghosh, Moumita; Yearwood, John; Bagirov, Adil

doi:10.1007/11510888_8

Ranadhir Ghosh²⁰,
Moumita Ghosh²⁰,
John Yearwood²⁰ &
…
Adil Bagirov²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3587))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

2080 Accesses
3 Citations

Abstract

Derivative free optimization methods have recently gained a lot of attractions for neural learning. The curse of dimensionality for the neural learning problem makes local optimization methods very attractive; however the error surface contains many local minima. Discrete gradient method is a special case of derivative free methods based on bundle methods and has the ability to jump over many local minima. There are two types of problems that are associated with this when local optimization methods are used for neural learning. The first type of problems is initial sensitivity dependence problem – that is commonly solved by using a hybrid model. Our early research has shown that discrete gradient method combining with other global methods such as evolutionary algorithm makes them even more attractive. These types of hybrid models have been studied by other researchers also. Another less mentioned problem is the problem of large weight values for the synaptic connections of the network. Large synaptic weight values often lead to the problem of paralysis and convergence problem especially when a hybrid model is used for fine tuning the learning task. In this paper we study and analyse the effect of different regularization parameters for our objective function to restrict the weight values without compromising the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford Press, Oxford (1995)
Google Scholar
Mangasarian, O.L.: Mathematical programming in neural networks. ORSA Journal on Computing 5, 349–360 (1993)
MATH Google Scholar
Zhang, X.M., Chen, Y.Q.: Ray-guided global optimization method for training neural networks. Neurocomputing 30, 333–337 (2000)
Article Google Scholar
Masters, T.: Practical neural network recipes in C++. Academic Press, Boston (1993)
Google Scholar
Masters, T.: Advanced algorithms for neural networks: a C++ sourcebook. Wiley, New York (1995)
Google Scholar
Duch, W., Korczak, J.: Optimization and global minimization methods suitable for neural networks. Neural computing surveys (1999)
Google Scholar
Coetzee, F.M., Stonick, V.L.: On the uniqueness of weights in singlelayer perceptrons. IEEE Transactions on Neural Networks 7, 318(8) (1996)
Article Google Scholar
Horst, R., Pardalos, P.M.: Handbook of global optimization. Kluwer Academic Publishers, Dordrecht (1995)
MATH Google Scholar
Pinter, J.: Global optimization in action: continuous and Lipschitz optimization–algorithms, implementations, and applications. Kluwer Academic Publishers, Dordrecht (1996)
MATH Google Scholar
Torn, A., Zhilinskas, A.: Global optimization. Springer, Heidelberg (1989)
Google Scholar
Porto, V.W., Fogel, D.B., Fogel, L.J.: Alternative Neural Network training algorithm. Intelligent system 10(3), 16–22 (1995)
Google Scholar
Glover, F.: Future path for integer Programming and Links to Artificial Intelligence. Computer Operations Research 13, 533–549
Google Scholar
Hansen, P., Jaumard, B.: Algorithms for the Maximum satisfatibility problem. RUTCOR Research Report, 43 – 87, Rutger Unoversity, New Burnswick, NJ
Google Scholar
Rechenberg, I.: Cybernetic solution path of an experimental problem, Royal Aircraft Establishment. Library Translation no. 1122, Farnborough, Hants, U.K. (August 1965)
Google Scholar
Whitley, D., Starkweather, T., Bogart, C.: Genetic algorithms and neural networks - optimizing connections and connectivity. Parallel Computing 14, 347–361 (1990)
Article Google Scholar
Montana, D., Davis, L.: Training feed forward neural networks using genetic algorithms. In: Proceedings of 11^th International Joint Conference on Artificial Intelligence IJCAI 1989, vol. 1, pp. 762–767 (1989)
Google Scholar
Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading (1989)
MATH Google Scholar
Holland, J.H.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)
Google Scholar
Bagirov, A.M.: Derivative-free methods for unconstrained nonsmooth optimization and its numerical analysis. Investigacao Operacional 19, 75–93 (1999)
Google Scholar
Bagirov, A.M.: Minimization methods for one class of nonsmooth functions and calculation of semi-equilibrium prices, Applied Optimization. In: Eberhard, A., et al. (eds.) Progress in Optimization: Contribution from Australasia, vol. 30, pp. 147–175. Kluwer Academic Publishers, Dordrecht (1999)
Google Scholar
Bagirov, A.M.: A method for minimization of quasidifferentiable functions. Optimization Methods and Software 17(1), 31–60 (2002)
Article MATH MathSciNet Google Scholar
Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms. Springer, Heidelberg (1993)
Google Scholar
Bartlett, P.L.: For valid generalization, the size of the weights is more important than the size of the network. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9, pp. 134–140. The MIT Press, Cambrideg (1997)
Google Scholar
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Google Scholar
Geman, S., Bienenstock, E., Doursat, R.: Neural Networks and the Bias/Variance Di-lemma. Neural Computation 4, 1–58 (1992)
Article Google Scholar
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
MATH Google Scholar
Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Lippmann, R.P., Moody, J., Touretzky, D.S. (eds.) Advances in Neural Information Processing Systems 3. Morgan Kaufmann, San Mateo (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

School of InformationTechnology and Mathematical Sciences, University of Ballarat, PO Box 663, Ballarat, 3353, Australia
Ranadhir Ghosh, Moumita Ghosh, John Yearwood & Adil Bagirov

Authors

Ranadhir Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Moumita Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
John Yearwood
View author publications
You can also search for this author in PubMed Google Scholar
Adil Bagirov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and applied Computer Sciences, IBaI, Germany
Petra Perner
Institute of Media and Information Technology, Chiba University, Japan
Atsushi Imiya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghosh, R., Ghosh, M., Yearwood, J., Bagirov, A. (2005). Determining Regularization Parameters for Derivative Free Neural Learning. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_8

Download citation

DOI: https://doi.org/10.1007/11510888_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26923-6
Online ISBN: 978-3-540-31891-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics