Abstract
Derivative free optimization methods have recently gained a lot of attractions for neural learning. The curse of dimensionality for the neural learning problem makes local optimization methods very attractive; however the error surface contains many local minima. Discrete gradient method is a special case of derivative free methods based on bundle methods and has the ability to jump over many local minima. There are two types of problems that are associated with this when local optimization methods are used for neural learning. The first type of problems is initial sensitivity dependence problem – that is commonly solved by using a hybrid model. Our early research has shown that discrete gradient method combining with other global methods such as evolutionary algorithm makes them even more attractive. These types of hybrid models have been studied by other researchers also. Another less mentioned problem is the problem of large weight values for the synaptic connections of the network. Large synaptic weight values often lead to the problem of paralysis and convergence problem especially when a hybrid model is used for fine tuning the learning task. In this paper we study and analyse the effect of different regularization parameters for our objective function to restrict the weight values without compromising the classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford Press, Oxford (1995)
Mangasarian, O.L.: Mathematical programming in neural networks. ORSA Journal on Computing 5, 349–360 (1993)
Zhang, X.M., Chen, Y.Q.: Ray-guided global optimization method for training neural networks. Neurocomputing 30, 333–337 (2000)
Masters, T.: Practical neural network recipes in C++. Academic Press, Boston (1993)
Masters, T.: Advanced algorithms for neural networks: a C++ sourcebook. Wiley, New York (1995)
Duch, W., Korczak, J.: Optimization and global minimization methods suitable for neural networks. Neural computing surveys (1999)
Coetzee, F.M., Stonick, V.L.: On the uniqueness of weights in singlelayer perceptrons. IEEE Transactions on Neural Networks 7, 318(8) (1996)
Horst, R., Pardalos, P.M.: Handbook of global optimization. Kluwer Academic Publishers, Dordrecht (1995)
Pinter, J.: Global optimization in action: continuous and Lipschitz optimization–algorithms, implementations, and applications. Kluwer Academic Publishers, Dordrecht (1996)
Torn, A., Zhilinskas, A.: Global optimization. Springer, Heidelberg (1989)
Porto, V.W., Fogel, D.B., Fogel, L.J.: Alternative Neural Network training algorithm. Intelligent system 10(3), 16–22 (1995)
Glover, F.: Future path for integer Programming and Links to Artificial Intelligence. Computer Operations Research 13, 533–549
Hansen, P., Jaumard, B.: Algorithms for the Maximum satisfatibility problem. RUTCOR Research Report, 43 – 87, Rutger Unoversity, New Burnswick, NJ
Rechenberg, I.: Cybernetic solution path of an experimental problem, Royal Aircraft Establishment. Library Translation no. 1122, Farnborough, Hants, U.K. (August 1965)
Whitley, D., Starkweather, T., Bogart, C.: Genetic algorithms and neural networks - optimizing connections and connectivity. Parallel Computing 14, 347–361 (1990)
Montana, D., Davis, L.: Training feed forward neural networks using genetic algorithms. In: Proceedings of 11th International Joint Conference on Artificial Intelligence IJCAI 1989, vol. 1, pp. 762–767 (1989)
Goldberg, D.E.: Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading (1989)
Holland, J.H.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)
Bagirov, A.M.: Derivative-free methods for unconstrained nonsmooth optimization and its numerical analysis. Investigacao Operacional 19, 75–93 (1999)
Bagirov, A.M.: Minimization methods for one class of nonsmooth functions and calculation of semi-equilibrium prices, Applied Optimization. In: Eberhard, A., et al. (eds.) Progress in Optimization: Contribution from Australasia, vol. 30, pp. 147–175. Kluwer Academic Publishers, Dordrecht (1999)
Bagirov, A.M.: A method for minimization of quasidifferentiable functions. Optimization Methods and Software 17(1), 31–60 (2002)
Hiriart-Urruty, J.B., Lemarechal, C.: Convex Analysis and Minimization Algorithms. Springer, Heidelberg (1993)
Bartlett, P.L.: For valid generalization, the size of the weights is more important than the size of the network. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9, pp. 134–140. The MIT Press, Cambrideg (1997)
Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Geman, S., Bienenstock, E., Doursat, R.: Neural Networks and the Bias/Variance Di-lemma. Neural Computation 4, 1–58 (1992)
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)
Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Lippmann, R.P., Moody, J., Touretzky, D.S. (eds.) Advances in Neural Information Processing Systems 3. Morgan Kaufmann, San Mateo (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ghosh, R., Ghosh, M., Yearwood, J., Bagirov, A. (2005). Determining Regularization Parameters for Derivative Free Neural Learning. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_8
Download citation
DOI: https://doi.org/10.1007/11510888_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26923-6
Online ISBN: 978-3-540-31891-0
eBook Packages: Computer ScienceComputer Science (R0)