Abstract
An important problem in learning using gradient descent algorithms (such as backprop) is the slowdown incurred by temporary minima (TM). We consider this problem for an artificial neural network trained to solve the XOR problem. The network is transformed into the equivalent all permutations fuzzy rule-base which provides a symbolic representation of the knowledge embedded in the network. We develop a mathematical model for the evolution of the fuzzy rule-base parameters during learning in the vicinity of TM. We show that the rule-base becomes singular and tends to remain singular in the vicinity of TM.
Our analysis suggests a simple remedy for overcoming the slowdown in the learning process incurred by TM. This is based on slightly perturbing the values of the training examples, so that they are no longer symmetric. Simulations demonstrate the usefulness of this approach.
Research supported in part by research grants from the Israel Science Foundation (ISF) and the Israeli Ministry of Science and Technology.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Andrews, R., Diederich, J., Tickle, A.B.: Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems 8, 373–389 (1995)
Cloete, I., Zurada, J.M. (eds.): Knowledge-Based Neurocomputing. MIT Press, Cambridge (2000)
Jacobsson, H.: Rule extraction from recurrent neural networks: A taxonomy and review. Neural Computation 17, 1223–1263 (2005)
Tickle, A.B., Andrews, R., Golea, M., Diederich, J.: The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Trans. Neural Networks 9, 1057–1068 (1998)
Jang, J.S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence. Prentice-Hall, Englewood Cliffs (1997)
Li, C., Cheng, K.H.: Recurrent neuro-fuzzy hybrid-learning approach to accurate system modeling. Fuzzy Sets and Systems 158, 194–212 (2007)
Kolman, E., Margaliot, M.: Are artificial neural networks white boxes? IEEE Trans. Neural Networks 16, 844–852 (2005)
Kolman, E., Margaliot, M.: Knowledge-Based Neurocomputing: A Fuzzy Logic Approach. Studies in Fuzziness and Soft Computing, vol. 234. Springer, Heidelberg (2009)
Ampazis, N., Perantonis, S.J., Taylor, J.G.: Dynamics of multilayer networks in the vicinity of temporary minima. Neural Networks 12, 43–58 (1999)
Annema, A.J., Hoen, K., Wallinga, H.: Learning behavior and temporary minima of two-layer neural networks. Neural Networks 7, 1387–1404 (1994)
Hamey, L.G.C.: XOR has no local minima: A case study in neural network error surface analysis. Neural Networks 11, 669–681 (1998)
Lee, O., Oh, S.H., Kim, M.W.: An analysis of premature saturation in back propagation learning. Neural Networks 6, 719–728 (1993)
Amari, S., Park, H., Ozeki, T.: Singularities affect dynamics of learning in neuromanifolds. Neural Computation 18, 1007–1065 (2006)
Sousa, J.W.C., Kaymak, U.: Fuzzy Decision Making in Modeling and Control. World Scientific, Singapore (2002)
Roth, I., Margaliot, M.: Analysis of learning near temporary minima using the all permutations fuzzy rule-base (submitted for publication) (2009), http://www.eng.tau.ac.il/~michaelm
Wang, X., Tang, Z., Tamura, H., Ishii, M., Sun, W.: An improved backpropagation algorithm to avoid the local minima problem. Neurocomputing 56, 455–460 (2004)
Bishop, C.M.: Training with noise is equivalent to Tikhonov regularization. Neural Computation 7(1), 108–116 (1995)
Wang, C., Principe, J.: Training neural networks with additive noise in the desired signal. IEEE Trans. Neural Networks 10, 1511–1517 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Roth, I., Margaliot, M. (2009). Improving Training in the Vicinity of Temporary Minima. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds) Bio-Inspired Systems: Computational and Ambient Intelligence. IWANN 2009. Lecture Notes in Computer Science, vol 5517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02478-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-02478-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02477-1
Online ISBN: 978-3-642-02478-8
eBook Packages: Computer ScienceComputer Science (R0)