Abstract
Recursive Neural Networks are non-linear adaptive models that are able to learn deep structured information. However, these models have not yet been broadly accepted. This fact is mainly due to its inherent complexity. In particular, not only for being extremely complex information processing models, but also because of a computational expensive learning phase. The most popular training method for these models is back-propagation through the structure. This algorithm has been revealed not to be the most appropriate for structured processing due to problems of convergence, while more sophisticated training methods enhance the speed of convergence at the expense of increasing significantly the computational cost. In this paper, we firstly perform an analysis of the underlying principles behind these models aimed at understanding their computational power. Secondly, we propose an approximate second order stochastic learning algorithm. The proposed algorithm dynamically adapts the learning rate throughout the training phase of the network without incurring excessively expensive computational effort. The algorithm operates in both on-line and batch modes. Furthermore, the resulting learning scheme is robust against the vanishing gradients problem. The advantages of the proposed algorithm are demonstrated with a real-world application example.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Goller, C., Kuchler, A.: Learning Task-Dependent Distributed Structure-Representations by Backpropagation Through Structure. In: Proceedings of the IEEE International Conference on Neural Networks (ICNN 1996), Washington, pp. 347–352 (1996)
Frasconi, P., Gori, M., Sperduti, A.: A General Framework for Adaptive Processing of Data Structures. IEEE Transactions on Neural Networks 9(5), 768–786 (1998)
Ceroni, A., Frasconi, P., Pollastri, G.: Learning Protein Secondary Structure From Sequential and Relational Data. Neural Networks 18, 1029–1039 (2005)
Baldi, P., Pollastri, G.: The Pricipled Design of Large-Scale Recursive Neural Networks Architectures-DAG-RNNs and the Protein Structure Prediction Problem. Journal of Machine Learning Research 4, 575–602 (2003)
Mauro, C.D., Diligenti, M., Gori, M., Maggini, M.: Similarity Learning for Graph-based Image Representations. Pattern Recognition Letters 24(8), 1115–1122 (2003)
Costa, F., Frasconi, P., Lombardo, V., Soda, G.: Towards Incremental Parsing of Natural Language Using Recursive Neural Networks. Applied Intelligence 19, 9–25 (2003)
Bianchini, M., Gori, M., Sarti, L., Scarselli, F.: Recursive Processing of Cyclic Graphs. IEEE Transactions on Neural Networks 9(17), 10–18 (2006)
Gori, M., Monfardini, G., Scarselli, L.: A New Model for Learning in Graph Domains. In: Proceedings of the 18th IEEE International Joint Conference on Neural Networks, Montreal, pp. 729–734 (2005)
Hammer, B., Micheli, A., Sperduti, A.: Adaptive Contextual Processing of Structured Data by Recursive Neural Networks: A Survey of Computational Properties. In: Hammer, B., Hitzler, P. (eds.) Perspectives of Neural-Symbolic Integration. Springer, Heidelberg (2007)
Micheli, A., Sona, D., Sperduti, A.: Contextual Processing of Structured Data by Recursive Cascade Correlation. IEEE Transactions on Neural Networks 15(6), 1396–1410 (2004)
Hammer, B., Micheli, A., Sperduti, A.: Universal Aproximation Capabilities of Cascade Correlation for Structures. Neural Computation 17, 1109–1159 (2005)
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support Vector Machine Learning for Interdependent and Structured Output Spaces. In: Brodley, C.E. (ed.) ICML 2004: Twenty-first international conference on Machine Learning. ACM Press, New York (2004)
Hammer, B., Saunders, C., Sperdutti, A.: Editorial of the Special issue on Neural Networks and Kernel Methods for Structured Domains. Neural Networks 18(8), 1015–1018 (2005)
Leyton, M.: A Generative Theory of Shape. LNCS, vol. 2145, pp. 1–76. Springer, Heidelberg (2001)
Leyton, M.: Symmetry, Causality, Mind. MIT Press, Massachusetts (1992)
Churchland, P., Sejnowski, T.: The Computational Brain. MIT Press, Cambridge (1992)
Hecht-Nielsen, R.: Confabulation Theory: The Mechanism of Thought. Springer, Heidelberg (2007)
Baldi, P., Rosen-Zvi, M.: On the Relationship between Deterministic and Probabilistic Directed Graphical Models: from Bayesian Networks to Recursive Neural Networks. Neural Networks 18(8), 1080–1086 (2005)
Orr, G.B., Müller, K.-R. (eds.): NIPS-WS 1996. LNCS, vol. 1524, pp. 395–399. Springer, Heidelberg (1998)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: A Field Guide to Dynamical Recurrent Networks. In: Kolen, J., Kremer, S. (eds.), pp. 237–243. IEEE Press, Inc., New York (2001)
Chinea, A., Parent, M.: Risk Assessment Algorithms Based on Recursive Neural Networks. In: Proceedings of the 20th IEEE International Joint Conference on Neural Networks, Florida, pp. 1434–1440 (2007)
Frasconi, P., Gori, M., Kuchler, A., Sperdutti, A.: A Field Guide to Dynamical Recurrent Networks. In: Kolen, J., Kremer, S. (eds.), pp. 351–364. IEEE Press, Inc., New York (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chinea, A. (2009). Understanding the Principles of Recursive Neural Networks: A Generative Approach to Tackle Model Complexity. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04274-4_98
Download citation
DOI: https://doi.org/10.1007/978-3-642-04274-4_98
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04273-7
Online ISBN: 978-3-642-04274-4
eBook Packages: Computer ScienceComputer Science (R0)