Abstract
The Multilayer Perceptron (MLP) is a neural network architecture that is widely used for regression, classification and time series forecasting. One often cited disadvantage of the MLP, however, is the difficulty associated with human understanding of a particular MLP’s function. This so called black box limitation is due to the fact that the weights of the network reveal little about structure of the function they implement. This paper proposes a method for understanding the structure of the function learned by MLPs that model functions of the class \(f:\{-1,1\}^n \rightarrow \mathbb {R}^m\). This includes regression and classification models. A Walsh decomposition of the function implemented by a trained MLP is performed and the coefficients analysed. The advantage of a Walsh decomposition is that it explicitly separates the contribution to the function made by each subset of input neurons. It also allows networks to be compared in terms of their structure and complexity. The method is demonstrated on some small toy functions and on the larger problem of the MNIST handwritten digit classification data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Complexity has a specific meaning in this context. It describes the number and order of the interactions between inputs that produce a function’s output.
References
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Cambridge, MA, USA (1986)
Elman, J.L.: Finding structure in time. Cognitive Sci. 14, 179–211 (1990)
Kamimura, R.: Principal hidden unit analysis with minimum entropy method. In: Gielen, S., Kappen, B. (eds.) ICANN 1993, pp. 760–763. Springer, London (1993)
Sanger, D.: Contribution analysis: a technique for assigning responsibilities to hidden units in connectionist networks. Connection Sci. 1, 115–138 (1989)
Gorman, R.P., Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1, 75–89 (1988)
Kulluk, S., Özbakir, L., Baykasoğglu, A.: Fuzzy difaconn-miner: A novel approach for fuzzy rule extraction from neural networks. Expert Systems with Applications 40, : 938–946 FUZZYSS11: 2nd International Fuzzy Systems Symposium 17–18 November 2011. Ankara, Turkey (2013)
Hruschka, E. R., Ebecken, N. F.: Extracting rules from multilayer perceptrons in classification problems: A clustering-based approach. Neurocomputing 70 (2006) 384–397 Neural Networks Selected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN ’04) 7th Brazilian Symposium on Neural Networks
Saad, E., Wunsch II, D.: Neural network explanation using inversion. Neural Networks 20 (2007) 78–93 cited By (since 1996)22
Augasta, M., Kathirvalavakumar, T.: Rule Extraction from Neural Networks - A Comparative Study, pp. 404–408 (2012). cited By (since 1996)
Jivani, K., Ambasana, J., Kanani, S.: A survey on rule extraction approaches based techniques for data classification using neural network. International Journal of Futuristic Trends in Engineering and Technology 1 (2014)
Baum, E.B., Haussler, D.: What size net gives valid generalization? Neural Comput. 1, 151–160 (1989)
Uphadyaya, B., Eryurek, E.: Application of neural networks for sensor validation and plant monitoring. Neural Technology (1992) 170–176
Widrow, B., Lehr, M.: 30 years of adaptive neural networks: perceptron, madaline, and backpropagation. Proc. IEEE 78, 1415–1442 (1990)
Weigend, A. S., Huberman, B. A., Rumelhart, D. E.: Predicting Sunspots and Exchange Rates with Connectionist Networks. In Casdagli, M., Eubank, S., eds.: Nonlinear modeling and forecasting. Addison-Wesley (1992) 395–432
Bartlett, E.B.: Dynamic node architecture learning: an information theoretic approach. Neural Netw. 7, 129–140 (1994)
Castillo, P. A., Carpio, J., Merelo, J., Prieto, A., Rivas, V., Romero, G.: Evolving multilayer perceptrons (2000)
Yao, X.: Evolving artificial neural networks. Proc. IEEE 87, 1423–1447 (1999)
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: NIPS. (1994) 231–238
Walsh, J.: A closed set of normal orthogonal functions. Amer. J. Math 45, 5–24 (1923)
Beauchamp, K.: Applications of Walsh and Related Functions. Academic Press, London (1984)
Lecun, Y., Cortes, C.: (The MNIST database of handwritten digits).
Li, S. Z.: Markov random field modeling in computer vision. Springer-Verlag New York, Inc. (1995)
Jian-guo, W., Jian-hong, Y., Wen-xing, Z., Jin-wu, X.: Rule extraction from artificial neural network with optimized activation functions. In: Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on. Volume 1, IEEE (2008) 873–879
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Swingler, K. (2016). Opening the Black Box: Analysing MLP Functionality Using Walsh Functions. In: Merelo, J.J., Rosa, A., Cadenas, J.M., Dourado, A., Madani, K., Filipe, J. (eds) Computational Intelligence. IJCCI 2014. Studies in Computational Intelligence, vol 620. Springer, Cham. https://doi.org/10.1007/978-3-319-26393-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-26393-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26391-5
Online ISBN: 978-3-319-26393-9
eBook Packages: EngineeringEngineering (R0)