Improving Generalisation Using Modular Neural Networks
This paper deals with improving generalisation performances of feed forward neural networks (FFNN) on real world data domains using more complex architectures for modelling. The convention in neural networks is to use as small an architecture as possible to force better generalisation by modelling the underlying distribution and ignoring the details . This practice involves the loss of information from the training data which in real world domains may represent important though poorly represented decision regions. The problem with introducing extra free parameters (more neurons and weights) to a network is that over-fitting can occur causing the network to model the training data too closely and generalise badly on new data from the same domain. This problem is overcome by combining a number of FFNN (with small architectures) that have been trained on the same data, though generalise differently, to produce more complex decision regions and improved generalisation. Committee decision theory is used to produce the combined model and has been shown to give promising results in the past .
A real world medical data set consisting of non discrete attribute values and FFNN trained using Back Propagation (BP)  were used to test the validity of the concepts presented.
KeywordsBack Propagation Feed Forward Neural Network Network Output Neural Information Processing System Component Network
Unable to display preview. Download preview PDF.
- Hertz J., Krogh A. and Palmer R.: Introduction to the Theory of Neural Computation, Sante Fe Institute, Addison Wesley, 1991.Google Scholar
- LeBlanc M, Tibshirani R: Combining Estimates in Regression and Classification, Univ. Toronto Statistics Dept., Technical Report. 1993.Google Scholar
- McLean D., Bandar Z., O’Shea J.: Improved Interpolation and Extrapolation from Continuous Training Examples Using a New Neuronal Model with an Adaptive Steepness, 2nd Australian and New Zealand Conference on Intelligent Information Systems, IEEE, pp. 125–129, 1994.Google Scholar
- Martin G., Pittman J.: Recognizing Hand-Printed Letters and Digits, Advances in Neural Information Processing Systems, II., pp. 405–414, 1990.Google Scholar
- Tesauro G., Sejinowski T.J.: A Parallel Network that Learns to Play Backgammon, Artificial Intelligence, No 39, pp. 357–390, 1988.Google Scholar
- Morgan N., Bourland H.: Generalization and Parameter Estimation in Feed Forward Nets: Some Experiments, Advances in Neural Information Processing Systems, II., pp. 405–414, 1990.Google Scholar
- McLean D., Bandar Z., O’Shea L: A Constructive Decision Boundary Modelling Algorithm, IASTED ‘98, Mexico. 1998.Google Scholar
- McLean D., Bandar Z., O’Shea L: The Evolution of a Feed Forward Neural Network Trained under Back Propagation’, ICANNGA′97, Springer-Verlag, 1997.Google Scholar
- Michie D., Spiegelhalter D.J., Taylor C.C.: Machine Learning, Neural and Statistical Classification, Ellis Hopwood Series in Artificial Intelligence, Ellis Hopwood, 1994.Google Scholar