Bayesian Regularization of Neural Networks
Bayesian regularized artificial neural networks (BRANNs) are more robust than standard back-propagation nets and can reduce or eliminate the need for lengthy cross-validation. Bayesian regularization is a mathematical process that converts a nonlinear regression into a “well-posed” statistical problem in the manner of a ridge regression. The advantage of BRANNs is that the models are robust and the validation process, which scales as O(N2) in normal regression methods, such as back propagation, is unnecessary. These networks provide solutions to a number of problems that arise in QSAR modeling, such as choice of model, robustness of model, choice of validation set, size of validation effort, and optimization of network architecture. They are difficult to overtrain, since evidence procedures provide an objective Bayesian criterion for stopping training. They are also difficult to overfit, because the BRANN calculates and trains on a number of effective network parameters or weights, effectively turning off those that are not relevant. This effective number is usually considerably smaller than the number of weights in a standard fully connected back-propagation neural net. Automatic relevance determination (ARD) of the input variables can be used with BRANNs, and this allows the network to “estimate” the importance of each input. The ARD method ensures that irrelevant or highly correlated indices used in the modeling are neglected as well as showing which are the most important variables for modeling the activity data.
This chapter outlines the equations that define the BRANN method plus a flowchart for producing a BRANN-QSAR model. Some results of the use of BRANNs on a number of data sets are illustrated and compared with other linear and nonlinear models.
KeywordsQSAR artificial neural network Bayesian regularizsation early stopping algorithm automatic relevance determination (ARD) overtraining
- 5.Neal RN (1996) Bayesian learning for neural networks. Springer-Verlag New York, Inc., Secaucus, NJ.Google Scholar
- 7.Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, Oxford.Google Scholar
- 8.Nabney IT (2002) Netlab: algorithms for pattern recognition. Springer-Verlag, London.Google Scholar
- 13.Burden FR (1989) Molecular identification number for substructure searches. J Chem Inf Comput Sci 29:225–227.Google Scholar
- 14.Winkler DA, Burden FR (2004) Bayesian neural nets for modeling in drug discovery. Biosilico 2:104–111.Google Scholar
- 19.van Rossum G. (1995) Python tutorial. Technical Report CS-R9526, Centrum voor Wiskunde en Informatica (CWI), Amsterdam, May1995.Google Scholar
- 20.van Rossum G, Drake FL Jr (eds) (2003) Python/C API reference manual. PythonLabs, release 2.2.330 May. Google Scholar
- 21.van Rossum G, Drake FL Jr (eds) (2003) Python library reference. PythonLabs, release 2.2.330 May. Google Scholar
- 23.Winkler DA, Burden FR. (2002) Application of neural networks to large dataset QSAR, virtual screening and library design. in: Bellavance-English,L (ed) Combinatorial chemistry methods and protocols., Humana Press, Totowa, NJ.Google Scholar