Demonstration of the Bayesian Evidence Scheme for Regularisation
The Bayesian evidence approach to regularisation, derived in the previous chapter, is applied to the stochastic time series generated from the logistic-kappa map. The scheme is found to prevent overfitting and lead to a stabilisation of the training process with respect to changes in the length of training time. For a small training set, it is also found to include an automatic pruning scheme: the network complexity is reduced until all remaining parameters are well-determined by the data.
Unable to display preview. Download preview PDF.
- 1.For the simulations reported here, this algorithm was slightly modified as follows. The order of steps 3 and 5 was inverted, and the hyperparameters αk were updated by finding the root of (10.75). Note that this equation is nonlinear in αksince the number of well-determined parameters λk on the right-hand side of (10.75) depends on αkvia (10.72). In practice, a solution can easily be obtained by invocation of a root-finding algorithm, like Brent’s method (, Chapter 9). In this way the speed of the standard algorithm can be slightly improved.Google Scholar
- 2.The cross-validation ‘error’ was estimated on an independent test set of the same size as the training set.Google Scholar
- 3.This terminology is slightly imprecise, because the actual weight decay ‘constant’ is given by the ratio αk/βk Google Scholar
- 4.Note that with a diverging weight-decay ‘constant’, λk, all the weights in the respective weight group decay to zero and leave the mapping implemented in the kth network branch completely misplaced. Consequently, the posterior probability for the kth component in the mixture will always be small, leading to a decay of the prior αk when updated with the EM algorithm.Google Scholar