Max-Pooling Dropout for Regularization of Convolutional Neural Networks
Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in pooling layers is still not clear. This paper demonstrates that max-pooling dropout is equivalent to randomly picking activation based on a multinomial distribution at training time. In light of this insight, we advocate employing our proposed probabilistic weighted pooling, instead of commonly used max-pooling, to act as model averaging at test time. Empirical evidence validates the superiority of probabilistic weighted pooling. We also compare max-pooling dropout and stochastic pooling, both of which introduce stochasticity based on multinomial distributions at pooling stage.
KeywordsDeep learning Convolutional neural network Max-pooling dropout
This work was supported in part by National Natural Science Foundation of China under grant 61371148.
- 1.Hinton, G.E., Srivastave, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaption of feature detectors. arXiv:1207.0580 (2012)
- 2.Krizhevsky, A.: Learning multiple layers of features from tiny images. M.S. dissertation, University of Toronto (2009)Google Scholar
- 3.Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML (2013)Google Scholar
- 4.Zeiler, M.D., Fergus R.: Stochastic pooling for regularization of deep convolutional neural networks. In: ICLR (2013)Google Scholar
- 5.Wan, L., Zeiler, M.D., Zhang, S., LeCun, Y., Fergus, R.: Regularization of neural networks using DropConnect. In: ICML (2013)Google Scholar
- 6.Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
- 7.Vinod, N., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML (2010)Google Scholar
- 8.Ba, J.L., Frey, B.: Adaptive dropout for training deep neural networks. In: NIPS (2013)Google Scholar