DropAll: Generalization of Two Convolutional Neural Network Regularization Methods

  • Xavier Frazão
  • Luís A. AlexandreEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8814)


We introduce DropAll, a generalization of DropOut [1] and DropConnect [2], for regularization of fully-connected layers within convolutional neural networks. Applying these methods amounts to sub-sampling a neural network by dropping units. Training with DropOut, a randomly selected subset of activations are dropped, when training with DropConnect we drop a randomly subsets of weights. With DropAll we can perform both methods. We show the validity of our proposal by improving the classification error of networks trained with DropOut and DropConnect, on a common image classification dataset. To improve the classification, we also used a new method for combining networks, which was proposed in [3].


Graphic Processing Unit Regularization Method Simple Average Convolutional Neural Network Drop Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors, CoRR, abs/1207.0580 (2012)Google Scholar
  2. 2.
    Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Dasgupta, S., Mcallester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning (ICML 2013). JMLR Workshop and Conference Proceedings, vol. 28, pp. 1058–1066 (May 2013)Google Scholar
  3. 3.
    Frazao, X., Alexandre, L.A.: Weighted convolutional neural network ensemble (in submitted, 2014)Google Scholar
  4. 4.
    Fukushima, K.: A neural network model for selective attention in visual pattern recognition. Biol. Cybern. 55(1), 5–16 (1986)CrossRefzbMATHGoogle Scholar
  5. 5.
    Jarrett, K., Kavukcuoglu, K., Ranzato, M., LeCun, Y.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153 (September 2009)Google Scholar
  6. 6.
    Chellapilla, K., Puri, S., Simard, P.: High Performance Convolutional Neural Networks for Document Processing. In: Lorette, G. (eds.) Tenth International Workshop on Frontiers in Handwriting Recognition, La Baule (France). Suvisoft, Université de Rennes 1 (October 2006),
  7. 7.
    Krizhevsky, A.: Cuda-convnet (2012),
  8. 8.
    Krizhevsky, A.: Learning multiple layers of features from tiny images, Tech. Rep. (2009)Google Scholar
  9. 9.
    Lin, M., Chen, Q., Yan, S.: Network in network, CoRR, abs/1312.4400 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of InformaticsUniv. Beira InteriorCovilhãPortugal
  2. 2.Department of InformaticsInstituto de TelecomunicaçõesCovilhãPortugal

Personalised recommendations