Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 1, pp 197–211 | Cite as

Deep networks with non-static activation function

  • Huajun Zhou
  • Zechao LiEmail author
Article
  • 130 Downloads

Abstract

Deep neural networks typically with a fixed activation function at each neuron, have shown breakthrough performances. The fixed activation function is not the optimal choice for different data distributions. Toward this end, this work improves the deep neural networks by proposing a novel and efficient activation scheme called “Mutual Activation” (MAC). A non-static activation function is adaptively learned in the training phase of deep network. Furthermore, the proposed activation neuron cooperating with maxout is a potent higher-order function approximator, which can break through the convex curve limitation. Experimental results on object recognition benchmarks demonstrate the effectiveness of the proposed activation scheme.

Keywords

Object recognition Activation neuron Convolution network Feature learning 

Notes

Acknowledgments

This work was partially supported by the 973 Program (Project No. 2014CB347600), the National Natural Science Foundation of China (Grant No. 61772275, 61720106004, 61672285 and 61672304) and the Natural Science Foundation of Jiangsu Province (BK20170033).

References

  1. 1.
    Agostinelli F, Hoffman MD, Sadowski PJ, Baldi P (2015) Learning activation functions to improve deep neural networks. In: ICLRGoogle Scholar
  2. 2.
    Chang JR, Chen YS (2015) Batch-normalized maxout network in network. arXiv:1511.02583 1511.02583
  3. 3.
    Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units. In: ICLRGoogle Scholar
  4. 4.
    Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: AISTATSGoogle Scholar
  5. 5.
    Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: AISTATSGoogle Scholar
  6. 6.
    Goodfellow IJ, Warde-Farley D, Mirza M, Courville AC, Bengio Y (2013) Maxout networks. In: ICMLGoogle Scholar
  7. 7.
    Gulcehre C, Cho K, Pascanu R, Bengio Y (2014) Learned-norm pooling for deep feedforward and recurrent neural networks. In: ECMLGoogle Scholar
  8. 8.
    He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV, pp 1026–1034Google Scholar
  9. 9.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778Google Scholar
  10. 10.
    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICMLGoogle Scholar
  11. 11.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678Google Scholar
  12. 12.
    Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny imagesGoogle Scholar
  13. 13.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPSGoogle Scholar
  14. 14.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  15. 15.
    Lee CY, Xie S, Gallagher PW, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: AISTATSGoogle Scholar
  16. 16.
    Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999CrossRefGoogle Scholar
  17. 17.
    Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288MathSciNetCrossRefGoogle Scholar
  18. 18.
    Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: CVPRGoogle Scholar
  19. 19.
    Lin M, Chen Q, Yan S (2014) Network in network. In ICLRGoogle Scholar
  20. 20.
    Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML, vol 30Google Scholar
  21. 21.
    Maaten LVD, Hinton G (2017) Visualizing data using t-sne. JMLR 9 (2605):2579–2605zbMATHGoogle Scholar
  22. 22.
    Mishkin D, Matas J (2016) All you need is a good init. In: ICLRGoogle Scholar
  23. 23.
    Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICMLGoogle Scholar
  24. 24.
    Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y Fitnets: Hints for thin deep nets. In: ICLRGoogle Scholar
  25. 25.
    Shang W, Sohn K, Almeida D, Lee H (2016) Understanding and improving convolutional neural networks via concatenated rectified linear units. In: ICMLGoogle Scholar
  26. 26.
    Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLRGoogle Scholar
  27. 27.
    Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. JMLR 15:1929–1958MathSciNetzbMATHGoogle Scholar
  28. 28.
    Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387
  29. 29.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR, pp 1–9Google Scholar
  30. 30.
    Wang Q, Gao J, Yuan Y (2017) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst PP(99):1–14Google Scholar
  31. 31.
    Wang Q, Wan J, Yuan Y (2017) Deep metric learning for crowdedness regression. IEEE Trans Circuits Syst Video Technol PP(99):1–1Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Nanjing University of Science and TechnologyNanjingChina

Personalised recommendations