Advertisement

A Parallel Forward-Backward Propagation Learning Scheme for Auto-Encoders

  • Yoshihiro OhamaEmail author
  • Takayoshi Yoshimura
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10635)

Abstract

Auto-encoders constitute one popular deep learning architecture for feature extraction. Since an auto-encoder has at least one bottle neck layer for feature representation and at least five layers for fitting nonlinear transformations, back-propagation learning (BPL) algorithms with saturated activation functions sometimes face the vanishing gradient problem, which slows convergence. Thus, several modified methods have been proposed to mitigate this problem. In this work, we propose the calculation of forward-propagated errors in parallel with back-propagated errors in the network, without modification of the activation functions or the network structure. Although this scheme for auto-encoder learning has a larger computational cost than that of BPL, processing time until convergence could be reduced by implementing parallel computing. In order to confirm the feasibility of this scheme, two simple problems were examined by training auto-encoders to acquire (1) identity mappings of two-dimensional points along the arc of a half-circle to extract the central angle and (2) hand-writing images to extract labeled digits. Both results indicate that the proposed scheme requires only about half of the iterations to reduce the cost value enough, compared to BPL.

Keywords

Auto-encoder Vanishing gradient Credit assignment Biological plausibility Feature extraction Parallel error propagation 

References

  1. 1.
    LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521, 436–444 (2015)CrossRefGoogle Scholar
  2. 2.
    Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in reccurent nets: the difficulty of learning long-term dependencies. In: Kremer, C., Kolen, J.F. (eds.) Field Guide to Dynamical Recurrent Neural Networks, pp. 237–244. Wily-IEEE Press, Hoboken (2001)Google Scholar
  3. 3.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  4. 4.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway Networks. arXiv:1505.00387 (2015)
  5. 5.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. arXiv:1512.03385 (2015)
  6. 6.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: 27th International Conference on Machine Learning, pp. 807–814. Omnipress, Madison (2010)Google Scholar
  7. 7.
    Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., Garcia, R.: Incorporating second-order functional knowledge for better option pricing. In: 13th International Conference on Neural Information Processing Systems, pp. 451–457. MIT Press, Denver (2001)Google Scholar
  8. 8.
    Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 257–269 (2010)zbMATHMathSciNetGoogle Scholar
  9. 9.
    Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (2014)
  10. 10.
    Crick, F.: The recent excitement about neural networks. Nature 337, 129–132 (1987)CrossRefGoogle Scholar
  11. 11.
    Harris, K.D.: Stability of the fittest: organizing learning through retroaxonal signals. Trends Neurosci. 31, 130–136 (2008)CrossRefGoogle Scholar
  12. 12.
    Werfel, J., Xie, X., Seung, H.S.: Learning curves for stochastic gradient decent in linear feedforward networks. Neural Comput. 17, 2699–2718 (2005)CrossRefzbMATHGoogle Scholar
  13. 13.
    Ohama, Y., Fukumura, N., Uno, Y.: A Simplified Forward-Propagation Learning Rule Applied to Adaptive Closed-Loop Control. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 437–443. Springer, Heidelberg (2005). doi: 10.1007/11550907_69 Google Scholar
  14. 14.
    Bengio, Y.: How Auto-encoders could Provide Credit Assignment in Deep Networks via Target Propagation. arXiv:1407.7906. (2014)
  15. 15.
    Lillicrap, T.P., Cownden, D., Tweed, D.B., Akerman, C.J.: Random synaptic feedback weights support error backpropagation for deep learning. Nat. Comm. 7, 13276 (2016)CrossRefGoogle Scholar
  16. 16.
    Nøkland, A.: Direct feedback alignment provides learning in deep neural networks. In: 30th International Conference on Neural Information Processing Systems, pp. 1037–1045. MIT Press, Denver (2016)Google Scholar
  17. 17.
    The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Toyota Central Research and Development Laboratories, Inc.AichiJapan

Personalised recommendations