Affine Transformation Capsule Net

  • Runkun Lu
  • Jianwei LiuEmail author
  • Siming Lian
  • Xin Zuo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11154)


CapsNet is a great attempt to relieve the drawback of CNN, where the routing by agreement method is tolerant to small changes in the viewpoint on one entity inside an image. This ground breaking has attracted the attention of many researchers, however, original CapsNet only utilizes the length of digit capsules in the classification task, which ignores the information of orientation. Based on this, we propose an Affine Transformation Capsule Net (AT-CapsNet) which we leverage both of the length and orientation information of digit capsules by adding a single-layer perceptron substitutes for the operation of computing length of vectors. In addition, we explain AT-CapsNet model’s architecture from five perspectives and further analyse model complexity and the difference between dynamic routing and attention mechanism. The experimental results outperform the efficiency of our proposed algorithm in real world data sets.


CapsNet AT-CapsNet Perceptron Affine transformation 


  1. 1.
    Frosst, N., Hinton, G.E., Sabour, S.: Matrix capsules with em routing. In: International Conference on Learning Representations, accepted as poster, Vancouver, BC, Canada (2018)Google Scholar
  2. 2.
    Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, Timo, Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). Scholar
  3. 3.
    Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869, Long Beach, CA, USA (2017)Google Scholar
  4. 4.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806 (2014)
  5. 5.
    Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
  6. 6.
    LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404, Morgan Kaufmann, Denver, Colorado, USA (1990)Google Scholar
  7. 7.
    Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025, Montreal, Quebec, Canada (2015)Google Scholar
  8. 8.
    Jia, X., De Brabandere, B., Tuytelaars, T., Gool, L.V.: Dynamic filter networks. In: Advances in Neural Information Processing Systems, pp. 667–675, Barcelona, Spain (2016)Google Scholar
  9. 9.
    Lenc, K., Vedaldi, A.: Learning covariant feature detectors. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 100–117. Springer, Cham (2016). Scholar
  10. 10.
    Cohen, T., Welling, M.: Group equivariant convolutional networks. In: International Conference on Machine Learning, pp. 2990–2999, New York City, NY, USA (2016)Google Scholar
  11. 11.
    Dieleman, S., De Fauw, J., Kavukcuoglu, K.: Exploiting cyclic symmetry in convolutional neural networks. arXiv preprint arXiv:1602.02660 (2016)
  12. 12.
    Oyallon, E., Mallat, S.: Deep roto-translation scattering for object classification in CVPR. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2865–2873, Boston, MA, USA (2015)Google Scholar
  13. 13.
    Xi, E., Bing, S., Jin, Y.: Capsule network performance on complex data. arXiv preprint arXiv:1712.03480 (2017)
  14. 14.
    LeCun, Y.: The mnist database of handwritten digits. (2008)
  15. 15.
    Yao, L., et al.: Video description generation incorporating spatio-temporal features and a soft-attention mechanism. arXiv preprint arXiv:1502.08029 (2015)
  16. 16.
    Xiao, H., Rasul, K., Vollgraf, R.: Fashionmnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
  17. 17.
    Goodfellow, I.J., et al.: Generative adversarial nets. In: International Conference on Neural Information Processing Systems MIT Press, pp. 2672–2680, Montreal, Quebec, Canada (2014)Google Scholar
  18. 18.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of AutomationChina University of Petroleum, Beijing Campus (CUP)BeijingChina

Personalised recommendations