Parallelizing Convolutional Neural Networks on Intel\(^{\textregistered }\) Many Integrated Core Architecture

  • Junjie LiuEmail author
  • Haixia Wang
  • Dongsheng Wang
  • Yuan Gao
  • Zuofeng Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9017)


Convolutional neural networks (CNNs) are state-of-the-art machine learning algorithm in low-resolution vision tasks and are widely applied in many applications. However, the training process of them is very time-consuming. As a result, many approaches have been proposed in which parallelization is one of the most effective. In this article, we parallelized a classic CNN on a new platform of Intel\(^{{\textregistered }}\) Xeon Phi\(^{{{\text {TM}}}}\) Coprocessor with OpenMP. Our implementation acquired 131\(\times \) speedup against the serial version running on the coprocessor itself and 8.3\(\times \) speedup against the serial baseline on the Xeon\(^{{\textregistered }}\) E5-2697 CPU.


Convolutional neural network OpenMP Intel many integrated core architecture Xeon phi 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Osadchy, M., Cun, Y.L., Miller, M.L.: Synergistic face detection and pose estimation with energy-based models. The Journal of Machine Learning Research 8, 1197–1215 (2007)Google Scholar
  2. 2.
    Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks 16(5), 555–559 (2003)CrossRefGoogle Scholar
  3. 3.
    Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (June 2012)Google Scholar
  4. 4.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  5. 5.
    Scherer, D., Schulz, H., Behnke, S.: Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 82–91. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  6. 6.
    Huqqani, A.A., Schikuta, E., Ye, S., Chen, P.: Multicore and gpu parallelization of neural networks for face recognition. Procedia Computer Science 18, 349–358 (2013)CrossRefGoogle Scholar
  7. 7.
    Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology 160(1), 106 (1962)CrossRefGoogle Scholar
  8. 8.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  9. 9.
    Scherer, D., Müller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 92–101. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  10. 10.
    LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  11. 11.
    Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: 2013 12th International Conference on Document Analysis and Recognition, vol. 2, pp. 958–958. IEEE Computer Society (August 2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Junjie Liu
    • 1
    Email author
  • Haixia Wang
    • 1
  • Dongsheng Wang
    • 1
  • Yuan Gao
    • 1
  • Zuofeng Li
    • 1
  1. 1.Tsinghua National Laboratory for Information Science and TechnologyBeijingChina

Personalised recommendations