Abstract
With the increase of processor computing power, also a substantial rise in the development of many scientific applications, such as weather forecast, financial market analysis, medical technology and so on. The need for more intelligent data increases significantly. Deep Learning as a framework that able to understand the abstract information such as images, text, and sound has a challenging area in recent research works. This phenomenon makes the accuracy and speed are essential for implementing a large neural network. Therefore in this paper, we intend to implement Caffe deep learning framework on Intel Xeon Phi and measure the performance of this environment. In this case, we conduct three experiments. First, we evaluated the accuracy of Caffe deep learning framework in several numbers of iterations on Intel Xeon Phi. For the speed evaluation, in the second experiment we compared the training time before and after optimization on Intel Xeon E5-2650 and Intel Xeon Phi 7210 . In this case, we use vectorization, OpenMP parallel processing, message transfer Interface (MPI) for optimization. In the third experiment, we compared multinode execution results on two nodes of Intel Xeon E5-2650 and two nodes of Intel Xeon Phi 7210.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Heinecke, A.: Accelerators in scientific computing is it worth the effort? In: 2013 International Conference on High Performance Computing Simulation (HPCS), pp. 504–504, July 2013
Dagum, Leonardo, Menon, Ramesh: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
Chapman, B., Jost, G., Van Der Pas, R.: Using OpenMP: portable shared memory parallel programming, vol. 10. MIT press, Cambridge (2008)
Chandra, R.: Parallel programming in OpenMP. Morgan kaufmann (2001)
Openmp (2017). https://www.openmp.org
Zhang, C., Fang, Z., Zhou, P., Pan, P., Cong, J.: Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. In: IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD, 07–10 Nov 2016
Hegde, G., Siddhartha, Ramasamy, N., Kapre, N.: Caffepresso: an optimized library for deep learning on embedded accelerator-based platforms. In: Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2016 (2016)
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, pp. 675–678 (2014)
Krizhevsky, A., Hinton, G.: Convolutional deep belief networks on cifar-10. Unpublished manuscript 40 (2010)
Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 215–223 (2011)
Cifar10 (2017). https://www.cs.toronto.edu/~kriz/cifar.html
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprintarXiv:1408.5882 (2014)
Roska, T., et al.: The use of cnn models in the subcortical visual pathway. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 40(3), 182–195 (1993)
Zarándy, Ákos, Orzó, László, Grawes, Edward, Werblin, Frank: CNN-based models for color vision and visual illusions. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 46(2), 229–238 (1999)
Bottou, L., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International. Conference on Pattern Recognition, 1994. Vol. 2-Conference B: Computer Vision & Image Processing, vol. 2, pp. 77–82. IEEE (1994)
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)
Susan Blackford, L., Petitet, Antoine, Pozo, Roldan, Karin Remington, R., Whaley, Clint, Demmel, James, Dongarra, Jack, Duff, Iain, Hammarling, Sven, Henry, Greg: An updated set of basic linear algebra subprograms (blas). ACM Trans. Math. Softw. 28(2), 135–151 (2002)
Nath, R., Tomov, S., Dongarra, J.: Accelerating GPU kernels for dense linear algebra. In: VECPAR, pp. 83–92. Springer, Berlin (2010)
Xbyak (2017). https://github.com/herumi/xbyak
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814 (2010)
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)
Acknowledgements
This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 107-2221-E-029-008 and MOST 106-3114-E-029-003.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, CT., Liu, JC., Chan, YW., Kristiani, E., Kuo, CF. (2019). On Construction of a Caffe Deep Learning Framework based on Intel Xeon Phi. In: Xhafa, F., Leu, FY., Ficco, M., Yang, CT. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-030-02607-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-02607-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02606-6
Online ISBN: 978-3-030-02607-3
eBook Packages: EngineeringEngineering (R0)