Abstract
Since the birth of artificial intelligence, the theory and the technology have become more mature, and the application field is expanding. Mobile networks and applications have grown quickly in recent years, and mobile computing is the new computing paradigm for mobile networks. In this paper, we build an artificial intelligence platform for a mobile service, which supports deep learning frameworks such as TensorFlow and Caffe. We describe the overall architecture of the AI platform for a GPU cluster in mobile service computing. In the GPU cluster, based on the scheduling layer, we propose Yarn by the Slurm scheduler to not only improve the distributed TensorFlow plug-in for the Slurm scheduling layer but also to extend YARN to manage and schedule GPUs. The front-end of the high-performance AI platform has the attributes of availability, scalability and efficiency. Finally, we verify the convenience, scalability, and effectiveness of the AI platform by comparing the performance of single-chip and distributed versions for the TensorFlow, Caffe and YARN systems.
Similar content being viewed by others
Change history
19 November 2019
The Publisher regrets an error on the printed front cover of the October 2019 issue. The issue numbers were incorrectly listed as Volume 91, Nos. 10-12, October 2019. The correct number should be: "Volume 91, No. 10, October 2019"
References
Shanhai, W., Xinxing, J., & Haiyan, Y. (2015). Research on isolated speech recognition based on deep learning neural networks. Computer Application Research, 32(8), 2289–2291.
Gantz, J., & Reinsel, D. (2012). The Digital Universe in 2020: big data, bigger digital shadows, and biggest growth in the far East[EB/OL]. https://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf. Accessed 1 Dec 2012.
Sen, H., Mantang, T., & Xingguo, L. (2011). Application driven multi DSP processor array in high performance computing[J]. Computer Application Research, 28(4), 1336–1338.
Nickolls, J., & Dally, W. J. (2010). The GPU computing era[J]. IEEE Micro, 30(2), 56–69.
Li, Y., Dai, W., Ming, Z., & Qiu, M. (2016). Privacy protection for preventing data over-collection in Smart City[J]. IEEE Transactions on Computers, 65(5), 1339–1350.
Qiu, M., Gai, K., Thuraisingham, B. M., Tao, L., & Zhao, H. (2018). Proactive user-centric secure data scheme using attribute-based semantic access controls for mobile clouds in financial industry[J]. Future Generation Computer Systems, 80, 421–429.
Zhang, Y., Qiu, M., Tsai, C., Hassan, M. M., & Alamri, A. (2017). Health-CPS: healthcare cyber-physical system assisted by cloud and big data[J]. IEEE Systems Journal, 11(1), 88–95.
Christiansen, B., Garey, M., & Hartung, I. (2017). Slurm oveview [EB/OL]. https://slurm.schedmd.com/SC17/SlurmOverviewSC17.pdf. Accessed 12 Dec 2017.
Jette, M. A., Yoo, A. B., & Grondona, M. (2003). Slurm: simple Linux utility for resource management. 9th Workshop on Job Scheduling Strategies for Parallel Processing. Seattle, WA.
TensorFlow: Large-scale machine learning on heterogeneous systems[EB/OL]. http://download.tensorflow.org/paper/whitepaper2015.pdf.
Cybulska, M. (1999). Assessing yarn structure with image analysis Methods1[J]. Textile Research Journal, 69(5), 369–373.
Pacelli, M., Caldani, L., & Paradiso, R. (2013). Performances evaluation of piezoresistive fabric sensors as function of yarn structure[J]. IEEE Engineering in Medicine and Biology Society 2013(2013):6502–6505.
Ozturk, M., & Nergis, B. U. (2008). Determining the dependence of colour values on yarn structure[J]. Coloration Technology, 124(3), 145–150.
Owens, J. D., Houston, M., Luebke, D., et al. (2008). GPU computing[J]. Proceedings of the IEEE, 96(5), 879–899.
Wikipedia contributors. CUDA[EB/OL]. https://en.wikipedia.org/wiki/CUDA.
Jia, Y. Caffe[EB/OL]. http://caffe.berkeleyvision.org.
Pratx, G., & Xing, L. (2011). GPU computing in medical physics: a review[J]. Medical Physics, 38(5), 2685–2697.
Zhang, E. Z., Jiang, Y., Guo, Z., et al. (2011). On-the-fly elimination of dynamic irregularities for GPU computing[J]. AcmSigarch. Computer Architecture News, 47(4), 369–380.
Coates, A., Huval, B., Wang, T., Wu, D., Catanzaro, B., & Ng, A. (2013). Deep learning with COTS HPC systems. In 30th ICML (pp. 1337–1345).
Xu, J., Yang, X., & Ali, S. https://kubernetes.io/blog.
Heigold, G., Vanhoucke, V., Senior, A., Nguyen, P., Ranzato, M.’. A., Devin, M., & Dean, J. (2013). Multilingual acoustic models using distributed deep neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8619–8623). IEEE research.google.com/pubs/archive/40807.pdf.
Wang, H., Potluri, S., Luo, M., Singh, A. K., Sur, S., & Panda, D. K. (2011). MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters. Computer Science-Research and Development, 26(3–4), 257–266.
Gropp, W., Lusk, E., Doss, N., & Skjellum, A. (1996). A highperformance, portable implementation of the MPI message passing interface standard. Parallel Computing, 22(6), 789–828.
Kuznik, F., Obrecht, C., et al. (2010). LBM based flow simulation using GPU computing processor.[J]. Computers & Mathematics with Applications, 59(7), 2380–2392.
Vetter, J. S., Glassbrook, R., Dongarra, J., et al. (2011). Keeneland: bringing heterogeneous GPU computing to the computational science community[J]. Computing in Science & Engineering, 13(5), 90–95.
IBM Platform LSF. http://www-03.ibm.com/systems/platformcomputing/products/lsf/.
Jiabin, W. (2013). Research on massive transaction record query system based on Hadoop[D]. Nanjing University of Posts and Telecommunications.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (2013). The Google file system. 19th ACM symposium on operating systems principles.
Wikipedia contributors (2017). Apache Hadoop [EB/OL]. https://en.wikipedia.org/wiki/Apache_Hadoop. Accessed 13 Dec 2017.
Wikipedia contributors (2018). Lustre_(file_system) [EB/OL]. https://en.wikipedia.org/wiki/Lustre_(file_system). Accessed 31 Jan 2018
ParaStor. [EB/OL]. https://www.sugon.com/product/37.html.
Martin, A., Raponi, S., Combe, T., et al. (2018). Docker ecosystem – vulnerability analysis[J]. Computer Communications.
Slurm workload Manager [EB/OL]. http://slurm.schedmd.com/slurm.html.
Wang, J., Liu, C., Huang, Y. Auto tuning for new energy dispatch problem: a case study. Future Generation Computer Systems, Elsevier Publisher, issue 54, 501–506,2016.1(IF:2.43).
Xu, J., Yang, X., & Ali, S. (2011). Hadoop authoritative guide: Second Edition[M]. Tsinghua University Press.
Wikipedia contributors. Depth-first search [EB/OL]. https://en.wikipedia.org/wiki/Depth-first_search.
Debo, L. (2014). Research of GPU cluster system based on YARN[D]. Sun Yat-sen University.
Acknowledgments
This work was partly supported by the National Key R&D Program of China (No. 2017YFB0202202), the Major Research Plan of National Natural Science Foundation of China (No. 91530324), the Super-computing Resource Pool of Chinese Academy of Sciences Information Project (No. XXH13503).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, H., Lu, Z., Xu, K. et al. Artificial Intelligence Platform for Mobile Service Computing. J Sign Process Syst 91, 1179–1189 (2019). https://doi.org/10.1007/s11265-019-1438-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-019-1438-3