Abstract
Deep Learning has recently been gaining popularity. From the micro-architecture field to the upper-layer end applications, a lot of research work has been proposed in the literature to advance the knowledge of Deep Learning. Deep Learning Benchmarking is one of such hot spots in the community. There are a bunch of Deep Learning benchmarks available in the community already and new ones keep coming as well. However, we find that not many survey works are available to give an overview of these useful benchmarks in the literature. We also find few discussions on what has been done for Deep Leaning Benchmarking in the community and what are still missing. To fill this gap, this paper attempts to provide a survey on multiple high-impact Deep Learning Benchmarks with training and inference support. We share some of our insightful observations and discussions on these benchmarks. In this paper, we believe the community still needs more benchmarks to capture different perspectives, while these benchmarks need a way for converging to a standard.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)
Adolf, R., Rama, S., Reagen, B., Wei, G.-Y., Brooks, D.M.: Fathom: reference workloads for modern deep learning methods. CoRR, abs/1608.06581 (2016)
Akioka, S., Muraoka, Y.: HPC benchmarks on Amazon EC2. In: 2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops, pp. 1029–1034. IEEE (2010)
Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and Mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)
Ben-Nun, T., Besta, M., Huber, S., Ziogas, A.N., Peter, D., Hoefler, T.: A modular benchmarking infrastructure for high-performance and reproducible deep learning. arXiv:1901.10183 (2019)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)
BigDataBench: A Big Data and AI Benchmark Suite (2018). http://prof.ict.ac.cn/
Huang, C., et al.: AIoT bench: towards comprehensive benchmarking mobile and embedded device intelligence. In: BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench), Seattle, WA, USA (2018)
Chen, T., et al.: BenchNN: on the broad potential application scope of hardware neural network accelerators. In: Proceedings - 2012 IEEE International Symposium on Workload Characterization, IISWC 2012, pp. 36–45, November 2012
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Coleman, C., et al.: DAWNBench: an end-to-end deep learning benchmark and competition. In: Proceedings of ML Systems Workshop, Co-Located with 31st Conference on Neural Information Processing Systems (NIPS) (2017)
Stanford DAWNBench: An End-to-End Deep Learning Benchmark and Competition (2018). https://dawn.cs.stanford.edu/benchmark/
Baidu DeepBench: Benchmarking Deep Learning Operations on Different Hardware (2018). https://github.com/baidu-research/DeepBench
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Deng, L.: The MNIST database of handwritten digit images for machine learning research. IEEE Sign. Process. Mag. 29(6), 141–142 (2012)
Dongarra, J., Heroux, M.A., Luszczek, P.: HPCG benchmark: a new metric for ranking high performance computing systems. Technical report UT-EECS-15-736, Electrical Engineering and Computer Science Department, University of Tennessee (2015)
Dongarra, J.J., Luszczek, P., Petitet, A.: The LINPACK benchmark: past, present and future. Concurrency Comput.: Pract. Exp. 15(9), 803–820 (2003)
Druzhkov, P.N., Kustikova, V.D.: A survey of deep learning methods and software tools for image classification and object detection. Pattern Recogn. Image Anal. 26(1), 9–15 (2016)
Erickson, B.J., Korfiatis, P., Akkus, Z., Kline, T., Philbrick, K.: Toolkits and libraries for deep learning. J. Dig. Imaging 30(4), 400–405 (2017)
Facebook AI Performance Evaluation Platform (2018). https://github.com/facebook/FAI-PEP
Gao, W., et al.: BigDataBench: a dwarf-based big data and AI benchmark suite, pp. 1–23 (2018)
Hannun, A., et al.: Deep speech: scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567 (2014)
Hatcher, W.G., Yu, W.: A survey of deep learning: platforms, applications and emerging research trends. IEEE Access 6, 24411–24432 (2018)
Hauswald, J., et al.: DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA), ISCA 2015. ACM, New York (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Hinton, G.E.: A practical guide to training restricted boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_32
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)
Facebook Inc.: Caffe2. https://caffe2.ai/
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Ketkar, N.: Introduction to PyTorch. Deep Learning with Python, pp. 195–208. Apress, Berkeley (2017). https://doi.org/10.1007/978-1-4842-2766-4_12
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)
Lee, J., et al.: On-device augmented reality with mobile GPUs (2019). https://mixedreality.cs.cornell.edu/s/10_CV4ARVR2019-jet-camera-ready.pdf
Luszczek, P.R., et al.: The HPC challenge (HPCC) benchmark suite. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, vol. 213. Citeseer (2006)
MLPerf: A Broad ML Benchmark Suite for Measuring Performance of ML Software Frameworks, ML Hardware Accelerators, and ML Cloud Platforms (2018). https://mlperf.org/
Nelson, M.T., et al.: NAMD: a parallel, object-oriented molecular dynamics program. Int. J. Supercomput. Appl. High Perform. Comput. 10(4), 251–268 (1996)
Ota, K., Dao, M.S., Mezaris, V., De Natale, F.G.B.: Deep learning for mobile multimedia: a survey. ACM Trans. Multimed. Comput. Commun. Appl. 13(3s), 34:1–34:22 (2017)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Rampasek, L., Goldenberg, A.: TensorFlow: biology’s gateway to deep learning? Cell Syst. 2(1), 12–14 (2016)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Wen, X., et al.: EdgeAI bench: towards comprehensive end-to-end edge computing benchmarking. In: BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench), Seattle, WA, USA (2018)
TensorFlow Benchmarks (2018). https://www.tensorflow.org/performance/benchmarks
Thomas, S., et al.: CortexSuite: a synthetic brain benchmark suite. In: 2014 IEEE International Symposium on Workload Characterization (IISWC), pp. 76–79, October 2014
Wang, L., et al.: HPC AI500: a benchmark suite for HPC AI systems. In: BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench), Seattle, WA, USA (2018)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Acknowledgments
This research is supported in part by the Strategic Priority Research Program of the Chinese Academy of Sciences, Grant No. XDA19020400.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Q. et al. (2019). A Survey on Deep Learning Benchmarks: Do We Still Need New Ones?. In: Zheng, C., Zhan, J. (eds) Benchmarking, Measuring, and Optimizing. Bench 2018. Lecture Notes in Computer Science(), vol 11459. Springer, Cham. https://doi.org/10.1007/978-3-030-32813-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-32813-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32812-2
Online ISBN: 978-3-030-32813-9
eBook Packages: Computer ScienceComputer Science (R0)