Abstract
We study the tradeoff between computational effort and classification accuracy in a cascade of deep neural networks. During inference, the user sets the acceptable accuracy degradation which then automatically determines confidence thresholds for the intermediate classifiers. As soon as the confidence threshold is met, inference terminates immediately without having to compute the output of the complete network. Confidence levels are derived directly from the softmax outputs of intermediate classifiers, as we do not train special decision functions. We show that using a softmax output as a confidence measure in a cascade of deep neural networks leads to a reduction of \(15\%\)–50\(\%\) in the number of MAC operations while degrading the classification accuracy by roughly \(1\%\). Our method can be easily incorporated into pre-trained non-cascaded architectures, as we exemplify on ResNet. Our main contribution is a method that dynamically adjusts the tradeoff between accuracy and computation without retraining the model.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available http://tensorflow.org/
Bodik, P.: Automating datacenter operations using machine learning. Ph.D. thesis, Berkeley, CA, USA (2010). aAI3555582
Bolukbasi, T., Wang, J., Dekel, O., Saligrama, V.: Adaptive neural networks for efficient inference, pp. 527–536 (2017)
Cambazoglu, B.B., et al.: Early exit optimizations for additive machine learned ranking systems. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 2010, pp. 411–420. ACM, New York (2010). https://doi.org/10.1145/1718487.1718538
Chang, S., Lee, S.h., Kwak, N.: URNet: User-Resizable Residual Networks with Conditional Gating Module (2019). arXiv preprint: arXiv:1901.04687
Cordella, L.P., De Stefano, C., Tortorella, F., Vento, M.: A method for improving classification reliability of multilayer perceptrons. IEEE Trans. Neural Netw. 6(5), 1140–1147 (1995). https://doi.org/10.1109/72.410358
De Stefano, C., Sansone, C., Vento, M.: To reject or not to reject: that is the question-an answer in case of neural classifiers. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 30(1), 84–94 (2000). https://doi.org/10.1109/5326.827457
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66(4), 614–656 (2003). https://doi.org/10.1016/S0022-0000(03)00026-6
Figurnov, M., et al.: Spatially adaptive computation time for residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1790–1799, July 2017. https://doi.org/10.1109/CVPR.2017.194
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 4885–4894. Curran Associates Inc., USA (2017)
Graves, A.: Adaptive computation time for recurrent neural networks. arXiv preprint: arXiv:1603.08983 (2016)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.: On calibration of modern neural networks. In: ICML 2017, April 2017
Han, S., et al.: EIE: efficient inference engine on compressed deep neural network. ACM SIGARCH Comput. Archit. News 44 (2016). https://doi.org/10.1145/3007787.3001163
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, June 2016
Leroux, S., et al.: The cascading neural network: building the internet of smart things. Knowl. Inf. Syst. 52(3), 791–814 (2017). https://doi.org/10.1007/s10115-017-1029-1
Odena, A., Lawson, D., Olah, C.: Changing model behavior at test-time using reinforcement learning. arXiv preprint: arXiv:1702.07780 (2017)
Panda, P., Sengupta, A., Roy, K.: Conditional deep learning for energy-efficient and enhanced pattern recognition. In: 2016 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 475–480, March 2016
Stamoulis, D., et al.: Designing adaptive neural networks for energy-constrained image classification. In: ICCAD 2018, pp. 23:1–23:8. ACM (2018). https://doi.org/10.1145/3240765.3240796
Sze, V., Chen, Y., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017). https://doi.org/10.1109/JPROC.2017.2761740
Teerapittayanon, S., McDanel, B., Kung, H.T.: BranchyNet: fast inference via early exiting from deep neural networks. In: ICPR, pp. 2464–2469, December 2016. https://doi.org/10.1109/ICPR.2016.7900006
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, p. I, December 2001. https://doi.org/10.1109/CVPR.2001.990517
Wang, X., Yu, F., Dou, Z., Gonzalez, J.E.: SkipNet: Learning dynamic routing in convolutional networks. arXiv preprint: arXiv:1711.09485 (2017)
Wang, Z., Sun, F., Lin, J., Wang, Z., Yuan, B.: SGAD: Soft-guided adaptively-dropped neural network. arXiv preprint: arXiv:1807.01430 (2018)
Wu, S., Li, G., Chen, F., Shi, L.: Training and inference with integers in deep neural networks. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, 30 April–3 May 2018, Conference Track Proceedings (2018)
Acknowledgments
We thank Nissim Halabi, Moni Shahar and Daniel Soudry for useful conversations.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Berestizshevsky, K., Even, G. (2019). Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-30484-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)