NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

  • Tien-Ju YangEmail author
  • Andrew Howard
  • Bo Chen
  • Xiao Zhang
  • Alec Go
  • Mark Sandler
  • Vivienne Sze
  • Hartwig Adam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11214)


This work proposes an algorithm, called NetAdapt, that automatically adapts a pre-trained deep neural network to a mobile platform given a resource budget. While many existing algorithms simplify networks based on the number of MACs or weights, optimizing those indirect metrics may not necessarily reduce the direct metrics, such as latency and energy consumption. To solve this problem, NetAdapt incorporates direct metrics into its adaptation algorithm. These direct metrics are evaluated using empirical measurements, so that detailed knowledge of the platform and toolchain is not required. NetAdapt automatically and progressively simplifies a pre-trained network until the resource budget is met while maximizing the accuracy. Experiment results show that NetAdapt achieves better accuracy versus latency trade-offs on both mobile CPU and mobile GPU, compared with the state-of-the-art automated network simplification algorithms. For image classification on the ImageNet dataset, NetAdapt achieves up to a 1.7\(\times \) speedup in measured inference latency with equal or higher accuracy on MobileNets (V1&V2).


  1. 1.
    Audet, C., Dennis Jr., J.E.: A progressive barrier for derivative-free nonlinear programming. SIAM J. Optim. 20(1), 445–472 (2009)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Chen, Y.H., Emer, J., Sze, V.: Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. In: Proceedings of the 43rd Annual International Symposium on Computer Architecture (ISCA) (2016)Google Scholar
  3. 3.
    Chen, Y.H., Krishna, T., Emer, J., Sze, V.: Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circuits 52, 127–138 (2016)CrossRefGoogle Scholar
  4. 4.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)Google Scholar
  5. 5.
    Gordon, A., Eban, E., Nachum, O., Chen, B., Yang, T.J., Choi, E.: Morphnet: fast & simple resource-constrained structure learning of deep networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  6. 6.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)Google Scholar
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  8. 8.
    He, Y., Han, S.: ADC: automated deep compression and acceleration with reinforcement learning. arXiv preprint arXiv:1802.03494 (2018)
  9. 9.
    Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  10. 10.
    Hu, H., Peng, R., Tai, Y.W., Tang, C.K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250 (2016)
  11. 11.
    Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Advances in Neural Information Processing Systems, pp. 4107–4115 (2016)Google Scholar
  12. 12.
    Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. arXiv preprint arXiv:1712.05877 (2017)
  13. 13.
    Kim, Y.D., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530 (2015)
  14. 14.
    Le Cun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: Advances in Neural Information Processing Systems (1990)Google Scholar
  15. 15.
    Lai, L., Suda, N., Chandra, V.: Not all ops are created equal! In: SysML (2018)Google Scholar
  16. 16.
    Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient transfer learning. arXiv preprint arXiv:1611.06440 (2016)
  17. 17.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). Scholar
  18. 18.
    Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.C.: Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  19. 19.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  20. 20.
    Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:1507.06149 (2015)
  21. 21.
    Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017). Scholar
  22. 22.
  23. 23.
    Yang, Z., et al.: Deep fried convnets. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1476–1483 (2015)Google Scholar
  24. 24.
    Yang, T.-J., Chen, Y.-H., Emer, J., Sze, V.: A method to estimate the energy consumption of deep neural networks. In: Asilomar Conference on Signals, Systems and Computers (2017)Google Scholar
  25. 25.
    Yang, T.-J., Chen, Y.-H., Sze, V.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  26. 26.
    Yu, J., Lukefahr, A., Palframan, D., Dasika, G., Das, R., Mahlke, S.: Scalpel: customizing DNN pruning to the underlying hardware parallelism. In: Proceedings of the 44th Annual International Symposium on Computer Architecture (2017)CrossRefGoogle Scholar
  27. 27.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. arXiv preprint arXiv:1707.01083 (2017)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Massachusetts Institute of TechnologyCambridgeUSA
  2. 2.Google Inc.Mountain ViewUSA

Personalised recommendations