Progressive Neural Architecture Search

  • Chenxi LiuEmail author
  • Barret Zoph
  • Maxim Neumann
  • Jonathon Shlens
  • Wei Hua
  • Li-Jia Li
  • Li Fei-Fei
  • Alan Yuille
  • Jonathan Huang
  • Kevin Murphy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11205)


We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.



We thank Quoc Le for inspiration, discussion and support; George Dahl for many fruitful discussions; Gabriel Bender, Vijay Vasudevan for the development of much of the critical infrastructure and the larger Google Brain team for the support and discussions. CL also thanks Lingxi Xie for support.

Supplementary material

474172_1_En_2_MOESM1_ESM.pdf (545 kb)
Supplementary material 1 (pdf 545 KB)


  1. 1.
    Baisero, A., Pokorny, F.T., Ek, C.H.: On a family of decomposable kernels on sequences. CoRR abs/1501.06284 (2015)Google Scholar
  2. 2.
    Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: ICLR (2017)Google Scholar
  3. 3.
    Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. CoRR abs/1705.10823 (2017)Google Scholar
  4. 4.
    Brock, A., Lim, T., Ritchie, J.M., Weston, N.: SMASH: one-shot model architecture search through hypernetworks. In: ICLR (2018)Google Scholar
  5. 5.
    Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: AAAI (2018)Google Scholar
  6. 6.
    Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NIPS (2017)Google Scholar
  7. 7.
    Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., Yang, S.: AdaNet: adaptive structural learning of artificial neural networks. In: ICML (2017)Google Scholar
  8. 8.
    Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  9. 9.
    Devries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017)Google Scholar
  10. 10.
    Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. IJCAI (2015)Google Scholar
  11. 11.
    Dong, J.D., Cheng, A.C., Juan, D.C., Wei, W., Sun, M.: PPP-Net: platform-aware progressive search for pareto-optimal neural architectures. In: ICLR Workshop (2018)Google Scholar
  12. 12.
    Elsken, T., Metzen, J.H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks. CoRR abs/1711.04528 (2017)Google Scholar
  13. 13.
    Grosse, R.B., Salakhutdinov, R., Freeman, W.T., Tenenbaum, J.B.: Exploiting compositionality to explore a large space of model structures. In: UAI (2012)Google Scholar
  14. 14.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)Google Scholar
  15. 15.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR abs/1709.01507 (2017)Google Scholar
  16. 16.
    Huang, F., Ash, J.T., Langford, J., Schapire, R.E.: Learning deep resnet blocks sequentially using boosting theory. CoRR abs/1706.04964 (2017)Google Scholar
  17. 17.
    Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). Scholar
  18. 18.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  19. 19.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)Google Scholar
  20. 20.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). Scholar
  21. 21.
    Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018)Google Scholar
  22. 22.
    Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. In: ICLR (2017)Google Scholar
  23. 23.
    Mendoza, H., Klein, A., Feurer, M., Springenberg, J.T., Hutter, F.: Towards Automatically-Tuned neural networks. In: ICML Workshop on AutoML, pp. 58–65, December 2016Google Scholar
  24. 24.
    Miikkulainen, R., et al.: Evolving deep neural networks. CoRR abs/1703.00548 (2017)Google Scholar
  25. 25.
    Negrinho, R., Gordon, G.J.: DeepArchitect: automatically designing and training deep architectures. CoRR abs/1704.08792 (2017)Google Scholar
  26. 26.
    Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. CoRR abs/1802.03268 (2018)Google Scholar
  27. 27.
    Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. CoRR abs/1802.01548 (2018)Google Scholar
  28. 28.
    Real, E., et al.: Large-scale evolution of image classifiers. In: ICML (2017)Google Scholar
  29. 29.
    Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017)Google Scholar
  30. 30.
    Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)CrossRefGoogle Scholar
  31. 31.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS (2012)Google Scholar
  32. 32.
    Stanley, K.O.: Neuroevolution: a different kind of deep learning, July 2017Google Scholar
  33. 33.
    Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
  34. 34.
    Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)zbMATHGoogle Scholar
  35. 35.
    Xie, L., Yuille, A.L.: Genetic CNN. In: ICCV (2017)Google Scholar
  36. 36.
    Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)Google Scholar
  37. 37.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. CoRR abs/1707.01083 (2017)Google Scholar
  38. 38.
    Zhang, X., Li, Z., Loy, C.C., Lin, D.: PolyNet: a pursuit of structural diversity in very deep networks. In: CVPR (2017)Google Scholar
  39. 39.
    Zhong, Z., Yan, J., Liu, C.L.: Practical network blocks design with Q-learning. In: AAAI (2018)Google Scholar
  40. 40.
    Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017)Google Scholar
  41. 41.
    Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Chenxi Liu
    • 1
    Email author
  • Barret Zoph
    • 2
  • Maxim Neumann
    • 2
  • Jonathon Shlens
    • 2
  • Wei Hua
    • 2
  • Li-Jia Li
    • 2
  • Li Fei-Fei
    • 2
    • 3
  • Alan Yuille
    • 1
  • Jonathan Huang
    • 2
  • Kevin Murphy
    • 2
  1. 1.Johns Hopkins UniversityBaltimoreUSA
  2. 2.Google AIMountain ViewUSA
  3. 3.Stanford UniversityStanfordUSA

Personalised recommendations