Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks

  • Chong LiEmail author
  • C. J. Richard Shi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11214)


We present COBLA—Constrained Optimization Based Low-rank Approximation—a systematic method of finding an optimal low-rank approximation of a trained convolutional neural network, subject to constraints in the number of multiply-accumulate (MAC) operations and the memory footprint. COBLA optimally allocates the constrained computation resources into each layer of the approximated network. The singular value decomposition of the network weight is computed, then a binary masking variable is introduced to denote whether a particular singular value and the corresponding singular vectors are used in low-rank approximation. With this formulation, the number of the MAC operations and the memory footprint are represented as linear constraints in terms of the binary masking variables. The resulted 0–1 integer programming problem is approximately solved by sequential quadratic programming. COBLA does not introduce any hyperparameter. We empirically demonstrate that COBLA outperforms prior art using the SqueezeNet and VGG-16 architecture on the ImageNet dataset.


Low-rank approximation Resource allocation Constrained optimization Integer relaxiation 



The authors would like to thank the anonymous reviewers, particularly Reviewer 3, for their highly constructive advice. This work is supported by an Intel/Semiconductor Research Corporation Ph.D. Fellowship.


  1. 1.
    Alvarez, J.M., Salzmann, M.: Compression-aware training of deep networks. In: Neural Information Processing Systems (2017).
  2. 2.
  3. 3.
    Dai, Y.H.: Convergence properties of the BFGS algoritm. SIAM J. Optim. 13(3), 693–701 (2002). Scholar
  4. 4.
    Dai, Y.H., Schittkowski, K.: A sequential quadratic programming algorithm with non-monotone line search. Pac. J. Optim. 4, 335–351 (2008)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Gavish, M., Donoho, D.L.: The optimal hard threshold for singular values is \(4/\sqrt{3}\). IEEE Trans. Inf. Theory 60(8), 5040–5053 (2014). Scholar
  6. 6.
    Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points - online stochastic gradient for tensor decomposition. J. Mach. Learn. Res. 40 (2015)Google Scholar
  7. 7.
    Gower, R.M., Goldfarb, D., Richtarik, P.: Stochastic block BFGS: squeezing more curvature out of data. In: International Conference on Machine Learning (2016). Scholar
  8. 8.
    Han, S., Mao, H., Dally, W.J.: Deep compression - compressing deep neural networks with pruning, trained quantization and huffman coding. In: International Conference on Learning Representations (2016)Google Scholar
  9. 9.
    Ioannou, Y., Robertson, D., Shotton, J., Cipolla, R., Criminisi, A.: Training CNNs with low-rank filters for efficient image classification. In: International Conference on Learning Representations (2016).
  10. 10.
    Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. ArXiv (2017).
  11. 11.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: British Machine Vision Conference (BMVC) (2014).,
  12. 12.
    Keutzer, F.N.I., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Kurt: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 MB model size. In: International Conference on Learning Representations (2017)., Scholar
  13. 13.
    Kim, Y.D., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. In: International Conference on Learning Representations (2016).
  14. 14.
    Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)Google Scholar
  15. 15.
    Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In: International Conference on Learning Representations (2015).
  16. 16.
    Lebedev, V., Lempitsky, V.: Fast ConvNets using group-wise brain damage. In: Conference on Computer Vision and Pattern Recognition (2016).,
  17. 17.
    Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (2013).,
  18. 18.
    Mokhtari, A.: Efficient methods for large-scale empirical risk minimization. Ph.D. thesis, University of Pennsylvania (2017)Google Scholar
  19. 19.
    MOSEK: the MOSEK optimization toolbox for MATLAB manual. Technical report (2017)Google Scholar
  20. 20.
    Nakajima, S., Tomioka, R., Sugiyama, M., Babacan, S.D.: Condition for perfect dimensionality recovery by variational bayesian PCA. J. Mach. Learn. Reas. 16, 3757–3811 (2016)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Novikov, A., Vetrov, D., Podoprikhin, D., Osokin, A.: Tensorizing neural networks. In: Neural Information Processing Systems (2015),
  22. 22.
    Nowak, I.: Relaxation and Decomposition Methods for Mixed Integer Nonlinear Programming. Birkhäuser Basel (2005).
  23. 23.
    Olga, R., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (2015)Google Scholar
  24. 24.
    Park, E., Ahn, J., Yoo, S.: Weighted-entropy-based quantization for deep neural networks. In: Conference on Computer Vision and Pattern Recognition (2017).
  25. 25.
    Raghavan, P., Tompson, C.: Randomized rounding: a technique for provably good algorithms and algorithmic proofs. Combinatorica 7(4), 365–374 (1987). Scholar
  26. 26.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)Google Scholar
  27. 27.
    Tai, C., Xiao, T., Zhang, Y., Wang, X., E, W.: Convolutional neural networks with low-rank regularization. In: International Conference on Learning Representations (2016).
  28. 28.
    Yu, X., Liu, T., Wang, X., Tao, D.: On compressing deep models by low rank and sparse decomposition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
  29. 29.
    Zhang, J., Mitliagkas, I., Ré, C.: YellowFin and the art of momentum tuning. arXiv preprint (2017).
  30. 30.
    Zhang, X., Zou, J., Ming, X., He, K., Sun, J.: Efficient and accurate approximations of nonlinear convolutional networks. In: Conference on Computer Vision and Pattern Recognition (2015).
  31. 31.
    Zhou, G.: Rank-constrained optimization: a Riemannian manifold approach. Ph.D. thesis, Florida State University (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA

Personalised recommendations