Learning Sparse Neural Networks via \(\ell _0\) and T\(\ell _1\) by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification

  • Fanghui Xue
  • Jack XinEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 991)


We study sparsification of convolutional neural networks (CNN) by a relaxed variable splitting method of \(\ell _0\) and transformed-\(\ell _1\) (T\(\ell _1\)) penalties, with application to complex curves such as texts written in different fonts, and words written with trembling hands simulating those of Parkinson’s disease patients. The CNN contains 3 convolutional layers, each followed by a maximum pooling, and finally a fully connected layer which contains the largest number of network weights. With \(\ell _0\) penalty, we achieved over 99% test accuracy in distinguishing shaky vs. regular fonts or hand writings with above 86% of the weights in the fully connected layer being zero. Comparable sparsity and test accuracy are also reached with a proper choice of T\(\ell _1\) penalty.


Convolutional neural network Sparsification Multi-scale curves Classification 



The work was partially supported by NSF grant IIS-1632935. The authors would like to thank Profs. Xiang Gao and Wenrui Hao at Penn State Universty for helpful discussions of handwritings and drawings on neuropsychological exams and diagnosis.


  1. 1.
    Blumensath, T., Davies, M.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 14(5–6), 629–654 (2008)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Daubechies, I., Michel, D., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Dinh, T., Xin, J.: Convergence of a relaxed variable splitting method for learning sparse neural networks via \(\ell _1\), \(\ell _0\), and transformed-\(\ell _1\) penalties (2018). arXiv:1812.05719
  4. 4.
    Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through \(\ell _0\) regularization. In: ICLR (2018). arXiv:1712.01312v2
  5. 5.
    Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. In: Seventh International Conference on Document Analysis and Recognition, pp. 958–963. IEEE (2003)Google Scholar
  6. 6.
    Yin, P., Zhang, S., Lyu, J., Osher, S., Qi, Y-Y., Xin, J.: Blended coarse gradient descent for full quantization of deep neural networks. Res. Math. Sci. 6(1), 14 (2019). arXiv:1808.05240
  7. 7.
    Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Signals and Communication Technology. Springer, New York (2015)Google Scholar
  8. 8.
    Zhang, S., Xin, J.: Minimization of transformed \( l_1 \) penalty: closed form representation and iterative thresholding algorithms. Comm. Math. Sci. 15(2), 511–537 (2017)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of MathematicsUC IrvineIrvineUSA

Personalised recommendations