Learning Sparse Neural Networks via \(\ell _0\) and T\(\ell _1\) by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification
We study sparsification of convolutional neural networks (CNN) by a relaxed variable splitting method of \(\ell _0\) and transformed-\(\ell _1\) (T\(\ell _1\)) penalties, with application to complex curves such as texts written in different fonts, and words written with trembling hands simulating those of Parkinson’s disease patients. The CNN contains 3 convolutional layers, each followed by a maximum pooling, and finally a fully connected layer which contains the largest number of network weights. With \(\ell _0\) penalty, we achieved over 99% test accuracy in distinguishing shaky vs. regular fonts or hand writings with above 86% of the weights in the fully connected layer being zero. Comparable sparsity and test accuracy are also reached with a proper choice of T\(\ell _1\) penalty.
KeywordsConvolutional neural network Sparsification Multi-scale curves Classification
The work was partially supported by NSF grant IIS-1632935. The authors would like to thank Profs. Xiang Gao and Wenrui Hao at Penn State Universty for helpful discussions of handwritings and drawings on neuropsychological exams and diagnosis.
- 3.Dinh, T., Xin, J.: Convergence of a relaxed variable splitting method for learning sparse neural networks via \(\ell _1\), \(\ell _0\), and transformed-\(\ell _1\) penalties (2018). arXiv:1812.05719
- 4.Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through \(\ell _0\) regularization. In: ICLR (2018). arXiv:1712.01312v2
- 5.Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. In: Seventh International Conference on Document Analysis and Recognition, pp. 958–963. IEEE (2003)Google Scholar
- 6.Yin, P., Zhang, S., Lyu, J., Osher, S., Qi, Y-Y., Xin, J.: Blended coarse gradient descent for full quantization of deep neural networks. Res. Math. Sci. 6(1), 14 (2019). arXiv:1808.05240
- 7.Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Signals and Communication Technology. Springer, New York (2015)Google Scholar