Learning Sparse Neural Networks via $$\ell _0$$ and T $$\ell _1$$ by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification

Xue, Fanghui; Xin, Jack

doi:10.1007/978-3-030-21803-4_80

Fanghui Xue¹⁷ &
Jack Xin¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 991))

Included in the following conference series:

World Congress on Global Optimization

1728 Accesses
2 Citations

Abstract

We study sparsification of convolutional neural networks (CNN) by a relaxed variable splitting method of $\ell _0$ and transformed-$\ell _1$ (T$\ell _1$) penalties, with application to complex curves such as texts written in different fonts, and words written with trembling hands simulating those of Parkinson’s disease patients. The CNN contains 3 convolutional layers, each followed by a maximum pooling, and finally a fully connected layer which contains the largest number of network weights. With $\ell _0$ penalty, we achieved over 99% test accuracy in distinguishing shaky vs. regular fonts or hand writings with above 86% of the weights in the fully connected layer being zero. Comparable sparsity and test accuracy are also reached with a proper choice of T$\ell _1$ penalty.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
When generating the figure, we used a tool by Alex Lenail available at http://alexlenail.me/NN-SVG/LeNet.html.
2.
https://en.wikipedia.org/wiki/Micrographia_(handwriting).
3.
https://www.dafont.com/parkinsons.font.

References

Blumensath, T., Davies, M.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 14(5–6), 629–654 (2008)
Google Scholar
Daubechies, I., Michel, D., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57(11), 1413–1457 (2004)
Google Scholar
Dinh, T., Xin, J.: Convergence of a relaxed variable splitting method for learning sparse neural networks via $\ell _1$, $\ell _0$, and transformed-$\ell _1$ penalties (2018). arXiv:1812.05719
Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through $\ell _0$ regularization. In: ICLR (2018). arXiv:1712.01312v2
Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. In: Seventh International Conference on Document Analysis and Recognition, pp. 958–963. IEEE (2003)
Google Scholar
Yin, P., Zhang, S., Lyu, J., Osher, S., Qi, Y-Y., Xin, J.: Blended coarse gradient descent for full quantization of deep neural networks. Res. Math. Sci. 6(1), 14 (2019). arXiv:1808.05240
Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach. Signals and Communication Technology. Springer, New York (2015)
Google Scholar
Zhang, S., Xin, J.: Minimization of transformed $ l_1 $ penalty: closed form representation and iterative thresholding algorithms. Comm. Math. Sci. 15(2), 511–537 (2017)
Google Scholar

Download references

Acknowledgements

The work was partially supported by NSF grant IIS-1632935. The authors would like to thank Profs. Xiang Gao and Wenrui Hao at Penn State Universty for helpful discussions of handwritings and drawings on neuropsychological exams and diagnosis.

Author information

Authors and Affiliations

Department of Mathematics, UC Irvine, Irvine, CA, 92697, USA
Fanghui Xue & Jack Xin

Authors

Fanghui Xue
View author publications
You can also search for this author in PubMed Google Scholar
Jack Xin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jack Xin .

Editor information

Editors and Affiliations

Computer science and Applications Department, LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai An Le Thi
Computer Science and Applications Department, LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai Minh Le
Laboratory of Mathematics, National Institute for Applied Sciences (INSA)-Rouen Normadie, Saint-Étienne-du-Rouvray Cedex, France
Tao Pham Dinh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xue, F., Xin, J. (2020). Learning Sparse Neural Networks via $\ell _0$ and T$\ell _1$ by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification. In: Le Thi, H., Le, H., Pham Dinh, T. (eds) Optimization of Complex Systems: Theory, Models, Algorithms and Applications. WCGO 2019. Advances in Intelligent Systems and Computing, vol 991. Springer, Cham. https://doi.org/10.1007/978-3-030-21803-4_80

Download citation

DOI: https://doi.org/10.1007/978-3-030-21803-4_80
Published: 15 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21802-7
Online ISBN: 978-3-030-21803-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Learning Sparse Neural Networks via \(\ell _0\) and T\(\ell _1\) by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning Sparse Neural Networks via \(\ell _0\) and T\(\ell _1\) by a Relaxed Variable Splitting Method with Application to Multi-scale Curve Classification

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation