Abstract
This paper mainly investigates the locally Lipschitz optimization problem (LLOP) with \(l_0\)-regularization in a finite dimensional space, which is generally NP-hard but highly applicable in statistics, compressed sensing and deep learning. First, we introduce two classes of stationary points for this problem: subdifferential-stationary point and proximal-stationary point. Secondly, based on these two concepts, we analyze the first-order necessary/sufficient optimality conditions for the LLOP with \(l_0\)-regularization. Finally, we present two examples to illustrate the validity of the proposed optimality conditions.
Similar content being viewed by others
References
Allen-Zhu, Z., Hazan, E.: Variance reduction for faster non-convex optimization. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 699–707 (2016)
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
Beck, A., Hallak, N.: Proximal mapping for symmetric penalty and sparsity. SIAM J. Optim. 28(1), 496–527 (2018)
Bian, W., Chen, X.J.: A Smoothing proximal gradient algorithm for nonsmooth convex regression with cardinality penalty. SIAM J. Numer. Anal. 58(1), 858–883 (2020)
Blumensath, T.: Compressed sensing with nonlinear observations and related nonlinear optimization problems. IEEE Trans. Inf. Theory 59(6), 3466–3474 (2013)
Blumensath, T., Davies, M.E.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appli. 14(5–6), 629–654 (2008)
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmonic Anal. 27(3), 265–274 (2009)
Chen, Y.Q., Xiu, N.H., Peng, D.T.: Global solutions of non-Lipschitz \(S_{2}CS_{p}\) minimization over the positive semidefinite cone. Optim. Lett. 8(7), 2053–2064 (2013)
Chen, X.J., Pan, L.L., Xiu, N.H.: Relationship between three sparse optimization problems for multivariate regression. Submitted 1–32 (2019)
Chib, S.: Bayes inference in the Tobit censored regression model. J. Econom. 51(1–2), 79–99 (1992)
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, Hoboken (1983)
Clarke, F.H.: Methods of Dynamic and Nonsmooth Optimization, CBMS-NSF Regional Conference Series in Applied Mathmatics, vol. 57. SIAM Publications, Philadelphia (1989)
Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51(42), 4203–4215 (2005)
Chen, X.J., Ge, D.D., Wang, Z.Z., et al.: Complexity of unconstrained \(L_2-L_p\) minimization. Math. Program. 143(1–2), 371–383 (2014)
Cuim, Y., Pangm, J.S., Senm, B.: Composite difference-max programs for modern statistical estimation problems. SIAM J. Optim. 28(4), 3344–3374 (2018)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Guo, L., Ye, J.J.: Necessary optimality conditions and exact penalization for non-Lipschitz nonlinear programs. Math. Program. 168(1–2), 571–598 (2018)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. Artif. Intell. Stat. 15, 315–323 (2011)
Hinton, G.E.: Rectified linear units improve restricted boltzmann machines Vinod Nair. In: International Conference on International Conference on Machine Learning. Omnipress (2010)
Hossein, R., Ajmal, M., Mubarak, S.: Learning a deep model for human action recognition from novel viewpoints. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 667–681 (2017)
Cho, K., Van Merrienboer, B., Gulcehre, C., et al.: Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)
Le, H.Y.: Generalized subdifferentials of the rank function. Optim. Lett. 7(4), 731–743 (2013)
Liu, J., Cosman, P.C., Rao, B.D.: Robust linear regression via \(l_0\) regularization. IEEE Trans. Signal Process. 66(3), 698–713 (2017)
Lu, Z.S., Zhang, Y.: Sparse approximation via penalty decomposition methods. SIAM J. Optim. 23(4), 2448–2478 (2013)
Lu, Z.S.: Iterative reweighted minimization methods for \(l_p\)-regularized unconstrained nonlinear programming. Math. Program. 147(1–2), 277–307 (2014)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Mordukhovich, B.S.: Variational Analysis and Application. Springer, Berlin (2018)
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Nikolova, M.: Relationship between the optimal solutions of least squares regularized with \(l_0\)-norm and constrained by k-sparsity. Appl. Comput. Harmonic Anal. 41(1), 237–265 (2016)
Powell, J.L.: Least absolute deviations estimation for the censored regression model. J. Econom. 25(3), 303–325 (1984)
Rockafellar, R.T., Wets, R.J.: Variational Analysis. Springer, Berlin (1998)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Thorarinsdottir, T.L., Gneiting, T.: Probabilistic forecasts of wind speed: ensemble model output statistics by using heteroscedastic censored regression. J. R. Stat. Soc. Ser. A (Stat. Soc.) 173(2), 371–388 (2010)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, Burlington (2000)
Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach, Signals and Communications Technology. Springer, Berlin (2015)
Yuan, X.T., Liu, Q.S.: Newton greedy pursuit: a quadratic approximation method for sparsity-constrained optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4122–4129 (2014)
Yuan, X.T., Liu, Q.S.: Newton-type greedy selection methods for \(l_0\)-constrained minimization. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2437–2450 (2017)
Wang, R., Xiu, N., Zhang, C.: Greedy Projected Gradient-Newton Method for Sparse Logistic Regression. IEEE Transactions on Neural Networks and Learning Systems 31(2), 527–538 (2020)
Zhou, S.L., Xiu, N.H., Qi, H.D.: Global and Quadratic Convergence of Newton Hard-Thresholding Pursuit. arXiv preprint arXiv:1901.02763 (2019)
Zhang, N., Li, Q.: On optimal solutions of the constrained \(l_0\) regularization and its penalty problem. Inverse Probl. 33(2), 025010 (2017)
Acknowledgements
The authors would like to thank the associate editor and two anonymous referees for their constructive comments, which have significantly improved the quality of the paper. This work is supported by the National Natural Science Foundation of China (No. 11971052) and (No. 11801325).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, H., Pan, L. & Xiu, N. Optimality conditions for locally Lipschitz optimization with \(l_0\)-regularization. Optim Lett 15, 189–203 (2021). https://doi.org/10.1007/s11590-020-01579-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-020-01579-y