Skip to main content
Log in

A new globally convergent algorithm for non-Lipschitz p-q minimization

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

We consider the non-Lipschitz p-q (0 < p < 1 ≤ q < ) minimization problem, which has many applications and is a great challenge for optimization. The problem contains a non-Lipschitz regularization term and a possibly nonsmooth fidelity. In this paper, we present a new globally convergent algorithm, which gradually shrinks the variable support and uses linearization and proximal approximations. The subproblem at each iteration is then convex with increasingly fewer unknowns. By showing a lower bound theory for the sequence generated by our algorithm, we prove that the sequence globally converges to a stationary point of the p-q objective function. Our method can be extended to the p-regularized elastic net model. Numerical experiments demonstrate the performances and flexibilities of the proposed algorithm, such as the applicability to measurements with either Gaussian or heavy-tailed noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137(1–2), 91–129 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian, optimization. SIAM J. Optim. 23(3), 1718–1741 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bolte, J.B., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    Article  MATH  Google Scholar 

  9. Bredies, K., Lorenz, D.A., Reiterer, S.: Minimization of non-smooth, non-convex functionals by iterative thresholding. J. Optim Theory Appl. 165(1), 78–112 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  10. Candès, E. J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted 1 minimization. J. Fourier Anal. Appl. 14(5), 877–905 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math Imaging Vis. 40(1), 120–145 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  12. Chan, R.H., Liang, H.-X.: Half-quadratic algorithm for p- q problems with applications to TV- 1 image restoration and compressive sensing. In: Bruhn, A., Pock, T., Tai, X.-C. (eds.) Efficient Algorithms for Global Optimization Methods in Computer Vision, pp 78–103. Springer, Berlin (2014)

  13. Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal. Proc. Let. 14(10), 707–710 (Oct. 2007)

  14. Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, pp. 3869–3872 (2008)

  15. Chen, X.: Smoothing methods for nonsmooth, nonconvex minimization. Math. Program. 134(1), 71–99 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  16. Chen, X., Niu, L., Yuan, Y.: Optimality conditions and a smoothing trust region newton method for nonLipschitz, optimization. SIAM J. Optim. 23(3), 1528–1552 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  17. Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of 2- p minimization. SIAM J. Sci Comput. 32(5), 2832–2852 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  18. Chen, X., Zhou, W.: Convergence of the reweighted 1 minimization algorithm for 2- p minimization. Comput. Optim Appl. 59(1), 47–61 (2014)

    Article  MathSciNet  Google Scholar 

  19. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm Pure Appl. Math. 57 (11), 1413–1457 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  20. Dielman, T.E.: Least absolute value regression: recent contributions. J. Stat. Comput. Simul. 75(4), 263–286 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  21. Esser, E., Zhang, X., Chan, T.F.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  22. Foucart, S., Lai, M.-J.: Sparsest solutions of underdetermined linear systems via q-minimization for 0 < q ≤ 1. Appl. Comput Harmon. Anal. 26(3), 395–407 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  23. Goldstein, T., Osher, S.: The split Bregman method for L1-regularized problems. SIAM J. Imaging Sci. 2(2), 323–343 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  24. Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Trans. Signal Process. 45(3), 600–616 (1997)

    Article  Google Scholar 

  25. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)

    Book  MATH  Google Scholar 

  26. He, B., Yuan, X.: On the O(1/n) convergence rate of the Douglas-Rachford alternating direction method. SIAM J. Numer. Anal. 50(2), 700–709 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  27. Huber, P.J., Ronchetti, E.M.: Robust Statistics, 2nd edn. Wiley (2009)

  28. Krishnan, D., Fergus, R.: Fast image deconvolution using hyper-Laplacian priors. In: Proc. 22nd Int. Conf. Neural Information Processing Systems, pp. 1033–1041 (2009)

  29. Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier (Grenoble) 48(3), 769–783 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  30. Lai, M.-J., unconstrained, J. Wang.: q minimization with 0 < q ≤ 1 for sparse solution of underdetermined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)

    Article  MathSciNet  Google Scholar 

  31. Lai, M.-J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed q minimization. SIAM J. Numer Anal. 51(2), 927–957 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  32. Lanza, A., Morigi, S., Reichel, L., Sgallari, F.: A generalized Krylov subspace method for p- q minimization. SIAM J. Sci Comput. 37(5), S30–S50 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  33. Laporte, L., Flamary, R., Canu, S., Déjean, S., Mothe, J.: Nonconvex regularizations for feature selection in ranking with sparse SVM. IEEE Trans. Neural Netw. Learn Syst. 25(6), 1118–1130 (2014)

    Article  Google Scholar 

  34. Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. In: Les Équations aux Dérivées Partielles (Paris, 1962), pp. 87–89. Éditions du Centre National de la Recherche Scientifique, Paris (1963)

  35. Lu, Z.: Iterative reweighted minimization methods for p regularized unconstrained nonlinear programming. Math. Program. 147(1), 277–307 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  36. Paredes, J.L., Arce, G.R.: Compressive sensing signal reconstruction by weighted median regression estimates. IEEE Trans. Signal Process. 59(6), 2585–2601 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  37. Price, B.S., Sherwood, B.: A cluster elastic net for multivariate regression. J. Mach. Learn. Res. 18(232), 1–39 (2018)

    MathSciNet  MATH  Google Scholar 

  38. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, Volume 317 of Grundlehren der Mathematischen Wissenschaften. Springer, Berlin (1998)

    Google Scholar 

  39. Shen, Y., Han, B., Braverman, E.: Stability of the elastic net estimator. J Complexity 32(1), 20–39 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  40. Shen, Y., Li, S.: Restricted p-isometry property and its application for nonconvex compressive sensing. Adv. Comput. Math. 37(3), 441–452 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  41. Sun, Q.: Recovery of sparsest signals via q-minimization. Appl. Comput. Harmon. Anal. 32(3), 329–341 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  42. Van den Dries, L., Miller, C., et al.: Geometric categories and o-minimal structures. Duke Math. J 84(2), 497–540 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  43. Wang, H., Pan, J., Su, Z., Liang, S.: Blind image deblurring using elastic-net based rank prior. Comput. Vis. Image Underst. 168, 157–171 (2018)

    Article  Google Scholar 

  44. Wu, C., Tai, X. -C.: Augmented Lagrangian method, dual methods, and split Bregman iteration for ROF, vectorial TV, and high order models. SIAM J. Imaging Sci. 3(3), 300–339 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  45. Xu, Z., Chang, X., Xu, F., Zhang, H.: L 1/2 regularization: A thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn. Syst. 23(7), 1013–1027 (2012)

    Article  Google Scholar 

  46. Yin, W., Osher, S., Goldfarb, D., Darbon, J.: Bregman iterative algorithms for 1-minimization with applications to compressed sensing. SIAM J. Imaging Sci. 1(1), 143–168 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  47. Yukawa, M., Amari, S.-I.: p-regularized least squares (0 < p < 1) and critical path. IEEE Trans. Inform. Theory 62(1), 488–502 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  48. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Statist. Soc. B 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  49. Zou, H., Li, R.: One-step sparse estimates in nonconcave penalized likelihood models. Ann Statist. 36(4), 1509–1533 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  50. Zuo, W., Meng, D., Zhang, L., Feng, X., Zhang, D.: A generalized iterated shrinkage algorithm for non-convex sparse coding. In: Proc. IEEE Int. Conf. Computer Vision, pp. 217–224 (2013)

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grants 11301289, 11531013, and 11871035), Recruitment Program of Global Young Expert, and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunlin Wu.

Additional information

Communicated by: Gitta Kutyniok

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

We recall some definitions and results here.

Definition A.1 (Subdifferentials 38)

Let \(h: \mathbb {R}^{\mathsf {N}} \to \mathbb {R}\cup \{+\infty \}\) be a proper, lower semicontinuous function.

  1. (i)

    The regular subdifferential of h at \(\bar {\mathbf {x}} \in \text {dom} h = \{ \mathbf {x} \in \mathbb {R}^{\mathsf {N}}: h(\mathbf {x}) < +\infty \}\) is defined as

    $$\widehat{\partial}h(\bar{\mathbf{x}} ) := \left\{\mathbf{v} \in\mathbb{R}^{\mathsf{N}}: \liminf_{\begin{array}{llll} \mathbf{x} \to \bar{\mathbf{x}} \\ \mathbf{x} \neq \bar{\mathbf{x}} \end{array}}\frac{h(\mathbf{x})-h(\bar{\mathbf{x}})- \langle \mathbf{v},\mathbf{x} -\bar{\mathbf{x}} \rangle}{\|\mathbf{x}-\bar{\mathbf{x}}\|}\geq 0 \right\}; $$
  2. (ii)

    The (limiting) subdifferential of h at \(\bar {\mathbf {x}} \in \text {dom} h \) is defined as

    $$\partial h(\bar{\mathbf{x}} ):=\left\{\mathbf{v} \in\mathbb{R}^{\mathsf{N}}: \exists \mathbf{x}^{(k)} \!\to\! \bar{\mathbf{x}} , h\left( \mathbf{x}^{(k)}\right) \!\to\! h(\mathbf{x}), \mathbf{v}^{(k)}\in \widehat{\partial} h\left( \mathbf{x}^{(k)}\right), \mathbf{v}^{(k)} \!\to\! \mathbf{v} \right\}. $$

Remark 1

From Definition A.1, the following properties hold:

  1. (i)

    For any \(\bar {\mathbf {x}} \in \text {dom} h \), \(\widehat {\partial }h(\bar {\mathbf {x}} ) \subset \partial h(\bar {\mathbf {x}} )\). If h is continuously differentiable at \(\bar {\mathbf {x}} \), then \(\widehat {\partial }h(\bar {\mathbf {x}} ) = \partial h(\bar {\mathbf {x}} )= \left \{\nabla h(\bar {\mathbf {x}} )\right \}\);

  2. (ii)

    For any \(\bar {\mathbf {x}} \in \text {dom} h \), the subdifferential set \(\partial h(\bar {\mathbf {x}} )\) is closed, i.e.,

    $$\left\{\mathbf{v} \in\mathbb{R}^{\mathsf{N}}:\exists \mathbf{x}^{(k)} \to \bar{\mathbf{x}}, h\left( \mathbf{x}^{(k)}\right) \to h(\bar{\mathbf{x}}), \mathbf{v}^{(k)}\in\partial h\left( \mathbf{x}^{(k)}\right), \mathbf{v}^{(k)} \to \mathbf{v} \right\}\subset \partial h(\bar{\mathbf{x}} ). $$

The fundamental works on the Kurdyka-Łojasiewicz (KL) property are due to Łojasiewicz [34] and Kurdyka [29]. For the development of the applications of KL property in optimization theory, see [1, 2, 5, 7] and references therein.

Definition A.2 (Kurdyka-Łojasiewicz Property 1)

A proper function h is said to have the Kurdyka-Łojasiewicz property at \(\bar {\mathbf {x}} \in \text {dom} \partial h = \{ \mathbf {x} \in \mathbb {R}^{\mathsf {N}}: \partial h(\mathbf {x}) \neq \emptyset \}\) if there exist ζ ∈ (0, +], a neighborhood U of \(\bar {\mathbf {x}}\), and a continuous concave function \(\varphi : [0, \zeta ) \to \mathbb {R}_{+}\) such that

  1. (i)

    φ(0) = 0;

  2. (ii)

    φ(0) is C1 on (0,ζ);

  3. (iii)

    for all s ∈ (0,ζ), φ(s) > 0;

  4. (iv)

    for all xU satisfying \(h(\bar {\mathbf {x}}) < h(\mathbf {x}) < h(\bar {\mathbf {x}}) + \zeta \), the Kurdyka-Łojasiewicz inequality holds:

    $$\varphi^{\prime}(h(\mathbf{x}) - h(\bar{\mathbf{x}})) \text{dist} (0,\partial h(\mathbf{x})) \geq 1. $$

    where dist(0,h(x)) = min{∥v∥ : vh(x)},

A proper, lower semicontinuous function h satisfying the KL property at all points in domh is called a KL function. Some examples can be found in [2].

A rich class of KL functions used widely in practice applications belongs to a special structure called o-minimal structure, which is introduced in [42]. The following definition is adopted from [1, Definition 4.1]

Definition A.3

Let \(\mathcal {O}=\left \{\mathcal {O}_{n}\right \}_{n\in N}\) be such that each \(\mathcal {O}_{n}\) is a collection of subsets of \(\mathbb {R}^{n}\). The family \(\mathcal {O}\) is an o-minimal structure over \(\mathbb {R}\), if it satisfies the following axioms:

  1. (i)

    Each \(\mathcal {O}_{n}\) is a boolean algebra. Namely \(\emptyset \in \mathcal {O}_{n}\) and for each \(A,B\in \mathcal {O}_{n}, A\cup B, A\cap B,\) and \(\mathbb {R}^{n}\setminus A\) belong to \(\mathcal {O}_{n}\).

  2. (ii)

    For all \(A\in \mathcal {O}_{n}, A\times R\) and R × A belong to \(\mathcal {O}_{n + 1}\).

  3. (iii)

    For all \(A\in \mathcal {O}_{n + 1}, \prod (A):=\left \{(x_{1},...,x_{n})\in \mathbb {R}^{n} | (x_{1},...,x_{n},x_{n + 1})\in A\right \}\) belongs to \(\mathcal {O}_{n}\).

  4. (iv)

    For all ij in {1, 2,...,n}, \(\left \{(x_{1},...,x_{n})\in \mathbb {R}^{n} | x_{i}=x_{j}\right \}\) belong to \(\mathcal {O}_{n}\).

  5. (v)

    The set \(\left \{(x_{1},x_{2})\in \mathbb {R}^{2} | x_{1},x_{2}\right \}\) belong to \(\mathcal {O}_{2}\).

  6. (vi)

    The elements of \(\mathcal {O}_{1}\) are exactly finite unions of intervals.

Let \(\mathcal {O}\) be an o-minimal structure on \(\mathbb {R}\). We say that a set \(A\subseteq \mathbb {R}^{n}\) is definable on \(\mathcal {O}\) if \(A\in \mathcal {O}^{n}\) and that a map \(f : \mathbb {R}^{n}\rightarrow \mathbb {R}^{m}\) is definable on \(\mathcal {O}\) if its graph \(\left \{(x,y)\in \mathbb {R}^{n}\times \mathbb {R}^{m}: y \in f(x)\right \}\) is definable on \(\mathcal {O}\). Definable functions can be defined like definable maps. The o-minimal structure has very nice properties. We list some known elementary properties of definable functions below [1]:

  1. (i)

    Finite sum of definable functions are definable.

  2. (ii)

    Compositions of definable functions are definable.

  3. (iii)

    Indicator functions of definable sets are definable.

A class of o-minimal structure is the log-exp structure [42, Example 2.5]. By this structure, the following function are all definable:

  1. 1.

    semi-algebraic function [7, Definition 5], such as real polynomial functions, and \(f : \mathbb {R}\rightarrow \mathbb {R}\) defined by x↦|x|.

  2. 2.

    \(x^{r}: \mathbb {R}\rightarrow \mathbb {R}\) defined by

    $$a \mapsto\left\{\begin{array}{lll} a^{r}, & a>0\\ 0, &a\leq 0. \end{array}\right. $$

    where \(r\in \mathbb {R}\).

  3. 3.

    the exponential function: \(\mathbb {R}\rightarrow \mathbb {R}\) given by xex and the logarithm function: \((0, \infty ) \rightarrow \mathbb {R}\) given by x↦ log(x).

It is known that any proper lower semicontinuous function that is definable on an o-minimal structure is a KL function. See [6]and [1, Theorem 4.1]. Combining the elementary properties (i)(ii) of definable functions and examples (1)(2), one can see that the function \(\mathcal {E}\) in this paper is a KL function.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Wu, C. & Zhao, Y. A new globally convergent algorithm for non-Lipschitz p-q minimization. Adv Comput Math 45, 1369–1399 (2019). https://doi.org/10.1007/s10444-019-09668-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10444-019-09668-y

Keywords

Mathematics Subject Classification (2010)

Navigation