Advertisement

Stochastic extra-gradient based alternating direction methods for graph-guided regularized minimization

  • Qiang Lan
  • Lin-bo Qiao
  • Yi-jie Wang
Article
  • 19 Downloads

Abstract

In this study, we propose and compare stochastic variants of the extra-gradient alternating direction method, named the stochastic extra-gradient alternating direction method with Lagrangian function (SEGL) and the stochastic extra-gradient alternating direction method with augmented Lagrangian function (SEGAL), to minimize the graph-guided optimization problems, which are composited with two convex objective functions in large scale. A number of important applications in machine learning follow the graph-guided optimization formulation, such as linear regression, logistic regression, Lasso, structured extensions of Lasso, and structured regularized logistic regression. We conduct experiments on fused logistic regression and graph-guided regularized regression. Experimental results on several genres of datasets demonstrate that the proposed algorithm outperforms other competing algorithms, and SEGAL has better performance than SEGL in practical use.

Key words

Stochastic optimization Graph-guided minimization Extra-gradient method Fused logistic regression Graph-guided regularized logistic regression 

CLC number

TP311 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azadi S, Sra S, 2014. Towards an optimal stochastic alternating direction method of multipliers. Int Conf on Machine Learning, p.620–628.Google Scholar
  2. Boyd S, Parikh N, Chu E, et al., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn, 3(1):1–122. https://doi.org/10.1561/2200000016 CrossRefzbMATHGoogle Scholar
  3. Cortes C, Vapnik V, 1995. Support-vector networks. Mach Learn, 20(3):273–297. https://doi.org/10.1023/A:1022627411411 zbMATHGoogle Scholar
  4. Gao X, Jiang B, Zhang S, 2017. On the information-adaptive variants of the ADMM: an iteration complexity perspective. J Sci Comput, 76(1):327–363. https://doi.org/10.1007/s10915-017-0621-6 MathSciNetCrossRefGoogle Scholar
  5. Hastie T, Tibshirani R, Friedman J, 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, New York, USA.CrossRefzbMATHGoogle Scholar
  6. Hsieh CJ, Sustik MA, Dhillon IS, et al., 2013. BIG & QUIC: sparse inverse covariance estimation for a million variables. Advances in Neural Information Processing Systems, p.3165–3173.Google Scholar
  7. Johnson R, Zhang T, 2013. Accelerating stochastic gradient descent using predictive variance reduction. Advances in Neural Information Processing Systems, p.315–323.Google Scholar
  8. Lin T, Ma S, Zhang S, 2015. An extra-gradient-based alternating direction method for convex minimization. Found Comput Math, 17(1):35–59. https://doi.org/10.1007/s10208-015-9282-8 CrossRefGoogle Scholar
  9. Lin T, Qiao L, Zhang T, et al., 2018. Stochastic primaldual proximal extra-gradient descent for compositely regularized optimization. Neurocomputing, 273:516–525. https://doi.org/10.1016/j.neucom.2017.07.066 CrossRefGoogle Scholar
  10. Ouyang H, He N, Tran L, et al., 2013. Stochastic alternating direction method of multipliers. Int Conf on Machine Learning, p.80–88.Google Scholar
  11. Qiao LB, Zhang BF, Su JS, et al., 2017. A systematic review of structured sparse learning. Front Inform Technol Electron Eng, 18(4):445–463. https://doi.org/10.1631/FITEE.1601489 CrossRefGoogle Scholar
  12. Suzuki T, 2013. Dual averaging and proximal gradient descent for online alternating direction multiplier method. Int Conf on Machine Learning, p.392–400.Google Scholar
  13. Tibshirani R, 1996. Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B, 1:267–288.MathSciNetzbMATHGoogle Scholar
  14. Tibshirani R, Saunders M, Rosset S, et al., 2005. Sparsity and smoothness via the fused Lasso. J R Stat Soc Ser B, 67:91–108. https://doi.org/10.1111/j.1467-9868.2005.00490.x MathSciNetCrossRefzbMATHGoogle Scholar
  15. Wang H, Banerjee A, 2013. Online alternating direction method (longer version). arXiv Preprint, 1306.3721.Google Scholar
  16. Yang JF, Yuan XM, 2013. Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math Comput, 82(281):301–329. https://doi.org/10.1090/S0025-5718-2012-02598-1 MathSciNetCrossRefzbMATHGoogle Scholar
  17. Zhao P, Yang J, Zhang T, et al., 2015. Adaptive stochastic alternating direction method of multipliers. Int Conf on Machine Learning, p.69–77.Google Scholar
  18. Zhong W, Kwok JT, 2013. Fast stochastic alternating direction method of multipliers. Int Conf on Machine Learning, p.46–54.Google Scholar

Copyright information

© Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of ComputerNational University of Defense TechnologyChangshaChina
  2. 2.National Laboratory for Parallel and Distributed ProcessingNational University of Defense TechnologyChangshaChina

Personalised recommendations