Abstract
We propose accelerated randomized coordinate descent algorithms for stochastic optimization and online learning. Our algorithms have significantly less per-iteration complexity than the known accelerated gradient algorithms. The proposed algorithms for online learning have better regret performance than the known randomized online coordinate descent algorithms. Furthermore, the proposed algorithms for stochastic optimization exhibit as good convergence rates as the best known randomized coordinate descent algorithms. We also show simulation results to demonstrate performance of the proposed algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The proposed algorithm does not need the distribution of \(\xi \).
- 2.
We allow \(\mu = 0\) to accommodate general convex loss functions. Strong convexity warrants \(\mu > 0\).
- 3.
The Young’s inequality states that \(\langle x,y\rangle \le \frac{\Vert x \Vert ^2}{2a} + \frac{a \Vert y \Vert ^2}{2}\) for any \(a > 0\).
References
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341–362 (2012)
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 928–936 (2003)
Hu, C., Pan, W., Kwok, J.: Accelerated gradient methods for stochastic optimization and online learning. Advances in Neural Information Processing Systems, pp. 781–789 (2009)
Le Roux, N., Schmidt, M., Bach, F.: A stochastic gradient method with an exponential convergence rate for finite training sets. In: Neural Information Processing Systems (2012)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Advances in Neural Information Processing Systems, pp. 315–323 (2013)
Langford, J., Smola, A., Zinkevich, M.: Slow learners are fast. Adv. Neural Inf. Process. Syst. 22, 2331–2339 (2009)
McMahan, B., Streeter, M.: Delay-tolerant algorithms for asynchronous distributed online learning. In: Advances in Neural Information Processing Systems, pp. 2915–2923 (2014)
Luo, Z.-Q., Tseng, P.: On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 72, 735 (2002)
Tseng, P.: Convergence of a block coordinate descent method for non differentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)
Fercoq, O., Richtarik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015)
Singh, C., Nedic, A., Srikant, R.: Random block-coordinate gradient projection algorithms. In: Decision and Control (CDC), pp. 185–190. IEEE (2014)
Allen-Zhu, Z., Qu, Z., Richtarik, P., Yuan, Y.: Even faster accelerated coordinate descent using non-uniform sampling. In: International Conference on Machine Learning, pp. 1110-1119 (2016)
Deng, Q., Lan, G., Rangarajan, A.: Randomized block subgradient methods for convex nonsmooth and stochastic optimization (2015)
Wang, H., Banerjee, A.: Randomized block coordinate descent for online and stochastic optimization (2014)
Hua, X., Kadomoto, S., Yamshita, N.: Regret analysis of block coordinate gradient methods for online convex programming (2015)
Zhao, T., Yu, M., Wang, Y., Arora, R., Liu, H.: Accelerated mini-batch randomized block coordinate descent method. In: Advances in neural information processing systems, pp. 3329–3337 (2014)
Zhang, A., Gu, Q.: Accelerated stochastic block coordinate descent with optimal sampling. In: KDD, pp. 2035–2044 (2016)
Nathan, A., Klabjan, D.: Optimization for large-scale machine learning with distributed features and observations. In: Perner, P. (ed.) MLDM 2017. LNCS (LNAI), vol. 10358, pp. 132–146. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62416-7_10
Konecny, J., McMahan, H., Ramage, D., Richtarik, P.: Federated optimization: distributed machine learning for on-device intelligence (2016)
Bhandari, A., Singh, C.: Accelerated randomized coordinate descent algorithms for stochastic optimization and online learning (2018). arXiv:1806.01600
Acknowledgments
The second author acknowledges support of INSPIRE Faculty Research Grant (DSTO-1363).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bhandari, A., Singh, C. (2019). Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning. In: Battiti, R., Brunato, M., Kotsireas, I., Pardalos, P. (eds) Learning and Intelligent Optimization. LION 12 2018. Lecture Notes in Computer Science(), vol 11353. Springer, Cham. https://doi.org/10.1007/978-3-030-05348-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-05348-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05347-5
Online ISBN: 978-3-030-05348-2
eBook Packages: Computer ScienceComputer Science (R0)