Distributed Block Coordinate Descent for Minimizing Partially Separable Functions

Mareček, Jakub; Richtárik, Peter; Takáč, Martin

doi:10.1007/978-3-319-17689-5_11

Jakub Mareček⁴,
Peter Richtárik⁵ &
Martin Takáč⁶

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 134))

1911 Accesses
10 Citations
4 Altmetric

Abstract

A distributed randomized block coordinate descent method for minimizing a convex function of a huge number of variables is proposed. The complexity of the method is analyzed under the assumption that the smooth part of the objective function is partially block separable. The number of iterations required is bounded by a function of the error and the degree of separability, which extends the results in Richtárik and Takác (Parallel Coordinate Descent Methods for Big Data Optimization, Mathematical Programming, DOI:10.1007/s10107-015-0901-6) to a distributed environment. Several approaches to the distribution and synchronization of the computation across a cluster of multi-core computer are described and promising computational results are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Note that \(\xi \in \{\lceil \tfrac{\omega }{\mathit{C}}\rceil,\ldots,\omega \}\).
2.
In fact, \(\vert \hat{\mathit{Z}}\vert = \mathit{C}\tau\) with probability 1.
3.
For the start of the algorithm we define \(\delta g_{l}^{(\mathit{c})} =\delta G_{l}^{(\mathit{c})} = \mathbf{0}\) for all l < 0.

References

Alham, N.K., Li, M., Liu, Y., Hammoud, S.: A MapReduce-based distributed SVM algorithm for automatic image annotation. Comput. Math. Appl. 62(7), 2801–2811 (2011)
Article MATH Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall Inc., Upper Saddle River (1989)
MATH Google Scholar
Chang, E.Y., Zhu, K., Wang, H., Bai, H., Li, J., Qiu, Z., Cui, H.: PSVM: parallelizing support vector machines on distributed computers. Adv. Neural Inf. Process. Syst. 20, 1–18 (2007)
Google Scholar
Fercoq, O., Richtárik, P.: Accelerated, parallel and proximal coordinate descent. arXiv:1312.5799 (2013)
Google Scholar
Fercoq, O., Richtárik, P.: Smooth minimization of nonsmooth functions with parallel coordinate descent methods. arXiv:1309.5885 (2013)
Google Scholar
Fercoq, O., Qu, Z., Richtárik, P., Takáč, M.: Fast distributed coordinate descent for non-strongly convex losses. In: IEEE Workshop on Machine Learning for Signal Processing (2014)
Book Google Scholar
Ge, D., Jiang, X., Ye, Y.: A note on the complexity of ℓ _p minimization. Math. Program. 129(2), 285–299 (2011)
Article MathSciNet MATH Google Scholar
Hsieh,C.-J., Chang, K.-W., Lin, C.-J., Sathiya Keerthi, S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pp. 408–415. ACM, New York (2008)
Google Scholar
InfiniBand Trade Association: InfiniBand Architecture Specification, vol. 1, Release 1.0 (2005)
Google Scholar
Jaggi, M., Smith, V., Takáč, M., Terhorst, J., Hofmann, T., Jordan, M.I.: Communication-efficient distributed dual coordinate ascent. In: Advances in Neural Information Processing Systems, vol. 27, 3068–3076 (NIPS 2014) http://papers.nips.cc/book/advances-in-neural-information-processing-systems-27-2014
Lee, Y.T., Sidford, A.: Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems. In: 54th Annual Symposium on Foundations of Computer Science. IEEE, New York (2013)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
LIBSVM Data: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html. Accessed 25 Oct 2014
Liu, J., Wright, S.J., Ré, C., Bittorf, V.: An asynchronous parallel stochastic coordinate descent algorithm. arXiv:1311.1873 (2013)
Google Scholar
Lu, Z., Xiao, L.: On the complexity analysis of randomized block-coordinate descent methods. arXiv:1305.4723 (2013)
Google Scholar
Luo, Z.Q., Tseng, P.: On the convergence of the coordinate descent method for convex differentiable minimization J. Optim. Theory Appl. 72(1), pp 7–35 (1992) http://link.springer.com/article/10.1007%2FBF00939948?LI=true
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Article MathSciNet MATH Google Scholar
Necoara, I., Clipici, D.: Distributed coordinate descent methods for composite minimization. arXiv:1312.5302 (2013)
Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization. Applied Optimization, vol. 87. Kluwer, Boston (2004)
Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
Article MathSciNet MATH Google Scholar
Niu, F., Recht, B., Ré, C., Wright, S.J.: Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. Adv. Neural Inf. Process. Syst. 24, 693–701 (2011)
Google Scholar
OpenMP Architecture Review Board: OpenMP Application Program Interface (2011)
Google Scholar
Patrascu, A., Necoara, I.: Random coordinate descent methods for ℓ ₀ regularized convex optimization. arXiv:1403.6622 (2014)
Google Scholar
Richtárik, P., Takáč, M.: Efficient serial and parallel coordinate descent methods for huge-scale truss topology design. In: Operations Research Proceedings 2011, pp. 27–32. Springer, New York (2012)
Google Scholar
Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Mathematical Programming, DOI:10.1007/s10107-015-0901-6 (2012)
Google Scholar
Richtárik, P., Takáč, M.: Distributed coordinate descent method for learning with big data. arXiv:1310.2059 (2013)
Google Scholar
Richtárik, P., Takáč, M.: On optimal probabilities in stochastic coordinate descent methods. arXiv:1310.3438 (2013)
Google Scholar
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1–2), 1–38 (2014)
Article MathSciNet MATH Google Scholar
Saha, A., Tewari, A.: On the finite time convergence of cyclic coordinate descent methods. SIAM J. Optim. 23(1), 576–601 (2013)
Article MathSciNet MATH Google Scholar
Salleh, N.S.M., Suliman, A., Ahmad, A.R.: Parallel execution of distributed SVM using MPI (CoDLib). In: Information Technology and Multimedia (ICIM), pp. 1–4. IEEE (2011)
Google Scholar
Scherrer, C., Tewari, A., Halappanavar, M., Haglin, D.: Feature clustering for accelerating parallel coordinate descent. Adv. Neural Inf. Process. Syst. 25, 28–36 (2012)
Google Scholar
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss. J. Mach. Learn. Res. 14(1), 567–599 (2013)
MathSciNet MATH Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N., Cotter, A.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
Article MathSciNet MATH Google Scholar
Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI-The Complete Reference, Volume 1: The MPI Core, 2nd (revised) edn. MIT Press, Cambridge, MA (1998)
Google Scholar
Takáč, M., Bijral, A.S., Richtárik, P., Srebro, N.: Mini-batch primal and dual methods for SVMs. J. Mach. Learn. Res. W&CP 28, 1022–1030 (2013)
Google Scholar
Tappenden, R., Richtárik, P., Gondzio, J: Inexact coordinate descent: complexity and preconditioning. arXiv:1304.5530 (2013)
Google Scholar
Tappenden, R., Richtárik, P., Büke, B.: Separable approximations and decomposition methods for the augmented lagrangian. Optim. Methods Softw. arXiv:1308.6774 (2015) Doi:10.1080/10556788.2014.966824
Google Scholar
Tappenden, R., Richtárik, P., Takáč, M.: On the complexity of parallel coordinate descent. Technical Report ERGO 15-001, The University of Edinburgh (2015) http://arxiv.org/abs/1503.03033
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117(1), 387–423 (2008)
MathSciNet Google Scholar
Tseng, P., Yun, S.: Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization. J. Optim. Theory Appl. 140, 513–535 (2009)
Article MathSciNet MATH Google Scholar
Zhao, P., Zhang, T.: Stochastic optimization with importance sampling. arXiv:1401.2753 (2014)
Google Scholar

Download references

Acknowledgements

The first author was supported by EPSRC grant EP/I017127/1 (Mathematics for Vast Digital Resources) in 2012 and by the EU FP7 INSIGHT project (318225) subsequently. The second author was supported by EPSRC grant EP/I017127/1. The third author was supported by the Centre for Numerical Algorithms and Intelligent Software, funded by EPSRC grant EP/G036136/1 and the Scottish Funding Council.

Author information

Authors and Affiliations

IBM Research – Ireland, Dublin, Ireland
Jakub Mareček
School of Mathematics, University of Edinburgh, Edinburgh, UK
Peter Richtárik
Department of Industrial & Systems Engineering, Lehigh University, Bethlehem, PA, USA
Martin Takáč

Authors

Jakub Mareček
View author publications
You can also search for this author in PubMed Google Scholar
Peter Richtárik
View author publications
You can also search for this author in PubMed Google Scholar
Martin Takáč
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Richtárik .

Editor information

Editors and Affiliations

Department of Mathematics College of Science, Sultan Qaboos University, Muscat, Oman
Mehiddin Al-Baali
Faculty of Engineering, University of Calabria, Arcavacada di Rende, Italy
Lucio Grandinetti
Department Mathematics and Statistics College of Science, Sultan Qaboos University, Muscat, Oman
Anton Purnama

Notation Glossary

Table 3

Full size table

Table 4

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mareček, J., Richtárik, P., Takáč, M. (2015). Distributed Block Coordinate Descent for Minimizing Partially Separable Functions. In: Al-Baali, M., Grandinetti, L., Purnama, A. (eds) Numerical Analysis and Optimization. Springer Proceedings in Mathematics & Statistics, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-319-17689-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-17689-5_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17688-8
Online ISBN: 978-3-319-17689-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Distributed Block Coordinate Descent for Minimizing Partially Separable Functions

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Notation Glossary

Notation Glossary

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation