Advertisement

ADMMLIB: A Library of Communication-Efficient AD-ADMM for Distributed Machine Learning

  • Jinyang XieEmail author
  • Yongmei LeiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11783)

Abstract

Alternating direction method of multipliers (ADMM) has recently been identified as a compelling approach for solving large-scale machine learning problems in the cluster setting. To reduce the synchronization overhead in a distributed environment, asynchronous distributed ADMM (AD-ADMM) was proposed. However, due to the high communication overhead in the master-slave architecture, AD-ADMM still cannot scale well. To address this challenge, this paper proposes the ADMMLIB, a library of AD-ADMM for distributed machine learning. We employ a set of network optimization techniques. First, hierarchical communication architecture is utilized. Second, we integrate ring-based allreduce and mixed precision training into ADMMLIB to further effectively reduce the inter-node communication cost. Evaluation with large dataset demonstrates that ADMMLIB can achieve significant speed up, up to 2x, compared to the original AD-ADMM implementation, and the overall communication cost is reduced by 83%.

Keywords

Asynchronous ADMM Consensus optimization Distributed machine learning 

References

  1. 1.
    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)zbMATHGoogle Scholar
  2. 2.
    Chang, T.H., Hong, M., Liao, W.C., Wang, X.: Asynchronous distributed admm for large-scale optimization-part I: algorithm and convergence analysis. IEEE Trans. Signal Process. 64(12), 3118–3130 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Li, M., et al.: Scaling distributed machine learning with the parameter server. In: 11th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 14), pp. 583–598 (2014)Google Scholar
  4. 4.
    Patarasuk, P., Yuan, X.: Bandwidth optimal all-reduce algorithms for clusters of workstations. J. Parallel Distrib. Comput. 69(2), 117–124 (2009)CrossRefGoogle Scholar
  5. 5.
    Sergeev, A., Del Balso, M.: Horovod: fast and easy distributed deep learning in tensorflow. arXiv preprint arXiv:1802.05799 (2018)
  6. 6.
    Wang, S., Lei, Y.: Fast communication structure for asynchronous distributed ADMM under unbalance process arrival pattern. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11139, pp. 362–371. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01418-6_36CrossRefGoogle Scholar
  7. 7.
    Xing, E.P., et al.: Petuum: a new platform for distributed machine learning on big data. IEEE Trans. Big Data 1(2), 49–67 (2015)CrossRefGoogle Scholar
  8. 8.
    Zhang, R., Kwok, J.: Asynchronous distributed ADMM for consensus optimization. In: International Conference on Machine Learning, pp. 1701–1709 (2014)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  1. 1.School of Computer Engineering and ScienceShanghai UniversityShanghaiChina

Personalised recommendations