Skip to main content

Federated Optimization with Linear-Time Approximated Hessian Diagonal

  • Conference paper
  • First Online:
Pattern Recognition and Machine Intelligence (PReMI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14301))

Abstract

Federated learning (FL) or federated optimization is a type of distributed optimization where multiple clients collaboratively train a global model without sharing local data. One of the key challenge in FL is the communication overhead due to slow convergence of the global model. In this paper, we propose a federated learning algorithm to handle this slow convergence by incorporating Hessian diagonal while training client’s models. To reduce the computational and memory complexity in local clients, we introduce a linear time Hessian diagonal approximation technique by using only the first row of the Hessian. Our extensive experiments show that our proposed method outperforms state-of-the-art FL algorithms, FedAvg, FedProx, SCAFFOLD and DONE in terms of training loss, test loss and test accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agarwal, N., Bullins, B., Hazan, E.: Second-order stochastic optimization for machine learning in linear time. J. Mach. Learn. Res. 18, 116:1–116:40 (2017)

    Google Scholar 

  2. Battiti, R.: First- and second-order methods for learning: between steepest descent and newton’s method. Neural Comput. 4(2), 141–166 (1992)

    Article  Google Scholar 

  3. Dinh, C.T., et al.: DONE: distributed approximate newton-type method for federated edge learning. IEEE Trans. Parallel Distrib. Syst. 33(11), 2648–2660 (2022)

    Google Scholar 

  4. Gao, L., Fu, H., Li, L., Chen, Y., Xu, M., Xu, C.: FedDC: federated learning with Non-IID data via local drift decoupling and correction. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 10102–10111. IEEE (2022)

    Google Scholar 

  5. Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S.J., Stich, S.U., Suresh, A.T.: SCAFFOLD: stochastic controlled averaging for federated learning. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13–18 July 2020, Virtual Event. Proceedings of Machine Learning Research, vol. 119, pp. 5132–5143. PMLR (2020)

    Google Scholar 

  6. Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 10713–10722 (2021)

    Google Scholar 

  7. Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: FedDANE: a federated newton-type method. CoRR abs/2001.01920 (2020)

    Google Scholar 

  8. Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. In: Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, 2–4 March 2020. mlsys.org (2020)

    Google Scholar 

  9. Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of FedAvg on Non-IID data. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020

    Google Scholar 

  10. Ma, X., et al.: Fedsso: a federated server-side second-order optimization algorithm. CoRR abs/2206.09576 (2022)

    Google Scholar 

  11. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, 20–22 April 2017, Fort Lauderdale, FL, USA, vol. 54, pp. 1273–1282. PMLR (2017)

    Google Scholar 

  12. Qian, X., Islamov, R., Safaryan, M., Richtárik, P.: Basis matters: better communication-efficient second order methods for federated learning. In: Camps-Valls, G., Ruiz, F.J.R., Valera, I. (eds.) International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28–30 March 2022, Virtual Event. Proceedings of Machine Learning Research, vol. 151, pp. 680–720. PMLR (2022)

    Google Scholar 

  13. Safaryan, M., Islamov, R., Qian, X., Richtárik, P.: FedNL: making newton-type methods applicable to federated learning. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., Sabato, S. (eds.) International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research, vol. 162, pp. 18959–19010. PMLR (2022)

    Google Scholar 

  14. Shamir, O., Srebro, N., Zhang, T.: Communication-efficient distributed optimization using an approximate newton-type method. In: Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21–26 June 2014, vol. 32, pp. 1000–1008 (2014)

    Google Scholar 

  15. Sun, S., Spall, J.C.: SPSA method using diagonalized hessian estimate. In: 58th IEEE Conference on Decision and Control, CDC 2019, Nice, France, 11–13 December 2019, pp. 4922–4927. IEEE (2019)

    Google Scholar 

  16. Tan, A.Z., Yu, H., Cui, L., Yang, Q.: Towards personalized federated learning. CoRR abs/2103.00710 (2021)

    Google Scholar 

  17. Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective inconsistency problem in heterogeneous federated optimization. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6–12 December 2020, virtual (2020)

    Google Scholar 

  18. Wang, S., Roosta-Khorasani, F., Xu, P., Mahoney, M.W.: GIANT: globally improved approximate newton method for distributed optimization. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3–8 December 2018, Montréal, Canada, pp. 2338–2348 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mrinmay Sen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sen, M., Mohan, C.K., Qin, A.K. (2023). Federated Optimization with Linear-Time Approximated Hessian Diagonal. In: Maji, P., Huang, T., Pal, N.R., Chaudhury, S., De, R.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2023. Lecture Notes in Computer Science, vol 14301. Springer, Cham. https://doi.org/10.1007/978-3-031-45170-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45170-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45169-0

  • Online ISBN: 978-3-031-45170-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics