Functional Federated Learning in Erlang (ffl-erl)

Ulm, Gregor; Gustavsson, Emil; Jirstrand, Mats

doi:10.1007/978-3-030-16202-3_10

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11285))

Included in the following conference series:

International Workshop on Functional and Constraint Logic Programming

529 Accesses
6 Citations
2 Altmetric

Abstract

The functional programming language Erlang is well-suited for concurrent and distributed applications, but numerical computing is not seen as one of its strengths. Yet, the recent introduction of Federated Learning, which leverages client devices for decentralized machine learning tasks, while a central server updates and distributes a global model, motivated us to explore how well Erlang is suited to that problem. We present the Federated Learning framework ffl-erl and evaluate it in two scenarios: one in which the entire system has been written in Erlang, and another in which Erlang is relegated to coordinating client processes that rely on performing numerical computations in the programming language C. There is a concurrent as well as a distributed implementation of each case. We show that Erlang incurs a performance penalty, but for certain use cases this may not be detrimental, considering the trade-off between speed of development (Erlang) versus performance (C). Thus, Erlang may be a viable alternative to C for some practical machine learning tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Source code artifacts accompanying this paper are available at https://gitlab.com/fraunhofer_chalmers_centre/functional_federated_learning.
2.
For illustrative purposes, we chose clear code over computationally more efficient code at some points. For instance, the function forward in Code Listing 1.3 constructs a temporary list, which could be avoided by computing the dot product with an accumulator. However, for benchmarking purposes we used more efficient code.
3.
Training normally ends after a given number of iterations or once a predefined error threshold has been met. The latter would make use of the computed error, based on the list Delta, but the corresponding code is omitted as it is not conceptually interesting.
4.
Refer to the section on Natively Implemented Functions (NIFs) in the official Erlang documentation for further details: http://erlang.org/doc/man/erl_nif.html (accessed on June 28, 2018).
5.
The corresponding code repository is located at https://bitbucket.org/nato/yanni (accessed on August 6, 2018).

References

Allison, L.: Models for machine learning and data mining in functional programming. J. Funct. Program. 15(1), 15–32 (2005)
Article MathSciNet Google Scholar
Bauer, H., Goh, Y., Schlink, S., Thomas, C.: The supercomputer in your pocket. McKinsey on Semiconductors, pp. 14–27 (2012)
Google Scholar
Chen, D., Zhao, H.: Data security and privacy protection issues in cloud computing. In: Proceedings of the 2012 International Conference on Computer Science and Electronics Engineering (ICCSEE), vol. 1, pp. 647–651. IEEE (2012)
Google Scholar
Coppola, R., Morisio, M.: Connected car: technologies, issues, future trends. ACM Comput. Surv. (CSUR) 49(3), 1–36 (2016)
Article Google Scholar
Cuccu, G., Togelius, J., Cudre-Mauroux, P.: Playing atari with six neurons. arXiv preprint arXiv:1806.01363 (2018)
Evans-Pughe, C.: The connected car. IEE Rev. 51(1), 42–46 (2005)
Article Google Scholar
Fisher, R., Marshall, M.: Iris Data Set. UC Irvine Machine Learning Repository (1936)
Google Scholar
Gybenko, G.: Approximation by superposition of sigmoidal functions. Math. Control Signals Syst. 2(4), 303–314 (1989)
Article MathSciNet Google Scholar
Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
Article MathSciNet Google Scholar
Johansson, E., Pettersson, M., Sagonas, K.: A high performance Erlang system. In: Proceedings of the 2nd ACM SIGPLAN International Conference on Principles and Practice Of Declarative Programming, pp. 32–43. ACM (2000)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.J.: MNIST handwritten digit database. AT&T Labs (2010). http://yann.lecun.com/exdb/mnist
Lee, J., Kim, C.M.: A roadside unit placement scheme for vehicular telematics networks. In: Kim, T., Adeli, H. (eds.) ACN/AST/ISA/UCMA -2010. LNCS, vol. 6059, pp. 196–202. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13577-4_17
Chapter Google Scholar
Löscher, A., Sagonas, K.: The Nifty way to call hell from heaven. In: Proceedings of the 15th International Workshop on Erlang, pp. 1–11. ACM (2016)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., Hampson, S., et al.: Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016)
Nissen, S.: Implementation of a fast artificial neural network library (FANN). Report, Department of Computer Science University of Copenhagen (DIKU) 31, 29 (2003)
Google Scholar
Orr, G.B., Müller, K.R.: Neural Networks: Tricks of the Trade. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-642-35289-8
Book Google Scholar
Sagonas, K., Pettersson, M., Carlsson, R., Gustafsson, P., Lindahl, T.: All you wanted to know about the HiPE compiler (but might have been afraid to ask). In: Proceedings of the 2003 ACM SIGPLAN Workshop on Erlang, pp. 36–42. ACM (2003)
Google Scholar
Sher, G.I.: Handbook of Neuroevolution Through Erlang. Springer, Heidelberg (2013). https://doi.org/10.1007/978-1-4614-4463-3
Book Google Scholar
Srihari, S.N., Kuebert, E.J.: Integration of hand-written address interpretation technology into the United States postal service remote computer reader system. In: Proceedings of the Fourth International Conference on Document Analysis and Recognition, vol. 2, pp. 892–896. IEEE (1997)
Google Scholar
Tene, O., Polonetsky, J.: Privacy in the age of big data: a time for big decisions. Stan. L. Rev. Online 64, 63–69 (2011)
Google Scholar
Ulm, G., Gustavsson, E., Jirstrand, M.: OODIDA: On-board/Off-board distributed data analytics for connected vehicles. arXiv preprint arXiv:1902.00319 (2019)
Yu, T., Clack, C.: PolyGP: a polymorphic genetic programming system in Haskell. In: Genetic Programming, vol. 98 (1998)
Google Scholar

Download references

Acknowledgements

Our research was financially supported by the project On-board/Off-board Distributed Data Analytics (OODIDA) in the funding program FFI: Strategic Vehicle Research and Innovation (DNR 2016-04260), which is administered by VINNOVA, the Swedish Government Agency for Innovation Systems. It was carried out in the Fraunhofer Cluster of Excellence “Cognitive Internet Technologies.” Adrian Nilsson and Simon Smith assisted with the implementation. Melinda Tóth pointed us to Sher’s work. We also thank our anonymous reviewers for their helpful feedback.

Author information

Authors and Affiliations

Fraunhofer-Chalmers Research Centre for Industrial Mathematics, Chalmers Science Park, 412 88, Gothenburg, Sweden
Gregor Ulm, Emil Gustavsson & Mats Jirstrand
Fraunhofer Center for Machine Learning, Chalmers Science Park, 412 88, Gothenburg, Sweden
Gregor Ulm, Emil Gustavsson & Mats Jirstrand

Authors

Gregor Ulm
View author publications
You can also search for this author in PubMed Google Scholar
Emil Gustavsson
View author publications
You can also search for this author in PubMed Google Scholar
Mats Jirstrand
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gregor Ulm .

Editor information

Editors and Affiliations

Universitat Politècnica de València, Valencia, Spain
Josep Silva

A Mathematical Derivation of Federated Stochastic Gradient Descent

In Sect. 2.2 we briefly describe Federated Stochastic Gradient Descent. In the current section, we present the complete derivation. As a reminder, we stated that in Stochastic Gradient Descent, weights are updated this way:

$$\begin{aligned} w := w - \frac{\eta }{n} \displaystyle \sum _{i=1}^{n} \nabla F_i(w). \end{aligned}$$

(6)

Furthermore, we started with the following equation, which is the objective function we would like to minimize:

$$\begin{aligned} F(w) = \frac{1}{n} \displaystyle \sum _{j=1}^{k} |P_j| F^{j}(w). \end{aligned}$$

(7)

The gradient of $F^{j}$ is expressed in the following formula:

$$\begin{aligned} \nabla F^{j}(w) = \frac{1}{|P_j|} \displaystyle \sum _{i \in P_{j}}^{} \nabla F_i(w), j = 1,\cdots ,k. \end{aligned}$$

(8)

To continue from here, each client updates the weights of the machine learning model the following way:

$$\begin{aligned} w_j = w - \frac{\eta }{|P_j|} \displaystyle \sum _{i \in P_{j}}^{} \nabla F_i(w). \end{aligned}$$

(9)

On the server, the weights of the global model are updated. The original equation can be reformulated in a few steps:

$$\begin{aligned} w :=&\frac{1}{n}\left( \displaystyle \sum _{j=1}^{k} w_j |P_j|\right) \end{aligned}$$

(10)

$$\begin{aligned} =&\frac{1}{n}\displaystyle \sum _{j=1}^{k} \left( w - \frac{\eta }{|P_j|} \displaystyle \sum _{i \in P_{j}}^{} \nabla F_i(w)\right) |P_j| \end{aligned}$$

(11)

$$\begin{aligned} =&\frac{1}{n}\displaystyle \sum _{j=1}^{k} |P_j|w - \frac{1}{n}\eta \displaystyle \sum _{j=1}^{k} \displaystyle \sum _{i \in P_{j}}^{} \nabla F_i(w) \end{aligned}$$

(12)

$$\begin{aligned} =&w - \frac{\eta }{n} \displaystyle \sum _{i=1}^{n} \nabla F_i (w). \end{aligned}$$

(13)

The reformulation in the last line is equivalent to Eq. 6 above. In case the transformation between Eqs. 12 and 13 is unclear, consider that the summand simplifies to

$$\begin{aligned} \frac{1}{n}\displaystyle \sum _{j=1}^{k} |P_j|w = \frac{1}{n} n w = w. \end{aligned}$$

(14)

The second summand in Eq. 12 can be simplified as follows:

$$\begin{aligned} \frac{1}{n}\eta \displaystyle \sum _{j=1}^{k} \displaystyle \sum _{i \in P_{j}}^{} \nabla F_i(w) = \frac{1}{n}\eta \displaystyle \sum _{i = 1}^{n} \nabla F_i(w) = \frac{\eta }{n} \displaystyle \sum _{i = 1}^{n} \nabla F_i(w). \end{aligned}$$

(15)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ulm, G., Gustavsson, E., Jirstrand, M. (2019). Functional Federated Learning in Erlang (ffl-erl). In: Silva, J. (eds) Functional and Constraint Logic Programming. WFLP 2018. Lecture Notes in Computer Science(), vol 11285. Springer, Cham. https://doi.org/10.1007/978-3-030-16202-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-16202-3_10
Published: 16 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16201-6
Online ISBN: 978-3-030-16202-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Functional Federated Learning in Erlang (ffl-erl)

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Mathematical Derivation of Federated Stochastic Gradient Descent

A Mathematical Derivation of Federated Stochastic Gradient Descent

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation