Robust Distributed Training of Linear Classifiers Based on Divergence Minimization Principle

Komiyama, Junpei; Oiwa, Hidekazu; Nakagawa, Hiroshi

doi:10.1007/978-3-662-44851-9_1

Junpei Komiyama²³,
Hidekazu Oiwa²³ &
Hiroshi Nakagawa²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8725))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4026 Accesses

Abstract

We study a distributed training of a linear classifier in which the data is separated into many shards and each worker only has access to its own shard. The goal of this distributed training is to utilize the data of all shards to obtain a well-performing linear classifier. The iterative parameter mixture (IPM) framework (Mann et al., 2009) is a state-of-the-art distributed learning framework that has a strong theoretical guarantee when the data is clean. However, contamination on shards, which sometimes arises in real world environments, largely deteriorates the performances of the distributed training. To remedy the negative effect of the contamination, we propose a divergence minimization principle for the weight determination in IPM. From this principle, we can naturally derive the Beta-IPM scheme, which leverages the power of robust estimation based on the beta divergence. A mistake/loss bound analysis indicates the advantage of our Beta-IPM in contaminated environments. Experiments with various datasets revealed that, even when 80% of the shards are contaminated, Beta-IPM can suppress the influence of the contamination.

Download to read the full chapter text

Chapter PDF

Online Learning of a Weighted Selective Naive Bayes Classifier with Non-convex Optimization

Robust finite mixture regression for heterogeneous targets

Article 17 April 2018

A Personalized Federated Learning Algorithm for One-Class Support Vector Machine: An Application in Anomaly Detection

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Aberdeen, D., Pacovsky, O., Slater, A.: The learning behind gmail priority inbox. In: LCCC: NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds (2010)
Google Scholar
Basu, A., Harris, I.R., Hjort, N.L., Jones, M.C.: Robust and efficient estimation by minimising a density power divergence. Biometrika 85(3), 549–559 (1998)
Article MATH MathSciNet Google Scholar
Chouvardas, S., Slavakis, K., Theodoridis, S.: Adaptive robust distributed learning in diffusion sensor networks. IEEE Transactions on Signal Processing 59(10), 4692–4707 (2011)
Article MathSciNet Google Scholar
Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: NIPS, pp. 281–288 (2006)
Google Scholar
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. Journal of Machine Learning Research 7, 551–585 (2006)
MATH MathSciNet Google Scholar
Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weight vectors. Machine Learning 91(2), 155–187 (2013)
Article MATH MathSciNet Google Scholar
Curtsinger, C., Livshits, B., Zorn, B.G., Seifert, C.: Zozzle: Fast and precise in-browser javascript malware detection. In: USENIX Security Symposium (2011)
Google Scholar
Daumé III, H., Phillips, J.M., Saha, A., Venkatasubramanian, S.: Efficient protocols for distributed classification and optimization. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds.) ALT 2012. LNCS, vol. 7568, pp. 154–168. Springer, Heidelberg (2012)
Chapter Google Scholar
Dekel, O., Shamir, O., Xiao, L.: Learning to classify with missing and corrupted features. Mach. Learn. 81(2), 149–178 (2010)
Article MathSciNet Google Scholar
Djuric, N., Grbovic, M., Vucetic, S.: Distributed confidence-weighted classification on mapreduce. In: IEEE Bigdata (2013)
Google Scholar
Eguchi, S., Kano, Y.: Robustifying maximum likelihood estimation. Technical report, Institute of Statistical Mathematics (June 2001)
Google Scholar
Gimpel, K., Das, D., Smith, N.A.: Distributed asynchronous online learning for natural language processing. In: CoNLL 2010, pp. 213–222 (2010)
Google Scholar
Gong, P., Ye, J., Zhang, C.: Robust multi-task feature learning. In: KDD, pp. 895–903 (2012)
Google Scholar
Hall, K.B., Inc, G., Gilpin, S., Mann, G.: Mapreduce/bigtable for distributed optimization. In: LCCC: NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds (2010)
Google Scholar
Hoi, S.C.H., Wang, J., Zhao, P.: Exact soft confidence-weighted learning. In: ICML (2012)
Google Scholar
Mann, G., McDonald, R.T., Mohri, M., Silberman, N., Walker, D.: Efficient large-scale distributed training of conditional maximum entropy models. In: NIPS, pp. 1231–1239 (2009)
Google Scholar
McDonald, R., Hall, K., Mann, G.: Distributed training strategies for the structured perceptron. In: NAACL, HLT 2010, pp. 456–464 (2010)
Google Scholar
Meyer, T.A., Whateley, B.: Spambayes: Effective open-source, bayesian based, email classification system. In: CEAS (2004)
Google Scholar
Rahm, E., Do, H.H.: Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin 23, 2000 (2000)
Google Scholar
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65(6), 386–408 (1958)
Article MathSciNet Google Scholar
Runnalls, A.R.: A kullback-leibler approach to gaussian mixture reduction. IEEE Trans. Aerosp. Electron. Syst., 989–999 (2007)
Google Scholar
Tsitsiklis, J., Bertsekas, D., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Transactions on Automatic Control 31(9), 803–812 (1986)
Article MATH MathSciNet Google Scholar
Xu, H., Leng, C.: Robust multi-task regression with grossly corrupted observations. Journal of Machine Learning Research - Proceedings Track 22, 1341–1349 (2012)
Google Scholar
Zinkevich, M., Smola, A.J., Langford, J.: Slow learners are fast. In: NIPS, pp. 2331–2339 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Junpei Komiyama, Hidekazu Oiwa & Hiroshi Nakagawa

Authors

Junpei Komiyama
View author publications
You can also search for this author in PubMed Google Scholar
Hidekazu Oiwa
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Nakagawa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences,Department of Computer and Decision Engineering, Université Libre de Bruxelles, Av. F. Roosevelt, CP 165/15, 1050, Brussels, Belgium
Toon Calders
Dipartimento di Informatica, Università degli Studi “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Floriana Esposito
Department of Computer Science, Universität Paderborn, Warburger Str. 100, 33098, Paderborn, Germany
Eyke Hüllermeier
Dipartimento di Informatica, Università degli Studi di Torino, Corso Svizzera 185, 10149, Torino, Italy
Rosa Meo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Komiyama, J., Oiwa, H., Nakagawa, H. (2014). Robust Distributed Training of Linear Classifiers Based on Divergence Minimization Principle. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44851-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-44851-9_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44850-2
Online ISBN: 978-3-662-44851-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Distributed Training of Linear Classifiers Based on Divergence Minimization Principle

Abstract

Chapter PDF

Similar content being viewed by others

Online Learning of a Weighted Selective Naive Bayes Classifier with Non-convex Optimization

Robust finite mixture regression for heterogeneous targets

A Personalized Federated Learning Algorithm for One-Class Support Vector Machine: An Application in Anomaly Detection

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Robust Distributed Training of Linear Classifiers Based on Divergence Minimization Principle

Abstract

Chapter PDF

Similar content being viewed by others

Online Learning of a Weighted Selective Naive Bayes Classifier with Non-convex Optimization

Robust finite mixture regression for heterogeneous targets

A Personalized Federated Learning Algorithm for One-Class Support Vector Machine: An Application in Anomaly Detection

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation