Distributed Logistic Regression for Separated Massive Data

Shi, Peishen; Wang, Puyu; Zhang, Hai

doi:10.1007/978-981-15-1899-7_20

Peishen Shi¹²,
Puyu Wang¹² &
Hai Zhang^12,13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1120))

Included in the following conference series:

CCF Conference on Big Data

1294 Accesses
1 Citations

Abstract

In this paper, we study the distributed logistic regression to process the separated large scale data which is stored in different linked computers. Based on the Alternating Direction Method of Multipliers (ADMM) algorithm, we transform the solving of logistic problem into the multistep iteration process, and propose the distributed logistic algorithm which has controllable communication cost. Specifically, in each iteration of the distributed algorithm, each computer updates the local estimators and interacts the local estimators with the neighbors simultaneously. Then we prove the convergence of distributed logistic algorithm. Due to the decentralized property of computer network, the proposed distributed logistic algorithm is robust. The classification results of our distributed logistic method are same as the non-distributed approach. Numerical studies have shown that our approach are both effective and efficient which perform well in distributed massive data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mcdonald, R., Mohri, M., Silberman, N., Walker, D., Mann, G.: Efficient large-scale distributed training of conditional maximum entropy models. Advances in Neural Information Processing Systems, vol. 1, pp. 1231–1239. NIPS, La Jolla (2009)
Google Scholar
McDonald, R., Hall, K., Mann, G.: Distributed training strategies for the structured perceptron. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 456–464. ACL, Los Angeles (2010)
Google Scholar
Zhang, Y., Duchi, J., Wainwright, M.: Communication-efficient algorithms for statistical optimization. J. Mach. Learn. Res. 14(1), 3321–3363 (2013)
MathSciNet MATH Google Scholar
Zhang, Y., Duchi, J., Wainwright, M.: Divide and conquer Kernel ridge regression: a distributed algorithm with minimax optimal rates. J. Mach. Learn. Res. 30(1), 592–617 (2013)
MATH Google Scholar
Mateos, G., Bazerque, J., Giannakis, G.: Distributed sparse linear regression. IEEE Trans. Signal Process. 58(10), 5262–5276 (2010)
Article MathSciNet Google Scholar
Wang, P., Zhang, H., Liang, Y.: Model selection with distributed SCAD penalty. J. Appl. Stat. 45(11), 1938–1955 (2017)
Article MathSciNet Google Scholar
Wang J., Kolar M., Srebro N., Zhang T.: Efficient distributed learning with sparsity. In: International Conference on Machine Learning, vol. 70, pp. 3636–3645. PMLR, Sydney (2017)
Google Scholar
Menendez, M.L., Pardo, L., Pardo, M.C.: Preliminary $phi$-divergence test estimators for linear restrictions in a logistic regression model. Stat. Pap. 50(2), 277–300 (2009)
Article MathSciNet Google Scholar
Pardo, J.A., Pardo, L., Pardo, M.C.: Minimum $\phi -$divergence estimator in logistic regression models. Stat. Pap. 47(1), 91–108 (2006)
Article MathSciNet Google Scholar
Revan, O.M.: Iterative algorithms of biased estimation methods in binary logistic regression. Stat. Pap. 57(4), 991–1016 (2016)
Article MathSciNet Google Scholar
Lange, T., Mosler, K., Mozharovskyi, P.: Fast nonparametric classification based on data depth. Stat. Pap. 55(1), 49–69 (2015)
Article MathSciNet Google Scholar
Boyd, S., Parikh, N., Chu, E.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Article Google Scholar
Xie, P., Jin, K., Xing, E.: Distributed machine learning via sufficient factor broadcasting. Arxiv, http://arxiv.org/abs/1409.5705. Accessed 7 Sep 2015
Gopal, S., Yang, Y.: Distributed training of large-scale logistic models. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 289–297. PMLR, Atlanta (2013)
Google Scholar
Peng, H., Liang, D., Choi, C.: Evaluating parallel logistic regression models. In: IEEE International Conference on Big Data, pp. 119–126. IEEE, Silicon Valley (2013)
Google Scholar
Kang, D., Lim, W., Shin, K.: Data/feature distributed stochastic coordinate descent for logistic regression. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1269–1278. ACM, Shanghai (2014)
Google Scholar
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2(1), 17–40 (1976)
Article Google Scholar
Glowinski, R., Marroco, A.: On the solution of a class of non linear Dirichlet problems by a penalty-duality method and finite elements of order one. In: Marchuk, G.I. (ed.) Optimization Techniques IFIP Technical Conference. LNCS, pp. 327–333. Springer, Berlin (1974). https://doi.org/10.1007/978-3-662-38527-2_45
Chapter Google Scholar
Bertsekas, D., Tsitsiklis, J.: Parallel and Distributed Computation: Numerical Methods, 2nd edn. Athena Scientific, Belmont (1997)
MATH Google Scholar
Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019)
Article MathSciNet Google Scholar
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. U.S.A. 96(12), 6745–6750 (1999)
Article Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

School of Mathematics, Northwest University, Xi’an, 710127, Shaanxi, China
Peishen Shi, Puyu Wang & Hai Zhang
Faculty of Information Technology and State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macau, China
Hai Zhang

Authors

Peishen Shi
View author publications
You can also search for this author in PubMed Google Scholar
Puyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hai Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai Zhang .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, China
Hai Jin
East China Normal University, Shanghai, China
Xuemin Lin
Chinese Academy of Sciences, Beijing, China
Xueqi Cheng
Huazhong University of Science and Technology, Wuhan, China
Xuanhua Shi
National University of Defense Technology, Changsha, China
Nong Xiao
Nanjing University, Nanjing, China
Yihua Huang

Appendix

1.1 ADMM algorithm

Consider the following optimization problem

$$\begin{aligned} \underset{R, \theta }{\min }\ f(\theta )+h(R) \end{aligned}$$

(2.4)

$$s.t.\ MR+M^{'}\theta =0,$$

First, we form the quadratically augmented Lagrangian function

$$\begin{aligned} L(R,\theta ,V)=f(\theta )+h(R)+\langle V, MR+M^{'}\theta \rangle +\frac{c}{2}\Vert MR+M^{'}\theta \Vert _2^2, \end{aligned}$$

(2.5)

where $V:=\{V_{j}\}_{j\in \mathcal {C}}$ is Lagrange multipliers, and $c>0$ is a preselected penalty coefficient. Then we use ADMM algorithm to solve this problem.

ADMM algorithm entails three steps per iteration

Step 1: R updates,

$$\begin{aligned} R^{k+1}=\underset{R}{\arg \min }\ L(R,\theta ^{k},V^{k}). \end{aligned}$$

(A.1)

Step 2: $\theta $ updates,

$$\begin{aligned} \theta ^{k+1}=\underset{\theta }{\arg \min }\ L(R^{k+1},\theta ,V^{k}). \end{aligned}$$

(A.2)

Step 3: V updates,

$$\begin{aligned} V^{k+1}= V^{k}+c(MR^{k+1}+M^{'}\theta ^{k+1}). \end{aligned}$$

(A.3)

For details, in iteration $k+1$, we update $R^{k+1}$ via minimum $L(R,\theta ^{k},V^{k})$ with respect to R, update $\theta $ via minimum $L(R^{k+1},\theta ,V^{k})$ with respect to $\theta $, and update Lagrange multiplier via $V^{k+1}=V^{(k)}+c(MR^{(k+1)}+M^{'}\theta ^{(k+1)})$, until the algorithm converges.

1.2 Proof of Theorem 2

Proof

Theorem 2 can be considered as a special case of [20]. This paper analyzed the convergence of ADMM for minimizing possible nonconvex objective problem. It’s enough to show that our distributed optimization problem satisfies the convergence conditions A1-A5 in [20]. It’s obvious that A1 holds. Since both M and M$^{'}$ are full rank, then A2 and A5 are established. And $f(\theta )$ is Lipschitz differentiable with constant $L_f$, A4 holds. For any $j\in J$, and any $R_j$, $R_j^{'}$, (2.6) holds, thus A3 established. Then Theorem 2 has been proved.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shi, P., Wang, P., Zhang, H. (2019). Distributed Logistic Regression for Separated Massive Data. In: Jin, H., Lin, X., Cheng, X., Shi, X., Xiao, N., Huang, Y. (eds) Big Data. BigData 2019. Communications in Computer and Information Science, vol 1120. Springer, Singapore. https://doi.org/10.1007/978-981-15-1899-7_20

Download citation

DOI: https://doi.org/10.1007/978-981-15-1899-7_20
Published: 28 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1898-0
Online ISBN: 978-981-15-1899-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Distributed Logistic Regression for Separated Massive Data

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 ADMM algorithm

1.2 Proof of Theorem 2

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation