Skip to main content

Scattered Data and Aggregated Inference

  • Chapter
  • First Online:
  • 4383 Accesses

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

Abstract

Scattered Data and Aggregated Inference (SDAI) represents a class of problems where data cannot be at a centralized location, while modeling and inference is pursued. Distributed statistical inference is a technique to tackle a type of the above problem, and has recently attracted enormous attention. Many existing work focus on the averaging estimator, e.g., Zhang et al. (2013) and many others. In this chapter, we propose a one-step approach to enhance a simple-averaging based distributed estimator. We derive the corresponding asymptotic properties of the newly proposed estimator. We find that the proposed one-step estimator enjoys the same asymptotic properties as the centralized estimator. The proposed one-step approach merely requires one additional round of communication in relative to the averaging estimator; so the extra communication burden is insignificant. In finite-sample cases, numerical examples show that the proposed estimator outperforms the simple averaging estimator with a large margin in terms of the mean squared errors. A potential application of the one-step approach is that one can use multiple machines to speed up large-scale statistical inference with little compromise in the quality of estimators. The proposed method becomes more valuable when data can only be available at distributed machines with limited communication bandwidth. We discuss other types of SDAI problems at the end.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Arjevani Y, Shamir O (2015) Communication complexity of distributed convex learning and optimization. Technical report. http://arxiv.org/abs/1506.01900. Accessed 28 Oct 2015

  • Balcan M-F, Blum A, Fine S, Mansour Y (2012) Distributed learning, communication complexity and privacy. https://arxiv.org/abs/1204.3514. Accessed 25 May 2012

  • Balcan M-F, Kanchanapally V, Liang Y, Woodruff D (2014) Improved distributed principal component analysis. Technical report. http://arxiv.org/abs/1408.5823. Accessed 23 Dec 2014

  • Battey H, Fan J, Liu H, Lu J, Zhu Z (2015) Distributed estimation and inference with statistical guarantees. https://arxiv.org/abs/1509.05457. Accessed 17 Sept 2015

  • Bickel PJ (1975) One-step Huber estimates in the linear model. J Am Stat Assoc 70(350):428–434

    Article  MathSciNet  Google Scholar 

  • Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122

    Article  Google Scholar 

  • Bradley JK, Kyrola A, Bickson D, Guestrin C (2011) Parallel coordinate descent for L1-regularized loss minimization. In Proceedings of 28th international conference on Machine Learning. https://arxiv.org/abs/1105.5379. Accessed 26 May 2011

  • Chen X, Xie M-g (2014) A split-and-conquer approach for analysis of extraordinarily large data. Stat Sin 24:1655–1684

    Google Scholar 

  • Chen S, Donoho DL, Saunders MA (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61

    Article  MathSciNet  Google Scholar 

  • Cichocki A, Amari S-I, Zdunek R, Phan AH (2009) Non-negative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. Wiley-Blackwell, Hoboken

    Google Scholar 

  • Corbett JC, Dean J, Epstein M, Fikes A, Frost C, Furman JJ, Ghemawat S, Gubarev A, Heiser C, Hochschild P et al. (2012) Spanner: Googles globally distributed database. In: Proceedings of the USENIX symposium on operating systems design and implementation

    Google Scholar 

  • Dekel O, Gilad-Bachrach R, Shamir O, Xiao L (2012) Optimal distributed online prediction using mini-batches. J Mach Learn Res 13:165–202

    Google Scholar 

  • Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SIAM international conference on data mining, pp 606–610

    Chapter  Google Scholar 

  • Donoho D, Stodden V (2003) When does non-negative matrix factorization give a correct decomposition into parts? In: Advances in neural information processing systems. Stanford University, Stanford

    Google Scholar 

  • El Gamal M, Lai L (2015) Are Slepian-Wolf rates necessary for distributed parameter estimation? Technical report. http://arxiv.org/abs/1508.02765. Accessed 10 Nov 2015

  • Fan J, Chen J (1999) One-step local quasi-likelihood estimation. J R Stat Soc Ser B Stat Methodol 61(4):927–943

    Article  MathSciNet  Google Scholar 

  • Fan J, Feng Y, Song R (2012) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc 106:544–557

    Article  MathSciNet  Google Scholar 

  • Fan J, Han F, Liu H (2014) Challenges of big data analysis. Natl Sci Rev 1:293–314

    Article  Google Scholar 

  • Fevotte C, Bertin N, Durrieu JL (2009) Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput 21(3):793–830

    Article  Google Scholar 

  • Forero PA, Cano A, Giannakis GB (2010) Consensus-based distributed support vector machines. J Mach Learn Res 11:1663–1707

    Google Scholar 

  • Gillis N, Luce R (2014) Robust near-separable nonnegative matrix factorization using linear optimization. J Mach Learn Res 15:1249–1280

    Google Scholar 

  • Huang C, Huo X (2015) A distributed one-step estimator. Technical report. http://arxiv.org/abs/1511.01443. Accessed 10 Nov 2015

  • Huang K, Sidiropoulos ND, Swami A (2014) Non-negative matrix factorization revisited: uniqueness and algorithm for symmetric decomposition. IEEE Trans Signal Process 62(1):211–224

    Article  MathSciNet  Google Scholar 

  • Jaggi M, Smith V, Takác M, Terhorst J, Krishnan S, Hofmann T, Jordan MI (2014) Communication-efficient distributed dual coordinate ascent. In: Advances in neural information processing systems, pp 3068–3076

    Google Scholar 

  • Kleiner A, Talwalkar A, Sarkar P, Jordan MI (2014) A scalable bootstrap for massive data. J R Stat Soc Ser B Stat Methodol 76(4):795–816

    Article  MathSciNet  Google Scholar 

  • Lang S (1993) Real and functional analysis, vol 142. Springer Science & Business Media, Berlin

    Book  Google Scholar 

  • Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791

    Article  Google Scholar 

  • Lee JD, Sun Y, Liu Q, Taylor JE (2015) Communication-efficient sparse regression: a one-shot approach. arXiv preprint arXiv:1503.04337

    Google Scholar 

  • Liu Q, Ihler AT (2014) Distributed estimation, information loss and exponential families. In: Advances in neural information processing systems, pp 1098–1106

    Google Scholar 

  • McDonald R, Hall K, Mann G (2010) Distributed training strategies for the structured perceptron. In: North American chapter of the Association for Computational Linguistics (NAACL)

    Google Scholar 

  • Mitra S, Agrawal M, Yadav A, Carlsson N, Eager D, Mahanti A (2011) Characterizing web-based video sharing workloads. ACM Trans Web 5(2):8

    Article  Google Scholar 

  • Mizutani T (2014) Ellipsoidal rounding for nonnegative matrix factorization under noisy separability. J Mach Learn Res 15:1011–1039

    Google Scholar 

  • Neiswanger W, Wang C, Xing E (2013) Asymptotically exact, embarrassingly parallel MCMC. arXiv preprint arXiv:1311.4780

    Google Scholar 

  • Nowak RD (2003) Distributed EM algorithms for density estimation and clustering in sensor networks. IEEE Trans Signal Process 51(8):2245–2253

    Article  Google Scholar 

  • Paatero P, Tapper U (1994) Positive matrix factorization: a nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126

    Article  Google Scholar 

  • Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl 401(1):29–47

    Article  MathSciNet  Google Scholar 

  • Ravikumar P, Lafferty J, Liu H, Wasserman L (2009) Sparse additive models. J R Stat Soc Ser B Stat Methodol 71(5):1009–1030

    Article  MathSciNet  Google Scholar 

  • Rosenblatt J, Nadler B (2014) On the optimality of averaging in distributed statistical learning. arXiv preprint arXiv:1407.2724

    Google Scholar 

  • Schmidt MN, Larson J, Hsiao FT (2007) Wind noise reduction using non-negative sparse coding. In: Machine learning for signal processing, IEEE workshop, pp 431–436

    Google Scholar 

  • Shamir O, Srebro N, Zhang T (2014) Communication-efficient distributed optimization using an approximate Newton-type method. In: Proceedings of the 31st international conference on machine learning, pp 1000–1008

    Google Scholar 

  • Song Q, Liang F (2015) A split-and-merge Bayesian variable selection approach for ultrahigh dimensional regression. J R Stat Soc B 77(Part 5):947–972

    Article  MathSciNet  Google Scholar 

  • Städler N, Bühlmann P, Van De Geer S (2010) 1-Penalization for mixture regression models. Test 19(2):209–256

    Article  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58(1):267–288

    Google Scholar 

  • van der Vaart AW (2000) Asymptotic statistics. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge

    Google Scholar 

  • Wainwright M (2014) Constrained forms of statistical minimax: computation, communication, and privacy. In: Proceedings of international congress of mathematicians

    Google Scholar 

  • Wang X, Peng P, Dunson DB (2014) Median selection subset aggregation for parallel inference. In: Advances in neural information processing systems, pp 2195–2203

    Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: The 26th annual international ACM SIGIR conference on research and development in information retrieval, pp 267–273

    Google Scholar 

  • Yang Y, Barron A (1999) Information-theoretic determination of minimax rates of convergence. Ann Stat 27(5):1564–1599

    Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67

    Article  MathSciNet  Google Scholar 

  • Zhang Y, Duchi JC, Wainwright MJ (2013) Communication-efficient algorithms for statistical optimization. J Mach Learn Res 14:3321–3363

    Google Scholar 

  • Zhang Y, Duchi JC, Jordan MI, Wainwright MJ (2013) Information-theoretic lower bounds for distributed statistical estimation with communication constraints. Technical report, UC Berkeley. Presented at the NIPS Conference 2013

    Google Scholar 

  • Zhao T, Cheng G, Liu H (2014) A partially linear framework for massive heterogeneous data. arXiv preprint arXiv:1410.8570

    Google Scholar 

  • Zinkevich M, Weimer M, Li L, Smola AJ (2010) Parallelized stochastic gradient descent. In: Advances in neural information processing systems, pp 2595–2603

    Google Scholar 

  • Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann Stat 36(4):1509–1533

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoming Huo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Huo, X., Huang, C., Ni, X.S. (2018). Scattered Data and Aggregated Inference. In: Härdle, W., Lu, HS., Shen, X. (eds) Handbook of Big Data Analytics. Springer Handbooks of Computational Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18284-1_4

Download citation

Publish with us

Policies and ethics