Skip to main content

Distributed Processing of Networked Data

  • Living reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Parallel processing

BSP :

Bulk Synchronous Parallel

MapReduce :

A distributed programming model derived from functional paradigm, dedicated for complex and distributed computations

SNA :

Social network analysis

Definition

The rapid development of the Internet provides many data sets that can be used to extract large and complex social networks. Such structures are characterized by the 3V rule, typical for big data sets: variety, volume, and velocity. These properties require sophisticated environment and specialized methods to be used for processing and analyzing large social networks. The primary purpose of various techniques, measures, and methods commonly called social network analysis (SNA) is to extract useful knowledge from such structures in order to support, e.g., targeted marketing, recommender, and personalized systems, or efficient human collaboration and knowledge exchange.

Due to efficiency reasons, to process large networked data, some complex cluster computer...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Abou-Rjeili A, Karypis G (2006) Multilevel algorithms for partitioning power-law graphs. The 20th International Conference on Parallel and Distributed Processing – IPDPS 2006, IEEE, p 124

    Google Scholar 

  • Andrews GR (2000) Foundations of multithreaded, parallel, and distributed programming. Addison–Wesley, Reading

    Google Scholar 

  • Apache Giraph (2011) http://giraph.apache.org/. Accessed 7 Apr 2017

  • Avery C (2011) Giraph: large-scale graph processing infrastructure on hadoop. Proceedings of the Hadoop Summit, Santa Clara

    Google Scholar 

  • Bartusiak Roman, Tomasz Kajdanowicz (2017) SparklingGraph – Large scale, distributed (not only!) graph processing made easy! http://sparkling.ml/. Accessed 30 Mar 2017

  • Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at Facebook-scale. Proceedings of the VLDB Endowment, 8(12), 1804–1815

    Google Scholar 

  • Cohen J (2009) Graph twiddling in a MapReduce world. Comput Sci Eng 11:29–41

    Article  Google Scholar 

  • Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) PowerGraph: distributed graph-parallel computation on natural graphs. In: The 10th USENIX Symposium on Operating systems design and implementation – OSDI ’12, Hollywood, pp 17–30

    Google Scholar 

  • Indyk W, Kajdanowicz T, Kazienko P, Plamowski S (2012) MapReduce approach to collective classification for networks. In: ICAISC 2012, Zakopane. Lecture notes in computer science, vol 7267. pp 656–663

    Google Scholar 

  • Kajdanowicz T, Kazienko P, Indyk W (2014b) Parallel processing of large graphs. Futur Gener Comput Syst 32:324–337

    Article  Google Scholar 

  • Kajdanowicz T, Indyk W, Kazienko P, Kukuł J (2012) Comparison of the efficiency of MapReduce and bulk synchronous parallel approaches to large network processing. In: ICDM 2012 – IEEE international conference on data mining, DaMNet 2012 – the second IEEE ICDM workshop on data mining in networks, Brussels. IEEE Computer Society Press, pp 218–225

    Google Scholar 

  • Kajdanowicz T, Indyk W, Plamowski S, Kazienko P (2014a) MapReduce approach to relational influence propagation in complex networks. Pattern Anal Applic 17(4):739–746. doi:10.1007/s10044-012-0294-6

    Article  MathSciNet  Google Scholar 

  • Kim GH, Trimi S, Chung JH (2014) Big-data applications in the government sector. Commun ACM 57(3):78–85

    Article  Google Scholar 

  • Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123

    Article  MathSciNet  MATH  Google Scholar 

  • Lin J, Dyer C (2010) Data-Intensive text processing with MapReduce. Synthesis lectures on human language technologies. Morgan & Claypool Publishers, San Rafael

    Google Scholar 

  • Lin J, Schatz M (2010) Design patterns for efficient graph algorithms in MapReduce. In: The eighth workshop on mining and learning with graphs – MLG'10. ACM, New York, pp 78–85

    Chapter  Google Scholar 

  • Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, ACM, pp 135–146

    Google Scholar 

  • Pace MF (2012) BSP vs MapReduce. Procedia Comput Sci 9:246–255

    Article  Google Scholar 

  • Valiant L (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111

    Article  Google Scholar 

  • White T (2010) Hadoop: the definitive guide, 2nd edn. O'Reilly, Sebastopol

    Google Scholar 

  • Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) Graphx: a resilient distributed graph system on spark. The First International Workshop on Graph Data Management Experiences and Systems, ACM, p 2

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the National Science Centre, Poland, the decisions no. DEC-2016/21/B/ST6/01463 and DEC-2016/21/D/ST6/02948.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Przemysław Kazienko .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this entry

Cite this entry

Kazienko, P., Indyk, W., Kajdanowicz, T., Bartusiak, R. (2017). Distributed Processing of Networked Data. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7163-9_258-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-7163-9_258-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-7163-9

  • Online ISBN: 978-1-4614-7163-9

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics