Skip to main content

Exploring Graph Analytics with the PCJ Toolbox

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10778))

Abstract

Graph analysis is an intrinsic tool embedded in the big data domain. The demand in processing of bigger and bigger graphs requires highly efficient and parallel applications. In this work we explore the possibility of employing the new PCJ library for distributed calculations in Java. We apply the toolbox to sparse matrix matrix multiplications and the k-means clustering problem. We benchmark the strong scaling performance against an equivalent C++/MPI implementation. Our benchmarks found comparable good scaling results for algorithms using mainly local point-to-point communications, and exposed the potential for logarithmic collective operations directly available in the PCJ library. Further more, we also experienced an improvement of development time to solution, as a result of the high level abstractions provided by Java and PCJ.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This is specially relevant in the one-dimensional domain decomposition used in the SPMM algorithm.

References

  1. Parallel computing in Java. https://pcj.icm.edu.pl

  2. Estrada, E.: Subgraph centrality in complex networks. Phys. Rev. E 71(5), 056103 (2005)

    Article  MathSciNet  Google Scholar 

  3. Estrada, E., et al.: Network properties revealed through matrix functions. SIAM Rev. 52(4), 696–714 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Leskovec, J., Krevl, A.: Snap datasets: Stanford large network dataset collection (2014). http://snap.stanford.edu/data

  5. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  6. Nowicki, M., Górski, Ł., Grabrczyk, P., Bala, P.: PCJ - Java library for high performance computing in PGAS model. In: 2014 International Conference on High Performance Computing Simulation (HPCS), pp. 202–209, July 2014. https://doi.org/10.1109/HPCSim.2014.6903687

  7. Nowicki, M., Bzhalava, D., Bała, P.: Massively parallel sequence alignment with BLAST through work distribution implemented using PCJ library. In: Ibrahim, S., Choo, K.-K.R., Yan, Z., Pedrycz, W. (eds.) ICA3PP 2017. LNCS, vol. 10393, pp. 503–512. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65482-9_36

    Chapter  Google Scholar 

  8. Ropo, M., Westerholm, J., Dongarra, J. (eds.): Recent Advances in Parallel Virtual Machine and Message Passing Interface. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03770-2. ISBN: 978-3-642-03769-6

    Google Scholar 

  9. Ryczkowska, M., Nowicki, M., Bala, P.: The performance evaluation of the Java implementation of Graph500. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9574, pp. 221–230. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32152-3_21

    Chapter  Google Scholar 

  10. Ryczkowska, M., Nowicki, M., Bała, P.: Level-synchronous BFS algorithm implemented in Java using PCJ library. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 596–601 (2016)

    Google Scholar 

  11. Staar, P.W.J., Barkoutsos, P.K., Istrate, R., Malossi, A.C.I., Tavernelli, I., Moll, N., Giefers, H., Hagleitner, C., Bekas, C., Curioni, A.: Stochastic matrix-function estimators: scalable big-data kernels with high performance. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 812–821 (2016). https://doi.org/10.1109/IPDPS.2016.34

  12. Tinney, W.F., Walker, J.W.: Direct solutions of sparse network equations by optimally ordered triangular factorization. Proc. IEEE 55(11), 1801–1809 (1967)

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Piotr Bała and Marek Nowicki for driving the development of the PCJ library and for fruitful discussions and debugging. This work was partial supported by the CHIST-ERA consortium.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roxana Istrate .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Istrate, R., Barkoutsos, P.K., Dolfi, M., Staar, P.W.J., Bekas, C. (2018). Exploring Graph Analytics with the PCJ Toolbox. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2017. Lecture Notes in Computer Science(), vol 10778. Springer, Cham. https://doi.org/10.1007/978-3-319-78054-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78054-2_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78053-5

  • Online ISBN: 978-3-319-78054-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics