Skip to main content

Comparison of Large Graphs Using Distance Information

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9573))

Abstract

We present a new framework for analysis and visualization of large complex networks based on structural information retrieved from their distance k-graphs and B-matrices. The construction of B-matrices for graphs with more than 1 million edges requires massive BFS computations and is facilitated using Cassovary - an open-source in-memory graph processing engine. The approach described in this paper enables efficient generation of expressive, multi-dimensional descriptors useful in graph embedding and graph mining tasks. In experimental section, we present how the developed tools helped in the analysis of real-world graphs from Stanford Large Network Dataset Collection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Diameter, efficiency, characteristic path length, vertex betweenness, vertex closeness, vertex eccentricity, transitivity, clustering coefficient, assortativity [3].

References

  1. Avery, C.: Giraph: large-scale graph processing infrastructure on hadoop. In: Proceedings of the Hadoop Summit. Santa Clara (2011)

    Google Scholar 

  2. Barabási, A., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509 (1999)

    Article  MathSciNet  Google Scholar 

  3. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)

    Article  MathSciNet  Google Scholar 

  4. Borzeshi, E.Z., Piccardi, M., Riesen, K., Bunke, H.: Discriminative prototype selection methods for graph embedding. Pattern Recogn. 46(6), 1648–1657 (2013)

    Article  Google Scholar 

  5. Brandes, U., Pfeffer, J., Mergel, I.: Studying Social Networks: A Guide to Empirical Research. Campus Verlag, Frankfurt (2012)

    Google Scholar 

  6. Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: Haloop: efficient iterative data processing on large clusters. Proc. VLDB Endowment 3(1–2), 285–296 (2010)

    Article  Google Scholar 

  7. Czech, W.: Graph descriptors from B-matrix representation. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 12–21. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Czech, W., Goryczka, S., Arodz, T., Dzwinel, W., Dudek, A.: Exploring complex networks with graph investigator research application. Comput. Inform. 30(2), 381–410 (2011)

    MathSciNet  MATH  Google Scholar 

  9. Czech, W.: Invariants of distance k-graphs for graph embedding. Pattern Recogn. Lett. 33(15), 1968–1979 (2012)

    Article  Google Scholar 

  10. D’Alberto, P., Nicolau, A.: R-kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica 47(2), 203–213 (2007)

    Article  MathSciNet  Google Scholar 

  11. Dzwinel, W., Wcisło, R.: Very fast interactive visualization of large sets of high-dimensional data. In: Proceedings of ICCS 2015, Reykjavik, 1–3 June 2015, Iceland, Procedia Computer Science (2015) (in print)

    Article  Google Scholar 

  12. Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.H., Qiu, J., Fox, G.: Twister: a runtime for iterative mapreduce. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 810–818. ACM (2010)

    Google Scholar 

  13. Emms, D., Wilson, R.C., Hancock, E.R.: Graph matching using the interference of continuous-time quantum walks. Pattern Recogn. 42(5), 985–1002 (2009)

    Article  Google Scholar 

  14. Foggia, P., Percannella, G., Vento, M.: Graph matching and learning in pattern recognition in the last 10 years. Int. J. Pattern Recogn. Artif. Intell. 28(01), 1450001 (2014)

    Article  MathSciNet  Google Scholar 

  15. Gibert, J., Valveny, E., Bunke, H.: Dimensionality reduction for graph of words embedding. In: Jiang, X., Ferrer, M., Torsello, A. (eds.) GbRPR 2011. LNCS, vol. 6658, pp. 22–31. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: Wtf: The who to follow service at twitter. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 505–514. International World Wide Web Conferences Steering Committee (2013)

    Google Scholar 

  17. Han, M., Daudjee, K., Ammar, K., Ozsu, M.T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. Proc. VLDB Endowment 7(12), 1047–1058 (2014)

    Article  Google Scholar 

  18. Lee, W.-J., Duin, R.P.W.: A labelled graph based multiple classifier system. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 201–210. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data

  20. Leskovec, J., Sosič, R.: SNAP: A general purpose network analysis and graph mining library in C++. http://snap.stanford.edu/snap

  21. Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E., Hellerstein, J.: Graphlab: a new framework for parallel machine learning (2014). arXiv:1408.2041

  22. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)

    Google Scholar 

  23. Qiu, H., Hancock, E.: Clustering and embedding using commute times. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1873–1890 (2007)

    Article  Google Scholar 

  24. Salihoglu, S., Widom, J.: Gps: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, p. 22. ACM (2013)

    Google Scholar 

  25. Watts, D., Strogatz, S.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998)

    Article  Google Scholar 

  26. Xiao, B., Hancock, E., Wilson, R.: A generative model for graph matching and embedding. Comput. Vis. Image Underst. 113(7), 777–789 (2009)

    Article  Google Scholar 

  27. Zhang, Y., Gao, Q., Gao, L., Wang, C.: Priter: a distributed framework for prioritizing iterative computations. IEEE Trans. Parallel Distrib. Syst. 24(9), 1884–1893 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This research is supported by the National Centre Science Poland (NCN) DEC-2013/09/B/ST6/01549.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wojciech Czech .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Czech, W., Mielczarek, W., Dzwinel, W. (2016). Comparison of Large Graphs Using Distance Information. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2015. Lecture Notes in Computer Science(), vol 9573. Springer, Cham. https://doi.org/10.1007/978-3-319-32149-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32149-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32148-6

  • Online ISBN: 978-3-319-32149-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics