Advertisement

The Various Graphs in Graph Computing

  • Rujun SunEmail author
  • Lufei Zhang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 994)

Abstract

The world is full of relationships, and graph is the most evident representation for them. With the increasing of data scale, graphs become larger and have encountered a new world of analyzing. What can we learn from a graph? How many kinds of graphs are there? How different is graph from one area to that from another? All these questions need answers, but previous research on graph computing mainly focused on computing frameworks and systems, paying little attention to graph itself.

In this paper, we studied graphs of different kinds, different scales and different mining methods, trying to give a sketcher and classification of graph categories. Besides, we studied characters and analyzed algorithms in each category. We researched public graph datasets to show current graph scale and its trend for future infrastructure.

Keywords

Graph category Extremely large scale graph Graph characters 

References

  1. 1.
    The clueweb12 dataset (2012). http://lemurproject.org/clueweb12/
  2. 2.
    Yahoo! altavista web page hyperlink connectivity graph (2012). http://webscope.sandbox.yahoo.com/
  3. 3.
    Graph 500, May 2017. www.graph500.org
  4. 4.
    Lehigh university benchmark (lubm) (2017). http://swat.cse.lehigh.edu/projects/lubm/
  5. 5.
    The linking open data cloud diagram: Datahub (2017). https://datahub.io/dataset
  6. 6.
    Alpert, J., Hajaj, N.: We knew the web was big (2008). http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html
  7. 7.
    Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. In: Proceedings of the 4th Annual ACM Web Science Conference, pp. 33–42. ACM (2012)Google Scholar
  8. 8.
    Batagelj, V., Mrvar, A.: Pajek datasets (2006)Google Scholar
  9. 9.
    Boldi, P.: The laboratory for web algorithmics (lwa) datasets, May 2017. http://law.di.unimi.it/datasets.php
  10. 10.
    Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: a multi resolution coordinate-free ordering for compressing social networks. In: Srinivasan, S., Ramamritham, K., Kumar, A., Ravindra, M.P., Bertino, E., Kumar, R. (eds.) Proceedings of the 20th International Conference on World Wide Web, pp. 587–596. ACM Press (2011)Google Scholar
  11. 11.
    Boldi, P., Vigna, S.: The webgraph framework i: compression techniques. In: Proceedings of the 13th International Conference on World Wide Web, pp. 595–602. ACM (2004)Google Scholar
  12. 12.
    Boldi, P., Vigna, S.: The WebGraph framework I: compression techniques. In: Proceedings of the Thirteenth International World Wide Web Conference (WWW 2004), pp. 595–601. ACM Press, Manhattan (2004)Google Scholar
  13. 13.
    Broder, A., et al.: Graph structure in the web. Comput. Netw. 33(1), 309–320 (2000)CrossRefGoogle Scholar
  14. 14.
    Cesa-Bianchi, N., Gentile, C., Mansour, Y., Minora, A.: Delay and cooperation in nonstochastic bandits. In: Conference on Learning Theory, pp. 605–622 (2016)Google Scholar
  15. 15.
    Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: a recursive model for graph mining. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 442–446. SIAM (2004)Google Scholar
  16. 16.
    Ching, A.: Giraph: production-grade graph processing infrastructure for trillion edge graphs. ATPESC ser. ATPESC 14 (2014)Google Scholar
  17. 17.
    Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at facebook-scale. Proc. VLDB Endowment 8(12), 1804–1815 (2015)CrossRefGoogle Scholar
  18. 18.
    Coppola, M., Locatelli, R., Maruccia, G., Pieralisi, L., Scandurra, A.: Spidergon: a novel on-chip communication network. In: 2004 International Symposium on System-on-Chip, Proceedings, p. 15. IEEE (2004)Google Scholar
  19. 19.
    Dai, G., Huang, T., Chi, Y., Xu, N., Wang, Y., Yang, H.: Foregraph: Exploring large-scale graph processing on multi-FPGA architecture. Proc. Multi-FPGA Archit. Vertex 1(4), 5 (2017)Google Scholar
  20. 20.
    Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1:1–1:25 (2011). http://www.cise.ufl.edu/research/sparse/matrices
  21. 21.
    Dill, S., Kumar, R., McCurley, K.S., Rajagopalan, S., Sivakumar, D., Tomkins, A.: Self-similarity in the web. ACM Trans. Internet Technol. (TOIT) 2(3), 205–223 (2002)CrossRefGoogle Scholar
  22. 22.
    Eirinaki, M., Vazirgiannis, M., Kapogiannis, D.: Web path recommendations based on page ranking and markov models. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management, pp. 2–9. ACM (2005)Google Scholar
  23. 23.
    Erdős, P., Rényi, A.: Asymmetric graphs. Acta Math. Hung. 14(3–4), 295–315 (1963)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Ezkurdia, I., et al.: Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Hum. Mol. Genet. 23(22), 5866–5878 (2014)CrossRefGoogle Scholar
  25. 25.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, vol. 12, p. 2 (2012)Google Scholar
  26. 26.
    Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)Google Scholar
  27. 27.
    Graham, A.: Kronecker Products and Matrix Calculus: With Applications, p. 130. Wiley, New York (1982)Google Scholar
  28. 28.
    Han, W., Zhu, X., Zhu, Z., Chen, W., Zheng, W., Lu, J.: Weibo and a Tale of two worlds. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 121–128. ACM (2015)Google Scholar
  29. 29.
    Hanneman, R.A., Riddle, M.: Introduction to social network methods (2005)Google Scholar
  30. 30.
    Henzinger, M.R.: Hyperlink analysis for the web. IEEE Internet Comput. 5(1), 45–50 (2001)CrossRefGoogle Scholar
  31. 31.
    Hildorsson, F., Kvernvik, T.: Method and arrangement for supporting analysis of social networks in a communication network. US Patent 9,305,110, 5 Apr 2016Google Scholar
  32. 32.
    Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21(suppl 1), i213–i221 (2005)CrossRefGoogle Scholar
  33. 33.
    Kaggle Inc.: Kaggle datasets, May 2017. https://www.kaggle.com/datasets
  34. 34.
    Kistler, M., Perrone, M., Petrini, F.: Cell multiprocessor communication network: built for speed. IEEE Micro 26(3), 10–23 (2006)CrossRefGoogle Scholar
  35. 35.
    Kumar, G., Duhan, N., Sharma, A.: Page ranking based on number of visits of links of web page. In: 2011 2nd International Conference on Computer and Communication Technology (ICCCT), pp. 11–14. IEEE (2011)Google Scholar
  36. 36.
    Lerman, K., Ghosh, R.: Information contagion: an empirical study of the spread of news on digg and twitter social networks. ICWSM 10, 90–97 (2010)Google Scholar
  37. 37.
    Leskovec, J., Krevl, A.: SNAP datasets: stanford large network dataset collection, Jun 2014. http://snap.stanford.edu/data
  38. 38.
    Li, C., Liakata, M., Rebholz-Schuhmann, D.: Biological network extraction from scientific literature: state of the art and challenges. Briefings Bioinform. 15(5), 856(2014).  https://doi.org/10.1093/bib/bbt006CrossRefGoogle Scholar
  39. 39.
    Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)Google Scholar
  41. 41.
    McBride, B.: Jena: a semantic web toolkit. IEEE Internet Comput. 6(6), 55–59 (2002)CrossRefGoogle Scholar
  42. 42.
    Merolla, P.A., et al.: A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014)CrossRefGoogle Scholar
  43. 43.
    Myers, S.A., Sharma, A., Gupta, P., Lin, J.: Information network or social network?: the structure of the twitter follow graph. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 493–498. ACM (2014)Google Scholar
  44. 44.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Tech. rep, Stanford InfoLab (1999)Google Scholar
  45. 45.
    Pavlopoulos, G.A., et al.: Using graph theory to analyze biological networks. BioData Min. 4(1), 10 (2011)Google Scholar
  46. 46.
    Rogers, E.M., Kincaid, D.L.: Communication networks: toward a new paradigm for research (1981)Google Scholar
  47. 47.
    Sallinen, S., Iwabuchi, K., Poudel, S., Gokhale, M., Ripeanu, M., Pearce, R.: Graph colouring as a challenge problem for dynamic graph processing on distributed systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 30. IEEE Press (2016)Google Scholar
  48. 48.
    Schneider, B., Acevedo, C., Buchmüller, J., Fischer, F., Keim, D.A.: Visual analytics for inspecting the evolution of a graph over time: pattern discovery in a communication network. In: 2015 IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 169–170. IEEE (2015)Google Scholar
  49. 49.
    Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 505–516. ACM (2013)Google Scholar
  50. 50.
    Song, C., Havlin, S., Makse, H.A.: Self-similarity of complex networks. Nature 433(7024), 392–395 (2005)CrossRefGoogle Scholar
  51. 51.
    Strogatz, S.H.: Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering. Westview press, Boulder (2014)Google Scholar
  52. 52.
    Sun, Y., Han, J., Gao, J., Yu, Y.: itopicmodel: information network-integrated topic modeling. In: Ninth IEEE International Conference on Data Mining ICDM 2009, pp. 493–502. IEEE (2009)Google Scholar
  53. 53.
    Sun, Y., Han, J., Zhao, P., Yin, Z., Cheng, H., Wu, T.: Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 565–576. ACM (2009)Google Scholar
  54. 54.
    Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. Proc. VLDB Endowment 7(14), 1981–1992 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.State Key Laboratory of Mathematical Engineering and Advanced ComputingWuxiChina

Personalised recommendations