Advertisement

Same Stats, Different Graphs

(Graph Statistics and Why We Need Graph Drawings)
  • Hang Chen
  • Utkarsh Soni
  • Yafeng Lu
  • Ross Maciejewski
  • Stephen KobourovEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11282)

Abstract

Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe’s Quartet demonstrates how such statistics can be misleading. We consider a similar problem in the context of graph mining. To study the relationships between different graph properties and statistics, we examine all low-order (\(\le \)10) non-isomorphic graphs and provide a simple visual analytics system to explore correlations across multiple graph properties. However, for graphs with more than ten nodes, generating the entire space of graphs becomes quickly intractable. We use different random graph generation methods to further look into the distribution of graph statistics for higher order graphs and investigate the impact of various sampling methodologies. We also describe a method for generating many graphs that are identical over a number of graph properties and statistics yet are clearly different and identifiably distinct.

Keywords

Graph mining Graph properties Graph generators 

References

  1. 1.
    Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47 (2002)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Albert, R., Jeong, H., Barabási, A.L.: Internet: diameter of the world-wide web. Nature 401(6749), 130 (1999)CrossRefGoogle Scholar
  3. 3.
    Anscombe, F.J.: Graphs in statistical analysis. Am. Stat. 27(1), 17–21 (1973). http://www.jstor.org/stable/2682899Google Scholar
  4. 4.
    Bach, B., Spritzer, A., Lutton, E., Fekete, J.-D.: Interactive random graph generation with evolutionary algorithms. In: Didimo, W., Patrignani, M. (eds.) GD 2012. LNCS, vol. 7704, pp. 541–552. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-36763-2_48CrossRefzbMATHGoogle Scholar
  5. 5.
    Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Boguná, M., Pastor-Satorras, R.: Epidemic spreading in correlated complex networks. Phys. Rev. E 66(4), 047104 (2002)CrossRefGoogle Scholar
  8. 8.
    Broder, A., et al.: Graph structure in the web. Comput. Netw. 33(1–6), 309–320 (2000)CrossRefGoogle Scholar
  9. 9.
    Cartwright, D., Harary, F.: Structural balance: a generalization of Heider’s theory. Psychol. Rev. 63(5), 277 (1956)CrossRefGoogle Scholar
  10. 10.
    Chakrabarti, D., Faloutsos, C.: Graph mining: laws, generators, and algorithms. ACM Comput. Surv. (CSUR) 38(1), 2 (2006)CrossRefGoogle Scholar
  11. 11.
    Chakrabarti, D., Faloutsos, C.: Graph patterns and the R-MAT generator. In: Mining Graph Data, pp. 65–95 (2007)CrossRefGoogle Scholar
  12. 12.
    Chen, H., Soni, U., Lu, Y., Maciejewski, R., Kobourov, S.: Same stats, different graphs (graph statistics and why we need graph drawings). ArXiv e-prints arXiv:1808.09913, August 2018
  13. 13.
    Davis, G.F., Yoo, M., Baker, W.E.: The small world of the American corporate elite, 1982–2001. Strateg. Org. 1(3), 301–326 (2003)CrossRefGoogle Scholar
  14. 14.
    Dorogovtsev, S.N., Mendes, J.F.F., Samukhin, A.N.: Structure of growing networks with preferential linking. Phys. Rev. Lett. 85(21), 4633 (2000)CrossRefGoogle Scholar
  15. 15.
    Ebel, H., Mielsch, L.I., Bornholdt, S.: Scale-free topology of e-mail networks. Phys. Rev. E 66(3), 035103 (2002)CrossRefGoogle Scholar
  16. 16.
    Erdös, P., Rényi, A.: On random graphs. Publicationes mathematicae 6, 290–297 (1959)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Even, S., Tarjan, R.E.: Network flow and testing graph connectivity. SIAM J. Comput. 4(4), 507–518 (1975)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Feld, S.L.: The focused organization of social ties. Am. J. Sociol. 86(5), 1015–1035 (1981)CrossRefGoogle Scholar
  19. 19.
    Frank, O., Harary, F.: Cluster inference by using transitivity indices in empirical graphs. J. Am. Stat. Assoc. 77(380), 835–840 (1982)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Gilbert, E.N.: Random graphs. Ann. Math. Stat. 30(4), 1141–1144 (1959)CrossRefGoogle Scholar
  21. 21.
    Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Phys. Rev. E 68(6), 065103 (2003)CrossRefGoogle Scholar
  23. 23.
    Hagberg, A., Swart, P., S Chult, D.: Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab. (LANL), Los Alamos, NM, United States (2008)Google Scholar
  24. 24.
    Hanneman, R.A., Riddle, M.: Introduction to social network methods (2005)Google Scholar
  25. 25.
    Kairam, S., MacLean, D., Savva, M., Heer, J.: GraphPrism: compact visualization of network structure. In: Proceedings of the International Working Conference on Advanced Visual Interfaces, pp. 498–505. ACM (2012)Google Scholar
  26. 26.
    Karlberg, M.: Testing transitivity in graphs. Soc. Netw. 19(4), 325–343 (1997)CrossRefGoogle Scholar
  27. 27.
    Li, G., Semerci, M., Yener, B., Zaki, M.J.: Graph classification via topological and label attributes. In: Proceedings of the 9th International Workshop on Mining and Learning with Graphs (MLG), San Diego, USA, vol. 2 (2011)Google Scholar
  28. 28.
    Lind, P.G., Gonzalez, M.C., Herrmann, H.J.: Cycles and clustering in bipartite networks. Phys. Rev. E 72(5), 056127 (2005)CrossRefGoogle Scholar
  29. 29.
    Loguinov, D., Kumar, A., Rai, V., Ganesh, S.: Graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience. In: Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 395–406. ACM (2003)Google Scholar
  30. 30.
    Maslov, S., Sneppen, K., Zaliznyak, A.: Detection of topological patterns in complex networks: correlation profile of the internet. Physica A: Stat. Mech. Appl. 333, 529–540 (2004)CrossRefGoogle Scholar
  31. 31.
    Matejka, J., Fitzmaurice, G.: Same stats, different graphs: generating datasets with varied appearance and identical statistics through simulated annealing. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 1290–1294. ACM (2017)Google Scholar
  32. 32.
    McGlohon, M., Akoglu, L., Faloutsos, C.: Statistical properties of social networks. In: Aggarwal, C. (ed.) Social Network Data Analytics, pp. 17–42. Springer, Boston (2011).  https://doi.org/10.1007/978-1-4419-8462-3_2CrossRefGoogle Scholar
  33. 33.
    Melancon, G.: Just how dense are dense graphs in the real world?: a methodological note. In: Proceedings of the 2006 AVI Workshop on BEyond Time and Errors: Novel Evaluation Methods for Information Visualization, pp. 1–7. ACM (2006)Google Scholar
  34. 34.
    Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, pp. 29–42. ACM (2007)Google Scholar
  35. 35.
    Newman, M.E.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)CrossRefGoogle Scholar
  36. 36.
    Newman, M.E.: Mixing patterns in networks. Phys. Rev. E 67(2), 026126 (2003)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Newman, M.E.: The structure and function of complex networks. SIAM Rev. 45(2), 167–256 (2003)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Newman, M.E., Watts, D.J.: Scaling and percolation in the small-world network model. Phys. Rev. E 60(6), 7332 (1999)CrossRefGoogle Scholar
  39. 39.
    de Solla Price, D.: A general theory of bibliometric and other cumulative advantage processes. J. Assoc Inf. Sci. Technol. 27(5), 292–306 (1976)Google Scholar
  40. 40.
    Sen, P., Dasgupta, S., Chatterjee, A., Sreeram, P., Mukherjee, G., Manna, S.: Small-world properties of the indian railway network. Phys. Rev. E 67(3), 036106 (2003)CrossRefGoogle Scholar
  41. 41.
    Soni, U., Lu, Y., Hansen, B., Purchase, H., Kobourov, S., Maciejewski, R.: The perception of graph properties in graph layouts. In: 20th IEEE Eurographics Conference on Visualization (EuroVis) (2018)Google Scholar
  42. 42.
    Uzzi, B., Spiro, J.: Collaboration and creativity: the small world problem. Am. J. Sociol. 111(2), 447–504 (2005)CrossRefGoogle Scholar
  43. 43.
    Van Noort, V., Snel, B., Huynen, M.A.: The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model. EMBO Rep. 5(3), 280–284 (2004)CrossRefGoogle Scholar
  44. 44.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440 (1998)CrossRefGoogle Scholar
  45. 45.
    Wei-Bing, D., Long, G., Wei, L., Xu, C.: Worldwide marine transportation network: efficiency and container throughput. Chin. Phys. Lett. 26(11), 118901 (2009)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Hang Chen
    • 1
  • Utkarsh Soni
    • 2
  • Yafeng Lu
    • 2
  • Ross Maciejewski
    • 2
  • Stephen Kobourov
    • 1
    Email author
  1. 1.University of ArizonaTucsonUSA
  2. 2.Arizona State UniversityTempeUSA

Personalised recommendations