Advertisement

Data Mining and Knowledge Discovery

, Volume 28, Issue 4, pp 1107–1128 | Cite as

Assessing the quality of multilevel graph clustering

  • François Queyroi
  • Maylis Delest
  • Jean-Marc Fédou
  • Guy Melançon
Article

Abstract

“Lifting up” a non-hierarchical approach to handle hierarchical clustering by iteratively applying the approach to hierarchically cluster a graph is a popular strategy. However, these lifted iterative strategies cannot reasonably guide the overall nesting process precisely because they fail to evaluate the very hierarchical character of the clustering they produce. In this study, we develop a criterion that can evaluate the quality of the subgraph hierarchy. The multilevel criterion we present and discuss in this paper generalizes a measure designed for a one-level (flat) graph clustering to take nesting of the clusters into account. We borrow ideas from standard techniques in algebraic combinatorics and exploit a variable \(q\) to keep track of the depth of clusters at which edges occur. Our multilevel measure relies on a recursive definition involving variable \(q\) outputting a one-variable polynomial. This paper examines archetypal examples as proofs-of-concept; these simple cases are useful in understanding how the multilevel measure actually works. We also apply this multilevel modularity to real world networks to demonstrate how it can be used to compare hierarchical clusterings of graphs.

Keywords

Graph clustering Graph hierarchies Hierarchical clustering Multilevel modularity 

References

  1. Auber D (2003) Tulip—a huge graph visualization framework. In: Mutzel P, Jnger M (eds) Graph drawing software, mathematics and visualization series. Springer, New YorkGoogle Scholar
  2. Auber D, Archambault D, Bourqui R, Lambert A, Mathiaut M, Mary P, Delest M, Dubois J, Melançon G (2012) The tulip 3 framework: a scalable software library for information visualization applications based on relational data. Technical report RR-7860. INRIA Bordeaux Sud-Ouest, BordeauxGoogle Scholar
  3. Auber D, Chiricota Y, Jourdan F, Melançon G (2003) Multiscale navigation of small world networks. In: IEEE symposium on information visualisation, pp 75–81. IEEE Computer Society, Washington, DCGoogle Scholar
  4. Batty M (2006) Hierarchy in cities and city systems. In: Pumain D (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, Berlin, pp 143–168Google Scholar
  5. Blondel V, Guillaume J, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech P10:008Google Scholar
  6. Boutin F, Hascoët M (2004) Cluster validity indices for graph partitioning. In: IV’04: 8th IEEE international conference on information visualization. IEEE, London, pp 376–381. http://hal-lirmm.ccsd.cnrs.fr/lirmm-00108948/en/
  7. Brandes U, Gaertler M, Wagner D (2007) Engineering graph clustering: models and experimental evaluation. J Exp Algorithmics 12:1–26Google Scholar
  8. Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2CrossRefGoogle Scholar
  9. Cook DJ, Holder LB (2006) (eds.) Mining graph data. Wiley, HobokenGoogle Scholar
  10. De Montis A, Caschili S, Chessa A (2011) Commuter networks and community detection: a method for planning sub regional areas. Arxiv, arXiv:1103.2467 (preprint)Google Scholar
  11. Delest M, Fédou J, Melançon G (2007) A quality measure for multi-level community structure. In: Eighth international symposium on symbolic and numeric algorithms for scientific computing, SYNASC’06. IEEE, Washington D.C., pp 63–68Google Scholar
  12. Delest MP, Fédou JM (1992) Attribute grammars are useful for combinatorics. Theor Comput Sci 98(1): 65–76CrossRefMATHGoogle Scholar
  13. Erdös P, Rényi A (1959) On random graphs I. Publ Math Debr 6:290–297MATHGoogle Scholar
  14. Evans T (2010) Clique graphs and overlapping communities. J Stat Mech: P12037. doi: 10.1088/1742-5468/2010/12/P12037
  15. Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174CrossRefMathSciNetGoogle Scholar
  16. Gaume B, Venant F, Victorri B (2006) Hierarchy in lexical organization of natural language. In: Pumain D (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, New York, pp 121–142Google Scholar
  17. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99:7821–7826CrossRefMATHMathSciNetGoogle Scholar
  18. Good B, De Montjoye Y, Clauset A (2010) Performance of modularity maximization in practical contexts. Phys Rev E 81(4):46–106CrossRefGoogle Scholar
  19. Jonyer I, Cook D, Holder L (2002) Graph-based hierarchical conceptual clustering. J Mach Learn Res 2:19–43MATHGoogle Scholar
  20. Mancoridis S, Mitchell BS, Rorres C, Chen Y, Gansner E (1998) Using automatic clustering to produce high-level system organizations of source code. In: IEEE international workshop on program understanding (IWPC’98). IEEE Computer Society, Washington D.C., pp 173–179Google Scholar
  21. Mishna M (2003) Attribute grammars and automatic complexity analysis. Adv Appl Math 30(1–2):189–207CrossRefMATHMathSciNetGoogle Scholar
  22. Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066–133Google Scholar
  23. Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113Google Scholar
  24. Patuelli R, Reggiani A, Gorman S, Nijkamp P, Bade F (2007) Network analysis of commuting flows: a comparative static approach to German data. Netw Spat Econ 7(4):315–331CrossRefMATHGoogle Scholar
  25. Pflieger G, Rozenblat C (2010) Discovery and evaluation of graph-based hierarchical conceptual clusters. Urban Stud (special issue: Urban Netw Netw Theory) 47(13):2723–2735Google Scholar
  26. Pons P, Latapy M (2011) Post-processing hierarchical community structures: quality improvements and multi-scale view. Theor Comput Sci 412(8–10):892–900CrossRefMATHMathSciNetGoogle Scholar
  27. Pumain D (2006) (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, HeidelbergGoogle Scholar
  28. Queyroi F, Chiricota Y (2011) Visualization-based communities discovering in commuting networks: a case study. Technical report. LaBRI, INRIA. http://hal.archives-ouvertes.fr/hal-00593734/PDF/queyroi_cga.pdf
  29. Rouwendal J, Nijkamp P (2004) Living in two worlds: a review of home-to-work decisions. Growth Change 35(3):287–303CrossRefGoogle Scholar
  30. Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: KDD, pp 737–746Google Scholar
  31. Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1:27–64CrossRefGoogle Scholar
  32. Sozio M, Gionis A (2010) The community-search problem and how to plan a successful cocktail party. In: KDD, pp 939–948Google Scholar
  33. Vespignani A (2003) Evolution thinks modular. Nature 35(2):118–119Google Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  • François Queyroi
    • 1
  • Maylis Delest
    • 1
  • Jean-Marc Fédou
    • 2
  • Guy Melançon
    • 1
  1. 1.CNRS, LaBRI, INRIA Bordeaux – Sud-OuestUniversité de BordeauxBordeauxFrance
  2. 2.Université de NiceNiceFrance

Personalised recommendations