Skip to main content
Log in

Assessing the quality of multilevel graph clustering

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

“Lifting up” a non-hierarchical approach to handle hierarchical clustering by iteratively applying the approach to hierarchically cluster a graph is a popular strategy. However, these lifted iterative strategies cannot reasonably guide the overall nesting process precisely because they fail to evaluate the very hierarchical character of the clustering they produce. In this study, we develop a criterion that can evaluate the quality of the subgraph hierarchy. The multilevel criterion we present and discuss in this paper generalizes a measure designed for a one-level (flat) graph clustering to take nesting of the clusters into account. We borrow ideas from standard techniques in algebraic combinatorics and exploit a variable \(q\) to keep track of the depth of clusters at which edges occur. Our multilevel measure relies on a recursive definition involving variable \(q\) outputting a one-variable polynomial. This paper examines archetypal examples as proofs-of-concept; these simple cases are useful in understanding how the multilevel measure actually works. We also apply this multilevel modularity to real world networks to demonstrate how it can be used to compare hierarchical clusterings of graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. See www.labri.fr/perso/queyroi/tulip/HierarchicalMQ.py.

  2. Actually, the groups of teams provided by the authors correspond to the 2001 conferences. Thanks to Evans (2010), here we use the correct conferences for the 2000 season.

  3. Source: INSEE, www.insee.fr. The data are unfortunately not publicly available but must be purchased from INSEE.

References

  • Auber D (2003) Tulip—a huge graph visualization framework. In: Mutzel P, Jnger M (eds) Graph drawing software, mathematics and visualization series. Springer, New York

  • Auber D, Archambault D, Bourqui R, Lambert A, Mathiaut M, Mary P, Delest M, Dubois J, Melançon G (2012) The tulip 3 framework: a scalable software library for information visualization applications based on relational data. Technical report RR-7860. INRIA Bordeaux Sud-Ouest, Bordeaux

  • Auber D, Chiricota Y, Jourdan F, Melançon G (2003) Multiscale navigation of small world networks. In: IEEE symposium on information visualisation, pp 75–81. IEEE Computer Society, Washington, DC

  • Batty M (2006) Hierarchy in cities and city systems. In: Pumain D (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, Berlin, pp 143–168

  • Blondel V, Guillaume J, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech P10:008

    Google Scholar 

  • Boutin F, Hascoët M (2004) Cluster validity indices for graph partitioning. In: IV’04: 8th IEEE international conference on information visualization. IEEE, London, pp 376–381. http://hal-lirmm.ccsd.cnrs.fr/lirmm-00108948/en/

  • Brandes U, Gaertler M, Wagner D (2007) Engineering graph clustering: models and experimental evaluation. J Exp Algorithmics 12:1–26

    Google Scholar 

  • Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv 38(1):2

    Article  Google Scholar 

  • Cook DJ, Holder LB (2006) (eds.) Mining graph data. Wiley, Hoboken

  • De Montis A, Caschili S, Chessa A (2011) Commuter networks and community detection: a method for planning sub regional areas. Arxiv, arXiv:1103.2467 (preprint)

    Google Scholar 

  • Delest M, Fédou J, Melançon G (2007) A quality measure for multi-level community structure. In: Eighth international symposium on symbolic and numeric algorithms for scientific computing, SYNASC’06. IEEE, Washington D.C., pp 63–68

  • Delest MP, Fédou JM (1992) Attribute grammars are useful for combinatorics. Theor Comput Sci 98(1): 65–76

    Article  MATH  Google Scholar 

  • Erdös P, Rényi A (1959) On random graphs I. Publ Math Debr 6:290–297

    MATH  Google Scholar 

  • Evans T (2010) Clique graphs and overlapping communities. J Stat Mech: P12037. doi:10.1088/1742-5468/2010/12/P12037

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174

    Article  MathSciNet  Google Scholar 

  • Gaume B, Venant F, Victorri B (2006) Hierarchy in lexical organization of natural language. In: Pumain D (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, New York, pp 121–142

  • Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 99:7821–7826

    Article  MATH  MathSciNet  Google Scholar 

  • Good B, De Montjoye Y, Clauset A (2010) Performance of modularity maximization in practical contexts. Phys Rev E 81(4):46–106

    Article  Google Scholar 

  • Jonyer I, Cook D, Holder L (2002) Graph-based hierarchical conceptual clustering. J Mach Learn Res 2:19–43

    MATH  Google Scholar 

  • Mancoridis S, Mitchell BS, Rorres C, Chen Y, Gansner E (1998) Using automatic clustering to produce high-level system organizations of source code. In: IEEE international workshop on program understanding (IWPC’98). IEEE Computer Society, Washington D.C., pp 173–179

  • Mishna M (2003) Attribute grammars and automatic complexity analysis. Adv Appl Math 30(1–2):189–207

    Article  MATH  MathSciNet  Google Scholar 

  • Newman MEJ (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066–133

    Google Scholar 

  • Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113

    Google Scholar 

  • Patuelli R, Reggiani A, Gorman S, Nijkamp P, Bade F (2007) Network analysis of commuting flows: a comparative static approach to German data. Netw Spat Econ 7(4):315–331

    Article  MATH  Google Scholar 

  • Pflieger G, Rozenblat C (2010) Discovery and evaluation of graph-based hierarchical conceptual clusters. Urban Stud (special issue: Urban Netw Netw Theory) 47(13):2723–2735

    Google Scholar 

  • Pons P, Latapy M (2011) Post-processing hierarchical community structures: quality improvements and multi-scale view. Theor Comput Sci 412(8–10):892–900

    Article  MATH  MathSciNet  Google Scholar 

  • Pumain D (2006) (ed) Hierarchy in natural and social sciences, methodos series, vol 3. Springer, Heidelberg

  • Queyroi F, Chiricota Y (2011) Visualization-based communities discovering in commuting networks: a case study. Technical report. LaBRI, INRIA. http://hal.archives-ouvertes.fr/hal-00593734/PDF/queyroi_cga.pdf

  • Rouwendal J, Nijkamp P (2004) Living in two worlds: a review of home-to-work decisions. Growth Change 35(3):287–303

    Article  Google Scholar 

  • Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: KDD, pp 737–746

  • Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1:27–64

    Article  Google Scholar 

  • Sozio M, Gionis A (2010) The community-search problem and how to plan a successful cocktail party. In: KDD, pp 939–948

  • Vespignani A (2003) Evolution thinks modular. Nature 35(2):118–119

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to François Queyroi.

Additional information

Responsible editor: Ian Davidson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Queyroi, F., Delest, M., Fédou, JM. et al. Assessing the quality of multilevel graph clustering. Data Min Knowl Disc 28, 1107–1128 (2014). https://doi.org/10.1007/s10618-013-0335-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-013-0335-9

Keywords

Navigation