Abstract
During the last years, a wide range of huge networks has been made available to researchers. The discovery of natural groups, a task called graph clustering, in such datasets is a challenge arising in many applications such as the analysis of neural, social, and communication networks.
We here present Orca, a new graph clustering algorithm, which operates locally and hierarchically contracts the input. In contrast to most existing graph clustering algorithms, which operate globally, Orca is able to cluster inputs with hundreds of millions of edges in less than 2.5 hours, identifying clusterings with measurably high quality. Our approach explicitly avoids maximizing any single index value such as modularity, but instead relies on simple and sound structural operations. We present and discuss the Orca algorithm and evaluate its performance with respect to both clustering quality and running time, compared to other graph clustering algorithms.
This work was partially supported by the DFG under grant WA 654/15-1 and by the EU under grant DELIS (contract no. 001907).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CiteSeer, Scientific Literature Digital Library, http://citeseer.ist.psu.edu
CAIDA - Cooperative Association for Internet Data Analysis, http://www.caida.org
DBLP - DataBase systems and Logic Programming (2007), http://dblp.uni-trier.de
Finding communities in large networks (2008), http://findcommunities.googlepages.com
IGRAPH - The igraph library (2008), http://cneurocvs.rmki.kfki.hu/igraph
Laboratory for Web Algorithmics (2008), http://law.dsi.unimi.it
An, Y., Janssen, J., Milios, E.E.: Characterizing and Mining the Citation Graph of the Computer Science Literature. Knowledge and Information Systems 6(6), 664–678 (2004)
Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A.: Complexity and Approximation - Combinatorial Optimization Problems and Their Approximability Properties, 2nd edn. Springer, Heidelberg (2002)
Barthélémy, M., Fortunato, S.: Resolution limit in community detection. Proc. of the National Academy of Science of the USA 104(1), 36–41 (2007)
Blondel, V., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. JSM: Theory and Experiment 2008(10) (2008)
Brandes, U., Delling, D., Gaertler, M., Görke, R., Höfer, M., Nikoloski, Z., Wagner, D.: On Modularity Clustering. IEEE TKDE 20(2), 172–188 (2008)
Brandes, U., Erlebach, T. (eds.): Network Analysis: Methodological Foundations. LNCS, vol. 3418. Springer, Heidelberg (2005)
Brandes, U., Gaertler, M., Wagner, D.: Engineering Graph Clustering: Models and Experimental Evaluation. ACM JEA 12(1.1), 1–26 (2007)
Castellano, C., Fortunato, S.: Community Structure in Graphs. To appear as chapter of Springer’s Encyclopedia of Complexity and Systems Science (2008) arXiv:0712.2716v1
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E 70(066111) (2004)
Delling, D., Gaertler, M., Wagner, D.: Generating Significant Graph Clusterings. In: Proc. of ECCS 2006 (2006)
Delling, D., Sanders, P., Schultes, D., Wagner, D.: Engineering Route Planning Algorithms. In: Lerner, J., Wagner, D., Zweig, K.A. (eds.) Algorithmics of Large and Complex Networks. LNCS. Springer, Heidelberg (to appear, 2009)
Demetrescu, C., Goldberg, A.V., Johnson, D.S. (eds.): 9th DIMACS Implementation Challenge - Shortest Paths (November 2006)
Derényi, I., Palla, G., Vicsek, T.: Clique Percolation in Random Networks. Physical Review Letters 94 (2005)
Duch, J., Arenas, A.: Community Detection in Complex Networks using Extremal Optimization. Physical Review E 72(027104), 1–4 (2005)
Flake, G.W., Tarjan, R.E., Tsioutsiouliklis, K.: Graph Clustering and Minimum Cut Trees. Internet Mathematics 1(4), 385–408 (2003)
Gaertler, M., Görke, R., Wagner, D.: Significance-Driven Graph Clustering. In: Kao, M.-Y., Li, X.-Y. (eds.) AAIM 2007. LNCS, vol. 4508, pp. 11–26. Springer, Heidelberg (2007)
Gaertler, M., Görke, R., Wagner, D., Wagner, S.: How to Cluster Evolving Graphs. In: Proc. of ECCS 2006 (2006)
Kannan, R., Vempala, S., Vetta, A.: On Clusterings: Good, Bad, Spectral. Journal of the ACM 51(3), 497–515 (2004)
Lancichinetti, A., Fortunato, S., Kertéz, J.: Detecting the Overlapping and Hierarchical Community Structure of Complex Networks (2008), http://arxiv.org/abs/0802.1218
Newman, M.E.J.: Analysis of Weighted Networks. Physical Review E 70(056131), 1–9 (2004)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(026113) (2004)
Pons, P., Latapy, M.: Computing Communities in Large Networks Using Random Walks. Journal of Graph Algorithms and Applications 10(2), 191–218 (2006)
Seidman, S.B.: Network Structure and Minimum Degree. Social Networks 5, 269–287 (1983)
van Dongen, S.M.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)
Wagner, D., Wagner, F.: Between Min Cut and Graph Bisection. In: Borzyszkowski, A.M., Sokolowski, S. (eds.) MFCS 1993. LNCS, vol. 711, pp. 744–750. Springer, Heidelberg (1993)
Zachary, W.W.: An Information Flow Model for Conflict and Fission in Small Groups. Journal of Anthropological Research 33, 452–473 (1977)
Wang, B., Phillips, J.M., Schreiber, R., Wilkinson, D., Mishra, N., Tarjan, R.: Spatial Scan Statistics for Graph Clustering. In: Proc. of SDM 2008, Atlanta (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Delling, D., Görke, R., Schulz, C., Wagner, D. (2009). Orca Reduction and ContrAction Graph Clustering. In: Goldberg, A.V., Zhou, Y. (eds) Algorithmic Aspects in Information and Management. AAIM 2009. Lecture Notes in Computer Science, vol 5564. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02158-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-02158-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02157-2
Online ISBN: 978-3-642-02158-9
eBook Packages: Computer ScienceComputer Science (R0)