Abstract
Community detection algorithms are widely used to study the structural properties of real-world networks. In this paper, we experimentally evaluate the qualitative performance of several community detection algorithms using large-scale email networks. The email networks were generated from real email traffic and contain both legitimate email (ham) and unsolicited email (spam). We compare the quality of the algorithms with respect to a number of structural quality functions and a logical quality measure which assesses the ability of the algorithms to separate ham and spam emails by clustering them into distinct communities. Our study reveals that the algorithms that perform well with respect to structural quality, don’t achieve high logical quality. We also show that the algorithms with similar structural quality also have similar logical quality regardless of their approach to clustering. Finally, we reveal that the algorithm that performs link community detection is more suitable for clustering email networks than the node-based approaches, and it creates more distinct communities of ham and spam edges.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ahn, Y.-Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010)
Almeida, H., Guedes, D., Meira Jr., W., Zaki, M.J.: Is There a Best Quality Metric for Graph Clusters? In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS, vol. 6911, pp. 44–59. Springer, Heidelberg (2011)
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), P10008 (2008)
Brandes, U., Gaertler, M., Wagner, D.: Experiments on Graph Clustering Algorithms. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 568–579. Springer, Heidelberg (2003)
Danon, L., Díaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment 2005(09), P09008 (2005)
Delling, D., Gaertler, M., Robert, G., Nikoloski, Z., Wagner, D.: How to Evaluate Clustering Techniques. Technical report, no. 2006-4, Universität Karlsruhe (2006)
Evans, T., Lambiotte, R.: Line graphs, link partitions, and overlapping communities. Physical Review E 80(1), 1–8 (2009)
Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America 99(12), 7821–7826 (2002)
Guimerà, R., Danon, L., Díaz-Guilera, A., Giralt, F., Arenas, A.: Self-similar community structure in a network of human interactions. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics 68(6 pt. 2), 065103 (2003)
Kannan, R., Vempala, S., Veta, A.: On clusterings-good, bad and spectral. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp. 367–377. IEEE Comput. Soc. (2000)
Lancichinetti, A., Fortunato, S.: Community detection algorithms: A comparative analysis. Physical Review E 80(5), 1–11 (2009)
Lancichinetti, A., Kivelä, M., Saramäki, J., Fortunato, S.: Characterizing the community structure of complex networks. PloS One 5(8), e11976 (2010)
Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web, p. 631. ACM Press, New York (2010)
Moradi, F., Almgren, M., John, W., Olovsson, T., Tsigas, P.: On Collection of Large-Scale Multi-Purpose Datasets on Internet Backbone Links. In: Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (2011)
Moradi, F., Olovsson, T., Tsigas, P.: Structural and Temporal Properties of E-mail and Spam Networks. Technical report, no. 2011-18, Chalmers University of Technology (2011)
Newman, M., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(2), 1–15 (2004)
Ronhovde, P., Nussinov, Z.: Multiresolution community detection for megascale networks by information-based replica correlations. Physical Review E 80(1), 1–18 (2009)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences of the United States of America 105(4), 1118–1123 (2008)
Rosvall, M., Bergstrom, C.T.: Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PloS One 6(4), e18209 (2011)
Schaeffer, S.E.: Graph clustering. Computer Science Review 1(1), 27–64 (2007)
Tibély, G., Kovanen, L., Karsai, M., Kaski, K., Kertész, J., Saramäki, J.: Communities and beyond: Mesoscopic analysis of a large social network with complementary methods. Physical Review E 83(5), 1–10 (2011)
Van Dongen, S.: Graph clustering by flow simulation. PhD thesis, University of Utrecht, The Netherlands (2000)
Viswanath, B., Post, A., Gummadi, K.P., Mislove, A.: An analysis of social network-based Sybil defenses. In: Proceedings of the ACM SIGCOMM 2010 Conference, p. 363. ACM Press, New York (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Moradi, F., Olovsson, T., Tsigas, P. (2012). An Evaluation of Community Detection Algorithms on Large-Scale Email Traffic. In: Klasing, R. (eds) Experimental Algorithms. SEA 2012. Lecture Notes in Computer Science, vol 7276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30850-5_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-30850-5_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30849-9
Online ISBN: 978-3-642-30850-5
eBook Packages: Computer ScienceComputer Science (R0)