Skip to main content

Clustering Communities Using Interval K-Means

  • Conference paper
  • First Online:
  • 1168 Accesses

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 288))

Abstract

With regard to large networks there is a specific need to consider particular patterns relatable to structured groups of nodes which could be also defined as communities. In this work we will propose an approach to cluster the different communities using interval data. This approach is relevant in the context of the analysis of large networks and, in particular, in order to discover the different functionalities of the communities inside a network. The approach is shown in this paper by considering different examples of networks by means of synthetic data. The application is specifically related to a large network, that of the co-authorship network in Astrophysics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    I thank the referees for their helpful suggestions

References

  1. Aggarwal, C.C.: Network Analysis in the Big Data Age: Mining Graphs and Social Streams. Keynote Talk, ECML/PKDD, 2014 (2014)

    Google Scholar 

  2. Atzmueller, M., Hotho, A., Strohmaier, M., Chin, A. (Eds.): Analysis of Social Media and Ubiquitous Data: International Workshops MSM 2010, Toronto, Canada, June 13, 2010, and MUSE 2010, Barcelona, Spain, September 20, 2010, Revised Selected Papers, vol. 6904. Springer (2011)

    Google Scholar 

  3. Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Google Scholar 

  4. Billard, L., Diday, E.: From the statistics of data to the statistics of knowledge: symbolic data analysis. J. Am. Stat. Assoc. 98(462), 470–487 (2003)

    Article  MathSciNet  Google Scholar 

  5. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  6. Bock, H.-H.: Clustering algorithms and kohonen maps for symbolic data. In: ‘ICNCB Proceedings’, Osaka, pp. 203–215 (2001)

    Google Scholar 

  7. Chavent, M, Francisco de A.T. De Carvalho, Yves Lechevallier, Rosanna Verde. New Clustering methods for interval data. Computational Statistics, vol. 21, pp. 211–229. Springer, Berlin (2006)

    Google Scholar 

  8. Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004)

    Article  Google Scholar 

  9. Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min.: ASA Data Sci. J. 4(5), 512–546 (2011)

    Article  MathSciNet  Google Scholar 

  10. Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJ. Complex Syst. 11, 1695 (2006). http://igraph.org

  11. De Carvalho, F., Souza, R., Chavent, M., Lechevallier, Y.: Adaptive Hausdorff distances and dynamic clustering of symbolic interval data. Pattern Recognit. Lett. 27(3), 167–179 (2006)

    Article  Google Scholar 

  12. Drago, C.: Exploring the community structure of complex networks. Annali del MEMOTEF - Note e Discussioni 10/2015; 2(forthcoming) (2015)

    Google Scholar 

  13. Erdos, P., Renyi, A.: On random graphs. Publ. Math. 6(195), 290–297 (1959)

    MathSciNet  MATH  Google Scholar 

  14. Fay, S., Gautrias, S.: A scientometric study of general relativity and quantum cosmology from 2000 to 2012. arXiv:1502.03471 (2015)

    Google Scholar 

  15. Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  16. Gherghi, M., Lauro, C.: Appunti di analisi dei dati multidimensionali: metodologia ed esempi. RCE edizioni (2004)

    Google Scholar 

  17. Gioia, F., Lauro, C.N.: Basic statistical methods for interval data. Stat. Appl. 17(1), 75–104 (2005)

    Google Scholar 

  18. Giordano, G., Brito, P.: Social networks as symbolic data. In: Vicari, D., Okada, A., Ragozini, G., Weihs, C. (eds.) Analysis and Modeling of Complex Data in Behavioral and Social Science, pp. 133–142. Springer, Heidelberg (2014)

    Google Scholar 

  19. Giordano, G., Signoriello, S., Vitale, M.P.: Comparing social networks in the framework of complex data analysis. CLEUP Editore, Padova: pp. 1–2, In: XLIV Riunione Scientifica Societ Italiana di Statistica (2008)

    Google Scholar 

  20. Girvan, M., Newman, M.E.: Community Structure in Social and Biological Networks (2002)

    Google Scholar 

  21. Harenberg, S., Bello, G., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., Samatova, N.: Community detection in large scale networks: a survey and empirical evaluation. Wiley Interdiscip. Rev.: Comput. Stat. 6(6), 426–439 (2014)

    Google Scholar 

  22. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data (TKDD) 1(1), 2 (2007)

    Article  Google Scholar 

  23. Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)

  24. Mancoridis, S., Mitchell, B.S., Rorres, C., Chen, Y., Gansner, E.R.: in IWPC ’98: Proceedings of the 6th International Workshop on Program Comprehension. IEEE Computer Society, Washington, DC, USA (1998)

    Google Scholar 

  25. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011)

    Google Scholar 

  26. Newman, M.E.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)

    Article  Google Scholar 

  27. Newman, M.E.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006)

    Article  MathSciNet  Google Scholar 

  28. Newman, M.E.: The mathematics of networks. New Palgrave Encycl. Econ. 2(2008), 1–12 (2008)

    Google Scholar 

  29. Nickel, C.L.M.: Random dot product graphs: A model for social networks, Vol. 68, no. 04. (2007)

    Google Scholar 

  30. Peng, W., Li, T.: Interval data clustering with applications. In: 2006. ICTAI’06. 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 355–362. IEEE (2006)

    Google Scholar 

  31. Reichardt, J., Bornholdt, S.: Statistical mechanics of community detection. Phys. Rev. E 74(1), 016110 (2006)

    Article  MathSciNet  Google Scholar 

  32. Rodriguez, O.R. with contributions from Calderon, O., Zuniga, R.: RSDA: RSDA- R to symbolic data analysis. R package version 1.2. http://CRAN.R-project.org/package=RSDA (2014)

  33. Sellis, T., Horadam, K.: Big data and complex networks analytics. IEEE Access 4, 1958–1996 (2015)

    Google Scholar 

  34. Vijgen, R.: Big data, big stories. New Challenges for Data Design, pp. 221–234. Springer, London (2015)

    Google Scholar 

  35. Zhao, Y., Levina, E., Zhu, J.: Community extraction for social networks. Proc. Natl. Acad. Sci. 108(18), 7321–7326 (2011)

    Article  Google Scholar 

  36. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications, vol. 8. Cambridge University Press, Cambridge (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlo Drago .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Drago, C. (2019). Clustering Communities Using Interval K-Means. In: Petrucci, A., Racioppi, F., Verde, R. (eds) New Statistical Developments in Data Science. SIS 2017. Springer Proceedings in Mathematics & Statistics, vol 288. Springer, Cham. https://doi.org/10.1007/978-3-030-21158-5_3

Download citation

Publish with us

Policies and ethics