A New Clustering Algorithm Based on Graph Connectivity

  • Yu-Feng Li
  • Liang-Hung Lu
  • Ying-Chao HungEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 858)


A new clustering algorithm based on the concept of graph connectivity is introduced. The idea is to develop a meaningful graph representation for data, where each resulting sub-graph corresponds to a cluster with highly similar objects connected by edge. The proposed algorithm has a fairly strong theoretical basis that supports its originality and computational efficiency. Further, some useful guidelines are provided so that the algorithm can be tuned to optimize the well-designed quality indices. Numerical evidences show that the proposed algorithm can provide a very good clustering accuracy for a number of benchmark data and has a relatively low computational complexity compared to some sophisticated clustering methods.


Clustering Graph theory Time complexity 


  1. 1.
    Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Program. 79, 191–215 (1997)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Sneath, P.: The application of computers to taxonomy. J. Gen. Microbiol. 17, 201–226 (1957)CrossRefGoogle Scholar
  3. 3.
    Sorensen, T.: A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyzes of the vegetation on danish commons. Biologiske Skrifter 5, 1–34 (1948)Google Scholar
  4. 4.
    Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefGoogle Scholar
  5. 5.
    Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)CrossRefGoogle Scholar
  7. 7.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, New Jersey (1988)zbMATHGoogle Scholar
  8. 8.
    Everitt, B., Landau, S.: Cluster Analysis. Amold, London (2001)zbMATHGoogle Scholar
  9. 9.
    Harary, F.: Graph Theory. Addison-Wesley, Boston (1969)CrossRefGoogle Scholar
  10. 10.
    Karypis, G., Han, E., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. IEEE Comput. 32, 68–75 (1999)CrossRefGoogle Scholar
  11. 11.
    Zadeh, L.: Fuzzy sets. Inf. Control 8, 338–353 (1965)CrossRefGoogle Scholar
  12. 12.
    Kohonen, T.: Self-organizing Maps. Springer, New York (2001)CrossRefGoogle Scholar
  13. 13.
    Schölkopf, B., Smola, A., Müller, K.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)CrossRefGoogle Scholar
  14. 14.
    Matula, D.W.: The cohesive strength of graphs. In: Chartrand, G., Kapoor, S.F. (eds.) The Many Facets of Graph Theory. Lecture Notes in Mathematics, vol. 110, pp. 215–221. Springer, Berlin (1969)CrossRefGoogle Scholar
  15. 15.
    Matula, D.W.: k-components, clusters and slicings in graphs. SIAM J. Appl. Math. 22, 459–480 (1972)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Cherng, J., Lo, M.: A hypergraph based clustering algorithm for spatial data sets. In: Proceedings of the IEEE International Conference on Data Mining, pp. 83–90 (2001)Google Scholar
  17. 17.
    Estivill-Castro, V., Lee, I.: AMOEBA: hierarchical clustering based on spatial proximity using Delaunay diagram. In: Proceedings of the 9th Symposium on Spatial Data Handling, pp. 7a.26–7a.41 (1999)Google Scholar
  18. 18.
    Hartuv, E., Shamir, R.: A clustering algorithm based on graph connectivity. Inf. Process. Lett. 76, 175–181 (2000)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Sharan, R., Shamir, R.: CLICK: a clustering algorithm with applications to gene expression analysis. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, pp. 307–316 (2000)Google Scholar
  20. 20.
    Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6, 281–297 (1999)CrossRefGoogle Scholar
  21. 21.
    Zhou, Y., Cheng, H., Yu, J.X.: Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2, 718–729 (2009)CrossRefGoogle Scholar
  22. 22.
    Parimala, M., Lopez, D.: Graph clustering based on structural attribute neighborhood similarity (SANS). In: Proceedings of the IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), pp. 1–4 (2015)Google Scholar
  23. 23.
    Milligan, G., Cooper, M.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50, 159–179 (1985)CrossRefGoogle Scholar
  24. 24.
    Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E.: The bounds for selection. J. Comput. Syst. Sci. 7, 448–461 (1973)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Moore, E.F.: The shortest path through a maze. In: Proceedings of the International Symposium on the Theory of Switching, pp. 285–292 (1959)Google Scholar
  26. 26.
    Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM 44, 585–591 (1997)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Electrical EngineeringNational Taiwan UniversityTaipeiTaiwan
  2. 2.Department of StatisticsNational Chengchi UniversityTaipeiTaiwan

Personalised recommendations