Advertisement

Efficiency of Transformations of Proximity Measures for Graph Clustering

  • Rinat AynulinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11631)

Abstract

Choice of proximity measure for the nodes greatly affects the results of graph clustering. In this paper, we consider several proximity measures transformed with a number of functions including the logarithmic function, the power function, and a family of activation functions. Transformations are tested in experiments in which several classical datasets are clustered using the k-Means, Ward, and the spectral method. The analysis of experimental results with statistical methods shows that a number of transformed proximity measures outperform their non-transformed versions. The top-performing transformed measures are the Heat measure transformed with the power function, the Forest measure transformed with the power function, and the Forest measure transformed with the logarithmic function.

Supplementary material

References

  1. 1.
    Avrachenkov, K., Chebotarev, P., Rubanov, D.: Kernels on graphs as proximity measures. In: Bonato, A., Chung Graham, F., Prałat, P. (eds.) WAW 2017. LNCS, vol. 10519, pp. 27–41. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-67810-8_3CrossRefGoogle Scholar
  2. 2.
    Chebotarev, P.: The walk distances in graphs. Discrete Appl. Math. 160, 1484–1500 (2012)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Chebotarev, P.: Studying new classes of graph metrics. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2013. LNCS, vol. 8085, pp. 207–214. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40020-9_21CrossRefzbMATHGoogle Scholar
  4. 4.
    Chebotarev, P., Shamis, E.: On the proximity measure for graph vertices provided by the inverse Laplacian characteristic matrix. In: Abstracts of the Conference “Linear Algebra and its Application”, 10–12 June 1995, pp. 6–7 (1995)Google Scholar
  5. 5.
    Chebotarev, P., Shamis, E.: On a duality between metrics and \(\varSigma \)-proximities. Autom. Remote Control. 59, 608–612 (1998)zbMATHGoogle Scholar
  6. 6.
    Chebotarev, P., Shamis, E.: On proximity measures for graph vertices. Autom. Remote Control. 59, 1443–1459 (1998)zbMATHGoogle Scholar
  7. 7.
    Chebotarev, P., Shamis, E.: The forest metrics for graph vertices. Electron. Notes Discret. Math. 11, 98–107 (2002)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Deza, M.M., Deza, E.: Encyclopedia of Distances. Springer, Berlin (2016).  https://doi.org/10.1007/978-3-662-52844-0CrossRefzbMATHGoogle Scholar
  10. 10.
    Estrada, E.: The communicability distance in graphs. Linear Algebr. Its Appl. 436, 4317–4328 (2012)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Fouss, F., Yen, L., Pirotte, A., Saerens, M.: An experimental investigation of graph kernels on a collaborative recommendation task. In: Proceedings of the Sixth International Conference on Data Mining (ICDM 2006), pp. 863–868 (2006)Google Scholar
  12. 12.
    Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32, 675–701 (1937)CrossRefGoogle Scholar
  13. 13.
    Goddard, W., Oellermann, O.R.: Distance in graphs. In: Dehmer, M. (ed.) Structural Analysis of Complex Networks, pp. 49–72. Birkhäuser, Boston (2010).  https://doi.org/10.1007/978-0-8176-4789-6_3CrossRefGoogle Scholar
  14. 14.
    Hartigan, J.A., Wong, M.A.: Algorithm as 136: a \(k\)-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)zbMATHGoogle Scholar
  15. 15.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)CrossRefGoogle Scholar
  16. 16.
    Ivashkin, V., Chebotarev, P.: Do logarithmic proximity measures outperform plain ones in graph clustering? In: Kalyagin, V., Nikolaev, A., Pardalos, P., Prokopyev, O. (eds.) NET 2016. PROMS, vol. 197, pp. 87–105. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-56829-4_8CrossRefGoogle Scholar
  17. 17.
    Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silvermank, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)CrossRefGoogle Scholar
  19. 19.
    Kondor, R.I., Lafferty, J.D.: Diffusion kernels on graphs and other discrete input spaces. In: Proceedings of ICML, pp. 315–322 (2002)Google Scholar
  20. 20.
    Milligan, G., Cooper, M.: A study of the comparability of external criteria for hierarchical cluster-analysis. Multivar. Behav. Res. 21, 441–458 (1986)CrossRefGoogle Scholar
  21. 21.
    Nemenyi, P.: Distribution-free multiple comparisons. Biometrics 18(2), 263 (1962)Google Scholar
  22. 22.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)Google Scholar
  23. 23.
    Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)CrossRefGoogle Scholar
  24. 24.
    Schenker, A., Last, M., Bunke, H., Kandel, A.: Comparison of distance measures for graph-based clustering of documents. In: Hancock, E., Vento, M. (eds.) GbRPR 2003. LNCS, vol. 2726, pp. 202–213. Springer, Heidelberg (2003).  https://doi.org/10.1007/3-540-45028-9_18CrossRefzbMATHGoogle Scholar
  25. 25.
    Sommer, F., Fouss, F., Saerens, M.: Comparison of graph node distances on clustering tasks. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9886, pp. 192–201. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-44778-0_23CrossRefGoogle Scholar
  26. 26.
    Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Yen, L., Fouss, F., Decaestecker, C., Francq, P., Saerens, M.: Graph nodes clustering based on the commute-time kernel. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 1037–1045. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-71701-0_117CrossRefGoogle Scholar
  28. 28.
    Yen, L., Vanvyve, D., Wouters, F.: Clustering using a random walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks, ESAAN-2005, pp. 317–324 (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Kotel’nikov Institute of Radio-engineering and Electronics (IRE) of Russian Academy of SciencesMoscowRussia
  2. 2.Moscow Institute of Physics and TechnologyDolgoprudnyRussia

Personalised recommendations