A degree-related and link clustering coefficient approach for link prediction in complex networks

Abstract

Link prediction plays a significant role in both theoretical research and practical application of complex network analysis, and thus has attracted much attention. Numerous similarity-based methods have been proposed to solve the link prediction problem, and various topological structure features of the network have been exploited to construct the similarity score. Most methods focus on the topological feature information of nodes rather than that of links. We define a degree-related and link clustering coefficient that can better describe the function of the common neighbor in distinct local areas. Then, the proposed clustering coefficient is applied to determine the similarity of node pairs. In particular, the node degree information of each endpoint is utilized to reflect the influence of the end node when exploring the similarity score. In addition, on small-scale, medium-scale, and large-scale real-world networks from different fields, our method is compared with some representative methods, including local similarity-based methods and graph embedding-based methods , and the performances are evaluated by two commonly used metrics. The experiment results show the feasibility and effectiveness of our method for networks with different scales, and demonstrate that prediction accuracy can be further improved by the novel measure of the degree-related and link clustering coefficient.

Graphic abstract

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

Data availability statement

This manuscript has associated data in a data repository. [Authors’ comment: The datasets utilized in this paper are downloaded from the following academic web sites. http://vlado.fmf.uni-lj.si/pub/networks/data/default.htmhttp://www.linkprediction.org/index.php/link/resource/data/2http://www-personal.umich.edu/mejn/netdata/https://github.com/gephi/gephi/wiki/Datasetshttp://snap.stanford.edu/data/.]

References

  1. 1.

    E. Sprinzak, S. Sattath, H. Margalit, J. Mol. Biol. 327, 919–923 (2003)

    Google Scholar 

  2. 2.

    F. Liljeros, C. Edling, L. Amaral, H. Stanley, Y. Aberg, Nature 411, 907–908 (2001)

    ADS  Google Scholar 

  3. 3.

    D. Liben-Nowell, J. Kleinberg, J. Am. Soc. Inf. Sci. Technol. 58, 1019–1031 (2007)

    Google Scholar 

  4. 4.

    V. Martínez, F. Berzal, J.C. Cubero, ACM Comput. Surv. 49, 1–33 (2016)

    Google Scholar 

  5. 5.

    X. Lou, J.A.K. Suykens, Chaos 21, 043116 (2011)

    ADS  Google Scholar 

  6. 6.

    H. Bouziane, B. Messabih, A. Chouarfia, Soft Comput. 19, 1663–1678 (2015)

    Google Scholar 

  7. 7.

    J. Zhang, Inf. Process. Manag. 53, 42–51 (2017)

    ADS  Google Scholar 

  8. 8.

    Q. Yu, C. Long, Y. Lv, H. Shao, P. He, Z. Duan, PLoS One 9, 1–7 (2014)

    Google Scholar 

  9. 9.

    C. Ma, T. Zhou, H. Zhang, Sci. Rep. 6, 30098 (2016)

    ADS  Google Scholar 

  10. 10.

    M. Fire, L. Tenenboim, O. Lesser, R. Puzis, L. Rokach, Y. Elovici, in 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (IEEE Computer Society, Los Alamitos, 2011), pp. 73–80

  11. 11.

    L. Lü, M. Medo, C.H. Yeung, Y. Zhang, Z. Zhang, T. Zhou, Phys. Rep. 519, 1–49 (2012)

    ADS  Google Scholar 

  12. 12.

    L. Lü, T. Zhou, Physica A 390, 1150–1170 (2011)

    ADS  Google Scholar 

  13. 13.

    P. Wang, B. Xu, Y. Wu, X. Zhou, Sci. China Inf. Sci. 58, 1–38 (2015)

    Google Scholar 

  14. 14.

    Z. Li, X. Fang, O.R.L. Sheng, A.C.M. Trans, Manag. Inf. Syst. 9, 1–26 (2017)

    Google Scholar 

  15. 15.

    M.E.J. Newman, Phys. Rev. E 64, 025102 (2001)

    ADS  Google Scholar 

  16. 16.

    L.A. Adamic, E. Adar, Soc. Netw. 25, 211–230 (2003)

    Google Scholar 

  17. 17.

    A. Barabási, R. Albert, Science 286, 509–512 (1999)

    ADS  MathSciNet  Google Scholar 

  18. 18.

    T. Zhou, L. Lu, Y.C. Zhang, Eur. Phys. J. B 71, 623–630 (2009)

    ADS  Google Scholar 

  19. 19.

    Z. Liu, Q. Zhang, L. Lü, T. Zhou, Europhys. Lett. 96, 48007 (2011)

    ADS  Google Scholar 

  20. 20.

    Z. Wu, Y. Lin, J. Wang, Physica A 452, 1–8 (2016)

    ADS  Google Scholar 

  21. 21.

    L. Katz, Psychometrika 18, 39–43 (1953)

    Google Scholar 

  22. 22.

    H. Tong, C. Faloutsos, J. Pan, in Proceedings of the Sixth International Conference on Data Mining (IEEE Computer Society, Los Alamitos, 2006), pp. 613–622

  23. 23.

    E.A. Leicht, P. Holme, M.E.J. Newman, Phys. Rev. E 73, 026120 (2006)

    ADS  Google Scholar 

  24. 24.

    P. Chebotarev, E. Shamis, Autom. Remote Control 58, 1505–1514 (2006)

    Google Scholar 

  25. 25.

    L. Lü, C. Jin, T. Zhou, Phys. Rev. E 80, 046122 (2009)

    ADS  Google Scholar 

  26. 26.

    W. Liu, L. Lü, Europhys. Lett. 89, 58007 (2010)

    ADS  Google Scholar 

  27. 27.

    X. Zhu, H. Tian, S. Cai, J. Huang, T. Zhou, Europhys. Lett. 106, 18008 (2014)

    ADS  Google Scholar 

  28. 28.

    L. Getoor, N. Friedman, D. Koller, B. Taskar, J. Mach. Learn. Res. 3, 679–707 (2003)

    MathSciNet  Google Scholar 

  29. 29.

    J. Neville, Ph.D. thesis, University of Massachusetts Amherst (2006). https://dl.acm.org/doi/book/10.5555/1269135. Accessed Jan 2020

  30. 30.

    B. Taskar, M. Wong, P. Abbeel, D. Koller, in Proceedings of the 16th International Conference on Neural Information Processing Systems (MIT Press, Cambridge, 2003), pp. 659–666

  31. 31.

    A. Clauset, C. Moore, M.E.J. Newman, Nature 453, 98–101 (2008)

    ADS  Google Scholar 

  32. 32.

    R. GuimeràÂ, M. Sales-Pardo, Proc. Natl. Acad. Sci. USA 106, 22073–22078 (2009)

    ADS  Google Scholar 

  33. 33.

    A. Kumar, S. Singh, K. Singh, B. Biswas, Physica A 553, 124289 (2020)

    MathSciNet  Google Scholar 

  34. 34.

    R. Pech, D. Hao, L. Pan, H. Cheng, T. Zhou, Europhys. Lett. 117, 38002 (2017)

    ADS  Google Scholar 

  35. 35.

    A.K. Menon, C. Elkan, in Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases-Volume Part II (Springer, Berlin, 2011), pp. 437–452

  36. 36.

    B. Perozzi, R. Al-Rfou, S. Skiena, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, 2014), pp. 701–710

  37. 37.

    J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, in Proceedings of the 24th International Conference on World Wide Web (International World Wide Web Conferences Steering Committee, Geneva, 2015), pp. 1067–1077

  38. 38.

    A. Grover, J. Leskovec, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, 2016), pp. 855–864

  39. 39.

    D. Wang, P. Cui, W. Zhu, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, 2016), pp. 1225–1234

  40. 40.

    M. Belkin, P. Niyogi, in Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (MIT Press, Cambridge, 2001), pp. 585–591

  41. 41.

    S. Roweis, L. Saul, Science 290, 2323–2326 (2001)

    ADS  Google Scholar 

  42. 42.

    D.J. Watts, S.H. Strogatz, Nature 393, 440–442 (1998)

    ADS  Google Scholar 

  43. 43.

    X. Chen, L. Fang, T. Yang, J. Yang, J. Zhao, Chaos 29, 053135 (2019)

    ADS  Google Scholar 

  44. 44.

    J. Wang, M. Li, H. Wang, IEEE/ACM Trans. Comput. Biol. Bioinf. 9, 1070–1080 (2011)

    Google Scholar 

  45. 45.

    F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, D. Parisi, Proc. Natl. Acad. Sci. USA 101, 2658–2663 (2004)

    ADS  Google Scholar 

  46. 46.

    E. Ravasz, A.L. Somera, D.A. Mongru, Z.N. Oltvai, A.-L. Barabási, Science 297, 1551–1555 (2002)

    ADS  Google Scholar 

  47. 47.

    M.E.J. Newman, Phys. Rev. Lett. 89, 208701 (2002)

    ADS  Google Scholar 

  48. 48.

    C. Cannistraci, G. Alanis-Lobato, T. Ravasi, Sci. Rep. 3, 1613 (2013)

    ADS  Google Scholar 

  49. 49.

    J. Ding, L. Jiao, J. Wu, F. Liu, Knowl. Based Syst. 98, 200–215 (2016)

    Google Scholar 

  50. 50.

    J. Leskovec, J. Kleinberg, C. Faloutsos, ACM Trans. Knowl. Discov. Data 1, 2-es (2007)

  51. 51.

    B. Rozemberczki, R. Sarkar, in Proceedings of the 29th ACM International Conference on Information and Knowledge Management (Association for Computing Machinery, New York, 2020), pp. 1325–1334

  52. 52.

    J. Leskovec, K.J. Lang, A. Dasgupta, M.W. Mahoney, Internet Math. 6, 29–123 (2009)

    MathSciNet  Google Scholar 

  53. 53.

    E. Cho, S.A. Myers, J. Leskovec, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, 2011), pp. 1082–1090

  54. 54.

    D. Lusseau, K. Schneider, O.J. Boisseau, P. Haase, E. Slooten, S.M. Dawson, Behav. Ecol. Sociobiol. 54, 396–405 (2003)

    Google Scholar 

  55. 55.

    M. Girvan, M.E.J. Newman, Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002)

    ADS  MathSciNet  Google Scholar 

  56. 56.

    P.M. Gleiser, L. Danon, Adv. Complex Syst. 6, 565–573 (2003)

    Google Scholar 

  57. 57.

    J.G. White, E. Southgate, J.N. Thomson, S. Brenner, Philos. Trans. R. Soc. Lond. Ser. B Biol. 314, 1–340 (1986)

  58. 58.

    J. Kunegis, Caenorhabditis elegans network dataset. http://konect.uni-koblenz.de/networks/moreno_names. Accessed Jan 2020

  59. 59.

    J. Duch, A. Arenas, Phys. Rev. E 72, 027104 (2005)

    ADS  Google Scholar 

  60. 60.

    L.A. Adamic, N. Glance, in Proceedings of the 3rd International Workshop on Link Discovery (Association for Computing Machinery, New York, 2005), pp. 36–43

  61. 61.

    J. Kunegis, Bible network dataset. http://konect.uni-koblenz.de/networks/moreno_names. Accessed Jan 2020

  62. 62.

    D. Bu, Z. Yi, C. Lun, X. Hong, X. Zhu, H. Lu, J. Zhang, S. Sun, L. Ling, Z. Nan, Nucleic Acids Res. 31, 2443–2450 (2003)

    Google Scholar 

  63. 63.

    R.A. Rossi, N.K. Ahmed, in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI Press, Palo Alto, 2015), pp. 4292–4293

  64. 64.

    N. Spring, R. Mahajan, D. Wetherall, ACM SIGCOMM Comput. Commun. Rev. 32, 133–145 (2002)

    Google Scholar 

  65. 65.

    J.A. Hanley, B.J. Mcneil, Radiology 143, 29–36 (1982)

    Google Scholar 

  66. 66.

    J. Herlocker, J. Konstan, L. Terveen, J.T. Riedl, A.C.M. Trans, Inf. Syst. 22, 5–53 (2004)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by Jiangsu Provincial Natural Science Foundation of China (BK20201340) and China Postdoctoral Science Foundation (2018M642160). The authors would like to thank the anonymous reviewers for their valuable comments of the manuscript.

Author information

Affiliations

Authors

Contributions

MW and XL designed research; MW performed research; MW and XL analyzed data; and MW, XL, and BC wrote the paper.

Corresponding author

Correspondence to Xuyang Lou.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Lou, X. & Cui, B. A degree-related and link clustering coefficient approach for link prediction in complex networks. Eur. Phys. J. B 94, 33 (2021). https://doi.org/10.1140/epjb/s10051-020-00037-z

Download citation