Advertisement

Towards k-vertex connected component discovery from large networks

  • Yuan LiEmail author
  • Guoren Wang
  • Yuhai Zhao
  • Feida Zhu
  • Yubao Wu
Article
  • 6 Downloads
Part of the following topical collections:
  1. Special Issue on Graph Data Management in Online Social Networks

Abstract

In many real life network-based applications such as social relation analysis, Web analysis, collaborative network, road network and bioinformatics, the discovery of components with high connectivity is an important problem. In particular, k-edge connected component (k-ECC) has recently been extensively studied to discover disjoint components. Yet many real scenarios present more needs and challenges for overlapping components. In this paper, we propose a k-vertex connected component (k-VCC) model, which is much more cohesive, and thus supports overlapping between components very well. To discover k-VCCs, we propose three frameworks including top-down, bottom-up and hybrid frameworks. The top-down framework is first developed to find the exact k-VCCs by dividing the whole network. To further reduce the high computational cost for input networks of large sizes, a bottom-up framework is then proposed to locally identify the seed subgraphs, and obtain the heuristic k-VCCs by expanding and merging these seed subgraphs. Finally, the hybrid framework takes advantages of the above two frameworks. It exploits the results of bottom-up framework to construct the well-designed mixed graph and then discover the exact k-VCCs by contracting the mixed graph in a top-down way. Because the size of mixed graph is smaller than the original network, the hybrid framework runs much faster than the top-down framework. Comprehensive experimental are conducted on large real and synthetic networks and demonstrate the efficiency and effectiveness of the proposed exact and heuristic approaches.

Keywords

k-vertex connected component(k-VCC) Component detection Large network 

Notes

Acknowledgments

This research is partially supported by the National NSFC (61672041, 61772124, 61732003,61902004,61977001), National Key Research and Development Program of China (2018YFB1004402), the Start-up Funds of North China University of Technology, and the National Research Foundation, Prime Ministers Office, Singapore under its International Research Centres in Singapore Funding Initiative and the Pinnacle lab for Analytics at SMU.

References

  1. 1.
    Adamcsek, B., Palla, G., Farkas, I., Derényi, I., Vicsek, T.: Cfinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)CrossRefGoogle Scholar
  2. 2.
    Akiba, T., Iwata, Y., Yoshida, Y.: Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction. In: CIKM, pp. 909–918 (2013)Google Scholar
  3. 3.
    Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. arXiv:cs/0310049 (2003)
  4. 4.
    Berlowitz, D., Cohen, S., Kimelfeld, B.: Efficient enumeration of maximal k-plexes. In: SIGMOD, pp. 431–444 (2015)Google Scholar
  5. 5.
    Böger, C.A., Chen, M.H., Tin, A., Olden, M., Köttgen, A., de Boer, I.H., Fuchsberger, C., O’Seaghdha, C.M., Pattaro, C., Teumer, A., et al: Cubn is a gene locus for albuminuria. J. Am. Soc. Nephrol. 22(3), 555–570 (2011)CrossRefGoogle Scholar
  6. 6.
    Chang, L., Yu, J.X., Qin, L., Lin, X., Liu, C., Liang, W.: Efficiently computing k-edge connected components via graph decomposition. In: SIGMOD, pp. 205–216 (2013)Google Scholar
  7. 7.
    Chang, L., Lin, X., Qin, L., Yu, J.X., Zhang, W.: Index-based optimal algorithms for computing steiner components with maximum connectivity. In: SIGMOD, pp. 459–474. ACM (2015)Google Scholar
  8. 8.
    Chen, L.Y., Zhao, W.H., Tian, W., Guo, J., Jiang, F., Jin, L.J., Sun, Y.X., Chen, K.M., An, L.L., Li, G., et al: Stk39 is an independent risk factor for male hypertension in Han Chinese. Int. J. Cardiol. 154(2), 122–127 (2012)CrossRefGoogle Scholar
  9. 9.
    Cheng, J., Ke, Y., Chu, S., Özsu, M. T.: Efficient core decomposition in massive networks. In: ICDE, pp. 51–62 (2011)Google Scholar
  10. 10.
    Christophides, V., Karvounarakis, G., Plexousakis, D., Scholl, M., Tourtounis, S.: Optimizing taxonomic semantic Web queries using labeling schemes. Web Semantics: Science Services and Agents on the World Wide Web 1(2), 207–228 (2004)CrossRefGoogle Scholar
  11. 11.
    Cohen, J.: Trusses: Cohesive subgraphs for social network analysis. National Security Agency Technical Report 16 (2008)Google Scholar
  12. 12.
    Consortium, W.T.C.C., et al.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661 (2007)CrossRefGoogle Scholar
  13. 13.
    Conte, A., Firmani, D., Mordente, C., Patrignani, M., Torlone, R.: Fast enumeration of large k-plexes. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 115–124. ACM (2017)Google Scholar
  14. 14.
    Cui, W., Xiao, Y., Wang, H., Wang, W.: Local search of communities in large graphs. In: SIGMOD, pp. 991–1002 (2014)Google Scholar
  15. 15.
    Diestel, R.: Graph theory. Grad Texts in Math (2005)Google Scholar
  16. 16.
    Esfahanian, A.H., Louis Hakimi, S.: On computing the connectivities of graphs and digraphs. Networks 14(2), 355–366 (1984)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Even, S., Tarjan, R.E.: Network flow and testing graph connectivity. SIAM J. Comput. 4(4), 507–518 (1975)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3), 75–174 (2010)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Gregory, S.: Finding overlapping communities in networks by label propagation. J. Phys. 12(10), 103018 (2010)Google Scholar
  20. 20.
    Hariharan, R., Kavitha, T., Panigrahi, D., Bhalgat, A.: An o (mn) gomory-hu tree construction algorithm for unweighted graphs. In: ACM Symposium on Theory of Computing, pp. 605–614 (2007)Google Scholar
  21. 21.
    Hu, J., Wu, X., Cheng, R., Luo, S., Fang, Y.: Querying minimal steiner maximum-connected subgraphs in large graphs. In: CIKM, pp. 1241–1250. ACM (2016)Google Scholar
  22. 22.
    Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: SIGMOD, pp. 1311–1322 (2014)Google Scholar
  23. 23.
    Huang, X., Lu, W., Lakshmanan, L.V.: Truss decomposition of probabilistic graphs: Semantics and algorithms. In: ACM Proc. of SIGMOD, pp. 77–90. ACM (2016)Google Scholar
  24. 24.
    Kane, V., Mohanty, S.: A lower bound on the number of vertices of a graph. Proc. Am. Math. Soc. 72(1), 211–212 (1978)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Kargar, M., An, A.: Keyword search in graphs: Finding r-cliques. PVLDB 4 (10), 681–692 (2011)Google Scholar
  26. 26.
    Lappas, T., Liu, K., Terzi, E.: Finding a team of experts in social networks. In: KDD, pp. 467–476. ACM (2009)Google Scholar
  27. 27.
    Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. arXiv:1002.1827(2010)
  28. 28.
    Li, Y., Zhao, Y., Wang, G., Zhu, F., Wu, Y., Shi, S.: Effective k-vertex connected component detection in large-scale networks. In: International Conference on Database Systems for Advanced Applications, pp. 404–421. Springer (2017)Google Scholar
  29. 29.
    Li, L., Zheng, K., Wang, S., Hua, W., Zhou, X.: Go slow to go fast: Minimal on-road time route scheduling with parking facilities using historical trajectory. VLDB J. 27(3), 321–345 (2018)CrossRefGoogle Scholar
  30. 30.
    Lian, D., Zheng, K., Ge, Y., Cao, L., Chen, E., Xie, X.: Geomf++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Trans. Inf. Syst. (TOIS) 36(3), 33 (2018)CrossRefGoogle Scholar
  31. 31.
    Lim, S., Ryu, S., Kwon, S., Jung, K., Lee, J.G.: Linkscan*: Overlapping community detection using the link-space transformation. In: ICDE, pp. 292–303. IEEE (2014)Google Scholar
  32. 32.
    Liu, G., Liu, Y., Zheng, K., Liu, A., Li, Z., Wang, Y., Zhou, X.: Mcs-gpm: Multi-constrained simulation based graph pattern matching in contextual social graphs. IEEE Trans. Knowl. Data Eng. 30(6), 1050–1064 (2017)CrossRefGoogle Scholar
  33. 33.
    Mokken, R.J.: Cliques, clubs and clans. Quality & Quantity 13(2), 161–173 (1979)CrossRefGoogle Scholar
  34. 34.
    Molloy, M., Reed, B.: The size of the giant component of a random graph with a given degree sequence. Comb. Probab. Comput. 7(03), 295–305 (1998)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRefGoogle Scholar
  36. 36.
    Pattillo, J., Youssef, N., Butenko, S.: On clique relaxation models in network analysis. Eur. J. Oper. Res. 226(1), 9–18 (2013)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Shan, J., Shen, D., Nie, T., Kou, Y., Yu, G.: Searching overlapping communities for group query. World Wide Web 19(6), 1179–1202 (2016)CrossRefGoogle Scholar
  38. 38.
    Slavin, T.P., Feng, T., Schnell, A., Zhu, X., Elston, R.C.: Two-marker association tests yield new disease associations for coronary artery disease and hypertension. Human Gen. 130(6), 725–733 (2011)CrossRefGoogle Scholar
  39. 39.
    Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: SIGKDD, pp. 939–948 (2010)Google Scholar
  40. 40.
    Stoer, M., Wagner, F.: A simple min-cut algorithm. J. ACM (JACM) 44(4), 585–591 (1997)MathSciNetCrossRefGoogle Scholar
  41. 41.
    Sun, H., Huang, J., Bai, Y., Zhao, Z., Jia, X., He, F., Li, Y.: Efficient k-edge connected component detection through an early merging and splitting strategy. Knowl.-Based Syst. 111, 63–72 (2016)CrossRefGoogle Scholar
  42. 42.
    Wang, J., Cheng, J.: Truss decomposition in massive networks. PVLDB 5(9), 812–823 (2012)Google Scholar
  43. 43.
    Wang, N., Zhang, J., Tan, K.L., Tung, A.K.: On triangulation-based dense neighborhood graph discovery. PVLDB 4(2), 58–68 (2010)Google Scholar
  44. 44.
    Wang, Y., O’Connell, J.R., McArdle, P.F., Wade, J.B., Dorff, S.E., Shah, S.J., Shi, X., Pan, L., Rampersaud, E., Shen, H., et al.: Whole-genome association study identifies stk39 as a hypertension susceptibility gene. Proc. Natl. Acad. Sci. 106(1), 226–231 (2009)CrossRefGoogle Scholar
  45. 45.
    Wu, Y., Jin, R., Li, J., Zhang, X.: Robust local community detection: On free rider effect and its elimination. PVLDB 8(7), 798–809 (2015)Google Scholar
  46. 46.
    Wu, Y., Jin, R., Zhu, X., Zhang, X.: Finding dense and connected subgraphs in dual networks. In: ICDE, pp. 915–926 (2015)Google Scholar
  47. 47.
    Wu, Y., Zhu, X., Li, L., Fan, W., Jin, R., Zhang, X.: Mining dual networks: Models, algorithms and applications. TKDD 10(4), 40 (2016)CrossRefGoogle Scholar
  48. 48.
    Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: ICDM, pp. 745–754 (2012)Google Scholar
  49. 49.
    Zeng, Z., Wang, J., Zhou, L., Karypis, G.: Coherent closed quasi-clique discovery from large dense graph databases. In: KDD, pp. 797–802 (2006)Google Scholar
  50. 50.
    Zhao, Y., Zheng, K., Li, Y., Su, H., Liu, J., Zhou, X.: Destination-aware task assignment in spatial crowdsourcing: A worker decomposition approach. IEEE Transactions on Knowledge and Data Engineering (2019)Google Scholar
  51. 51.
    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S., Zhou, X.: Online discovery of gathering patterns over trajectories. IEEE Trans. Knowl. Data Eng. 26(8), 1974–1988 (2013)CrossRefGoogle Scholar
  52. 52.
    Zheng, B., Su, H., Hua, W., Zheng, K., Zhou, X., Li, G.: Efficient clue-based route search on road networks. IEEE Trans. Knowl. Data Eng. 29 (9), 1846–1859 (2017)CrossRefGoogle Scholar
  53. 53.
    Zheng, K., Zhao, Y., Lian, D., Zheng, B., Liu, G., Zhou, X.: Reference-based framework for spatio-temporal trajectory compression and query processing. IEEE Transactions on Knowledge and Data Engineering (2019)Google Scholar
  54. 54.
    Zhou, R., Liu, C., Yu, J.X., Liang, W., Chen, B., Li, J.: Finding maximal k-edge-connected subgraphs from a large graph. In: EDBT, pp. 480–491 (2012)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.North China University of TechnologyBeijingChina
  2. 2.Beijing Institute of TechnologyBeijingChina
  3. 3.Northeastern UniversityShenyangChina
  4. 4.Singapore Management UniversitySingaporeSingapore
  5. 5.Georgia State UniversityAtlantaUSA

Personalised recommendations