Cluster Computing

, Volume 22, Supplement 3, pp 5521–5533 | Cite as

Research on subgraph distribution algorithm based on label null model

  • Zhisong WangEmail author


In view of the problem of low accuracy of subgraph distribution algorithm in existing graph classification, this paper constructs a n-order label null model based on the n-order null model, and proposes an index algorithm Build Graph Location Index (BGLI) and a subgraph distribution algorithm estimate subgraph on spark (ESGS). Firstly, in order to add the classification characteristics of the graph, the topological structure information, the vertex and edge label information of the graph are taken into consideration, and the n-order label null model is constructed. And validity that the label null model is used for the classification is proved. Secondly, in order to improve the efficiency of subgraph retrieval, BGLI algorithm is proposed to build the index based on label null model. Then, based on BGLI algorithm, the ESGS algorithm is proposed. Finally, the experiments prove that the subgraph extracted by ESGS algorithm as the classification characteristics can improve the accuracy of the classification.


Graph classification Subgraph distribution Label graph Null model 



This work is supported by the National Science Foundation of China (No. 61472340), the National Youth Science Foundation of China (No. 61602401), the National Youth Science Foundation of Hebei (No. F2017209070), Hebei Province Colleges and Universities Science and Technology Research for Youth Fund Project (No. QN2017058).


  1. 1.
    Lam, W.W.M., Chan, K.C.C.: A graph mining algorithm for classifying chemical compounds. In: IEEE International Conference on Bioinformatics and Biomedicine, Philadelphia, Pennsylvania, USA, pp. 321–324 (2008)Google Scholar
  2. 2.
    Pan, S., Wu, J., Zhu, X., et al.: Finding the best not the most: regularized loss minimization subgraph selection for graph classification. Pattern Recogn. 48(11), 3783–3796 (2015)CrossRefGoogle Scholar
  3. 3.
    Lai, L., Qin, L., Lin, X., et al.: Scalable distributed subgraph enumeration. Proc. VLDB Endow. 10(3), 217–228 (2016)CrossRefGoogle Scholar
  4. 4.
    He, Y., Wang, T., Xie, J., et al.: Parallel frequent subgraph mining algorithm. In: Proceedings of the International Conference on Software and Computer Applications. ACM pp. 98–202 (2017)Google Scholar
  5. 5.
    Pan, S., Wu, J., Zhu, X.: CogBoost: boosting for fast cost-sensitive graph classification. IEEE Trans. Knowl. Data Eng. 27(11), 2933–2946 (2015)CrossRefGoogle Scholar
  6. 6.
    Przulj, N., Corneil, D.I.: Modeling interactome scale-free or geometric. Bioinformatics 20(18), 3508–3515 (2004)CrossRefGoogle Scholar
  7. 7.
    Milenković, T., Przulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6(1), 257–273 (2008)Google Scholar
  8. 8.
    Ahmed, N.K., Neville, J., Rossi, R.A., et al.: Efficient graphlet counting for large networks. In: International Conference on Data Mining series, Atlantic, USA, pp. 1–10 (2015)Google Scholar
  9. 9.
    Elenberg, E.R., Shanmugam, K., Borokhovich, M., et al.: Beyond triangles a distributed framework for estimating 3-profiles of large graphs. In: International Conference on Knowledge Discovery and Data Mining. Sydney, Australia, pp. 229–238 (2015)Google Scholar
  10. 10.
    Elenberg, E.R., Shanmugam, K., Borokhovich, M., et al.: Distributed estimation of graph 4-profiles. In: International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 483–493 (2016)Google Scholar
  11. 11.
    Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer Berlin Heidelberg, pp. 13–23 (2000)CrossRefGoogle Scholar
  12. 12.
    Yan, X., Han, J.: Gspan: Graph-based substructure pattern mining. In: International Conference on Data Mining series, Melbourne, Florida, USA, pp. 721–724 (2003)Google Scholar
  13. 13.
    Wang, C., Wang, W., Pei, J., et al.: Scalable mining of large disk-based graph databases. In: International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, August., pp. 316–325 (2004)Google Scholar
  14. 14.
    Bringmann, B., Nijssen, S.: What is Frequent in a Single Graph? In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer Berlin Heidelberg, pp. 858–863 (2008)Google Scholar
  15. 15.
    Elseidy, M., Abdelhamid, E., Skiadopoulos, S., et al.: Grami: frequent subgraph and pattern mining in a single large graph. Proc. VLDB Endow. 7(7), 517–528 (2014)CrossRefGoogle Scholar
  16. 16.
    Fan, W., Wang, X., Wu, Y., et al.: Association rules with graph patterns. Proc. VLDB Endow. 8(12), 1502–1513 (2015)CrossRefGoogle Scholar
  17. 17.
    Chen, Y., Zhao, X., Lin, X., et al.: Towards frequent subgraph mining on single large uncertain graphs. In: International Conference on Data Mining. Miami, USA, pp. 41–50 (2015)Google Scholar
  18. 18.
    Dong, G., Yang, W., Zhu, F., et al.: Discovering burst patterns of burst topic in Twitter. Comput. Electr. Eng. 22(10), 426–440 (2016)Google Scholar
  19. 19.
    Fang, Y., Cheng, R., Luo, S., et al.: Effective community search for large attributed graphs. Proc. VLDB Endow. 9(12), 1233–1244 (2016)CrossRefGoogle Scholar
  20. 20.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 3rd edn. China Machine Press, Beijing (2012)zbMATHGoogle Scholar
  21. 21.
    Jing, Y., Han, Y.: URSI: high efficient query algorithm on subgraph isomorphism. J. Yanshan Univ. 40(6), 517–523 (2016)Google Scholar
  22. 22.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. Journal. 27(3), 379–423 (1948)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Gjoka, M., Kurant, M., Markopoulou, A.: 2.5K-graphs: from sampling to generation. In: IEEE INFOCOM. New York, USA, pp. 1968–1976 (2013)Google Scholar
  24. 24.
    Mahadevan, P., Hubble, C., Krioukov, D., et al.: Orbis: rescaling degree correlations to generate annotated internet topologies. ACM SIGCOMM Comput. Commun. Rev. 37(4), 325–336 (2007)CrossRefGoogle Scholar
  25. 25.
    Mahadevan, P., Krioukov, D., Fall, K., et al.: Systematic topology analysis and generation using degree correlations. ACM SIGCOMM Comput. Commun. Rev. 36(4), 135–146 (2006)CrossRefGoogle Scholar
  26. 26.
    Holme, P., Kim, B.J.: Growing scale-free networks with tunable clustering. Phys. Rev. E 65(2), 026107 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.School of Mechanical EngineeringYanShan UniversityQinhuangdaoChina
  2. 2.Key Laboratory of Advanced Forging & Stamping Technology and Science (Yanshan University)Ministry of Education of ChinaQinhuangdaoChina
  3. 3.Hebei Provincial Key Laboratory of Parallel Robot and Mechatronic SystemYanshan UniversityQinhuangdaoChina

Personalised recommendations