Subgraph-Centric Graph Mining

Yan, Da; Tian, Yuanyuan; Cheng, James

doi:10.1007/978-3-319-58217-7_6

Da Yan¹⁷,
Yuanyuan Tian¹⁸ &
James Cheng¹⁹

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

884 Accesses

Abstract

The computation models we see so far are all data-intensive, where the cost of message transmission is often much higher than that of message processing, rendering the distributed execution communication-intensive. However, graph mining tasks are often computation-intensive, and cannot be efficiently executed with a data-intensive system. The vertex-centric API is also unsuitable for writing a graph mining algorithm that often checks subgraphs rather than individual vertices. This chapter introduces a couple of subgraph-centric systems for graph mining, among which only G-thinker is able to handle computation-intensive workloads. G-thinker targets at problems that find from a big graph all subgraphs that satisfy certain requirements (e.g., graph matching and community detection). It provides an intuitive subgraph-centric API for graph exploration, which can be used to conveniently implement various graph mining algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http:///www.cis.uab.edu/yanda/gthinker.
2.
http:///www.cis.uab.edu/yanda/gthinker.
3.
You may simply use the hash-partitioner that distributes vertices to workers by hashing vertex ID. Please refer to an example application code for its usage.

References

C. Bron and J. Kerbosch. Finding all cliques of an undirected graph (algorithm 457). Commun. ACM, 16(9):575–576, 1973.
Article MATH Google Scholar
H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, pages 405–418, 2008.
Google Scholar
X. Hu, Y. Tao, and C. Chung. I/o-efficient algorithms on triangle listing and counting. ACM Trans. Database Syst., 39(4):27:1–27:30, 2014.
Google Scholar
G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. NAGA: searching and ranking knowledge. In ICDE, pages 953–962, 2008.
Google Scholar
S. Khuller and B. Saha. On finding dense subgraphs. In ICALP, pages 597–608, 2009.
Google Scholar
J. Lee, W. Han, R. Kasperovics, and J. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB, 6(2):133–144, 2012.
Google Scholar
G. Liu and L. Wong. Effective pruning techniques for mining quasi-cliques. In ECML/PKDD Part II, pages 33–49, 2008.
Google Scholar
J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9–18, 2013.
Article MathSciNet MATH Google Scholar
A. Quamar, A. Deshpande, and J. Lin. NScale: neighborhood-centric large-scale graph analytics in the cloud. VLDB Journal, 25(2):125–150, 2016.
Article Google Scholar
L. Quick, P. Wilkinson, and D. Hardcastle. Using pregel-like large scale graph processing frameworks for social network analysis. In ASONAM, pages 457–463, 2012.
Google Scholar
C. H. C. Teixeira, A. J. Fonseca, M. Serafini, G. Siganos, M. J. Zaki, and A. Aboulnaga. Arabesque: a system for distributed graph mining. In SOSP, pages 425–440, 2015.
Google Scholar
E. Tomita and T. Seki. An efficient branch-and-bound algorithm for finding a maximum clique. In DMTCS, pages 278–289, 2003.
Google Scholar
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481–492, 2012.
Google Scholar
L. Zou, L. Chen, and M. T. Özsu. Distancejoin: Pattern match query in a large graph database. PVLDB, 2(1):886–897, 2009.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Alabama at Birmingham, Birmingham, AL, USA
Da Yan
IBM Almaden Research Center, San Jose, CA, USA
Yuanyuan Tian
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
James Cheng

Authors

Da Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Tian
View author publications
You can also search for this author in PubMed Google Scholar
James Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yan, D., Tian, Y., Cheng, J. (2017). Subgraph-Centric Graph Mining. In: Systems for Big Graph Analytics. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-58217-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-58217-7_6
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58216-0
Online ISBN: 978-3-319-58217-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics