Abstract
The computation models we see so far are all data-intensive, where the cost of message transmission is often much higher than that of message processing, rendering the distributed execution communication-intensive. However, graph mining tasks are often computation-intensive, and cannot be efficiently executed with a data-intensive system. The vertex-centric API is also unsuitable for writing a graph mining algorithm that often checks subgraphs rather than individual vertices. This chapter introduces a couple of subgraph-centric systems for graph mining, among which only G-thinker is able to handle computation-intensive workloads. G-thinker targets at problems that find from a big graph all subgraphs that satisfy certain requirements (e.g., graph matching and community detection). It provides an intuitive subgraph-centric API for graph exploration, which can be used to conveniently implement various graph mining algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
http:///www.cis.uab.edu/yanda/gthinker.
- 2.
http:///www.cis.uab.edu/yanda/gthinker.
- 3.
You may simply use the hash-partitioner that distributes vertices to workers by hashing vertex ID. Please refer to an example application code for its usage.
References
C. Bron and J. Kerbosch. Finding all cliques of an undirected graph (algorithm 457). Commun. ACM, 16(9):575–576, 1973.
H. He and A. K. Singh. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD, pages 405–418, 2008.
X. Hu, Y. Tao, and C. Chung. I/o-efficient algorithms on triangle listing and counting. ACM Trans. Database Syst., 39(4):27:1–27:30, 2014.
G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. NAGA: searching and ranking knowledge. In ICDE, pages 953–962, 2008.
S. Khuller and B. Saha. On finding dense subgraphs. In ICALP, pages 597–608, 2009.
J. Lee, W. Han, R. Kasperovics, and J. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB, 6(2):133–144, 2012.
G. Liu and L. Wong. Effective pruning techniques for mining quasi-cliques. In ECML/PKDD Part II, pages 33–49, 2008.
J. Pattillo, N. Youssef, and S. Butenko. On clique relaxation models in network analysis. European Journal of Operational Research, 226(1):9–18, 2013.
A. Quamar, A. Deshpande, and J. Lin. NScale: neighborhood-centric large-scale graph analytics in the cloud. VLDB Journal, 25(2):125–150, 2016.
L. Quick, P. Wilkinson, and D. Hardcastle. Using pregel-like large scale graph processing frameworks for social network analysis. In ASONAM, pages 457–463, 2012.
C. H. C. Teixeira, A. J. Fonseca, M. Serafini, G. Siganos, M. J. Zaki, and A. Aboulnaga. Arabesque: a system for distributed graph mining. In SOSP, pages 425–440, 2015.
E. Tomita and T. Seki. An efficient branch-and-bound algorithm for finding a maximum clique. In DMTCS, pages 278–289, 2003.
W. Wu, H. Li, H. Wang, and K. Q. Zhu. Probase: a probabilistic taxonomy for text understanding. In SIGMOD, pages 481–492, 2012.
L. Zou, L. Chen, and M. T. Özsu. Distancejoin: Pattern match query in a large graph database. PVLDB, 2(1):886–897, 2009.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Yan, D., Tian, Y., Cheng, J. (2017). Subgraph-Centric Graph Mining. In: Systems for Big Graph Analytics. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-58217-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-58217-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58216-0
Online ISBN: 978-3-319-58217-7
eBook Packages: Computer ScienceComputer Science (R0)