gPrune: A Constraint Pushing Framework for Graph Pattern Mining

Zhu, Feida; Yan, Xifeng; Han, Jiawei; Yu, Philip S.

doi:10.1007/978-3-540-71701-0_38

Feida Zhu¹,
Xifeng Yan¹,
Jiawei Han¹ &
…
Philip S. Yu²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1938 Accesses
36 Citations

Abstract

In graph mining applications, there has been an increasingly strong urge for imposing user-specified constraints on the mining results. However, unlike most traditional itemset constraints, structural constraints, such as density and diameter of a graph, are very hard to be pushed deep into the mining process.

In this paper, we give the first comprehensive study on the pruning properties of both traditional and structural constraints aiming to reduce not only the pattern search space but the data search space as well. A new general framework, called gPrune, is proposed to incorporate all the constraints in such a way that they recursively reinforce each other through the entire mining process. A new concept, Pattern-inseparable Data-antimonotonicity, is proposed to handle the structural constraints unique in the context of graph, which, combined with known pruning properties, provides a comprehensive and unified classification framework for structural constraints. The exploration of these antimonotonicities in the context of graph pattern mining is a significant extension to the known classification of constraints, and deepens our understanding of the pruning properties of structural graph constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boulicaut, J., De Raedt, L.: Inductive Databases and Constraint-Based Mining. In: ECML’02 Tutorial
Google Scholar
Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. In: ISMB’04, pp. 200–207 (2004)
Google Scholar
Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: ICDM’02, pp. 211–218 (2002)
Google Scholar
Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: ICDM’03, pp. 35–42 (2003)
Google Scholar
Huan, J., et al.: Mining spatial motifs from protein structure graphs. In: RECOMB ’04, pp. 308–315 (2004)
Google Scholar
Deshpande, M., et al.: Frequent substructure-based approaches for classifying chemical compounds. IEEE TKDE 17(8), 1036–1050 (2005)
Google Scholar
Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD’04, pp. 335–346 (2004)
Google Scholar
Butte, A., et al.: Discovering functional relationships between rna expression and chemotherapeutic susceptibility. Proc. of the National Academy of Science 97, 12182–12186 (2000)
Article Google Scholar
Ng, R., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: SIGMOD’98, pp. 13–24 (1998)
Google Scholar
Bucila, C., et al.: DualMiner: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery 7, 241–272 (2003)
Article MathSciNet Google Scholar
Bonchi, F., et al.: Exante: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., et al. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, Springer, Heidelberg (2003)
Google Scholar
Bonchi, F., et al.: Exante: A preprocessing method for frequent-pattern mining. IEEE Intelligent Systems 20(3), 25–31 (2005)
Article Google Scholar
Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 114–124. Springer, Heidelberg (2004)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM’01, pp. 313–320 (2001)
Google Scholar
Vanetik, N., Gudes, E., Shimony, S.E.: Computing frequent graph patterns from semistructured data. In: ICDM’02, pp. 458–465 (2002)
Google Scholar
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM’02, pp. 721–724 (2002)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: ICDM’03, pp. 549–552 (2003)
Google Scholar
Prins, J., et al.: Spin: Mining maximal frequent subgraphs from graph databases. In: KDD’04, pp. 581–586 (2004)
Google Scholar
Nijssen, S., Kok, J.: A quickstart in frequent structure mining can make a difference. In: KDD’04, pp. 647–652 (2004)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB’94, pp. 487–499 (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD’00, pp. 1–12 (2000)
Google Scholar
Yan, X., Zhou, X.J., Han, J.: Mining closed relational graphs with connectivity constraints. In: KDD’05, pp. 324–333 (2005)
Google Scholar
Goldberg, A.: Finding a maximum density subgraph. Berkeley Tech Report, CSD-84-171
Google Scholar
Seno, M., Karypis, G.: Slpminer: An algorithm for finding frequent sequential patterns using length decreasing support constraint. In: ICDM’02, pp. 418–425 (2002)
Google Scholar
Dong, G., et al.: Mining constrained gradients in multi-dimensional databases. IEEE TKDE 16, 922–938 (2004)
Google Scholar
Gade, K., Wang, J., Karypis, G.: Efficient closed pattern mining in the presence of tough block constraints. In: KDD’04, pp. 138–147 (2004)
Google Scholar
Zaki, M.: Generating non-redundant association rules. In: KDD’00, pp. 34–43 (2000)
Google Scholar
Wang, C., et al.: Constraint-based graph mining in large database. In: Zhang, Y., et al. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 133–144. Springer, Heidelberg (2005)
Google Scholar
Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: KDD’03, pp. 286–295 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, UIUC,
Feida Zhu, Xifeng Yan & Jiawei Han
IBM T. J. Watson Research Center,
Philip S. Yu

Authors

Feida Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xifeng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar
Philip S. Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, F., Yan, X., Han, J., Yu, P.S. (2007). gPrune: A Constraint Pushing Framework for Graph Pattern Mining. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-540-71701-0_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics