Advertisement

gPrune: A Constraint Pushing Framework for Graph Pattern Mining

  • Feida Zhu
  • Xifeng Yan
  • Jiawei Han
  • Philip S. Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4426)

Abstract

In graph mining applications, there has been an increasingly strong urge for imposing user-specified constraints on the mining results. However, unlike most traditional itemset constraints, structural constraints, such as density and diameter of a graph, are very hard to be pushed deep into the mining process.

In this paper, we give the first comprehensive study on the pruning properties of both traditional and structural constraints aiming to reduce not only the pattern search space but the data search space as well. A new general framework, called gPrune, is proposed to incorporate all the constraints in such a way that they recursively reinforce each other through the entire mining process. A new concept, Pattern-inseparable Data-antimonotonicity, is proposed to handle the structural constraints unique in the context of graph, which, combined with known pruning properties, provides a comprehensive and unified classification framework for structural constraints. The exploration of these antimonotonicities in the context of graph pattern mining is a significant extension to the known classification of constraints, and deepens our understanding of the pruning properties of structural graph constraints.

Keywords

Density Ratio Frequent Pattern Pattern Mining Graph Mining Frequent Subgraph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Boulicaut, J., De Raedt, L.: Inductive Databases and Constraint-Based Mining. In: ECML’02 TutorialGoogle Scholar
  2. 2.
    Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. In: ISMB’04, pp. 200–207 (2004)Google Scholar
  3. 3.
    Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: ICDM’02, pp. 211–218 (2002)Google Scholar
  4. 4.
    Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: ICDM’03, pp. 35–42 (2003)Google Scholar
  5. 5.
    Huan, J., et al.: Mining spatial motifs from protein structure graphs. In: RECOMB ’04, pp. 308–315 (2004)Google Scholar
  6. 6.
    Deshpande, M., et al.: Frequent substructure-based approaches for classifying chemical compounds. IEEE TKDE 17(8), 1036–1050 (2005)Google Scholar
  7. 7.
    Yan, X., Yu, P.S., Han, J.: Graph indexing: A frequent structure-based approach. In: SIGMOD’04, pp. 335–346 (2004)Google Scholar
  8. 8.
    Butte, A., et al.: Discovering functional relationships between rna expression and chemotherapeutic susceptibility. Proc. of the National Academy of Science 97, 12182–12186 (2000)CrossRefGoogle Scholar
  9. 9.
    Ng, R., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: SIGMOD’98, pp. 13–24 (1998)Google Scholar
  10. 10.
    Bucila, C., et al.: DualMiner: A dual-pruning algorithm for itemsets with constraints. Data Mining and Knowledge Discovery 7, 241–272 (2003)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Bonchi, F., et al.: Exante: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., et al. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Bonchi, F., et al.: Exante: A preprocessing method for frequent-pattern mining. IEEE Intelligent Systems 20(3), 25–31 (2005)CrossRefGoogle Scholar
  13. 13.
    Bonchi, F., Lucchese, C.: Pushing tougher constraints in frequent pattern mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 114–124. Springer, Heidelberg (2004)Google Scholar
  14. 14.
    Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  15. 15.
    Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM’01, pp. 313–320 (2001)Google Scholar
  16. 16.
    Vanetik, N., Gudes, E., Shimony, S.E.: Computing frequent graph patterns from semistructured data. In: ICDM’02, pp. 458–465 (2002)Google Scholar
  17. 17.
    Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: ICDM’02, pp. 721–724 (2002)Google Scholar
  18. 18.
    Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: ICDM’03, pp. 549–552 (2003)Google Scholar
  19. 19.
    Prins, J., et al.: Spin: Mining maximal frequent subgraphs from graph databases. In: KDD’04, pp. 581–586 (2004)Google Scholar
  20. 20.
    Nijssen, S., Kok, J.: A quickstart in frequent structure mining can make a difference. In: KDD’04, pp. 647–652 (2004)Google Scholar
  21. 21.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB’94, pp. 487–499 (1994)Google Scholar
  22. 22.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD’00, pp. 1–12 (2000)Google Scholar
  23. 23.
    Yan, X., Zhou, X.J., Han, J.: Mining closed relational graphs with connectivity constraints. In: KDD’05, pp. 324–333 (2005)Google Scholar
  24. 24.
    Goldberg, A.: Finding a maximum density subgraph. Berkeley Tech Report, CSD-84-171Google Scholar
  25. 25.
    Seno, M., Karypis, G.: Slpminer: An algorithm for finding frequent sequential patterns using length decreasing support constraint. In: ICDM’02, pp. 418–425 (2002)Google Scholar
  26. 26.
    Dong, G., et al.: Mining constrained gradients in multi-dimensional databases. IEEE TKDE 16, 922–938 (2004)Google Scholar
  27. 27.
    Gade, K., Wang, J., Karypis, G.: Efficient closed pattern mining in the presence of tough block constraints. In: KDD’04, pp. 138–147 (2004)Google Scholar
  28. 28.
    Zaki, M.: Generating non-redundant association rules. In: KDD’00, pp. 34–43 (2000)Google Scholar
  29. 29.
    Wang, C., et al.: Constraint-based graph mining in large database. In: Zhang, Y., et al. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 133–144. Springer, Heidelberg (2005)Google Scholar
  30. 30.
    Yan, X., Han, J.: CloseGraph: Mining Closed Frequent Graph Patterns. In: KDD’03, pp. 286–295 (2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Feida Zhu
    • 1
  • Xifeng Yan
    • 1
  • Jiawei Han
    • 1
  • Philip S. Yu
    • 2
  1. 1.Computer Science, UIUC 
  2. 2.IBM T. J. Watson Research Center 

Personalised recommendations