Data Mining and Knowledge Discovery

, Volume 21, Issue 3, pp 472–508 | Cite as

Frequent subgraph mining in outerplanar graphs

  • Tamás Horváth
  • Jan Ramon
  • Stefan Wrobel


In recent years there has been an increased interest in frequent pattern discovery in large databases of graph structured objects. While the frequent connected subgraph mining problem for tree datasets can be solved in incremental polynomial time, it becomes intractable for arbitrary graph databases. Existing approaches have therefore resorted to various heuristic strategies and restrictions of the search space, but have not identified a practically relevant tractable graph class beyond trees. In this paper, we consider the class of outerplanar graphs, a strict generalization of trees, develop a frequent subgraph mining algorithm for outerplanar graphs, and show that it works in incremental polynomial time for the practically relevant subclass of well-behaved outerplanar graphs, i.e., which have only polynomially many simple cycles. We evaluate the algorithm empirically on chemo- and bioinformatics applications.


Graph mining Frequent pattern mining Algorithms Complexity Applications 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI (1996) Fast discovery of association rules. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI Press/The MIT Press, Menlo Park, CA, pp 307–328Google Scholar
  2. Bodlaender HL (1998) A partial k-arboretum of graphs with bounded treewidth. Theor Comput Sci 209(1–2): 1–45MATHCrossRefMathSciNetGoogle Scholar
  3. Borgelt C, Berthold M (2002) Mining molecular fragments: finding relevant substructures of molecules. In: Proceedings of the 2002 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 51–58Google Scholar
  4. Calders T, Ramon J, Van Dyck D (2008) Anti-monotonic overlap-graph support measures. In: Proceedings of the 2008 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 73–82Google Scholar
  5. Chartrand G, Harary F (1967) Planar permutation graphs. Annales de l’institut Henri Poincaré, (Sec. B) Probabilités et Statistiques 3(4): 433–438MATHMathSciNetGoogle Scholar
  6. Chi Y, Nijssen S, Muntz RR, Kok JN (2005) Frequent subtree mining–an overview. Fundam Inform 66: 161–198MATHMathSciNetGoogle Scholar
  7. Chi Y, Yang Y, Muntz RR (2005) Canonical forms for labelled trees and their applications in frequent subtree mining. Knowl Inf Syst 8(2): 203–234CrossRefGoogle Scholar
  8. Cook D, Holder L (1994) Substructure discovery using minimum description length and background knowledge. J Artif Intell Res 1: 231–255Google Scholar
  9. Deshpande M, Kuramochi M, Wale N, Karypis G (2005) Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 17(8): 1036–1050CrossRefGoogle Scholar
  10. Diestel R (2005) Graph theory. 3. Springer, HeidelbergMATHGoogle Scholar
  11. Feder T, Hell P (1998) List homomorphisms to reflexive graphs. J Comb Theory B 72(2): 236–250CrossRefMathSciNetGoogle Scholar
  12. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, San FranciscoMATHGoogle Scholar
  13. Harary F (1971) Graph theory. Addison-Wesley, ReadingGoogle Scholar
  14. He H, Singh AK (2007) Efficient algorithms for mining significant substructures in graphs with quality guarantees. In: Proceedings of the 2007 IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 163–172Google Scholar
  15. Hedetniemi S, Chartrand G, Geller D (1971) Graphs with forbidden suhgraphs. J Comb Theory 10: 12–41MATHCrossRefMathSciNetGoogle Scholar
  16. Hopcroft JE, Wong JK (1974) Linear time algorithm for isomorphism of planar graphs. In: Proceedings of the sixth annual ACM symposium on theory of Computing (STOC). ACM Press, New York, pp 172–184Google Scholar
  17. Horváth T (2005) Cyclic pattern kernels revisited. In: Proceedings of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD), vol 3518 of LNAI. Springer, Heidelberg, pp 791–801Google Scholar
  18. Horváth T, Bringmann B, Raedt LD (2007) Frequent hypergraph mining. In: Proceedings of the 16th international conference on inductive logic programming (ILP), vol 4455 of LNAI. Springer, Heidelberg, pp 244–259Google Scholar
  19. Horváth T, Wrobel S, Bohnebeck U (2001) Relational instance-based learning with lists and terms. Mach Learn 43(1/2): 53–80MATHCrossRefGoogle Scholar
  20. Inokuchi A, Washio T, Motoda H (2003) Complete mining of frequent patterns from graphs: mining graph data. Mach Learn 50(3): 321–354MATHCrossRefGoogle Scholar
  21. Johnson DS, Papadimitriou CH, Yannakakis M (1988) On generating all maximal independent sets. Inform Process Lett 27(3): 119–123MATHCrossRefMathSciNetGoogle Scholar
  22. Koontz W (1980) Economic evaluation of loop feeder relief alternatives. Bell Syst Tech J 59: 277–281Google Scholar
  23. Kramer S, De Raedt L, Helma C (2001) Molecular feature mining in HIV data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 136–143Google Scholar
  24. Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the 2001 international conference on data mining (ICDM). IEEE Computer Society, pp 313–320Google Scholar
  25. Leydold J, Stadler PF (1998) Minimal cycle bases of outerplanar graphs. Electron J Comb 5Google Scholar
  26. Lingas A (1989) Subgraph isomorphism for biconnected outerplanar graphs in cubic time. Theor Comput Sci 63(3): 295–302MATHCrossRefMathSciNetGoogle Scholar
  27. Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Mining Knowl Discover 1(3): 241–258CrossRefGoogle Scholar
  28. Matula DW (1978) Subtree isomorphism in O(n 5/2). Ann Discrete Math 2: 91–106MATHCrossRefMathSciNetGoogle Scholar
  29. Maunz A, Helma C, Kramer S (2009) Large-scale graph mining using backbone refinement classes. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 617–626Google Scholar
  30. Mitchell SL (1979) Linear algorithms to recognize outerplanar and maximal outerplanar graphs. Inform Process Lett 9(5): 229–232MATHCrossRefMathSciNetGoogle Scholar
  31. Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 647–652Google Scholar
  32. Nishi T, Chua LO (1986) Uniqueness of solution for nonlinear resistive circuits containing CCCS’s or VCVS’s whose controlling coefficients are finite. IEEE Trans Circuits Syst 33(4): 381–397MATHCrossRefMathSciNetGoogle Scholar
  33. Ramon J, Nijssen S (2009) Polynomial-delay enumeration of monotonic graph classes. J Mach Learn Res 10: 907–929MathSciNetGoogle Scholar
  34. Read RC, Tarjan RE (1975) Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks 5(3): 237–252MATHMathSciNetGoogle Scholar
  35. Schietgat L, Costa F, Ramon J, De Raedt L (2009) Maximum common subgraph mining: a fast and effective approach towards feature generation. In: Proceedings of the 7th international workshop on mining and learning with graphs (MLG). Leuven, Belgium, pp 1–3Google Scholar
  36. Schietgat L, Ramon J, Bruynooghe M, Blockeel H (2008) An efficiently computable graph-based metric for the classification of small molecules. In: Proceedings of the 11th international conference on discovery science (DS), vol 5255 of LNAI. Springer, Heidelberg, pp 197–209Google Scholar
  37. Shamir R, Tsur D (1999) Faster subtree isomorphism. J Algorithms 33(2): 267–280MATHCrossRefMathSciNetGoogle Scholar
  38. Syslo MM (1982) The subgraph isomorphism problem for outerplanar graphs. Theor Comput Sci 17: 91–97MATHCrossRefMathSciNetGoogle Scholar
  39. Tarjan RE (1972) Depth-first search and linear graph algorithms. SIAM J Comput 1(2): 146–160MATHCrossRefMathSciNetGoogle Scholar
  40. Tong H, Faloutsos C, Gallagher B, Eliassi-Rad T (2007) Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, pp 737–746Google Scholar
  41. Yan X, Han J (2002) gSpan: graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE international conference on data mining. IEEE Computer Society, pp 721–724Google Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.Department of Computer Science IIIUniversity of BonnBonnGermany
  2. 2.Department of Computer ScienceKatholieke Universiteit LeuvenLeuvenBelgium
  3. 3.Fraunhofer Institute IAIS, Schloss BirlinghovenSankt AugustinGermany

Personalised recommendations