Full Perfect Extension Pruning for Frequent Subgraph Mining

Borgelt, Christian; Meinl, Thorsten

doi:10.1007/978-3-540-88067-7_11

Christian Borgelt⁴ &
Thorsten Meinl⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 165))

756 Accesses
1 Citations

Summary

Mining graph databases for frequent subgraphs has recently developed into an area of intensive research. Its main goals are to reduce the execution time of the existing basic algorithms and to enhance their capability to find meaningful graph fragments. Here we present a method to achieve the former, namely an improvement of what we called “perfect extension pruning” in an earlier paper [4]. With this method the number of generated fragments and visited search tree nodes can be reduced, often considerably, thus accelerating the search.We describe the method in detail and present experimental results that demonstrate its usefulness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Borgelt, C.: On Canonical Forms for Frequent Graph Mining. In: Proc. 3rd Int. Workshop on Mining Graphs, Trees and Sequences, MGTS 2005, Porto, Portugal, 1–12. ECML/PKDD 2005 Organization Committee, Porto, Portugal (2005)
Google Scholar
Borgelt, C.: Combining Ring Extensions and Canonical Form Pruning. In: Proceedings of the 4th International Workshop on Mining and Learning in Graphs (MLG 2006), ECML/PKDD 2006 Organization Committee, Berlin, pp. 109–116 (2006)
Google Scholar
Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. IEEE Int. Conf. on Data Mining, ICDM 2002, Maebashi, Japan, pp. 51–58. IEEE Press, Piscataway (2002)
Chapter Google Scholar
Borgelt, C., Meinl, T., Berthold, M.R.: Advanced Pruning Strategies to Speed Up Mining Closed Molecular Fragments. In: Proc. IEEE Conf. on Systems, Man and Cybernetics, SMC 2004, The Hague, Netherlands, IEEE Press, Piscataway (2004)
Google Scholar
Cook, D.J., Holder, L.B.: Graph-Based Data Mining. IEEE Trans.on Intelligent Systems 15(2), 32–41 (2000)
Article Google Scholar
Di Fatta, G., Berthold, M.R.: Distributed Mining of Molecular Fragments. In: Workshop on Data Mining and the Grid, IEEE Int. Conf. on Data Mining, pp. 1–9. IEEE Press, Piscataway (2004)
Google Scholar
Finn, P.W., Muggleton, S., Page, D., Srinivasan, A.: Pharmacore Discovery Using the Inductive Logic Programming System PROGOL. Machine Learning 30(2-3), 241–270 (1998)
Article Google Scholar
Hofer, H., Borgelt, C., Berthold, M.R.: Large Scale Mining of Molecular Fragments with Wildcards. Intelligent Data Analysis 8, 495–504 (2004)
Google Scholar
Huan, J., Wang, W., Prins, J.: Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. In: Proc. 3rd IEEE Int. Conf. on Data Mining, ICDM 2003, Melbourne, FL, pp. 549–552. IEEE Press, Piscataway (2003)
Google Scholar
Index Chemicus — Subset from 1993. Institute of Scientific Information, Inc (ISI). Thomson Scientific, Philadelphia, PA, USA (1993), http://www.thomsonscientific.com/products/indexchemicus/
Kramer, S., de Raedt, L., Helma, C.: Molecular Feature Mining in HIV Data. In: Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 2001, San Francisco, CA, pp. 136–143. ACM Press, New York (2001)
Chapter Google Scholar
Kuramochi, M., Karypis, G.: Frequent Subgraph Discovery. In: Proc. 1st IEEE Int. Conf. on Data Mining, ICDM 2001, San Jose, CA, pp. 313–320. IEEE Press, Piscataway (2001)
Chapter Google Scholar
DTP AIDS Antiviral Screen (HIV Data Set) — Subset from 2001. Developmental Therapeutics Program (DTP), National Cancer Institute, USA (2001), http://dtp.nci.nih.gov/docs/aids/aids_data.html
Nijssen, S., Kok, J.N.: A Quickstart in Frequent Structure Mining Can Make a Difference. In: Proc. 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 2004, Seattle, WA, pp. 647–652. ACM Press, New York (2004)
Chapter Google Scholar
Washio, T., Motoda, H.: State of the Art of Graph-Based Data Mining. SIGKDD Explorations Newsletter 5(1), 59–68 (2003)
Article Google Scholar
Yan, X., Han, J.: gSpan: Graph-Based Substructure Pattern Mining. In: Proc. 2nd IEEE Int. Conf. on Data Mining, ICDM 2003, Maebashi, Japan, pp. 721–724. IEEE Press, Piscataway (2002)
Google Scholar
Yan, X., Han, J.: Closegraph: Mining Closed Frequent Graph Patterns. In: Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 2003, Washington, DC, pp. 286–295. ACM Press, New York (2003)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

European Center for Soft Computing, c/ Gonzalo Gutiérrez Quirós s/n, 33600, Mieres, Spain
Christian Borgelt
Nycomed Chair for Bioinformatics and Information Mining Dept. of Computer and Information Science, University of Konstanz, Box M712, 78457, Konstanz, Germany
Thorsten Meinl

Authors

Christian Borgelt
View author publications
You can also search for this author in PubMed Google Scholar
Thorsten Meinl
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Lyon, Lyon, France
Djamel A. Zighed & Hakim Hacid &
Shimane University, Shimane, Japan
Shusaku Tsumoto
University of North Carolina, Charlotte, NC, USA
Zbigniew W. Ras

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Borgelt, C., Meinl, T. (2009). Full Perfect Extension Pruning for Frequent Subgraph Mining. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-88067-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88066-0
Online ISBN: 978-3-540-88067-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics