Predictive Graph Mining

Karwath, Andreas; De Raedt, Luc

doi:10.1007/978-3-540-30214-8_1

Predictive Graph Mining

Andreas Karwath²⁰ &
Luc De Raedt²⁰

Conference paper

907 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3245))

Abstract

Graph mining approaches are extremely popular and effective in molecular databases. The vast majority of these approaches first derive interesting, i.e. frequent, patterns and then use these as features to build predictive models. Rather than building these models in a two step indirect way, the SMIREP system introduced in this paper, derives predictive rule models from molecular data directly. SMIREP combines the SMILES and SMARTS representation languages that are popular in computational chemistry with the IREP rule-learning algorithm by Fürnkranz. Even though SMIREP is focused on SMILES, its principles are also applicable to graph mining problems in other domains. SMIREP is experimentally evaluated on two benchmark databases.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dehaspe, L.: Frequent Pattern Discovery in First-Order Logic. K. U. Leuven (1998)
Google Scholar
Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: Proc. ICDM 2003, pp. 35–42 (2003)
Google Scholar
Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in HIV data. In: Provost, F., Srikant, R. (eds.) Proc. KDD 2001, pp. 136–143. ACM Press, New York (2001)
Chapter Google Scholar
Zaki, M.: Efficiently mining frequent trees in a forest. In: Hand, D., Keim, D., Ng, R. (eds.) Proc. KDD 2002, pp. 71–80. ACM Press, New York (2002)
Chapter Google Scholar
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proc. ICDM 2002 (2002)
Google Scholar
Inokuchi, A., Kashima, H.: Mining significant pairs of patterns from graph structures with class labels. In: Proc. ICDM 2003, pp. 83–90 (2003)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50, 321–354 (2003)
Article MATH Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. ICDM 2001, pp. 179–186 (2001)
Google Scholar
Yan, X., Han, J.: Closegraph: Mining closed frequent graph patterns. In: Proc. KDD 2003 (2003)
Google Scholar
Fürnkranz, J., Widmer, G.: Incremental reduced error pruning. In: Cohen, W.W., Hirsh, H. (eds.) Proc. ICML 1994, pp. 70–77. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Proc. ICML 1995, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Google Scholar
King, R.D., Muggleton, S., Srinivasan, A., Sternberg, M.J.E.: Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. of the National Academy of Sciences 93, 438–442 (1996)
Article Google Scholar
Weininger, D.: SMILES, a chemical language and information system 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988)
Google Scholar
Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239–266 (1990)
Google Scholar
Srinivasan, A., Muggleton, S., Sternberg, M.E., King, R.D.: Theories for mutagenicity: a study of first-order and feature based induction. A.I. Journal 85, 277–299 (1996)
Google Scholar
Cook, Holder: Graph-based data mining. ISTA: Intelligent Systems & their applications 15 (2000)
Google Scholar
Gonzalez, J.A., Holder, L.B., Cook, D.J.: Experimental comparison of graph-based relational concept learning with inductive logic programming systems. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 84–100. Springer, Heidelberg (2003)
Chapter Google Scholar
Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)
Google Scholar
Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Constructing a decision tree for graph structured data. In: Proc. MGTS 2003, pp. 1–10 (2003), http://www.ar.sanken.osaka-u.ac.jp/MGTS-2003CFP.html
Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43, 53–80 (2001)
Article MATH Google Scholar
Muggleton, S.: Inverting entailment and Progol. Machine Intelligence 14, 133–188 (1995)
Google Scholar
Srinivasan, A., King, R.D., Bristol, D.W.: An assessment of ILP-assisted models for toxicology and the PTE-3 experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 291–302. Springer, Heidelberg (1999)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Albert-Ludwigs-Universität Freiburg, Georges-Köhler-Allee 079, D-79110, Freiburg, Germany
Andreas Karwath & Luc De Raedt

Authors

Andreas Karwath
View author publications
You can also search for this author in PubMed Google Scholar
Luc De Raedt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Nishi, 819-0395, Fukuoka, Japan
Einoshin Suzuki
Kyushu University, 6–10–1 Hakozaki Higashi-ku, 812–8581, Fukuoka, Japan
Setsuo Arikawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karwath, A., De Raedt, L. (2004). Predictive Graph Mining. In: Suzuki, E., Arikawa, S. (eds) Discovery Science. DS 2004. Lecture Notes in Computer Science(), vol 3245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30214-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-30214-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23357-2
Online ISBN: 978-3-540-30214-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics