Abstract
Graph mining approaches are extremely popular and effective in molecular databases. The vast majority of these approaches first derive interesting, i.e. frequent, patterns and then use these as features to build predictive models. Rather than building these models in a two step indirect way, the SMIREP system introduced in this paper, derives predictive rule models from molecular data directly. SMIREP combines the SMILES and SMARTS representation languages that are popular in computational chemistry with the IREP rule-learning algorithm by Fürnkranz. Even though SMIREP is focused on SMILES, its principles are also applicable to graph mining problems in other domains. SMIREP is experimentally evaluated on two benchmark databases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dehaspe, L.: Frequent Pattern Discovery in First-Order Logic. K. U. Leuven (1998)
Deshpande, M., Kuramochi, M., Karypis, G.: Frequent sub-structure-based approaches for classifying chemical compounds. In: Proc. ICDM 2003, pp. 35–42 (2003)
Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in HIV data. In: Provost, F., Srikant, R. (eds.) Proc. KDD 2001, pp. 136–143. ACM Press, New York (2001)
Zaki, M.: Efficiently mining frequent trees in a forest. In: Hand, D., Keim, D., Ng, R. (eds.) Proc. KDD 2002, pp. 71–80. ACM Press, New York (2002)
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proc. ICDM 2002 (2002)
Inokuchi, A., Kashima, H.: Mining significant pairs of patterns from graph structures with class labels. In: Proc. ICDM 2003, pp. 83–90 (2003)
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50, 321–354 (2003)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. ICDM 2001, pp. 179–186 (2001)
Yan, X., Han, J.: Closegraph: Mining closed frequent graph patterns. In: Proc. KDD 2003 (2003)
Fürnkranz, J., Widmer, G.: Incremental reduced error pruning. In: Cohen, W.W., Hirsh, H. (eds.) Proc. ICML 1994, pp. 70–77. Morgan Kaufmann, San Francisco (1994)
Cohen, W.W.: Fast effective rule induction. In: Proc. ICML 1995, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
King, R.D., Muggleton, S., Srinivasan, A., Sternberg, M.J.E.: Structure-activity relationships derived by machine learning: The use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. of the National Academy of Sciences 93, 438–442 (1996)
Weininger, D.: SMILES, a chemical language and information system 1. Introduction and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988)
Quinlan, J.R.: Learning logical definitions from relations. Machine Learning 5, 239–266 (1990)
Srinivasan, A., Muggleton, S., Sternberg, M.E., King, R.D.: Theories for mutagenicity: a study of first-order and feature based induction. A.I. Journal 85, 277–299 (1996)
Cook, Holder: Graph-based data mining. ISTA: Intelligent Systems & their applications 15 (2000)
Gonzalez, J.A., Holder, L.B., Cook, D.J.: Experimental comparison of graph-based relational concept learning with inductive logic programming systems. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 84–100. Springer, Heidelberg (2003)
Warodom, G., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Classifier construction by graph-based induction for graph-structured data. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 52–62. Springer, Heidelberg (2003)
Geamsakul, W., Matsuda, T., Yoshida, T., Motoda, H., Washio, T.: Constructing a decision tree for graph structured data. In: Proc. MGTS 2003, pp. 1–10 (2003), http://www.ar.sanken.osaka-u.ac.jp/MGTS-2003CFP.html
Horvath, T., Wrobel, S., Bohnebeck, U.: Relational instance-based learning with lists and terms. Machine Learning 43, 53–80 (2001)
Muggleton, S.: Inverting entailment and Progol. Machine Intelligence 14, 133–188 (1995)
Srinivasan, A., King, R.D., Bristol, D.W.: An assessment of ILP-assisted models for toxicology and the PTE-3 experiment. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, pp. 291–302. Springer, Heidelberg (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karwath, A., De Raedt, L. (2004). Predictive Graph Mining. In: Suzuki, E., Arikawa, S. (eds) Discovery Science. DS 2004. Lecture Notes in Computer Science(), vol 3245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30214-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-30214-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23357-2
Online ISBN: 978-3-540-30214-8
eBook Packages: Springer Book Archive