Abstract
Most of the relations are represented by a graph structure, e.g., chemical bonding, Web browsing record, DNA sequence, Inference pattern (program trace), to name a few. Thus, efficiently finding characteristic substructures in a graph will be a useful technique in many important KDD/ML applications. However, graph pattern matching is a hard problem. We propose a machine learning technique called Graph- Based Induction (GBI) that efficiently extracts typical patterns from graph data in an approximate manner by stepwise pair expansion (pairwise chunking). It can handle general graph structured data, i.e., directed/undirected, colored/uncolored graphs with/without (self) loop and with colored/uncolored links. We show that its time complexity is almost linear with the size of graph. We, further, show that GBI can effiectively be applied to the extraction of typical patterns from chemical compound data from which to generate classification rules, and that GBI also works as a feature construction component for other machine learning tools.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R. and Srikant, R. Mining sequential patterns. Proc. of the Eleventh International Conference on Data Engineering (ICDE’95), pp. 3–14, 1995.
Mannila, H., Toivonen, H. and Verkamo, A. I. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, Vol.1, No.3, pp. 259–289, 1997.
Sintani, T. and Kituregawa, M. Mining algorithms for sequential patterns in parallel: Hash based approach. Proc. of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pp. 283–294, 1998.
Srikant, R., Vu, Q. and Agrawal, R. Mining Association Rules with Item Constraints. Proc. of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), pp. 67–73, 1997.
Agrwal, R and Srikant, R. First Algorithms for Mining Association Rules. Proc. of the 20th VLDB Conference, pp. 487–499, 1994.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. Classification and Regression Trees. Wadsworth & Brooks/Cole Advanced Books & Software, 1984.
Clark, P. and Niblett, T. The CN2 Induction Algorithm. Machine Learning Vol. 3, pp. 261–283, 1989.
Debnath, A. K., Lopez de Compadre, R. L., Debnath, G., Shusterman, A. J. and Hansch, C. Structure-Activity Relationship of Mutagenic Aromatic and Heteroaromatic Nitro Compunds. Correlation with Molecular Orbital Energies and Hydrophobicity J. Med. Chem., Vol. 34, pp. 786–797, 1991.
Michalski, R. S. Learning Flexible Concepts: Fundamental Ideas and a Method Based on Two-Tiered Representation. In Machine Learning, An Artificial Intelligence Approach, Vol. III, (eds. Kodrtoff Y. and Michalski T.), pp. 63–102, 1990.
S. Muggleton and L. de Raedt. Inductive Logic Programming: Theory and Methods. Journal of Logic Programming Vol. 19, No. 20, pp. 629–679, 1994.
Cook, D. J. and Holder, L. B. Substructure Discovery Using Minimum Description Length and Background Knowledge Journal of Artificial Intelligence Research, Vol. 1 pp. 231–235, 1994.
S. Fortin. The graph isomorphism problem. Technical Report 96-20, University of Alberta, Edomonton, Alberta, Canada., 1996. References
T. Matsuda, T. Horiuchi, H. Motoda and T. Washio. Extension of Graph-Based Induction for General Graph Structured Data. Knowledge Discovery and Data Mining: Current Issues and New Applications, Springer Verlag, LNAI 1805, pp. 420–431, 2000.
Matsumoto, T. and Tanabe, T. Carcinogenesis Prediction for Chlorinated Hydrocarbons using Neural Network (in Japanese). Japan Chemistry Program Exchange Journal Vol. 11 No. 1 pp. 29–34, 1999.
Motoda, H. and Yoshida. K. Machine Learning Techniques to Make Computers Easier to Use. Journal of Artificial Intelligence, Vol. 103, No. 1–2, pp. 295–321, 1998
Quinlan, J. R. Induction of decision trees. Machine Learning, Vol. 1, pp. 81–106, 1986.
Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Yoshida, K., Motoda, H., and Indurkhya, N. Induction as a Unified Learning Framework. Journal of Applied Intelligence, Vol. 4, pp. 297–328, 1994
Yoshida, K. and Motoda, H. CLIP: Concept Learning from Inference Pattern. Artificial Intelligence. Vol. 75, No. 1, pp. 63–92, 1995.
Yoshida, K. and Motoda, H. Table, Graph and Logic for Induction. Machine Intelligence, Vol. 15, pp 298–311, Oxford Univ. Press, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Matsuda, T., Horiuchi, T., Motoda, H., Washio, T. (2000). Graph-Based Induction for General Graph Structured Data and Its Application to Chemical Compound Data. In: Arikawa, S., Morishita, S. (eds) Discovery Science. DS 2000. Lecture Notes in Computer Science(), vol 1967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44418-1_9
Download citation
DOI: https://doi.org/10.1007/3-540-44418-1_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41352-3
Online ISBN: 978-3-540-44418-3
eBook Packages: Springer Book Archive