Abstract
Given a set of graphs with class labels, discriminative subgraphs appearing therein are useful to construct a classification model. A graph mining technique called Chunkingless Graph-Based Induction (Cl-GBI) can find such discriminative subgraphs from graph structured data. But, it sometimes happens that Cl-GBI cannot extract subgraphs that are good enough to characterize the given data due to its time and space complexities. Thus, to improve its efficiency, we propose pruning methods based on the upper-bound of information gain that is used as a criterion for discriminability of subgraphs in Cl-GBI. The upper-bound of information gain of a subgraph is the maximal one that its super graph can achieve. By comparing the upper-bound of each subgraph with the best information gain at the moment, Cl-GBI can exclude unfruitful subgraphs from its search space. Furthermore, we experimentally evaluate the effectiveness of the pruning methods on a real world and artificial datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research 1, 231–255 (1994)
Yoshida, K., Motoda, H.: Clip: Concept learning from inference patterns. Artificial Intelligence 75(1), 63–92 (1995)
Matsuda, T., Motoda, H., Yoshida, T., Washio, T.: Mining patterns from structured data by beam-wise graph-based induction. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 422–429. Springer, Heidelberg (2002)
Yan, X., Han, J.: gSpan: Graph-based structure pattern mining. In: 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 721–724 (2002)
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50(3), 321–354 (2003)
Kuramochi, M., Karypis, G.: An efficient algorithm for discovering frequent subgraphs. IEEE Trans. Knowledge and Data Engineering 16(9), 1038–1051 (2004)
Nguyen, P.C., Ohara, K., Motoda, H., Washio, T.: Cl-gbi: A novel approach for extracting typical patterns from graph-structured data. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 639–649. Springer, Heidelberg (2005)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Morishita, S., Sese, j.: Traversing itemset lattices with statical metric pruning. In: 19th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 226–236 (2000)
Nguyen, P.C., Ohara, K., Mogi, A., Motoda, H., Washio, T.: Constructing decision trees for graph-structured data by chunkingless graph-based induction. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 390–399. Springer, Heidelberg (2006)
Geamsakul, W., Yoshida, T., Ohara, K., Motoda, H., Yokoi, H., Takabayashi, K.: Constructing a decision tree for graph-structured data and its applications. Fundamenta Informaticae 66(1-2), 131–160 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ohara, K., Hara, M., Takabayashi, K., Motoda, H., Washio, T. (2009). Pruning Strategies Based on the Upper Bound of Information Gain for Discriminative Subgraph Mining. In: Richards, D., Kang, BH. (eds) Knowledge Acquisition: Approaches, Algorithms and Applications. PKAW 2008. Lecture Notes in Computer Science(), vol 5465. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01715-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-01715-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01714-8
Online ISBN: 978-3-642-01715-5
eBook Packages: Computer ScienceComputer Science (R0)