Pruning Strategies Based on the Upper Bound of Information Gain for Discriminative Subgraph Mining

Ohara, Kouzou; Hara, Masahiro; Takabayashi, Kiyoto; Motoda, Hiroshi; Washio, Takashi

doi:10.1007/978-3-642-01715-5_5

Kouzou Ohara²¹,
Masahiro Hara²¹,
Kiyoto Takabayashi²¹,
Hiroshi Motoda²¹ &
…
Takashi Washio²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5465))

Included in the following conference series:

Pacific Rim Knowledge Acquisition Workshop

642 Accesses
3 Citations

Abstract

Given a set of graphs with class labels, discriminative subgraphs appearing therein are useful to construct a classification model. A graph mining technique called Chunkingless Graph-Based Induction (Cl-GBI) can find such discriminative subgraphs from graph structured data. But, it sometimes happens that Cl-GBI cannot extract subgraphs that are good enough to characterize the given data due to its time and space complexities. Thus, to improve its efficiency, we propose pruning methods based on the upper-bound of information gain that is used as a criterion for discriminability of subgraphs in Cl-GBI. The upper-bound of information gain of a subgraph is the maximal one that its super graph can achieve. By comparing the upper-bound of each subgraph with the best information gain at the moment, Cl-GBI can exclude unfruitful subgraphs from its search space. Furthermore, we experimentally evaluate the effectiveness of the pruning methods on a real world and artificial datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research 1, 231–255 (1994)
Google Scholar
Yoshida, K., Motoda, H.: Clip: Concept learning from inference patterns. Artificial Intelligence 75(1), 63–92 (1995)
Article Google Scholar
Matsuda, T., Motoda, H., Yoshida, T., Washio, T.: Mining patterns from structured data by beam-wise graph-based induction. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 422–429. Springer, Heidelberg (2002)
Chapter Google Scholar
Yan, X., Han, J.: gSpan: Graph-based structure pattern mining. In: 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 721–724 (2002)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning 50(3), 321–354 (2003)
Article MATH Google Scholar
Kuramochi, M., Karypis, G.: An efficient algorithm for discovering frequent subgraphs. IEEE Trans. Knowledge and Data Engineering 16(9), 1038–1051 (2004)
Article Google Scholar
Nguyen, P.C., Ohara, K., Motoda, H., Washio, T.: Cl-gbi: A novel approach for extracting typical patterns from graph-structured data. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 639–649. Springer, Heidelberg (2005)
Chapter Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Morishita, S., Sese, j.: Traversing itemset lattices with statical metric pruning. In: 19th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 226–236 (2000)
Google Scholar
Nguyen, P.C., Ohara, K., Mogi, A., Motoda, H., Washio, T.: Constructing decision trees for graph-structured data by chunkingless graph-based induction. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 390–399. Springer, Heidelberg (2006)
Chapter Google Scholar
Geamsakul, W., Yoshida, T., Ohara, K., Motoda, H., Yokoi, H., Takabayashi, K.: Constructing a decision tree for graph-structured data and its applications. Fundamenta Informaticae 66(1-2), 131–160 (2005)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

The Institute of Scientific and Industrial Research, Osaka University, 8-1, Mihogaoka, Ibaraki, Osaka, 567-0047, Japan
Kouzou Ohara, Masahiro Hara, Kiyoto Takabayashi, Hiroshi Motoda & Takashi Washio

Authors

Kouzou Ohara
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Hara
View author publications
You can also search for this author in PubMed Google Scholar
Kiyoto Takabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Motoda
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Washio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computing Department, Division of Information and Communication Sciences, Macquarie University, NSW, 2109, Sydney, Australia
Debbie Richards
School of Computing ad Information Systems, University of Tasmania , Launceton, TAS 7250, Tasmania, Australia
Byeong-Ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ohara, K., Hara, M., Takabayashi, K., Motoda, H., Washio, T. (2009). Pruning Strategies Based on the Upper Bound of Information Gain for Discriminative Subgraph Mining. In: Richards, D., Kang, BH. (eds) Knowledge Acquisition: Approaches, Algorithms and Applications. PKAW 2008. Lecture Notes in Computer Science(), vol 5465. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01715-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-01715-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01714-8
Online ISBN: 978-3-642-01715-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics