An Efficient Algorithm of Frequent Connected Subgraph Extraction

Hong, Mingsheng; Zhou, Haofeng; Wang, Wei; Shi, Baile

doi:10.1007/3-540-36175-8_5

Mingsheng Hong⁵,
Haofeng Zhou⁵,
Wei Wang⁵ &
…
Baile Shi⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1187 Accesses
6 Citations

Abstract

Mining frequent patterns from datasets is one of the key success stories of data mining research. Currently, most of the works focus on independent data, such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to extract frequent patterns from these relations is the objective in this paper. We use graphs to model the relations, and select a simple type for analysis. Combining the graph theory and algorithms to generate frequent patterns, a new algorithm Topology, which can mine these graphs efficiently, has been proposed. We evaluate the performance of the algorithm by doing experiments with synthetic datasets and real data. The experimental results show that Topology can do the job well. At the end of this paper, the potential improvement is mentioned.

This paper was supported by the Key Program of National Natural Science Foundation of China (No. 69933010) and China National 863 High-Tech Projects (No. 2002AA4Z3430 and 2002AA231041)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., et al: Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD (1993) 207–216
Google Scholar
Agrawal, R., et al: Fast algorithms for mining association rules in large databases. In: Proc. of VLDB (1994) 487–499.
Google Scholar
Park, J.S., et al: An Effective Hash Based Algorithm for Mining Association Rules. In: Proc. of ACM SIGMOD (1995) 175–186
Google Scholar
Brin, S., et al:Dynamic itemset counting and implication rules for market basket data. In: Proc. of ACM SIGMOD (1997) 255–264
Google Scholar
Han, J., et al: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD (2000) 1–12
Google Scholar
Read, R. C., et al: The graph isomorphism disease. J. of Graph Theory, 4 (1977) 339–363
Article MathSciNet Google Scholar
Babai, L., et al: Canonical labeling of graphs. In: Proc. of ACM STOC (1983) 171–183.
Google Scholar
Inokuchi, A., et al: An apriori-based algorithm for mining frequent substructures from graph data. In: Proc. of PKDD, LNCS, Vol. 1910, Springer (2000) 13–23
Google Scholar
Inokuchi, A., et al: Applying algebraic mining method of graph substructures to mutageniesis data analysis. In: KDD Challenge, PAKDD (2000) 41–46.
Google Scholar
Inokuchi, A., et al: A fast algorithms for mining frequent connected subgraphs. Research Report RT0448, IBM Research, Tokyo Research Laboratory (2002)
Google Scholar
Kuramochi, M., et al: Frequent subgraph discovery. In: Proc. of IEEE ICDM (2001) 313–320.
Google Scholar
Kuramochi, M., et al: An efficient algorithm for discovering frequent subgraph. Technical Report 02-026, Dept. of Computer Science, University of Minnesota (2002)
Google Scholar
Pei, J., et al: PrefixSpan: Mining sequential patterns by prefix-projected growth. In: Proc. of ICDE (2001) 215–224.
Google Scholar
Cook, D. J., et al: Substructure discovery using minimum description length and background knowledge. J. of Artificial Intelligence Research 1 (1994) 231–255
Google Scholar
Yoshida, K., et al: CLIP: Concept learning from inference patterns. Artificial Intelligence 1 (1995) 63–92
Article Google Scholar
Motoda, H., et al: Machine learning techniques to make computers easier to use. In: Proc. of IJCAI, Vol. 2 (1997) 1622–1631
Google Scholar
Matsuda, T., et al: Extension of graph-based induction for general graph structured data. In: Proc. of PAKDD, LNCS, Vol. 1805, Springer (2000) 420–431
Google Scholar
Matsuda, T., et al: Knowledge discovery from structured data by beam-wise graph-based induction. In: Proc. of PRICAI, LNCS, Vol. 2417, Springer (2002) 255–264
Google Scholar
Raedt, L. De, et al: The levelwise version space algorithm and its application to molecular fragment finding. In: Proc. of IJCAI Vol. 2 (2001) 853–862.
Google Scholar
Dehaspe, L., et al: Finding frequent substructures in chemical compounds. In: Proc. of KDD (1998) 30–36
Google Scholar
Kramer, S., et al: Molecular feature mining in HIV data. In: Proc. of ACM SIGKDD (2001) 136–143
Google Scholar
Weininger D.: SMILES, a chemical language and information system. J. of Chemical Information and Computer Sciences 1 (1988) 31–36
Article Google Scholar
James, C. A., et al: Daylight Theory Manual — Daylight 4.71
Google Scholar
Wang, X., et al: Finding patterns in three dimensional graphs: Algorithms and applications to scientific data mining. IEEE TKDE 4 (2002) 731–749
Google Scholar
Yan, X., et al: gSpan: Graph-based substructure pattern mining. In: Proc. of IEEE ICDM (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Technology Science, Fudan University, Shanghai, 200433, P.R.China
Mingsheng Hong, Haofeng Zhou, Wei Wang & Baile Shi

Authors

Mingsheng Hong
View author publications
You can also search for this author in PubMed Google Scholar
Haofeng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Baile Shi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Korea Advanced Institute of Science and Technology, 373-1 Koo-Sung Dong, Yoo-Sung Ku, Daejeon, 305-701, Korea
Kyu-Young Whang
Department of Statistics, Seoul National University, Sillimdong Kwanakgu, Seoul, 151-742, Korea
Jongwoo Jeon
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak P.O. Box 34, Seoul, 151-742, Korea
Kyuseok Shim
Department of Computer Science and Engineering, University of Minnesota, 200 Union St SE, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hong, M., Zhou, H., Wang, W., Shi, B. (2003). An Efficient Algorithm of Frequent Connected Subgraph Extraction. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_5

Download citation

DOI: https://doi.org/10.1007/3-540-36175-8_5
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics