Abstract
Mining frequent patterns from datasets is one of the key success stories of data mining research. Currently, most of the works focus on independent data, such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to extract frequent patterns from these relations is the objective in this paper. We use graphs to model the relations, and select a simple type for analysis. Combining the graph theory and algorithms to generate frequent patterns, a new algorithm Topology, which can mine these graphs efficiently, has been proposed. We evaluate the performance of the algorithm by doing experiments with synthetic datasets and real data. The experimental results show that Topology can do the job well. At the end of this paper, the potential improvement is mentioned.
This paper was supported by the Key Program of National Natural Science Foundation of China (No. 69933010) and China National 863 High-Tech Projects (No. 2002AA4Z3430 and 2002AA231041)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., et al: Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD (1993) 207–216
Agrawal, R., et al: Fast algorithms for mining association rules in large databases. In: Proc. of VLDB (1994) 487–499.
Park, J.S., et al: An Effective Hash Based Algorithm for Mining Association Rules. In: Proc. of ACM SIGMOD (1995) 175–186
Brin, S., et al:Dynamic itemset counting and implication rules for market basket data. In: Proc. of ACM SIGMOD (1997) 255–264
Han, J., et al: Mining frequent patterns without candidate generation. In: Proc. of ACM SIGMOD (2000) 1–12
Read, R. C., et al: The graph isomorphism disease. J. of Graph Theory, 4 (1977) 339–363
Babai, L., et al: Canonical labeling of graphs. In: Proc. of ACM STOC (1983) 171–183.
Inokuchi, A., et al: An apriori-based algorithm for mining frequent substructures from graph data. In: Proc. of PKDD, LNCS, Vol. 1910, Springer (2000) 13–23
Inokuchi, A., et al: Applying algebraic mining method of graph substructures to mutageniesis data analysis. In: KDD Challenge, PAKDD (2000) 41–46.
Inokuchi, A., et al: A fast algorithms for mining frequent connected subgraphs. Research Report RT0448, IBM Research, Tokyo Research Laboratory (2002)
Kuramochi, M., et al: Frequent subgraph discovery. In: Proc. of IEEE ICDM (2001) 313–320.
Kuramochi, M., et al: An efficient algorithm for discovering frequent subgraph. Technical Report 02-026, Dept. of Computer Science, University of Minnesota (2002)
Pei, J., et al: PrefixSpan: Mining sequential patterns by prefix-projected growth. In: Proc. of ICDE (2001) 215–224.
Cook, D. J., et al: Substructure discovery using minimum description length and background knowledge. J. of Artificial Intelligence Research 1 (1994) 231–255
Yoshida, K., et al: CLIP: Concept learning from inference patterns. Artificial Intelligence 1 (1995) 63–92
Motoda, H., et al: Machine learning techniques to make computers easier to use. In: Proc. of IJCAI, Vol. 2 (1997) 1622–1631
Matsuda, T., et al: Extension of graph-based induction for general graph structured data. In: Proc. of PAKDD, LNCS, Vol. 1805, Springer (2000) 420–431
Matsuda, T., et al: Knowledge discovery from structured data by beam-wise graph-based induction. In: Proc. of PRICAI, LNCS, Vol. 2417, Springer (2002) 255–264
Raedt, L. De, et al: The levelwise version space algorithm and its application to molecular fragment finding. In: Proc. of IJCAI Vol. 2 (2001) 853–862.
Dehaspe, L., et al: Finding frequent substructures in chemical compounds. In: Proc. of KDD (1998) 30–36
Kramer, S., et al: Molecular feature mining in HIV data. In: Proc. of ACM SIGKDD (2001) 136–143
Weininger D.: SMILES, a chemical language and information system. J. of Chemical Information and Computer Sciences 1 (1988) 31–36
James, C. A., et al: Daylight Theory Manual — Daylight 4.71
Wang, X., et al: Finding patterns in three dimensional graphs: Algorithms and applications to scientific data mining. IEEE TKDE 4 (2002) 731–749
Yan, X., et al: gSpan: Graph-based substructure pattern mining. In: Proc. of IEEE ICDM (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hong, M., Zhou, H., Wang, W., Shi, B. (2003). An Efficient Algorithm of Frequent Connected Subgraph Extraction. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_5
Download citation
DOI: https://doi.org/10.1007/3-540-36175-8_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive