Abstract
Mining frequent patterns has been studied popularly in data mining area. However, little work has been done on mining patterns when the database has an influx of fresh data constantly. In these dynamic scenarios, efficient maintenance of the discovered patterns is crucial. Most existing methods need to scan the entire database repeatedly, which is an obvious disadvantage. In this paper, an efficient incremental mining algorithm, Incremental-Mining (IM), is proposed for maintenance of the frequent patterns when incremental data come. Based on the frequent pattern tree (FP-tree) structure, IM gives a way to make the most of the things from the previous mining process, and requires scanning the original data once at most. Furthermore, IM can identify directly the differential set of frequent patterns, which may be more informative to users. Moreover, IM can deal with changing thresholds as well as changing data, thus provide a full maintenance scheme. IM has been implemented and the performance study shows it outperforms three other incremental algorithms: FUP, DB-tree and re-running frequent pattern growth (FP-growth).
Similar content being viewed by others
References
Agrawal R, Srikant R. Fast algorithm for mining association rules. InProc. 20th Int. Conf. Very Large Data Bases, Santiago de Chile, Chile, September 12–15, 1994, pp.487–499.
Zaki M J. Scalable algorithms for association mining.IEEE Trans. Knowledge and Data Engineering, 2000, 12(3): 372–390.
Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. InProc. 2000 ACM-SIGMOD Int. Conf. Management of Data, Dallas, TX, USA, May 14–19, 2000, pp.1–12.
Pei J, Han J, Lu H, Nishio S, Tang S, Yang D. H-Mine: Hyper-structure mining of frequent patterns in large databases. InProc. 2001 Int. Conf. Data Mining, San Jose, CA, USA, Nov.29–Dec.2, 2001, pp.441–448.
Tseng F, Hsu C. Generating frequent patterns with the Frequent Pattern List.Lecture Notes in Artificial Intelligence 2035, Cheung D, Williams G J, Li Q (eds.), Springer-Verlag, 2001, pp.376–386.
Cheung D, Han J, Ng V, Wong C. Maintenance of discovered association rules in large databases: An incremental updating technique. InProc. 12th Int. Conf. Data Engineering, New Orleans, Louisiana, Feb. 26–Mar. 1, 1996, pp.106–114.
Cheung D, Lee S, Kao B. A general incremental technique for maintaining discovered association rules. InProc. 5th Int. Conf. Database Systems for Advanced Applications, Melbourne, Austrlia, April 1–4, 1997, pp.185–194.
Lee S, Cheung D, Maintenance of discovered association rules: When to update? InProc. 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'97), Tucson, Arizona, May 11, 1997.
Ezeife C I, Su Y. Mining incremental association rules with generalized FP-tree.Lecture Notes in Computer Science 2338, Cohen R, Spencer B (eds.), Springer-Verlag, 2002, pp.147–160.
Liu J, Yin J. Towards efficient data re-mining (DRM).Lecture Notes in Artificial Intelligence 2035, Cheung D, Williams G J, Li Q (eds.), Springer-Verlag, 2001, pp.406–412.
Ma X, Tang S, Yang D, Du X. Towards efficient re-mining of frequent pattern upon threshold changes.Lecture Notes in Computer Science 2419, Meng X, Su J, Wang Y (eds.), Springer-Verlag, 2002, pp.80–88.
Du X, Tang S, Makinouchi A. Maintaining discovered frequent itemsets: Cases for changeable database and support.Journal of Computer Science and Technology, Sept. 2003, 18(5): 648–658.
Davey B A. Priestley H A. Introduction to Lattices and Order. Cambridge Univ. Press, 1990.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Supported by the National Basic Research 973 Program of China under Grant No.G1999032705.
Xiu-Li Ma received the Ph.D. degree in computer science from Peking University in 2003. She is currently a postdoctoral researcher at National Lab on Machine Perception of Peking University. Her main research interests include data warehousing, data mining, intelligent online analysis, and sensor network.
Yun-Hai Tong received the Ph.D. degree in computer software from Peking University in 2002. He is currently an assistant professor at School of Electronics Engineering and Computer Science of Peking University. His research interests include data warehousing, online analysis processing and data mining.
Shi-Wei Tang received the B.S. degree in mathematics from Peking University in 1964. Now, he is a professor and Ph.D. supervisor at School of Electronics Engineering and Computer Science of Peking University. His research interests include DBMS, information integration, data warehousing. OLAP, and data mining, database technology in specific application fields. He is the vice chair of the Database Society of China Computer Federation.
Dong-Qing Yang received the B.S. degree in mathematics from Peking University in 1969. Now, she is a professor and Ph.D supervisor at School of Electronics Engineering and Computer Science of Peking University. Her research interests include database design methodology, database system implementation techniques, data warehousing and data mining, information integration and sharing in Web environment. She is a member of academic committee of Database Society of China Computer Federation.
Rights and permissions
About this article
Cite this article
Ma, XL., Tong, YH., Tang, SW. et al. Efficient incremental maintenance of frequent patterns with FP-tree. J. Comput. Sci. & Technol. 19, 876–884 (2004). https://doi.org/10.1007/BF02973451
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF02973451