Abstract
The FP-Growth algorithm has been studied extensively in the field of frequent pattern mining. The algorithm offers the advantage of avoiding costly database scans in comparison with Apriori-based algorithms. However, since it still requires two database scans, it cannot be used on streaming data. Also, the algorithm is designed for static datasets, where the input transactions are fixed and thus cannot be used for incremental or interactive mining. Existing incremental mining algorithms are not easily adoptable for on-the-fly, fast, and memory efficient FP-tree mining. In this paper we propose a novel SPFP-tree (single pass frequent pattern tree) algorithm that scans the database only once and provides the same tree as FP-Growth. Our algorithm changes the tree structure dynamically to create a highly compact frequency-ordered tree on the fly. With the insertion of each new transaction our algorithm dynamically maintains a tree identical to an FP-tree. Experimental results show the efficiency of the SPFP-tree algorithm in both incremental and interactive mining of frequent patterns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the SIGMOD, New York (1993)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas (2000)
Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: CanTree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)
Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Efficient single-pass frequent pattern mining using a prefix-tree. Information Sciences 179(5), 559–583 (2009)
Deng, Z.H., Wang, Z., Jiang, J.J.: A new algorithm for fast mining frequent itemsets using N-lists. SCIENCE CHINA Information Sciences 55(9), 2008–2030 (2012)
Deng, Z.H., Lv, S.L.: PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning. Expert Systems with Applications 42(13), 5424–5432 (2015)
Deng, Z.H., Lv, S.L.: Fast mining frequent itemsets using Nodesets. Expert Systems with Applications 41(7), 3506–3513 (2014)
Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the ICDE, Los Alamitos, CA (1996)
Cheung, D.W., Lee, S.D., Kao, B.: A general incremental technique for maintaining discovered association rules. In: Proceedings of the DASFAA, Singapore (1997)
Ayan, N.F., Tansel, A.U., Arkun, E.: Efficient algorithm to update large itemsets with early pruning. In: Proceedings of the SIGKDD, New York (1999)
Koh, J.L., Shieh, S.F.: An efficient approach for maintaining association rules based on adjusting FP-tree structures. In: Proceedings of the DASFAA, New York (2004)
Li, X., Deng, Z.-H., Tang, S.-W.: A fast algorithm for maintenance of association rules in incremental databases. In: Li, X., Zaïane, O.R., Li, Z.-H. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 56–63. Springer, Heidelberg (2006)
Hong, T.P., Lin, C.W., Wu, Y.L.: Incrementally fast updated frequent pattern trees. Expert Systems with Applications 34(4), 2424–2435 (2008)
Cheung, W., Zaiane, O.R.: Incremental mining of frequent patterns without candidate generation or support constraint. In: Proceedings of the IDEAS, Los Alamitos, CA (2003)
Liu, G., Lu, H., Yu, J.X.: CFP-tree: A compact disk-based structure for storing and querying frequent itemsets. Information Systems 32(2), 295–319 (2007)
Blake, C., Merz, C.: UCI repository of machine learning databases. University of California, Irvine (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Shahbazi, N., Soltani, R., Gryz, J., An, A. (2016). Building FP-Tree on the Fly: Single-Pass Frequent Itemset Mining. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)