Higher Performance IPPC+ Tree for Parallel Incremental Frequent Itemsets Mining

Huynh, Van Quoc Phuong; Küng, Josef

doi:10.1007/978-3-030-03192-3_10

Higher Performance IPPC⁺ Tree for Parallel Incremental Frequent Itemsets Mining

Van Quoc Phuong Huynh¹⁸ &
Josef Küng¹⁸

Conference paper
First Online: 27 October 2018

965 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11251))

Abstract

IPPC tree provides incremental properties and high performance for mining frequent itemsets through shared-memory parallel algorithm IFIN⁺. However, in the case of datasets comprising a large number of distinguishing items but just a small percentage of frequent items, IPPC tree becomes to lose its advantage in running time and memory for the tree construction. With a motivation of reducing the execution time for the tree building, in this paper, we propose an improved version for IPPC tree, called IPPC⁺, to increase the performance of the tree construction. We conducted extensive experiments on both synthetic and real datasets to evaluate IPPC⁺ tree against IPPC tree. Besides, the IFIN⁺ with the new tree is also compared to the well-known algorithm FP-Growth and the other two state-of-the-art ones, FIN and PrePost⁺. The experimental results show that the construction time of IPPC⁺ tree is improved remarkably compared to that of IPPC tree; and IFIN⁺ is the most efficient algorithm, especially in the case of mining at different support thresholds within the same running session.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Swapping two nodes is simply exchanging one’s item name to that of the other.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on VLDB, pp. 487–499 (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent itemsets without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000)
Article Google Scholar
Deng, Z.-H., Lv, S.-L.: Fast mining frequent itemsets using nodesets. Expert Syst. Appl. 41(10), 4505–4512 (2014)
Article Google Scholar
Deng, Z.-H., Lv, S.-L.: PrePost⁺: an efficient N-lists-based algorithm for mining frequent itemsets via children-parent equivalence pruning. Expert Syst. Appl. 42(13), 5424–5432 (2015)
Article Google Scholar
Market-Basket Synthetic Data Generator. https://synthdatagen.codeplex.com/
Frequent Itemset Mining Dataset Repository: Kosarak, online news portal click-stream data. http://fimi.ua.ac.be/data/kosarak.dat.gz
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: VLDB, pp. 432–443 (1995)
Google Scholar
Perego, R., Orlando, S., Palmerini, P.: Enhancing the Apriori algorithm for frequent set counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 71–82. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44801-2_8
Chapter Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: Using a hash-based method with transaction trimming and database scan reduction for mining association rules. IEEE Trans. Knowl. Data Eng. 9(5), 813–825 (1997)
Article Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article Google Scholar
Grahne, G., Zhu, J.: Fast algorithms for frequent itemset mining using FP-trees. Trans. Knowl. Data Eng. 17(10), 1347–1362 (2005)
Article Google Scholar
Liu, G., Lu, H., Lou, W., Xu, Y., Yu, J.X.: Efficient mining of frequent itemsets using ascending frequency ordered prefix-tree. DMKD J. 9(3), 249–274 (2004)
Google Scholar
Shenoy, P., Haritsa, J.R., Sudarshan, S.: Turbo-charging vertical mining of large databases. In: 2000 SIGMOD, pp. 22–33 (2000)
Article Google Scholar
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: 9th SIGKDD, pp. 326–335 (2003)
Google Scholar
Huynh, V.Q.P., Küng, J., Dang, T.K.: Incremental frequent itemsets mining with IPPC tree. In: Benslimane, D., Damiani, E., Grosky, W.I., Hameurlain, A., Sheth, A., Wagner, R.R. (eds.) DEXA 2017. LNCS, vol. 10438, pp. 463–477. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64468-4_35
Chapter Google Scholar
Huynh, V.Q.P., Küng, J., Jäger, M., Dang, T.K.: IFIN⁺: a parallel incremental frequent itemsets mining in shared-memory environment. In: Dang, T.K., Wagner, R., Küng, J., Thoai, N., Takizawa, M., Neuhold, E.J. (eds.) FDSE 2017. LNCS, vol. 10646, pp. 121–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70004-5_9
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Application Oriented Knowledge Processing (FAW), Faculty of Engineering and Natural Sciences (TNF), Johannes Kepler University (JKU), Linz, Austria
Van Quoc Phuong Huynh & Josef Küng

Authors

Van Quoc Phuong Huynh
View author publications
You can also search for this author in PubMed Google Scholar
Josef Küng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Van Quoc Phuong Huynh .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh, Vietnam
Tran Khanh Dang
Johannes Kepler University of Linz, Linz, Austria
Josef Küng
Johannes Kepler University of Linz, Linz, Austria
Roland Wagner
Ho Chi Minh City University of Technology, Ho Chi Minh, Vietnam
Nam Thoai
Hosei University, Tokyo, Japan
Makoto Takizawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huynh, V.Q.P., Küng, J. (2018). Higher Performance IPPC⁺ Tree for Parallel Incremental Frequent Itemsets Mining. In: Dang, T., Küng, J., Wagner, R., Thoai, N., Takizawa, M. (eds) Future Data and Security Engineering. FDSE 2018. Lecture Notes in Computer Science(), vol 11251. Springer, Cham. https://doi.org/10.1007/978-3-030-03192-3_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-03192-3_10
Published: 27 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03191-6
Online ISBN: 978-3-030-03192-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics