Mining Compact High Utility Itemsets Without Candidate Generation

Wu, Cheng-Wei; Fournier-Viger, Philippe; Gu, Jia-Yuan; Tseng, Vincent S.

doi:10.1007/978-3-030-04921-8_11

Mining Compact High Utility Itemsets Without Candidate Generation

Cheng-Wei Wu⁷,
Philippe Fournier-Viger⁸,
Jia-Yuan Gu⁹ &
…
Vincent S. Tseng¹⁰

Chapter
First Online: 19 January 2019

822 Accesses
14 Citations

Part of the book series: Studies in Big Data ((SBD,volume 51))

Abstract

Though the research topic of high utility itemset (HUI) mining has received extensive attention in recent years, current algorithms suffer from the crucial problem that too many HUIs tend to be produced. This seriously degrades the performance of HUI mining in terms of execution and memory efficiency. Moreover, it is very hard for users to discover meaningful information in a huge number of HUIs. In this paper, we address this issue by proposing a promising framework with a novel algorithm named CHUI (Compact High Utility Itemset)-Mine to discover closed\(^{+}\) HUIs and maximal HUIs, which are compact representations of HUIs. The main merits of CHUI-Mine lie in two aspects: First, in terms of efficiency, unlike existing algorithms that tend to produce a large amount of candidates during the mining process, CHUI-Mine computes the utility of itemsets directly without generating candidates. Second, in terms of losslessness, unlike current algorithms that provide incomplete results, CHUI-Mine can discover the complete closed\(^{+}\) or maximal HUIs with no miss. A comprehensive investigation is also presented to compare the relative advantages of different compact representations in terms of computational cost and compactness. To our best knowledge, this is the first work addressing the issue of mining compact high utility itemsets in terms of closed\(^{+}\) and maximal HUIs without candidate generation. Experimental results show that CHUI-Mine achieves a massive reduction in the number of HUIs and is several orders of magnitude faster than benchmark algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)
Article Google Scholar
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: fast and space-preserving frequent pattern mining in large databases. IIE Trans. 39(6), 593–605 (2007)
Article Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Article Google Scholar
Chan, R., Yang, Q., Shen, Y.: Mining high utility itemsets. In: Proceedings of IEEE International Conference on Data Mining, pp. 19–26 (2003)
Google Scholar
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Tseng, V.S., Yu, P.: A survey of utility-oriented pattern mining (2018). arxiv:1805.10511
Li, H.F., Huang, H.Y., Chen, Y.C., Liu, Y.J., Lee, S.Y.: Fast and memory efficient mining of high utility itemsets in data streams. In: Proceedings of IEEE International Conference on Data Mining (ICDM), pp. 881–886 (2008)
Google Scholar
Liu, Y., Liao, W., Choudhary, A.: A fast high utility itemsets mining algorithm. In: Proceedings of the Utility-Based Data Mining Workshop, pp. 90–99 (2005)
Google Scholar
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2016)
Article Google Scholar
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of International Conference on ACM SIGKDD, pp. 253–262 (2010)
Google Scholar
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15, 3389–3393 (2014)
Google Scholar
Tseng, V.S., Wu, C.W., Lin, J.H., Fournier-Viger, P.: UP-miner: a utility pattern mining toolbox. In: Proceedings of IEEE International Conference on Data Mining, pp. 1656–1659 (2015)
Google Scholar
Li, Y.C., Yeh, J.S., Chang, C.C.: Isolated items discarding strategy for discovering high utility itemsets. Data Knowl. Eng. 64(1), 198–217 (2008)
Article Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of ACM International Conference on Information and knowledge Management, pp. 55–64 (2012)
Google Scholar
Shie, B.E., Tseng, V.S., Yu, P.S.: Online mining of temporal maximal utility itemsets from data streams. In: Proceedings of Annual ACM Symposium on Applied Computing, pp. 1622–1626 (2010)
Google Scholar
Wu, C.W., Fournier-Viger, P., Gu, J.Y., Tseng, V.S.: Mining closed+ high utility itemsets without candidate generation. In: Proceedings of Conference on Technologies and Applications of Artificial Intelligence, pp. 187–194 (2015)
Google Scholar
Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of Boolean data for the approximation of frequency queries. Data Min. Knowl. Discov. 7(1), 5–22 (2003)
Google Scholar
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Proceedings of European Conference on Principles of Data Mining and Knowledge Discovery, pp. 74–85 (2002)
Chapter Google Scholar
Gouda, K., Zaki, M.J.: GenMax: an efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005)
Article MathSciNet Google Scholar
Lucchese, C., Orlando, S., Perego, R.: Fast and memory efficient mining of frequent closed itemsets. IEEE Trans. Knowl. Data Eng. 18(1), 21–36 (2006)
Article Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattice. J. Inf. Syst. 24(1), 25–46 (1999)
Article Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Proceedings of International Conference on Database Theory, pp. 398–416 (1999)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Generating a condensed representation for association rules. J. Intell. Inf. Syst. 24(1), 29–60 (2005)
Article Google Scholar
Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of International Conference on ACM SIGKDD, pp. 236–245 (2003)
Google Scholar
Zaki, M.J., Hsiao, C.J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans. Knowl. Data Eng. 17(4), 462–478 (2005)
Article Google Scholar
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)
Article Google Scholar
Wu, C.W., Fournier-Viger, P., Yu, P.S., Tseng, V.S.: Efficient mining of a concise and lossless representation of high utility itemsets. In: Proceedings of IEEE International Conference on Data Mining, pp. 824–833 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

National Ilan University, Ilan, Taiwan
Cheng-Wei Wu
Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
Jia-Yuan Gu
National Chiao Tung University, Hsinchu, Taiwan
Vincent S. Tseng

Authors

Cheng-Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Yuan Gu
View author publications
You can also search for this author in PubMed Google Scholar
Vincent S. Tseng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng-Wei Wu .

Editor information

Editors and Affiliations

Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Université du Québec à Montréal, Montreal, QC, Canada
Roger Nkambou
Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Bay Vo
National Chiao Tung University, Hsinchu, Taiwan
Vincent S. Tseng

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wu, CW., Fournier-Viger, P., Gu, JY., Tseng, V.S. (2019). Mining Compact High Utility Itemsets Without Candidate Generation. In: Fournier-Viger, P., Lin, JW., Nkambou, R., Vo, B., Tseng, V. (eds) High-Utility Pattern Mining. Studies in Big Data, vol 51. Springer, Cham. https://doi.org/10.1007/978-3-030-04921-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-04921-8_11
Published: 19 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04920-1
Online ISBN: 978-3-030-04921-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics