TKEH: an efficient algorithm for mining top-k high utility itemsets

Singh, Kuldeep; Singh, Shashank Sheshar; Kumar, Ajay; Biswas, Bhaskar

doi:10.1007/s10489-018-1316-x

TKEH: an efficient algorithm for mining top-k high utility itemsets

Published: 25 October 2018

Volume 49, pages 1078–1097, (2019)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Kuldeep Singh¹,
Shashank Sheshar Singh¹,
Ajay Kumar¹ &
…
Bhaskar Biswas¹

541 Accesses
23 Citations
Explore all metrics

Abstract

High utility itemsets mining is a subfield of data mining with wide applications. Although the existing high utility itemsets mining algorithms can discover all the itemsets satisfying a given minimum utility threshold, it is often difficult for users to set a proper minimum utility threshold. A smaller minimum utility threshold value may produce a huge number of itemsets, whereas a higher one may produce a few itemsets. Specification of minimum utility threshold is difficult and time-consuming. To address these issues, top-k high utility itemsets mining has been defined where k is the number of high utility itemsets to be found. In this paper, we present an efficient algorithm (named TKEH) for finding top-k high utility itemsets. TKEH utilizes transaction merging and dataset projection techniques to reduce the dataset scanning cost. These techniques reduce the dataset when larger items are explored. TKEH employs three minimum utility threshold raising strategies. We utilize two strategies to prune search space efficiently. To calculate the utility of items and upper-bounds in linear time, TKEH utilizes array-based utility technique. We carried out some extensive experiments on real datasets. The results show that TKEH outperforms the state-of-the-art algorithms. Moreover, TKEH always performs better for dense datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Data distribution tailoring revisited: cost-efficient integration of representative data

Article 12 April 2024

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Article Open access 22 February 2023

References

Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases, VLDB ’94. Morgan Kaufmann Publishers Inc, San Francisco, pp 487– 499
Ahmed CF, Tanbeer SK, Jeong B-S, Choi H -J (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991
Article Google Scholar
Chu C-J, Tseng VS, Liang T (2008) An efficient algorithm for mining temporal high utility itemsets from data streams. J Syst Softw 81(7):1105–1117
Article Google Scholar
Dam T-L, Li K, Fournier-Viger P, Duong Q-H (2017) An efficient algorithm for mining top-k on-shelf high utility itemsets. Knowl Inf Syst 52(3):621–655
Article Google Scholar
Duong Q-H, Liao B, Fournier-Viger P, Dam T-L (2016) An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowl-Based Syst 104:106–122
Article Google Scholar
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu C-W, Tseng VS (2014a) Spmf: a java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014b) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. Springer International Publishing, Cham, pp 83–92
Fournier-Viger P, Zida S (2015) Foshu: faster on-shelf high utility itemset mining – with or without negative unit profit. In: Proceedings of the 30th annual ACM symposium on applied computing, SAC ’15. ACM, New York, pp 857–864
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Article Google Scholar
Krishnamoorthy S (2017) Hminer: efficiently mining high utility itemsets. Expert Syst Appl 90(Supplement C):168–183
Article Google Scholar
Lee S, Park JS (2016) Top-k high utility itemset mining based on utility-list structures. In: 2016 International conference on big data and smart computing (BigComp), pp 101–108
Li HF, Huang HY, Chen YC, Liu YJ, Lee SY (2008) Fast and memory efficient mining of high utility itemsets in data streams. In: 2008 Eighth IEEE International conference on data mining, pp 881–886
Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International conference on data mining. Brussels, pp 984–989
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12. ACM, New York, pp 55–64
Liu Y, Liao W -k, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Proceedings of the 9th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’05. Springer, Berlin, pp 689–695
Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowl-Based Syst 76:109–126
Article Google Scholar
Shie B-E, Hsiao H-F, Tseng VS, Yu PS (2011) Mining high utility mobile sequential patterns in mobile commerce environments. Springer, Berlin, pp 224–238
Google Scholar
Shie B-E, Yu PS, Tseng VS (2013) Mining interesting user behavior patterns in mobile commerce environments. Appl Intell 38(3):418–435
Article Google Scholar
Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans on Knowl and Data Eng 25(8):1772–1786
Article Google Scholar
Tseng VS, Wu CW, Fournier-Viger P, Yu PS (2016) Efficient algorithms for mining top-k high utility itemsets. IEEE Trans Knowl Data Eng 28(1):54–67
Article Google Scholar
Tseng VS, Wu C -W, Shie B -E, Yu PS (2010) Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10. ACM, New York, pp 253–262
Wu CW, Shie B-E, Tseng VS, Yu PS (2012) Mining top-k high utility itemsets. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, pp 78–86
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59 (3):603–626
Article Google Scholar
Yen S-J, Lee Y-S (2007) Mining high utility quantitative association rules. Springer, Berlin, pp 283–292
Google Scholar
Yin J, Zheng Z, Cao L, Song Y, Wei W (2013) Efficiently mining top-k high utility sequential patterns. In: 2013 IEEE 13th International conference on data mining, pp 1259–1264
Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878
Article Google Scholar
Zida S, Fournier-Viger P, Lin JC -W, Wu C -W, Tseng VS (2017) Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Article Google Scholar
Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161. Processing and Mining Complex Data Streams
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221–005, India
Kuldeep Singh, Shashank Sheshar Singh, Ajay Kumar & Bhaskar Biswas

Authors

Kuldeep Singh
View author publications
You can also search for this author in PubMed Google Scholar
Shashank Sheshar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Ajay Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Bhaskar Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuldeep Singh.

Ethics declarations

The article uses threshold raising and memory reduction techniques to mine the top-k high utility itemsets mining.

Conflict of interests

The authors declare no conflicts of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, K., Singh, S.S., Kumar, A. et al. TKEH: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49, 1078–1097 (2019). https://doi.org/10.1007/s10489-018-1316-x

Download citation

Published: 25 October 2018
Issue Date: 15 March 2019
DOI: https://doi.org/10.1007/s10489-018-1316-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TKEH: an efficient algorithm for mining top-k high utility itemsets

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TKEH: an efficient algorithm for mining top-k high utility itemsets

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation