Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility

Uday Kiran, R.; Yashwanth Reddy, T.; Fournier-Viger, Philippe; Toyoda, Masashi; Krishna Reddy, P.; Kitsuregawa, Masaru

doi:10.1007/978-3-030-16145-3_15

R. Uday Kiran^19,20,
T. Yashwanth Reddy²¹,
Philippe Fournier-Viger²²,
Masashi Toyoda²⁰,
P. Krishna Reddy²¹ &
…
Masaru Kitsuregawa^20,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11440))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2255 Accesses
11 Citations

Abstract

High utility itemset mining is an important model with many real-world applications. But the popular adoption and successful industrial application of this model has been hindered by the following two limitations: (i) computational expensiveness of the model and (ii) infrequent itemsets may be output as high utility itemsets. This paper makes an effort to address these two limitations. A generic high utility-frequent itemset model is introduced to find all itemsets in the data that satisfy user-specified minimum support and minimum utility constraints. Two new pruning measures, named cutoff utility and suffix utility, are introduced to reduce the computational cost of finding the desired itemsets. A single phase fast algorithm, called High Utility Frequent Itemset Miner (HU-FIMi), is introduced to discover the itemsets efficiently. Experimental results demonstrate that the proposed algorithm is efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
More details of this dataset are presented in latter parts of this paper.
2.
Since the local utility measure generalizes the TWU measure by taking into account itemsets, we use the former measure throughout this paper for brevity.

References

Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)
MATH Google Scholar
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Hong, T.P., Fujita, H.: A survey of incremental high-utility itemset mining. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 8(2), e1242 (2018)
Google Scholar
Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: ICDM, pp. 984–989. IEEE (2012)
Google Scholar
Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_79
Chapter Google Scholar
Pei, J., Han, J., Wang, W.: Constraint-based sequential pattern mining: the pattern-growth methods. J. Intell. Inf. Syst. 28(2), 133–160 (2007)
Article Google Scholar
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM, pp. 482–486 (2004)
Google Scholar
Zhang, C., Almpanidis, G., Wang, W., Liu, C.: An empirical evaluation of high utility itemset mining algorithms. Expert Syst. with Appl. 101, 91–115 (2018)
Article Google Scholar
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Article Google Scholar

Download references

Acknowledgements

We would like to thank Yahoo Japan Corporation for providing the retail transaction data.

Author information

Authors and Affiliations

National Institute of Information and Communications Technology, Tokyo, Japan
R. Uday Kiran
The University of Tokyo, Tokyo, Japan
R. Uday Kiran, Masashi Toyoda & Masaru Kitsuregawa
International Institute of Information Technology-Hyderabad, Hyderabad, India
T. Yashwanth Reddy & P. Krishna Reddy
Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
National Institute of Informatics, Tokyo, Japan
Masaru Kitsuregawa

Authors

R. Uday Kiran
View author publications
You can also search for this author in PubMed Google Scholar
T. Yashwanth Reddy
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Masashi Toyoda
View author publications
You can also search for this author in PubMed Google Scholar
P. Krishna Reddy
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Kitsuregawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Uday Kiran .

Editor information

Editors and Affiliations

Hong Kong University of Science and Technology, Hong Kong, China
Qiang Yang
Nanjing University, Nanjing, China
Zhi-Hua Zhou
University of Macau, Taipa, Macau, China
Zhiguo Gong
Southeast University, Nanjing, China
Min-Ling Zhang
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Sheng-Jun Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Uday Kiran, R., Yashwanth Reddy, T., Fournier-Viger, P., Toyoda, M., Krishna Reddy, P., Kitsuregawa, M. (2019). Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11440. Springer, Cham. https://doi.org/10.1007/978-3-030-16145-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-16145-3_15
Published: 22 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16144-6
Online ISBN: 978-3-030-16145-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics