Abstract
Frequent pattern mining has been widely studied in the past decades and has been applied to many domains. In particular, numerical transaction databases, where not only the items but also the utility associated with them are available in user transactions, are useful for real applications. For example, customer mobile App traffic data collected by mobile service providers contains such information. In this paper, we aim to find frequent patterns that occupy a large portion of total utility of the supporting transactions, to answer questions like “On which mobile Apps do the customers spend most of their data traffic?” Towards this goal, we define a measure called utility occupancy to measure the contribution of a pattern within a transaction. The challenge of high utility occupancy itemset discovering is the lack of monotone or anti-monotone property. So we derive an upper bound for utility occupancy and design an efficient mining algorithm called OCEAN based on a fast implementation of utility list. Evaluations on real world mobile App traffic data and other three datasets show that OCEAN is efficient and effective in finding frequent patterns with large utility occupancy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Frequent itemset mining implementations repository. http://fimi.ua.ac.be/data/
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering (ICDE), pp. 443–452. IEEE (2001)
Chan, R.C., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM), pp. 19–26. IEEE (2003)
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 83–92. Springer, Heidelberg (2014)
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: Current status and future directions. Data Min. Knowl. Disc. 15(1), 55–86 (2007)
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of the 2rd IEEE International Conference on Data Mining (ICDM), pp. 211–218. IEEE (2002)
Hong, T.P., Lee, C.H., Wang, S.L.: Effective utility mining with the measure of average utility. Expert Syst. Appl. 38(7), 8259–8265 (2011)
Lan, G.C., Hong, T.P., Tseng, V.S.: A projection-based approach for discovering high average-utility itemsets. J. Inf. Sci. Eng. 28(1), 193–209 (2012)
Liu, J., Wang, K., Fung, B.: Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the 12th IEEE International Conference on Data Mining (ICDM), pp. 984–989. IEEE (2012)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on InformatIon and Knowledge Management (CIKM), pp. 55–64. ACM (2012)
Rymon, R.: Search through systematic set enumeration. Technical Reports (CIS), pp. 539–550 (1992)
Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Inf. Syst. 29(4), 293–313 (2004)
Tang, L., Zhang, L., Luo, P., Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM), pp. 75–84. ACM (2012)
Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 344–353. ACM (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Shen, B., Wen, Z., Zhao, Y., Zhou, D., Zheng, W. (2016). OCEAN: Fast Discovery of High Utility Occupancy Itemsets. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)