Abstract
Most studies have considered the frequency as sole interestingness measure for identifying high quality patterns. However, each object is different in nature, in terms of criteria such as the utility, risk, or interest. Besides, another limitation of frequent patterns is that they generally have a low occupancy, and may not be truly representative. Thus, this paper extends the occupancy measure to assess the utility of patterns in transaction databases. The High Utility Occupancy Pattern Mining (HUOPM) algorithm considers user preferences in terms of frequency, utility, and occupancy. Several novel data structures are designed to discover the complete set of high quality patterns without candidate generation. Extensive experiments have been conducted on several datasets to evaluate the effectiveness and efficiency of HUOPM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Frequent itemset mining dataset repository (2012). http://fimi.ua.ac.be/data/
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases, pp. 487–499 (1994)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Le, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Chan, R., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: International Conference on Data Mining, pp. 19–26 (2003)
Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Tseng, V.S.: Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl. Based Syst. 96, 171–187 (2016)
Yao, H., Hamilton, J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, pp. 211–225 (2004)
Rymon, R.: Search through systematic set enumeration. Technical reports (CIS), p. 297 (1992)
Shen, B., Wen, Z., Zhao, Y., Zhou, D., Zheng, W.: OCEAN: fast discovery of high utility occupancy itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 354–365 (2016)
Tang, L., Zhang, L., Luo, P., Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of the 21st ACM International Conference on Information and knowledge management, pp. 75–84 (2012)
Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Acknowledgments
This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092, by the Research on the Technical Platform of Rural Cultural Tourism Planning Basing on Digital Media under grant 2017A020220011, and by the Tencent Project under grant CCF-Tencent IAGR20160115.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Gan, W., Lin, J.CW., Fournier-Viger, P., Chao, HC. (2017). Exploiting High Utility Occupancy Patterns. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-63579-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63578-1
Online ISBN: 978-3-319-63579-8
eBook Packages: Computer ScienceComputer Science (R0)