Skip to main content

OCEAN: Fast Discovery of High Utility Occupancy Itemsets

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Included in the following conference series:

Abstract

Frequent pattern mining has been widely studied in the past decades and has been applied to many domains. In particular, numerical transaction databases, where not only the items but also the utility associated with them are available in user transactions, are useful for real applications. For example, customer mobile App traffic data collected by mobile service providers contains such information. In this paper, we aim to find frequent patterns that occupy a large portion of total utility of the supporting transactions, to answer questions like “On which mobile Apps do the customers spend most of their data traffic?” Towards this goal, we define a measure called utility occupancy to measure the contribution of a pattern within a transaction. The challenge of high utility occupancy itemset discovering is the lack of monotone or anti-monotone property. So we derive an upper bound for utility occupancy and design an efficient mining algorithm called OCEAN based on a fast implementation of utility list. Evaluations on real world mobile App traffic data and other three datasets show that OCEAN is efficient and effective in finding frequent patterns with large utility occupancy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Frequent itemset mining implementations repository. http://fimi.ua.ac.be/data/

  2. Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)

    Article  Google Scholar 

  3. Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proceedings of the 17th International Conference on Data Engineering (ICDE), pp. 443–452. IEEE (2001)

    Google Scholar 

  4. Chan, R.C., Yang, Q., Shen, Y.D.: Mining high utility itemsets. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM), pp. 19–26. IEEE (2003)

    Google Scholar 

  5. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 83–92. Springer, Heidelberg (2014)

    Google Scholar 

  6. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: Current status and future directions. Data Min. Knowl. Disc. 15(1), 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  7. Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of the 2rd IEEE International Conference on Data Mining (ICDM), pp. 211–218. IEEE (2002)

    Google Scholar 

  8. Hong, T.P., Lee, C.H., Wang, S.L.: Effective utility mining with the measure of average utility. Expert Syst. Appl. 38(7), 8259–8265 (2011)

    Article  Google Scholar 

  9. Lan, G.C., Hong, T.P., Tseng, V.S.: A projection-based approach for discovering high average-utility itemsets. J. Inf. Sci. Eng. 28(1), 193–209 (2012)

    Google Scholar 

  10. Liu, J., Wang, K., Fung, B.: Direct discovery of high utility itemsets without candidate generation. In: Proceedings of the 12th IEEE International Conference on Data Mining (ICDM), pp. 984–989. IEEE (2012)

    Google Scholar 

  11. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on InformatIon and Knowledge Management (CIKM), pp. 55–64. ACM (2012)

    Google Scholar 

  12. Rymon, R.: Search through systematic set enumeration. Technical Reports (CIS), pp. 539–550 (1992)

    Google Scholar 

  13. Tan, P.N., Kumar, V., Srivastava, J.: Selecting the right objective measure for association analysis. Inf. Syst. 29(4), 293–313 (2004)

    Article  Google Scholar 

  14. Tang, L., Zhang, L., Luo, P., Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM), pp. 75–84. ACM (2012)

    Google Scholar 

  15. Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 344–353. ACM (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Shen, B., Wen, Z., Zhao, Y., Zhou, D., Zheng, W. (2016). OCEAN: Fast Discovery of High Utility Occupancy Itemsets. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31753-3_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31752-6

  • Online ISBN: 978-3-319-31753-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics