Skip to main content

An Efficient Approach for Mining High Utility Itemsets Over Data Streams

  • Chapter
  • First Online:
Book cover Data Science and Big Data: An Environment of Computational Intelligence

Part of the book series: Studies in Big Data ((SBD,volume 24))

  • 2662 Accesses

Abstract

Mining frequent itemsets only considers the number of the occurrences of the itemsets in the transaction database. Mining high utility itemsets considers the purchased quantities and the profits of the itemsets in the transactions, which the profitable products can be found. In addition, the transactions will continuously increase over time, such that the size of the database becomes larger and larger. Furthermore, the older transactions which cannot represent the current user behaviors also need to be removed. The environment to continuously add and remove transactions over time is called a data stream . When the transactions are added or deleted, the original high utility itemsets will be changed. The previous proposed algorithms for mining high utility itemsets over data streams need to rescan the original database and generate a large number of candidate high utility itemsets without using the previously discovered high utility itemsets. Therefore, this chapter proposes an approach for efficiently mining high utility itemsets over data streams. When the transactions are added into or removed from the transaction database, our algorithm does not need to scan the original transaction database and search from a large number of candidate itemsets. Experimental results also show that our algorithm outperforms the previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of 20th International Conference on Very Large Databases, Santiago, Chile, 487–499 (1994)

    Google Scholar 

  2. Ahmed, C. F., Tanbeer, S. K., Jeong B.S., Lee. Y.K.: An Efficient Candidate Pruning Technique for High Utility Pattern Mining. In: Proceedings of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 749–756 (2009)

    Google Scholar 

  3. Ahmed, C. F., Tanbeer, S. K., Jeong, B. S., Lee, Y. K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering, 21(12), 1708–1721 (2009)

    Google Scholar 

  4. Brin, S., Motwani, R., Ullman, J., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings ACM SIGMOD International Conference on management of Data, 255–264 (1997)

    Google Scholar 

  5. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1), 53–87 (2004)

    Google Scholar 

  6. Li, H.F., Huang, H.Y., Chen, Y.C., Liu, Y.J., Lee, S.Y.: Fast and memory efficient mining of high utility itemsets in data streams. In: Proceedings of the 8th IEEE International Conference on Data Mining, 881–886 (2008)

    Google Scholar 

  7. Liu, Y., Liao, W. K., Choudhary, A.: A Fast High Utility Itemsets Mining Algorithm. In: Proceedings of the International. Workshop on Utility-Based Data Mining, 90–99 (2005)

    Google Scholar 

  8. Lin, C. W., Lan, G. C., Hong, T. P.: Mining high utility itemsets for transaction deletion in a dynamic database. Intelligent Data Analysis 19(1), 43–55 (2015)

    Google Scholar 

  9. Li, Y.C., Yeh, J.S., Chang, C.C.: Isolated Items Discarding Strategy for Discovering High Utility Itemsets. Data and Knowledge Engineering, 64(1), 198–217 (2008)

    Google Scholar 

  10. Morteza, Z., Aijun, A.: Mining top-k high utility patterns over data streams. Information Sciences, 285(1), 138–161 (2014)

    Google Scholar 

  11. Mohammad, E., Osmar, R. Z.: COFI approach for mining frequent itemsets revisited. In: Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 70–75 (2004)

    Google Scholar 

  12. Park, J. S., Chen, M. S., Yu, P. S.: An Effective Hash-Based Algorithm for Mining Association Rules. ACM SIGMOD 24(2), 175–186 (1995)

    Google Scholar 

  13. Ryang, H., Yun, U.: High utility pattern mining over data streams with sliding window technique. Expert Systems with Applications, 57, 214–231(2016)

    Google Scholar 

  14. Tseng, S.M., Chu, C. J., Liang, T.: Efficient mining of temporal high utility itemsets from data streams. In: Proceedings of the ACM International Conference on Utility-Based Data Mining Workshop, 18–27 (2006)

    Google Scholar 

  15. Tseng, S.M., Shie, B.E., Philip Yu, S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering, 25(8), 1772–1786 (2013)

    Google Scholar 

  16. Tseng, S.M., Wu, C.W., Shie, B.E., Philip Yu, S: UP-Growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD, 253–262 (2010)

    Google Scholar 

  17. Wang, J., Han, J., Pei, J.: CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 236–245 (2003)

    Google Scholar 

  18. Yen, S.J., Chen, C.C., Lee, Y.S.: A fast algorithm for mining high utility Itemsets. In: Proceedings of International Workshop on Behavior Informatics, joint with the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 171–182 (2011)

    Google Scholar 

  19. Yun, U., Ryang, H.: Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence, 42(2), 323–352(2015)

    Google Scholar 

  20. Yen, S.J., Wu, C.W., Lee, Y.S., Vincent Tseng, S.: A Fast Algorithm for Mining Frequent Closed Itemsets over Stream Sliding Window. In: Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 996–1002 (2011)

    Google Scholar 

  21. Yen, S. J., Wang, C. K., Ouyang, L. Y.: A search space reduced algorithm for mining frequent patterns. Journal of Information Science and Engineering, 28 (1), 177–191 (2012)

    Google Scholar 

  22. IBM Synthetic Data Generator http://www.almaden.ibm.com/software/quest/Resorces/index.shtml

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Show-Jane Yen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Yen, SJ., Lee, YS. (2017). An Efficient Approach for Mining High Utility Itemsets Over Data Streams. In: Pedrycz, W., Chen, SM. (eds) Data Science and Big Data: An Environment of Computational Intelligence. Studies in Big Data, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-319-53474-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-53474-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-53473-2

  • Online ISBN: 978-3-319-53474-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics