Skip to main content

Mining High-Utility Itemsets with Both Positive and Negative Unit Profits from Uncertain Databases

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10234))

Abstract

Some important limitation of frequent itemset mining are that it assumes that each item cannot appear more than once in each transaction, and all items have the same importance (weight, cost, risk, unit profit or value). These assumptions often do not hold in real-world applications. For example, consider a database of customer transactions containing information about the purchase quantities of items in each transaction and the positive or negative unit profit of each item. Besides, uncertainty is commonly embedded in collected data in real-life applications. To address this issue, we propose an efficient algorithm named HUPNU (mining High-Utility itemsets with both Positive and Negative unit profits from Uncertain databases), the high qualified patterns can be discovered effectively for decision-making. Based on the designed vertical PU\(^{\pm }\)-list (Probability-Utility list with Positive-and-Negative profits) structure and several pruning strategies, HUPNU can directly discovers the potential high-utility itemsets without generating candidates.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Frequent Itemset Mining Dataset Repository. http://fimi.ua.ac.be/data/

  2. Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21(5), 609–623 (2009)

    Article  Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the International Conference on Very Large Databases, pp. 487–499 (1994)

    Google Scholar 

  4. Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)

    Article  Google Scholar 

  5. Bernecker, T., Kriegel, H.P., Renz, M., Verhein, F., Zuefl, A.: Probabilistic frequent itemset mining in uncertain databases. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 119–128 (2009)

    Google Scholar 

  6. Chan, R., Yang, Q., Shen, Y.: Mining high utility itemsets. In: IEEE International Conference on Data Mining, pp. 19–26 (2003)

    Google Scholar 

  7. Chu, C.J., Tseng, V.S., Liang, T.: An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl. Math. Comput. 215, 767–778 (2009)

    MATH  Google Scholar 

  8. Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS (LNAI), vol. 8502, pp. 83–92. Springer, Cham (2014). doi:10.1007/978-3-319-08326-1_9

    Google Scholar 

  9. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)

    Google Scholar 

  10. Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). doi:10.1007/11430919_79

    Chapter  Google Scholar 

  11. Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Tseng, V.S.: Mining potential high-utility itemsets over uncertain databases. In: ACM ASE BigData & Social Informatics, p. 25 (2015)

    Google Scholar 

  12. Fournier-Viger, P.: FHN: efficient mining of high-utility itemsets with negative unit profits. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS (LNAI), vol. 8933, pp. 16–29. Springer, Cham (2014). doi:10.1007/978-3-319-14717-8_2

    Google Scholar 

  13. Rymon, R.: Search through systematic set enumeration. Technical reports (CIS), pp. 539–550 (1992)

    Google Scholar 

  14. Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)

    Article  Google Scholar 

  15. Yao, H., Hamilton, H.J., d Butz C.J.: A foundational approach to mining itemset utilities from databases. In: SIAM International Conference on Data Mining, pp. 211–225 (2004)

    Google Scholar 

Download references

Acknowledgement

This research was partially supported by the National Natural Science Foundation of China (NSFC) under grant No. 61503092 and by the Tencent Project under grant CCF-Tencent IAGR20160115.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Gan, W., Lin, J.CW., Fournier-Viger, P., Chao, HC., Tseng, V.S. (2017). Mining High-Utility Itemsets with Both Positive and Negative Unit Profits from Uncertain Databases. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10234. Springer, Cham. https://doi.org/10.1007/978-3-319-57454-7_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57454-7_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57453-0

  • Online ISBN: 978-3-319-57454-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics