Advertisement

Discovering High-Utility Itemsets at Multiple Abstraction Levels

  • Luca CaglieroEmail author
  • Silvia Chiusano
  • Paolo Garza
  • Giuseppe Ricupero
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 767)

Abstract

High-Utility Itemset Mining (HUIM) is a relevant data mining task. The goal is to discover recurrent combinations of items characterized by high profit from transactional datasets. HUIM has a wide range of applications among which market basket analysis and service profiling. Based on the observation that items can be clustered into domain-specific categories, a parallel research issue is generalized itemset mining. It entails generating correlations among data items at multiple abstraction levels. The extraction of multiple-level patterns affords new insights into the analyzed data from different viewpoints. This paper aims at discovering a novel pattern that combines the expressiveness of generalized and High-Utility itemsets. According to a user-defined taxonomy items are first aggregated into semantically related categories. Then, a new type of pattern, namely the Generalized High-utility Itemset (GHUI), is extracted. It represents a combinations of items at different granularity levels characterized by high profit (utility). While profitable combinations of item categories provide interesting high-level information, GHUIs at lower abstraction levels represent more specific correlations among profitable items. A single-phase algorithm is proposed to efficiently discover utility itemsets at multiple abstraction levels. The experiments, which were performed on both real and synthetic data, demonstrate the effectiveness and usefulness of the proposed approach.

Keywords

High-Utility Itemset Mining Generalized itemset mining Data mining Knowledge discovery 

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD 1993, pp. 207–216 (1993)Google Scholar
  2. 2.
    Baralis, E., Cagliero, L., Cerquitelli, T., D’Elia, V., Garza, P.: Expressive generalized itemsets. Inf. Sci. 278, 327–343 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Cagliero, L.: Discovering temporal change patterns in the presence of taxonomies. IEEE Trans. Knowl. Data Eng. 25(3), 541–555 (2013)CrossRefGoogle Scholar
  4. 4.
    Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.-W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Res. 15(1), 3389–3393 (2014)zbMATHGoogle Scholar
  5. 5.
    Fournier-Viger, P., Zida, S., Lin, J.C., Wu, C., Tseng, V.S.: Efficient closed high-utility itemset mining. In: Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, pp. 898–900, 4–8 April 2016Google Scholar
  6. 6.
    Han, J., Fu, Y.: Discovery of multiple-level association rules from large databases. In: VLDB Conference, pp. 420–431 (1995)Google Scholar
  7. 7.
    Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)CrossRefGoogle Scholar
  8. 8.
    Lin, J.C., Fournier-Viger, P., Gan, W.: FHN: an efficient algorithm for mining high-utility itemsets with negative unit profits. Knowl. Based Syst. 111, 283–298 (2016)CrossRefGoogle Scholar
  9. 9.
    Liu, J., Wang, K., Fung, B.C.M.: Direct discovery of high utility itemsets without candidate generation. In: 12th IEEE ICDM Conference, pp. 984–989, December 2012Google Scholar
  10. 10.
    Liu, Y., Liao, W., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). doi: 10.1007/11430919_79 CrossRefGoogle Scholar
  11. 11.
    Srikant, R., Agrawal, R.: Mining generalized association rules. In: VLDB 1995, pp. 407–419 (1995)Google Scholar
  12. 12.
    Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)CrossRefGoogle Scholar
  13. 13.
    Tseng, V.S., Wu, C.W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining top-k high utility itemsets. IEEE TKDE 28(1), 54–67 (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Luca Cagliero
    • 1
    Email author
  • Silvia Chiusano
    • 1
  • Paolo Garza
    • 1
  • Giuseppe Ricupero
    • 1
  1. 1.Dipartimento di Automatica e InformaticaPolitecnico di TorinoTurinItaly

Personalised recommendations