Advertisement

On Maximal Frequent Itemsets Mining with Constraints

  • Said Jabbour
  • Fatima Ezzahra Mana
  • Imen Ouled Dlala
  • Badran Raddaoui
  • Lakhdar SaisEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11008)

Abstract

Recently, a new declarative mining framework based on constraint programming (CP) and propositional satisfiability (SAT) has been designed to deal with several pattern mining tasks. The itemset mining problem has been modeled using constraints whose models correspond to the patterns to be mined. In this paper, we propose a new propositional satisfiability based approach for mining maximal frequent itemsets that extends the one proposed in [20]. We show that instead of adding constraints to the initial SAT based itemset mining encoding, the maximal itemsets can be obtained by performing clause learning during search. A major strength of our approach rises in the compactness of the proposed encoding and the efficiency of the SAT-based maximal itemsets enumeration derived using blocked clauses. Experimental results on several datasets, show the feasibility and the efficiency of our approach.

References

  1. 1.
    Abío, I., Nieuwenhuis, R., Oliveras, A., Rodríguez-Carbonell, E., Mayer-Eichberger, V.: A new look at bdds for pseudo-boolean constraints. J. Artif. Intell. Res. (JAIR) 45, 443–480 (2012)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: Depth first generation of long patterns. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 108–118 (2000)Google Scholar
  3. 3.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, pp. 207–216. ACM, New York (1993)Google Scholar
  4. 4.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of 20th International Conference on Very Large Data Bases VLDB 1994, pp. 487–499 (1994)Google Scholar
  5. 5.
    Borgelt, C.: Frequent item set mining. Wiley Interdisc. Rew.: Data Min. Knowl. Disc. 2(6), 437–456 (2012)Google Scholar
  6. 6.
    Burdick, D., Calimlim, M., Gehrke, J.: Mafia: a maximal frequent itemset algorithm for transactional databases. In: ICDE, pp. 443–452 (2001)Google Scholar
  7. 7.
    Coquery, E., Jabbour, S., Saïs, L., Salhi, Y.: A sat-based approach for discovering frequent, closed and maximal patterns in a sequence. In: Proceedings of the 20th European Conference on Artificial Intelligence (ECAI 2012), pp. 258–263 (2012)Google Scholar
  8. 8.
    Davis, M., Logemann, G., Loveland, D.: A machine program for theorem proving. Commun. ACM 5, 394–397 (1962)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Dlala, I.O., Jabbour, S., Raddaoui, B., Sais, L., Yaghlane, B.B.: A sat-based approach for enumerating interesting patterns from uncertain data. In: Proceedings of 28th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2016, San Jose, CA, USA, pp. 255–262, 6–8 November 2016Google Scholar
  10. 10.
    Dlala, I.O., Jabbour, S., Sais, L., Yaghlane, B.B.: A comparative study of SAT-based itemsets mining. In: Bramer, M., Petridis, M. (eds.) Research and Development in Intelligent Systems XXXIII, pp. 37–52. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-47175-4_3CrossRefGoogle Scholar
  11. 11.
    Eén, N., Sörensson, N.: Translating pseudo-boolean constraints into SAT. JSAT 2(1–4), 1–26 (2006)zbMATHGoogle Scholar
  12. 12.
    Gebser, M., Guyet, T., Quiniou, R., Romero, J., Schaub, T.: Knowledge-based sequence mining with ASP. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016Google Scholar
  13. 13.
    Gouda, K., Zaki, M.J.: GenMax: an efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29, 1–12 (2000)CrossRefGoogle Scholar
  16. 16.
    Henriques, R., Lynce, I., Manquinho, V.M.: On when and how to use sat to mine frequent itemsets. CoRR, abs/1207.6253 (2012)Google Scholar
  17. 17.
    Heule, M., Järvisalo, M., Biere, A.: Revisiting hyper binary resolution. In: International Conference on Integration of AI and OR Techniques in Constraint Programming, pp. 77–93 (2013)CrossRefGoogle Scholar
  18. 18.
    Jabbour, S., Sais, L., Salhi, Y.: Boolean satisfiability for sequence mining. In: Proceedings of 22nd ACM International Conference on Information and Knowledge Management (CIKM 2013), pp. 649–658. ACM (2013)Google Scholar
  19. 19.
    Jabbour, S., Sais, L., Salhi, Y.: A pigeon-hole based encoding of cardinality constraints. TPLP, 13(4-5-Online-Supplement) (2013)Google Scholar
  20. 20.
    Jabbour, S., Sais, L., Salhi, Y.: The top-k frequent closed itemset mining using top-k sat problem. In: European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), pp. 403–418 (2013)CrossRefGoogle Scholar
  21. 21.
    Jabbour, S., Sais, L., Salhi, Y.: Mining top-k motifs with a sat-based framework. Artif. Intell. 244, 30–47 (2017)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Bayardo, Jr R.J.: Efficiently mining long patterns from databases. In: Proceedings ACM SIGMOD International Conference on Management of Data SIGMOD 1998, Seattle, Washington, USA, pp. 85–93, 2–4 June 1998Google Scholar
  23. 23.
    Lin, D.-I., Kedem, Z.M.: Pincer-search: a new algorithm for discovering the maximum frequent set. In: Schek, H.-J., Alonso, G., Saltor, F., Ramos, I. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 103–119. Springer, Heidelberg (1998).  https://doi.org/10.1007/BFb0100980CrossRefGoogle Scholar
  24. 24.
    Nijssen, S., Guns, T.: Integrating constraint programming and itemset mining. In: Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2010, Proceedings, Part II, Barcelona, Spain, pp. 467–482, 20–24 September 2010CrossRefGoogle Scholar
  25. 25.
    Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings IEEE International Conference on Data Mining ICDM 2001, pp. 441–448 (2001)Google Scholar
  26. 26.
    Pei, J., Han, J., Mao, R.: CLOSET: an efficient algorithm for mining frequent closed itemsets. In: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)Google Scholar
  27. 27.
    Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: ACM SIGKDD, pp. 204–212 (2008)Google Scholar
  28. 28.
    Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, pp. 204–212, 24–27 August 2008Google Scholar
  29. 29.
    Tiwari, A., Gupta, R., Agrawal, D.: A survey on frequent pattern mining: current status and challenging issues. Inform. Technol. J 9, 1278–1293 (2010)CrossRefGoogle Scholar
  30. 30.
    Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations FIMI 2004, Brighton, UK, 1 November 2004Google Scholar
  31. 31.
    Warners, J.P.: A linear-time transformation of linear inequalities into conjunctive normal form. Inf. Process. Lett. 68(2), 63–69 (1998)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Zaki, M.J., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the Second SIAM International Conference on Data Mining, pp. 457–473 (2002)Google Scholar
  33. 33.
    Zou, Q., Chu, W.W., Lu, B.: Smartminer: a depth first algorithm guided by tail information for mining maximal frequent itemsets. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, pp. 570–577, 9–12 December 2002Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Said Jabbour
    • 1
  • Fatima Ezzahra Mana
    • 1
    • 3
  • Imen Ouled Dlala
    • 1
    • 4
  • Badran Raddaoui
    • 2
  • Lakhdar Sais
    • 1
    Email author
  1. 1.CRIL-CNRS, Université d’ArtoisLens CedexFrance
  2. 2.SAMOVAR, Télécom SudParis, CNRS, Univ. Paris-SaclayEvryFrance
  3. 3.INPTInstitut National des Postes et TelecommunicationsRabatMorocco
  4. 4.LARODEC, University of TunisTunisTunisia

Personalised recommendations