Abstract
Discovering significant itemsets is one of the fundamental tasks in data mining. It has recently been shown that constraint programming is a flexible way to tackle data mining tasks. With a constraint programming approach, we can easily express and efficiently answer queries with user’s constraints on itemsets. However, in many practical cases queries also involve user’s constraints on the dataset itself. For instance, in a dataset of purchases, the user may want to know which itemset is frequent and the day at which it is frequent. This paper presents a general constraint programming model able to handle any kind of query on the dataset for itemset mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A constraint c is monotone if any superset of an itemset P satisfying c also satisfies c.
- 2.
A CP expert may object that disjunctions of predicates are not the most efficient way to express constraints. This operational concern can be addressed by capturing \((1) \vee (2)\vee (3)\) into a single global constraint, or by simply adding redundant constraints \(V_p=1 \rightarrow V_r=1\) for every pair (p, r) of transactions in the same city, and \((V_p=1 \wedge V_q=1)\rightarrow V_r=1\) for every triplet (p, q, r) of transactions in the same region (resp. department) such that p and q are not in the same department (resp. city).
- 3.
- 4.
- 5.
References
Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1–4 November 2004, Brighton, UK, pp. 35–42 (2004)
Guns, T., Dries, A., Nijssen, S., Tack, G., Raedt, L.D.: MiningZinc: a declarative framework for constraint-based mining. Artif. Intell. 244, 6–29 (2017)
Guns, T., Nijssen, S., Raedt, L.D.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)
Kemmar, A., Lebbah, Y., Loudni, S., Boizumault, P., Charnois, T.: Prefix-projection global constraint and top-k approach for sequential pattern mining. Constraints 22(2), 265–306 (2017)
Khiari, M., Boizumault, P., Crémilleux, B.: Constraint programming for mining n-ary patterns. In: Cohen, D. (ed.) CP 2010. LNCS, vol. 6308, pp. 552–567. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15396-9_44
Lazaar, N., et al.: A global constraint for closed frequent pattern mining. In: Rueher, M. (ed.) CP 2016. LNCS, vol. 9892, pp. 333–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44953-1_22
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Min. Knowl. Discov. 1(3), 241–258 (1997)
Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 24–27 August 2008, pp. 204–212 (2008)
Schaus, P., Aoga, J.O.R., Guns, T.: CoverSize: a global constraint for frequency-based itemset mining. In: Beck, J.C. (ed.) CP 2017. LNCS, vol. 10416, pp. 529–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66158-2_34
Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 16–31. Springer, Heidelberg (2004)
Wojciechowski, M., Zakrzewicz, M.: Dataset filtering techniques in constraint-based frequent pattern mining. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, pp. 77–91. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45728-3_7
Zaki, M.J., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the Second SIAM International Conference on Data Mining, Arlington, VA, USA, 11–13 April 2002, pp. 457–473 (2002)
Acknowledgment
Christian Bessiere was partially supported by the ANR project DEMOGRAPH (ANR-16-CE40-0028). Nadjib Lazaar is supported by the project I3A TRACT (CNRS INSMI INS2I - AMIES - 2018). Mehdi Maamar is supported by the project CPER Data from the region “Hauts-de-France” We thank Yahia Lebbah for the discussions we shared during this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Bessiere, C., Lazaar, N., Maamar, M. (2018). User’s Constraints in Itemset Mining. In: Hooker, J. (eds) Principles and Practice of Constraint Programming. CP 2018. Lecture Notes in Computer Science(), vol 11008. Springer, Cham. https://doi.org/10.1007/978-3-319-98334-9_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-98334-9_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98333-2
Online ISBN: 978-3-319-98334-9
eBook Packages: Computer ScienceComputer Science (R0)