Abstract
Over the years many pattern mining tasks and algorithms have been proposed. Traditionally, the focus of these studies was on the efficiency of the computation and the scalability towards very large databases. Little research has however been done on a general framework that encompasses several of these problems. In earlier work we showed how constraint programming (CP) can offer such a general framework; unfortunately, however, we also found that out-of-the-box CP solvers lack the efficiency and scalability achieved by specialized itemset mining systems, which could discourage their use. Here we study the question whether a framework can be built that inherits the generality of CP systems and the efficiency of specialized algorithms. We propose a CP-based framework for pattern mining that avoids the redundant representations and propagations found in existing CP systems. We show experimentally that an implementation of this framework performs comparable to specialized itemset mining systems; furthermore, under certain conditions it lists itemsets with polynomial delay, which demonstrates that it also is a promising approach for analyzing pattern mining tasks from more theoretical perspectives. This is illustrated on a graph mining problem.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1996)
Arimura, H., Uno, T.: Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems. In: SDM, pp. 1087–1098. SIAM, Philadelphia (2009)
Boley, M., Horváth, T., Poigné, A., Wrobel, S.: Efficient closed pattern mining in strongly accessible set systems. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 382–389. Springer, Heidelberg (2007)
Bonchi, F., Giannotti, F., Lucchese, C., Orlando, S., Perego, R., Trasarti, R.: A constraint-based querying system for exploratory pattern discovery. Inf. Syst. 34(1), 3–27 (2009)
Borgelt, C.: Efficient implementations of Apriori and Eclat. In: Workshop of Frequent Item Set Mining Implementations, FIMI (2003)
Bucila, C., Gehrke, J., Kifer, D., White, W.M.: Dualminer: A dual-pruning algorithm for itemsets with constraints. Data Min. Knowl. Discov. 7(3), 241–272 (2003)
Burdick, D., Calimlim, M., Flannick, J., Gehrke, J., Yiu, T.: MAFIA: A maximal frequent itemset algorithm. IEEE TKDE 17(11), 1490–1504 (2005)
De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: KDD, pp. 204–212 (2008)
Goethals, B., Zaki, M.J.: Advances in frequent itemset mining implementations: report on FIMI 2003. In: SIGKDD Explorations, volume 6, pp. 109–117 (2004)
Han, J., Lakshmanan, L.V.S., Ng, R.T.: Constraint-based multidimensional data mining. IEEE Computer 32(8), 46–50 (1999)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)
Nijssen, S., Guns, T., De Raedt, L.: Correlated itemset mining in ROC space: a constraint programming approach. In: KDD, pp. 647–656 (2009)
Rossi, F., van Beek, P., Walsh, T.: Handbook of Constraint Programming (Foundations of Artificial Intelligence). Elsevier Science Inc., Amsterdam (2006)
Schulte, C., Stuckey, P.J.: Efficient constraint propagation engines. Transactions on Programming Languages and Systems 31(1) (2008)
Uno, T., Kiyomi, M., Arimura, H.: LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: OSDM 2005: Proceedings of the 1st International Workshop on Open Source Data Mining, pp. 77–86 (2005)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: KDD, pp. 283–286 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nijssen, S., Guns, T. (2010). Integrating Constraint Programming and Itemset Mining. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15883-4_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-15883-4_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15882-7
Online ISBN: 978-3-642-15883-4
eBook Packages: Computer ScienceComputer Science (R0)