Abstract
The number of applications generating sequential data is exploding. This work studies the discovering of frequent patterns in a large sequence of events, possibly time-stamped. This problem is known as the Frequent Episode Mining (FEM). Similarly to the mining problems recently tackled by Constraint Programming (CP), FEM would also benefit from the modularity offered by CP to accommodate easily additional constraints on the patterns. These advantages do not offer a guarantee of efficiency. Therefore, we introduce two global constraints for solving FEM problems with or without time consideration. The time-stamped version can accommodate gap and span constraints on the matched sequences. Our experiments on real data sets of different levels of complexity show that the introduced constraints is competitive with the state-of-the-art methods in terms of execution time and memory consumption while offering the flexibility of adding constraints on the patterns.
J. O. R. Aoga—This author is supported by the FRIA-FNRS, Belgium.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
https://bitbucket.org/projetsJOHN/episodesupport (also available in [31]).
- 2.
Results provided in [14] are directly used since the implementation is not available.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)
Aoga, J.O.R., Guns, T., Schaus, P.: An efficient algorithm for mining frequent sequence with constraint programming. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9852, pp. 315–330. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46227-1_20
Aoga, J.O.R., Guns, T., Schaus, P.: Mining time-constrained sequential patterns with constraint programming. Constraints 22(4), 548–570 (2017)
Calders, T., Dexters, N., Goethals, B.: Mining frequent itemsets in a stream. In: 2007 Seventh IEEE International Conference on Data Mining, ICDM 2007, pp. 83–92. IEEE (2007)
UniProt Consortium: The universal protein resource (UniProt). Nucleic Acids Res. 36(Suppl. 1), D190–D195 (2008)
Cule, B., Goethals, B., Robardet, C.: A new constraint for mining sets in sequences. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp. 317–328. SIAM (2009)
Das, G., Lin, K.I., Mannila, H., Renganathan, G., Smyth, P.: Rule discovery from time series. In: KDD, vol. 98, pp. 16–22 (1998)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002). https://doi.org/10.1007/s101070100263
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28, pp. 2962–2970. Curran Associates, Inc. (2015). http://papers.nips.cc/paper/5872-efficient-and-robust-automated-machine-learning.pdf
Ghahramani, Z.: Automating machine learning. In: Lecture Notes in Computer Science, vol. 9852 (2016)
Guns, T., Dries, A., Tack, G., Nijssen, S., De Raedt, L.: MiningZinc: a modeling language for constraint-based mining. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 1365–1372. AAAI Press (2013)
Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12–13), 1951–1983 (2011)
Han, J., Pei, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224 (2001)
Huang, K.Y., Chang, C.H.: Efficient mining of frequent episodes from complex sequences. Inf. Syst. 33(1), 96–114 (2008)
Iwanuma, K., Takano, Y., Nabeshima, H.: On anti-monotone frequency measures for extracting sequential patterns from a single very-long data sequence. In: 2004 IEEE Conference on Cybernetics and Intelligent Systems, vol. 1, pp. 213–217. IEEE (2004)
Kemmar, A., Loudni, S., Lebbah, Y., Boizumault, P., Charnois, T.: A global constraint for mining sequential patterns with GAP constraint. In: Quimper, C.-G. (ed.) CPAIOR 2016. LNCS, vol. 9676, pp. 198–215. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-33954-2_15
Kotthoff, L., Nanni, M., Guidotti, R., O’Sullivan, B.: Find your way back: mobility profile mining with constraints. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 638–653. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23219-5_44
Kotthoff, L., Thornton, C., Hoos, H.H., Hutter, F., Leyton-Brown, K.: Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J. Mach. Learn. Res. 17, 1–5 (2017)
Laxman, S., Sastry, P., Unnikrishnan, K.: A fast algorithm for finding frequent episodes in event streams. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 410–419. ACM (2007)
Lichman, M.: UCI machine learning repository (2013). https://archive.ics.uci.edu/ml/datasets/UNIX+User+Data
Mannila, H., Toivonen, H.: Discovering generalized episodes using minimal occurrences. In: KDD, vol. 96, pp. 146–151 (1996)
Mannila, H., Toivonen, H., Verkamo, A.I.: Discovering frequent episodes in sequences extended abstract. In: 1st Conference on Knowledge Discovery and Data Mining (1995)
Negrevergne, B., Guns, T.: Constraint-based sequence mining using constraint programming. In: Michel, L. (ed.) CPAIOR 2015. LNCS, vol. 9075, pp. 288–305. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18008-3_20
Nijssen, S., Guns, T.: Integrating constraint programming and itemset mining. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6322, pp. 467–482. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15883-4_30
Pesant, G.: A regular language membership constraint for finite sequences of variables. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 482–495. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30201-8_36
Rawassizadeh, R., Momeni, E., Dobbins, C., Mirza-Babaei, P., Rahnamoun, R.: Lesson learned from collecting quantified self information via mobile and wearable devices. J. Sens. Actuator Netw. 4(4), 315–335 (2015)
Rawassizadeh, R., Tomitsch, M., Wac, K., Tjoa, A.M.: UbiqLog: a generic mobile phone-based life-log framework. Pers. Ubiquit. Comput. 17(4), 621–637 (2013)
Schaus, P., Aoga, J.O.R., Guns, T.: CoverSize: a global constraint for frequency-based itemset mining. In: Beck, J.C. (ed.) CP 2017. LNCS, vol. 10416, pp. 529–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66158-2_34
Shokoohi-Yekta, M., Chen, Y., Campana, B., Hu, B., Zakaria, J., Keogh, E.: Discovery of meaningful rules in time series. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1085–1094. ACM (2015)
Tatti, N., Cule, B.: Mining closed strict episodes. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 501–510. IEEE (2010)
Team, O.: OscaR: Scala in OR (2012)
Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Making 5(04), 597–604 (2006)
Yang, Z., Wang, Y., Kitsuregawa, M.: LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 1020–1023. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71703-4_95
Zhou, W., Liu, H., Cheng, H.: Mining closed episodes from event sequences efficiently. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS (LNAI), vol. 6118, pp. 310–318. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13657-3_34
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Cappart, Q., Aoga, J.O.R., Schaus, P. (2018). EpisodeSupport: A Global Constraint for Mining Frequent Patterns in a Long Sequence of Events. In: van Hoeve, WJ. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2018. Lecture Notes in Computer Science(), vol 10848. Springer, Cham. https://doi.org/10.1007/978-3-319-93031-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-93031-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93030-5
Online ISBN: 978-3-319-93031-2
eBook Packages: Computer ScienceComputer Science (R0)