Abstract
Sequential pattern mining is aimed at extracting correlations among temporal data. Many different methods were proposed to either enumerate sequences of set valued data (i.e., itemsets) or sequences containing dimensional items. However, in real-world scenarios, data sequences are described as combination of both multidimensional items and itemsets. These heterogeneous descriptions cannot be handled by traditional approaches. In this paper we propose a new approach called MMISP (Mining Multidimensional Itemset Sequential Patterns) to extract patterns from complex sequential database including both multidimensional items and itemsets. The novelties of the proposal lies in: (i) the way in which the data are efficiently compressed; (ii) the ability to reuse and adopt sequential pattern mining algorithms and (iii) the extraction of new kind of patterns. We introduce a case-study on real-world data from a regional healthcare system and we point out the usefulness of the extracted patterns. Additional experiments on synthetic data highlights the efficiency and scalability of the approach MMISP.
Similar content being viewed by others
Notes
Programme de Médicalisation des Sytèmes d’Information.
“Classification Commune des Actes Médicaux”: the French classification of medical and surgical procedures.
References
Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proceedings of the eleventh international conference on data engineering, ICDE’95 (pp. 3–14) Washington: IEEE Computer Society.
Ayres, J., Flannick, J., Gehrke, J., Yiu, T. (2002). Sequential pattern mining using a bitmap representation. In KDD (pp. 429–435).
Chiu, D.-Y., Wu, Y.-H., Chen, A. LP. (2004). An efficient algorithm for mining frequent sequences by a new strategy without support counting. In ICDE (pp. 375–386).
Fetter, R. B., Shin, Y., Freeman, J. L., Averill, R. F., Thompson, J. D. (1980). Case mix definition by diagnosis-related groups. Medical Care, 18(2), 1–53.
Masseglia, F., Cathala, F., Poncelet, P. (1998). The PSP approach for mining sequential patterns. In PKDD (pp. 176–184).
Mooney, C. H., & Roddick, J. F. (2013). Sequential pattern mining—approaches and algorithms. ACM Computing Surveys, 45(2), 19:1–19:39.
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M. (2001). Prefixspan: mining sequential patterns by prefix-projected growth. In ICDE (pp. 215–224).
Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., Dayal, U. (2001). Multi-dimensional sequential pattern mining. In CIKM (pp. 81–88).
Plantevit, M., Laurent, A., Laurent, D., Teisseire, M., Choong, Y. W. (2010). Mining multidimensional and multilevel sequential patterns. TKDD, 4(1), 1–37.
Salvemini, E., Fumarola, F., Malerba, D., Han, J. (2011). Fast sequence mining based on sparse id-lists. In Proceedings of the 19th international conference on foundations of intelligent systems, ISMIS’11 (pp. 316–325). Berlin: Springer.
Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: generalizations and performance improvements. In Proceedings of the 5th international conference on extending database technology: advances in database technology, EDBT’96 (pp. 3–17). London: Springer.
Yan, X., Han, J., Afshar, R. (2003). Clospan: mining closed sequential patterns in large datasets. In SDM (pp. 166–177).
Yang, Z., Kitsuregawa, M., Wang, Y. (2006). Paid: mining sequential patterns by passed item deduction in large databases. In IDEAS (pp. 113–120).
Yu, C.-C., & Chen, Y.-L. (2005). Mining sequential patterns from multidimensional sequence data. IEEE Transactions on Knowledge and Data Engineering, 17(1), 136–140.
Yu, C., & Jagadish, H. V. (2007). Querying complex structured databases. In Proceedings of the 33rd international conference on very large data bases, VLDB’07 (pp. 1010–1021). VLDB Endowment.
Zaki, M. J. (2001). Spade: an efficient algorithm for mining frequent sequences. Machine Learning, 42(1–2), 31–60.
Zhang, C., Hu, K., Chen, Z., Chen, L., Dong, Y. (2007). Approxmgmsp: a scalable method of mining approximate multidimensional sequential patterns on distributed system. In FSKD (2) (pp. 730–734).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Egho, E., Jay, N., Raïssi, C. et al. A contribution to the discovery of multidimensional patterns in healthcare trajectories. J Intell Inf Syst 42, 283–305 (2014). https://doi.org/10.1007/s10844-014-0309-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-014-0309-4