Skip to main content

Mining Itemset-based Distinguishing Sequential Patterns with Gap Constraint

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9049))

Included in the following conference series:

Abstract

Mining contrast sequential patterns, which are sequential patterns that characterize a given sequence class and distinguish that class from another given sequence class, has a wide range of applications including medical informatics, computational finance and consumer behavior analysis. In previous studies on contrast sequential pattern mining, each element in a sequence is a single item or symbol. This paper considers a more general case where each element in a sequence is a set of items. The associated contrast sequential patterns will be called itemset-based distinguishing sequential patterns (itemset-DSP). After discussing the challenges on mining itemset-DSP, we present iDSP-Miner, a mining method with various pruning techniques, for mining itemset-DSPs that satisfy given support and gap constraint. In this study, we also propose a concise border-like representation (with exclusive bounds) for sets of similar itemset-DSPs and use that representation to improve efficiency of our proposed algorithm. Our empirical study using both real data and synthetic data demonstrates that iDSP-Miner is effective and efficient.

This work was supported in part by NSFC 61103042, SKLSE2012-09-32, and China Postdoctoral Science Foundation 2014M552371. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dong, G., Pei, J.: Sequence Data Mining. Springer-Verlag, Berlin, Heidelberg (2007)

    MATH  Google Scholar 

  2. Dong, G., Bailey, J., eds.: Contrast Data Mining: Concepts, Algorithms, and Applications. CRC Press (2012)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3–14. IEEE Computer Society, Washington, DC (1995)

    Google Scholar 

  4. Zaki, M.J.: Spade: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)

    Article  MATH  Google Scholar 

  5. Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. Knowl. Inf. Syst. 11(3), 259–286 (2007)

    Article  Google Scholar 

  6. Yan, X., Han, J., Afshar, R.: Clospan: mining closed sequential patterns in large databases. In: SDM (2003)

    Google Scholar 

  7. Han, J., Dong, G., Yin, Y.: Efficient mining of partial periodic patterns in time series database. In: Proceedings of the 15th International Conference on Data Engineering, pp. 106–115. IEEE Computer Society, Washington, DC (1999)

    Google Scholar 

  8. Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Trans. Knowl. Discov. Data 1(2), August 2007

    Google Scholar 

  9. Pei, J., Wang, H., Liu, J., Wang, K., Wang, J., Yu, P.S.: Discovering frequent closed partial orders from strings. IEEE Trans. on Knowl. and Data Eng. 18(11), 1467–1481 (2006)

    Article  Google Scholar 

  10. Ferreira, P.G., Azevedo, P.J.: Protein sequence pattern mining with constraints. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 96–107. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. She, R., Chen, F., Wang, K., Ester, M., Gardy, J.L., Brinkman, F.S.L.: Frequent-subsequence-based prediction of outer membrane proteins. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–445. ACM, New York, NY (2003)

    Google Scholar 

  12. Zeng, Q., Chen, Y., Han, G., Ren, J.: Sequential pattern mining with gap constraints for discovery of the software bug features. Journal of Computational Information Systems 10(2), 673–680 (2014)

    Google Scholar 

  13. Conklin, D., Anagnostopoulou, C.: Comparative pattern analysis of cretan folk songs. Journal of New Music Research 40(2), 119–125 (2011)

    Article  Google Scholar 

  14. Rabatel, J., Bringay, S., Poncelet, P.: Contextual sequential pattern mining. In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops. ICDMW 2010, pp. 981–988. IEEE Computer Society, Washington, DC (2010)

    Google Scholar 

  15. Feng, J., Xie, F., Hu, X., Li, P., Cao, J., Wu, X.: Keyword extraction based on sequential pattern mining. In: Proceedings of the Third International Conference on Internet Multimedia Computing and Service. ICIMCS 2011, pp. 34–38. ACM, New York, NY (2011)

    Google Scholar 

  16. Chang, J.H.: Mining weighted sequential patterns in a sequence database with a time-interval weight. Know.-Based Syst. 24(1), 1–9 (2011)

    Article  Google Scholar 

  17. Cécile, L.K., Chedy, R., Mehdi, K., Jian, P.: Mining statistically significant sequential patterns. In: Proceedings of the 13th IEEE International Conference on Data Mining (ICDM2013). ICDM2013, pp. 488–497. IEEE Computer Society, Dallas, TX (2013)

    Google Scholar 

  18. Antunes, C., Oliveira, A.L.: Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNAI 2734, vol. 2734, pp. 239–251. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  19. Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Chun Hsu, M.: Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224. IEEE Computer Society, Washington, DC (2001)

    Google Scholar 

  20. Xie, F., Wu, X., Hu, X., Gao, J., Guo, D., Fei, Y., Hua, E.: MAIL: mining sequential patterns with wildcards. Int. J. Data Min. Bioinformatics 8(1), 1–23 (2013)

    Article  Google Scholar 

  21. Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(2), 7 (2007)

    Article  Google Scholar 

  22. Shah, C.C., Zhu, X., Khoshgoftaar, T.M., Beyer, J.: Contrast pattern mining with gap constraints for peptide folding prediction. In: FLAIRS Conference, pp. 95–100 (2008)

    Google Scholar 

  23. Deng, K., Zaïane, O.R.: Contrasting sequence groups by emerging sequences. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 377–384. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  24. Wang, X., Duan, L., Dong, G., Yu, Z., Tang, C.: Efficient mining of density-aware distinguishing sequential patterns with gap constraints. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part I. LNCS 8421, vol. 8421, pp. 372–387. Springer, Switzerland (2014)

    Chapter  Google Scholar 

  25. Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43–52 (1999)

    Google Scholar 

  26. Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2007, pp. 430–439 (2007)

    Google Scholar 

  27. Rymon, R.: Search through systematic set enumeration. In: Proc. of the 3rd Int’l Conf. on Principle of Knowledge Representation and Reasoning. KR 1992, pp. 539–550 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Duan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Yang, H., Duan, L., Dong, G., Nummenmaa, J., Tang, C., Li, X. (2015). Mining Itemset-based Distinguishing Sequential Patterns with Gap Constraint. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18120-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18119-6

  • Online ISBN: 978-3-319-18120-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics