Advertisement

Mining Weighted Sequential Patterns in a Sequence Database with Itemset-Interval Measurement

  • Yu Fu
  • Yanhua Yu
  • Meina Song
  • Xiaosu Zhan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8351)

Abstract

Weighted sequential pattern mining algorithms discover weighted sequences with considering the different significance of each item in a sequence database. But current algorithms have not considered the importance of the itemset-intervals information between the two items in a same itemset. Hence, although a large number of sequences had been discovered, most of them are not useful for analysis. In this study, we propose a new algorithm, called ItemSet-interval Weighted Sequences (ISiWS), to solve the problem about efficient discovering useful sequences. In ISiWS, a matrix structure, called Transaction Bit Matrix (TBM), represents a sequence. ISiWS first uses TBMs to represent the sequences in a sequence database. Then, it utilizes projected technology to discover weighted sequences, and an approximate sequence match algorithm is applied to calculate support of sequences based on their itemset-intervals. Experiments show that ISiWS produces a significantly less number of weighted sequences than those of WSpan.

Keywords

weighted sequential patterns mining approximate sequence match algorithm transaction bit matrix itemset-interval 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yun, U., Leggett, J.J.: WSpan: Weighted Sequential pattern mining in large sequence databses. In: International IEEE Conference on Intelligent Systems, IS, pp. 512–517 (2006)Google Scholar
  2. 2.
    Ji, X., Bailey, J., Dong, G.: Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints. In: Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 194–201 (2005)Google Scholar
  3. 3.
    Chang, J.H.: Mining weighted sequential patterns in a sequence database with a time-interval weight. Knowledge Based System 24(1), 1–9 (2011)CrossRefGoogle Scholar
  4. 4.
    Ji, X., Bailey, J.: An Efficient technique for mining approximately frequent substring patterns. In: IEEE International Conference on Data Mining, pp. 325–330 (2007)Google Scholar
  5. 5.
    Kum, H.-C., Pei, J., Wang, W., Duncan, D.: ApproxMap: Approximate Mining of Consensus Sequential Patterns. In: SIAM International Conference on Data Mining (2003)Google Scholar
  6. 6.
    Yin, J., Zheng, Z., Cao, L.: USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 660–668 (2012)Google Scholar
  7. 7.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceeding of 7th International Conference on Data Engineering, pp. 3–14 (1995)Google Scholar
  8. 8.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215–224 (2001)Google Scholar
  9. 9.
    Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42(1/2), 31–60 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential Pattern mining using a bitmap representation. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435 (2002)Google Scholar
  11. 11.
    Yun, U.: A new framework for detecting weighted sequential patterns in large sequence databases. Knowledge-Based Systems 21(2), 110–122 (2008)CrossRefGoogle Scholar
  12. 12.
    Lo, S.: Binary prediction based on weighted sequential mining method. In: Proceeding of the 2005 International Conference on Web Intelligence, pp. 755–761 (2005)Google Scholar
  13. 13.
    Xifeng, Y., JiaWei, H., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Databases. In: ACM SIAM International Conference on Data Mining, pp. 166–177 (2003)Google Scholar
  14. 14.
    Dmitriy, F., Fabian, M.: Margin-Closed Frequent Sequential Pattern Mining. In: Proceedings of the ACM SIGKDD Workshop on Useful Patterns, pp. 45–54 (2010)Google Scholar
  15. 15.
    Wang, J., Han, J.: BIDE: Efficient Mining of Frequent Closed Sequences. In: IEEE International Conference on Data Engineering, pp. 79–90 (2003)Google Scholar
  16. 16.
    Chen, E., Cao, H., Li, Q., Qian, T.: Efficient strategies for tough aggregate constraint-based sequential pattern mining. Information Sciences 178(6), 1498–1518 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: Proceedings of the 2002 ACM International Conference on Information and Knowledge Management, pp. 18–25 (2002)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yu Fu
    • 1
  • Yanhua Yu
    • 1
  • Meina Song
    • 1
  • Xiaosu Zhan
    • 2
  1. 1.PCN&CAD CenterBeijing University of Posts and TelecommunicationsBeijingChina
  2. 2.Institute of Military Operation Research and AnalysisAcademy of Military ScienceBeijingChina

Personalised recommendations