Mining Maximal Sequential Patterns without Candidate Maintenance
Sequential pattern mining is an important data mining task with wide applications. However, it may present too many sequential patterns to users, which degrades the performance of the mining task in terms of execution time and memory requirement, and makes it difficult for users to comprehend the results. The problem becomes worse when dealing with dense or long sequences. As a solution, several studies were performed on mining maximal sequential patterns. However, previous algorithms are not memory efficient since they need to maintain a large amount of intermediate candidates in main memory during the mining process. To address these problems, we present a both time and memory efficient algorithm to efficiently mine maximal sequential patterns, named MaxSP (Maximal Sequential Pattern miner), which computes all maximal sequential patterns without storing intermediate candidates in main memory. Experimental results on real datasets show that MaxSP serves as an efficient solution for mining maximal sequential patterns.
Keywordssequences sequential pattern mining compact representation maximal sequential patterns
Unable to display preview. Download preview PDF.
- 1.Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2006)Google Scholar
- 2.Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. Int. Conf. on Data Engineering, pp. 3–14 (1995)Google Scholar
- 4.Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential PAttern mining using a bitmap representation. In: Proc. KDD 2002, Edmonton, Alberta, pp. 429–435 (2002)Google Scholar
- 5.Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Trans. Knowledge and Data Engineering 16(10), 1–17 (2001)Google Scholar
- 8.Yan, X., Han, J., Afshar, R.: CloSpan: Mining closed sequential patterns in large datasets. In: Proc. of the Third SIAM International Conference on Data Mining, San Francisco, California, May 1–3 (2003) ISBN 0-89871-545-8Google Scholar
- 11.Lin, N.P., Hao, W.-H., Chen, H.-J., Chueh, H.-E., Chang, C.-I.: Fast Mining Maximal Sequential Patterns. In: Proc. of the 7th International Conference on Simulation, Modeling and Optimization, Beijing, China, September 15-17, pp. 405–408 (2007)Google Scholar
- 12.Luo, C., Chung, S.: Efficient mining of maximal sequential patterns using multiple samples. In: Proc.5th SIAM int’l Conf. on Data Mining, Newport Beach, California (2005)Google Scholar
- 13.Lu, S., Li, C.: AprioriAdjust: An Efficient Algorithm for Discovering the Maximum Sequential Patterns. In: Proc. 2nd Int’l Workshop Knowl. Grid and Grid Intell. (2004)Google Scholar
- 14.Guan, E.-Z., Chang, X.-Y., Wang, Z., Zhou, C.-G.: Mining Maximal Sequential Patterns. In: Proc of the Second Int’l Conf. Neural Networks and Brain, pp. 525–528 (2005)Google Scholar
- 15.Fournier-Viger, P., Nkambou, R., Tseng, V.S.: RuleGrowth: Mining Sequential Rules Common to Several Sequences by Pattern-Growth. In: Proc. of the 26th Symposium on Applied Computing, Tainan, Taiwan, pp. 954–959. ACM Press (2011)Google Scholar