Mining Frequent Sequential Patterns under a Similarity Constraint
Many practical applications are related to frequent sequential pattern mining, ranging from Web Usage Mining to Bioinformatics. To ensure an appropriate extraction cost for useful mining tasks, a key issue is to push the user-defined constraints deep inside the mining algorithms. In this paper, we study the search for frequent sequential patterns that are also similar to an user-defined reference pattern. While the effective processing of the frequency constraints is well-understood, our contribution concerns the identification of a relaxation of the similarity constraint into a convertible anti-monotone constraint. Both constraints are then used to prune the search space during a levelwise search. Preliminary experimental validations have confirmed the algorithm efficiency.
KeywordsSequential Pattern Mining Algorithm Similarity Threshold Editing Operation Similarity Constraint
Unable to display preview. Download preview PDF.
- 1.R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. ICDE’95, pages 3–14. IEEE Press, March 1995.Google Scholar
- 2.Matthieu Capelle. Extraction de motifs séquentiels sous contraintes (in french). Master’s thesis, DEA ECD, INSA Lyon, Villeurbanne, France, September 2001.Google Scholar
- 3.M. N. Garofalakis, R. Rastogi, and K. Shim. SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. In Proc. VLDB’99, pages 223–234. Morgan Kaufmann, September 1999.Google Scholar
- 4.Levenshtein. Binary codes capable of corecting deletions, insertions, and reversals, 1966.Google Scholar
- 5.J. Liu, Kelvin Chi Kuen Wong, and Ka Keung Hui. Discovering user behavior patterns in personalized interface agents. In Proc. IDEAL 2000, pages 398–403. Springer Verlag LNCS 1983, December 2000.Google Scholar
- 7.P. Moen. Attribute, Event Sequence, and Event Type Simarity Notions for Data Mining. PhD thesis, Dept. of Computer Science, University of Helsinki, Finland, February 2000.Google Scholar
- 8.R. T. Ng, L. V.S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proc. SIGMOD’98, pages 13–24. ACM Press, June 1998.Google Scholar
- 9.J. Pei, J. Han, and L. V.S. Lakshmanan. Mining frequent itemsets with convertible constraints. In Proc. ICDE’01, pages 433–442. IEEE Computer Press, April 2001.Google Scholar
- 10.M. J. Zaki. Sequence mining in categorical domains: Incorporating constraints. In Proc. CIKM’00, pages 422–429. ACM Press, November 2000.Google Scholar