Abstract
Sequential pattern mining has been broadly studied and many algorithms have been proposed. The first part of this chapter proposes a new algorithm for mining frequent sequences. This algorithm processes only one scan of the database thanks to an indexed structure associated to a bit map representation. Thus, it allows a fast data access and a compact storage in main memory. Experiments have been conducted using real and synthetic datasets. The experimental results show the efficiency of our method compared to existing algorithms. Beyond mining plain sequences, taking into account multidimensional information associated to sequential data is for a great interest for many applications. In the second part, we propose a characterization based multidimensional sequential patterns mining. This method first groups sequences by similarity; then characterizes each cluster using multidimensional properties describing the sequences. The clusters are built around the frequent sequential patterns. Thus, the whole process results in rules characterizing sequential patterns using multidimensional information. This method has been experimented towards a survey on population daily activity and mobility in order to analyze the profile of the population having typical activity sequences. The extracted rules show our method effectiveness.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th Int. Conf. Very Large Data Bases (VLDB), Santiago, Chile (September 1994)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. of the 11th Int’l Conference on Data Engineering, Taipei, Taiwan (March 1995)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403–410 (1990)
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cubes. In: Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1999), Philadelphia, PA, pp. 359–370 (June 1999)
Cheng, H., Yan, X., Han, J.: IncSpan: Incremental mining of sequential patterns in large database. In: Proc. 2004 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD 2004), Seattle, WA, pp. 527–532 (August 2004)
Freschi, V., Bogliolo, A.: Longest Common Subsequence between Run-Length-Encoded Strings: a New Algorithm with Improved Parallelism. Elsevier Information Processing Letters 90(4), 167–173 (2004)
Gidófalvi, G., Pedersen, T.B.: Mining Long Sharable Patterns in Trajectories of Moving Objects. In: Proc. of STDBM, pp. 49–58 (2006)
Han, J., Jamil, H.M., Lu, Y., Chen, L., Liao, Y., Pei, J.: DNA Miner: A system prototype for mining DNA sequences. In: The proc. of the ACM SIGMOD International Conference on the management of data, Day 21-24, Santa Barbara, CA, USA (2001)
Han, J., Kamber, M.: Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2001)
Jay, A., Johannes, G., Tomi, Y., Jason, F.: Sequential Pattern Mining using A Bitmap Representation. In: SIGMOD, Edmonton, Alberta, Canada, pp. 429–435 (July 2002)
Masseglia, F., Cathala, F., Poncelet, P.: The PSP approach for mining sequential patterns. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 176–184. Springer, Heidelberg (1998)
Masseglia, F., Poncelet, P., Teisseire, M.: Incremental mining of sequential patterns in large databases. In: Data & Knowledge Engineering (DKE), vol. 46, pp. 97–121 (July 2003)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H.: PrefixSPAN: Mining sequential patterns efficiency by prefix-projected pattern growth. In: Proc. of the International Conference on Data Engineering (ICDE), pp. 215–224 (2001)
Pei, J., Han, J., Wang, W.: Constraint-Based Sequential Pattern Mining: The Pattern-Growth Methods. Journal of Intelligent Information Systems 28(2), 133–160 (2007)
Pearson, W., Lipman, D.: Improved Tools for Biological Sequence Analysis. Proceedings of National Academic Science 85, 2444–2448 (1988)
Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., Dayal, U.: Multi-dimensionnal sequential pattern mining. In: ACM CIKM, pp. 81–88 (2001)
Savary, L., Zeitouni, K.: Indexed Bit Map (IBM) for Mining Frequent Sequences. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 659–666. Springer, Heidelberg (2005)
Savary, L., Zeitouni, K.: Mining Multidimensional Sequential Rules - A Characterization Approach. In: First International Workshop on Mining Complex Data in conjunction with ICDM 2005 (IEEE MCD 2005), vol. 27, Houston, Texas, USA, pp. 99–102 (November 2005)
Spiliopoulou, M., Faulstich Lukas, C., Winkler, K.: A data miner analyzing the navigational behaviour of web users. In: Proc. Of the Workshop on Machine Learning in User Modelling of the ACAI 1999 Int. Conf., Creta, Greece (July 1999)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proc. 5th EDBT, Mars 25-29, Avignon, France, pp. 3–17 (1996)
Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th Int. Conf. Very Large Data Bases (VLDB), Santiago, Chile (September 1994)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. ACM SIGKDD Explorations, vol. 1(2), pp. 12–23 (2000)
Vlachos, M., Kollios, G., Gunopulos, D.: Discovering Similar Multidimensional Trajectories. In: Proc. of 18th International Conference on Data Engineering (ICDE), San Jose, CA, pp. 673–684 (2002)
Wang, D., Cheng, T.: A spatio-temporal data model for activity-based transport demand modeling. International Journal of Geographical Information Science 15(6), 561–585 (2001)
Zaki, M.J.: Efficient Enumeration of Frequent Sequences. Int. Conference on Information and Knowledge Management, Washington DC (November 1998)
Zaki, M.J., Lesh, N., Ogihara, M.: PLANMINE: Sequence mining for plan failures. In: Proc. 1998 Int. Conf. Knowledge Discovery and Data Mining (KDD 1998), New York, NY, pp. 369–373 (Auguest 1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zeitouni, K. (2009). From Sequence Mining to Multidimensional Sequence Mining. In: Zighed, D.A., Tsumoto, S., Ras, Z.W., Hacid, H. (eds) Mining Complex Data. Studies in Computational Intelligence, vol 165. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88067-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-88067-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88066-0
Online ISBN: 978-3-540-88067-7
eBook Packages: EngineeringEngineering (R0)