Advertisement

TKS: Efficient Mining of Top-K Sequential Patterns

  • Philippe Fournier-Viger
  • Antonio Gomariz
  • Ted Gueniche
  • Espérance Mwamikazi
  • Rincy Thomas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8346)

Abstract

Sequential pattern mining is a well-studied data mining task with wide applications. However, fine-tuning the minsup parameter of sequential pattern mining algorithms to generate enough patterns is difficult and time-consuming. To address this issue, the task of top-k sequential pattern mining has been defined, where k is the number of sequential patterns to be found, and is set by the user. In this paper, we present an efficient algorithm for this problem named TKS (Top-K Sequential pattern mining). TKS utilizes a vertical bitmap database representation, a novel data structure named PMAP (Precedence Map) and several efficient strategies to prune the search space. An extensive experimental study on real datasets shows that TKS outperforms TSP, the current state-of-the-art algorithm for top-k sequential pattern mining by more than an order of magnitude in execution time and memory.

Keywords

top-k sequential pattern sequence database pattern mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proc. Int. Conf. on Data Engineering, pp. 3–14 (1995)Google Scholar
  2. 2.
    Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Trans. Knowledge and Data Engineering 16(10), 1–17 (2001)Google Scholar
  3. 3.
    Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential PAttern mining using a bitmap representation. In: Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD 2002), Edmonton, Alberta, July 23-26, pp. 429–435 (2002)Google Scholar
  4. 4.
    Tzvetkov, P., Yan, X., Han, J.: TSP: Mining Top-k Closed Sequential Patterns. Knowledge and Information Systems 7(4), 438–457 (2005)CrossRefGoogle Scholar
  5. 5.
    Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Computing Surveys 43(1), 1–41 (2010)CrossRefGoogle Scholar
  6. 6.
    Fournier-Viger, P., Wu, C.-W., Tseng, V.S.: Mining Top-K Association Rules. In: Kosseim, L., Inkpen, D. (eds.) Canadian AI 2012. LNCS, vol. 7310, pp. 61–73. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Kun Ta, C., Huang, J.-L., Chen, M.-S.: Mining Top-k Frequent Patterns in the Presence of the Memory Constraint. VLDB Journal 17(5), 1321–1344 (2008)CrossRefGoogle Scholar
  8. 8.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publ., San Francisco (2006)Google Scholar
  9. 9.
    Cormen, T.H., Leiserson, C.E., Rivest, R., Stein, C.: Introduction to Algorithms, 3rd edn. MIT Press, Cambridge (2009)zbMATHGoogle Scholar
  10. 10.
    Fournier-Viger, P., Tseng, V.S.: Mining Top-K Sequential Rules. In: Tang, J., King, I., Chen, L., Wang, J. (eds.) ADMA 2011, Part II. LNCS, vol. 7121, pp. 180–194. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Philippe Fournier-Viger
    • 1
  • Antonio Gomariz
    • 2
  • Ted Gueniche
    • 1
  • Espérance Mwamikazi
    • 1
  • Rincy Thomas
    • 3
  1. 1.Department of Computer ScienceUniversity of MonctonCanada
  2. 2.Dept. of Information and Communication EngineeringUniversity of MurciaSpain
  3. 3.Department of Computer ScienceSCTBhopalIndia

Personalised recommendations