Skip to main content

The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining

  • Conference paper
Discovery Science (DS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6926))

Included in the following conference series:

Abstract

This paper introduces an approach for incremental maximal frequent pattern (MFP) mining in sparse binary data, where instances are observed one by one. For this purpose, we propose the Augmented Itemset Tree (AIST), a data structure that incorporates features of the FP-tree into the itemset tree. In the given setting, we assume that just the data structure is maintained in main memory, and each instance is observed only once. The AIST not only stores observed frequent patterns, but also allows for quick frequency updates of relevant subpatterns. In order to quickly identify the current set of exact MFPs, potential candidates are extracted from former MFPs and patterns that occur in the new instance. The presented approach is evaluated concerning the runtime and memory requirements depending on the number of instances, minimum support and different settings of pattern properties. The obtained results suggest that AISTs are useful for mining maximal frequent itemsets in an online setting whenever larger patterns can be expected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data 1993, pp. 207–216. ACM, New York (1993)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994)

    Google Scholar 

  3. Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: An incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering (ICDE), pp. 106–114. IEEE Computer Society, Los Alamitos (1996)

    Chapter  Google Scholar 

  4. Cheung, W., Zaiane, O.R.: Incremental mining of frequent patterns without candidate generation or support. In: IDEAS 2003: Proceedings of the 7th International Database Engineering and Applications Symposium 2003, pp. 111–116. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  5. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining closed frequent itemsets over a stream sliding window. In: Proceedings of the Fourth IEEE International Conference on Data Mining, pp. 59–66. IEEE Computer Society, Los Alamitos (2004)

    Google Scholar 

  6. Chiu, D.Y., Wu, Y.H., Chen, A.: Efficient frequent sequence mining by a dynamic strategy switching algorithm. The VLDB Journal 18, 303–327 (2009)

    Article  Google Scholar 

  7. Floratou, A., Tata, S., Patel, J.M.: Efficient and accurate discovery of patterns in sequence datasets. In: ICDE 2010: Proceedings of the 26th International Conference on Data Engineering, pp. 461–472. IEEE Computer Society, Los Alamitos (2010)

    Chapter  Google Scholar 

  8. Hafez, A., Deogun, J., Raghavan, V.V.: The item-set tree: A data structure for data mining. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 183–192. Springer, Heidelberg (1999)

    Google Scholar 

  9. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2000)

    Chapter  Google Scholar 

  10. Lee, D., Lee, W.: Finding maximal frequent itemsets over online data streams adaptively. In: ICDM, pp. 266–273 (2005)

    Google Scholar 

  11. Lee, H.S.: Incremental association mining based on maximal itemsets. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3681, pp. 365–371. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  12. Leung, C.K.S., Khan, Q.I., Li, Z., Hoque, T.: Cantree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)

    Article  Google Scholar 

  13. Lian, W., Cheung, D.W., Yiu, S.M.: Maintenance of maximal frequent itemsets in large databases. In: Proceedings of the 2007 ACM Symposium on Applied Computing, SAC 2007, pp. 388–392. ACM, New York (2007)

    Google Scholar 

  14. Mozafari, B., Thakkar, H., Zaniolo, C.: Verifying and Mining Frequent Patterns from Large Windows over Data Streams. In: ICDE 2008: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pp. 179–188. IEEE Computer Society, Los Alamitos (2008)

    Google Scholar 

  15. Omari, A., Langer, R., Conrad, S.: Tartool: A temporal dataset generator for market basket analysis. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 400–410. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  16. Savasere, A., Omiecinski, E., Navathe, S.B.: An efficient algorithm for mining association rules in large databases. In: Dayal, U., Gray, P.M.D., Nishio, S. (eds.) Proceedings of 21th International Conference on Very Large Data Bases, VLDB 1995, pp. 432–444. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  17. Schmidt, J., Kramer, S.: The augmented itemset tree: A data structure for online maximum frequent pattern mining. techreport (2011), http://drehscheibe.in.tum.de/forschung/pub/reports/2011/TUM-I1114.pdf

  18. Seeland, M., Girschick, T., Buchwald, F., Kramer, S.: Online structural graph clustering using frequent subgraph mining. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 213–228. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. Valtchev, P., Missaoui, R., Godin, R.: A framework for incremental generation of closed itemsets. Discrete Applied Mathematics 156, 924–949 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  20. Valtchev, P., Missaoui, R., Godin, R., Meridji, M.: Generating frequent itemsets incrementally: two novel approaches based on galois lattice theory. Journal of Experimental & Theoretical Artificial Intelligence 14(2-3), 115–142 (2002)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schmidt, J., Kramer, S. (2011). The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining. In: Elomaa, T., Hollmén, J., Mannila, H. (eds) Discovery Science. DS 2011. Lecture Notes in Computer Science(), vol 6926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24477-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24477-3_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24476-6

  • Online ISBN: 978-3-642-24477-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics