Skip to main content

Mining Patterns from Large Star Schemas Based on Streaming Algorithms

  • Chapter
Computer and Information Science 2012

Part of the book series: Studies in Computational Intelligence ((SCI,volume 429))

Abstract

A growing challenge in data mining is the ability to deal with complex, voluminous and dynamic data. In many real world applications, complex data is not only organized in multiple database tables, but it is also continuously and endlessly arriving in the form of streams. Although there are some algorithms for mining multiple relations, as well as a lot more algorithms to mine data streams, very few combine the multi-relational case with the data streams case. In this paper we describe a new algorithm, Star FP-Stream, for finding frequent patterns in multi-relational data streams following a star schema. Experiments in the emphAdventureWorks data warehouse show that Star FP-Stream is accurate and performs better than the equivalent algorithm, FP-Streaming, for mining patterns in a single data stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proc. of the 20th Intern. Conf. on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  2. Crestana-Jensen, V., Soparkar, N.: Frequent itemset counting across multiple tables. In: PADKK 2000: Proc. of the 4th Pacific-Asia Conf. on Knowl. Discovery and Data Mining, London, pp. 49–61 (2000)

    Google Scholar 

  3. Dehaspe, L., Raedt, L.D.: Mining Association Rules in Multiple Relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  4. Džeroski, S.: Multi-relational data mining: an introduction. SIGKDD Explor. Newsl. 5(1), 1–16 (2003)

    Article  Google Scholar 

  5. Fumarola, F., Ciampi, A., Appice, A., Malerba, D.: A sliding window algorithm for relational frequent patterns mining from data streams. In: Proc. of the 12th Intern. Conf. on Discovery Science, pp. 385–392. Springer (2009)

    Google Scholar 

  6. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities: Next generation data mining (2003)

    Google Scholar 

  7. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000: Proc. of the 2000 ACM SIGMOD, pp. 1–12. ACM, New York (2000)

    Chapter  Google Scholar 

  8. Hou, W., Yang, B., Xie, Y., Wu, C.: Mining multi-relational frequent patterns in data streams. In: BIFE 2009: Proc. of the Second Intern. Conf. on Business Intelligence and Financial Engineering, pp. 205–209 (2009)

    Google Scholar 

  9. Kimball, R., Ross, M.: The Data warehouse Toolkit - the complete guide to dimensional modeling, 2nd edn. John Wiley & Sons, Inc., New York (2002)

    Google Scholar 

  10. Liu, H., Lin, Y., Han, J.: Methods for mining frequent items in data streams: an overview. Knowl. Inf. Syst. 26, 1–30 (2011)

    Article  Google Scholar 

  11. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB 2002: Proc. of the 28th Intern. Conf. on Very Large Data Bases, pp. 346–357. Morgan Kaufman, Hong Kong (2002)

    Chapter  Google Scholar 

  12. Ng, E.K.K., Fu, A.W.C., Wang, K.: Mining association rules from stars. In: ICDM 2002: Proc. of the 2002 IEEE Intern. Conf. on Data Mining, pp. 322–329. IEEE, Japan (2002)

    Chapter  Google Scholar 

  13. Silva, A., Antunes, C.: Pattern Mining on Stars with FP-Growth. In: Torra, V., Narukawa, Y., Daumas, M. (eds.) MDAI 2010. LNCS, vol. 6408, pp. 175–186. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Xu, L.J., Xie, K.L.: A novel algorithm for frequent itemset mining in data warehouses. Journal of Zhejiang University - Science A 7(2), 216–224 (2006)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreia Silva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Silva, A., Antunes, C. (2012). Mining Patterns from Large Star Schemas Based on Streaming Algorithms. In: Lee, R. (eds) Computer and Information Science 2012. Studies in Computational Intelligence, vol 429. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30454-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30454-5_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30453-8

  • Online ISBN: 978-3-642-30454-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics