Mining Patterns from Large Star Schemas Based on Streaming Algorithms

Silva, Andreia; Antunes, Cláudia

doi:10.1007/978-3-642-30454-5_10

Andreia Silva² &
Cláudia Antunes²

Part of the book series: Studies in Computational Intelligence ((SCI,volume 429))

710 Accesses
1 Citations

Abstract

A growing challenge in data mining is the ability to deal with complex, voluminous and dynamic data. In many real world applications, complex data is not only organized in multiple database tables, but it is also continuously and endlessly arriving in the form of streams. Although there are some algorithms for mining multiple relations, as well as a lot more algorithms to mine data streams, very few combine the multi-relational case with the data streams case. In this paper we describe a new algorithm, Star FP-Stream, for finding frequent patterns in multi-relational data streams following a star schema. Experiments in the emphAdventureWorks data warehouse show that Star FP-Stream is accurate and performs better than the equivalent algorithm, FP-Streaming, for mining patterns in a single data stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proc. of the 20th Intern. Conf. on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Crestana-Jensen, V., Soparkar, N.: Frequent itemset counting across multiple tables. In: PADKK 2000: Proc. of the 4th Pacific-Asia Conf. on Knowl. Discovery and Data Mining, London, pp. 49–61 (2000)
Google Scholar
Dehaspe, L., Raedt, L.D.: Mining Association Rules in Multiple Relations. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997)
Chapter Google Scholar
Džeroski, S.: Multi-relational data mining: an introduction. SIGKDD Explor. Newsl. 5(1), 1–16 (2003)
Article Google Scholar
Fumarola, F., Ciampi, A., Appice, A., Malerba, D.: A sliding window algorithm for relational frequent patterns mining from data streams. In: Proc. of the 12th Intern. Conf. on Discovery Science, pp. 385–392. Springer (2009)
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities: Next generation data mining (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000: Proc. of the 2000 ACM SIGMOD, pp. 1–12. ACM, New York (2000)
Chapter Google Scholar
Hou, W., Yang, B., Xie, Y., Wu, C.: Mining multi-relational frequent patterns in data streams. In: BIFE 2009: Proc. of the Second Intern. Conf. on Business Intelligence and Financial Engineering, pp. 205–209 (2009)
Google Scholar
Kimball, R., Ross, M.: The Data warehouse Toolkit - the complete guide to dimensional modeling, 2nd edn. John Wiley & Sons, Inc., New York (2002)
Google Scholar
Liu, H., Lin, Y., Han, J.: Methods for mining frequent items in data streams: an overview. Knowl. Inf. Syst. 26, 1–30 (2011)
Article Google Scholar
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB 2002: Proc. of the 28th Intern. Conf. on Very Large Data Bases, pp. 346–357. Morgan Kaufman, Hong Kong (2002)
Chapter Google Scholar
Ng, E.K.K., Fu, A.W.C., Wang, K.: Mining association rules from stars. In: ICDM 2002: Proc. of the 2002 IEEE Intern. Conf. on Data Mining, pp. 322–329. IEEE, Japan (2002)
Chapter Google Scholar
Silva, A., Antunes, C.: Pattern Mining on Stars with FP-Growth. In: Torra, V., Narukawa, Y., Daumas, M. (eds.) MDAI 2010. LNCS, vol. 6408, pp. 175–186. Springer, Heidelberg (2010)
Chapter Google Scholar
Xu, L.J., Xie, K.L.: A novel algorithm for frequent itemset mining in data warehouses. Journal of Zhejiang University - Science A 7(2), 216–224 (2006)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Instituto Supeior Técnico / Technical University of Lisbon, Lisbon, Portugal
Andreia Silva & Cláudia Antunes

Authors

Andreia Silva
View author publications
You can also search for this author in PubMed Google Scholar
Cláudia Antunes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreia Silva .

Editor information

Editors and Affiliations

Software Engineering & Information, Technology Institute, Central Michigan University, Mt. Pleasant, 48859, Michigan, USA
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Silva, A., Antunes, C. (2012). Mining Patterns from Large Star Schemas Based on Streaming Algorithms. In: Lee, R. (eds) Computer and Information Science 2012. Studies in Computational Intelligence, vol 429. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30454-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-30454-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30453-8
Online ISBN: 978-3-642-30454-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics