Abstract
Some data mining tasks can produce such great amounts of data that we have to cope with a new knowledge management problem. Frequent itemset mining fits in this category. Different approaches were proposed to handle or avoid somehow this problem. All of them have problems and limitations. In particular, most of them need the original data during the analysis phase, which is not feasible for data streams. The DWFIST (Data Warehouse of Frequent ItemSets Tactics) approach aims at providing a powerful environment for the analysis of itemsets and derived patterns, such as association rules, without accessing the original data during the analysis phase. This approach is based on a Data Warehouse of Frequent Itemsets. It provides frequent itemsets in a flexible and efficient way as well as a standardized logical view upon which analytical tools can be developed. This paper presents how such a data warehouse can be built.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM SIGMOD Conf., Washington, pp. 207–216 (1993)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI Press, Menlo Park (1998)
Han, J.: OLAP Mining: An Integration of OLAP with Data Mining. In: Proceedings of the 1997 IFIP Conference on Data Semantics (DS-7), Leysin, Switzerland, October 1997, pp. 1–11 (1997)
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of ACM 39, 58–64 (1996)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings KDD 1998, pp. 80–86. AAAI Press, New York (1998)
Beyer, K., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cubes. In: Proc. ACM-SIGMOD Int. Conf. Management of Data (SIGMOD 1999), pp. 359–370 (1999)
Wang, H., Yang, J., Wang, W., Yu, P.S.: Clustering by pattern similarity in large data sets. In: Proc. ACM-SIGMOD Int. Conf. on Management of Data, pp. 418–427 (2002)
Mannila, H., Toivonen, H.: Multiple Uses of Frequent Sets and Condensed Representations. In: Proceedings KDD 1996, pp. 189–194. AAAI Press, Portland (1996)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., et al. (eds.) Data Mining: Next Generation Challenges and Future Directions. AAAI/MIT Press (2003)
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modelling, 2nd edn. Wiley Publishers, Chichester (2002) ISBN 0471200247
Monteiro, R.S., Zimbrão, G., Souza, J.M.: An Analytical Approach for Handling Association Rule Mining Results. In: Proc. AusDM Workshop, Canberra, Australia (2003)
Boulicaut, J.: Inductive databases and multiple uses of frequent itemsets: the cInQ approach. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds.) Database Support for Data Mining Applications. LNCS (LNAI), vol. 2682, pp. 3–26. Springer, Heidelberg (2004)
Tryfona, N., Busborg, F., Christiansen, J.G.B.: starER: A Conceptual Model for Data Warehouse Design. In: Proc. Int. Workshop on Data Warehousing and OLAP, pp. 3–8 (1999)
Li, Y., Ning, P., Wang, X.S., Jajodia, S.: Discovering calendar-based temporal association rules. In: Proc. Int. Symp. Temp. Representation and Reasoning, pp. 111–118 (2001)
The PANDA Project (2004), http://dke.cti.gr/panda/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Monteiro, R.S., Zimbrão, G., Schwarz, H., Mitschang, B., de Souza, J.M. (2005). Building the Data Warehouse of Frequent Itemsets in the DWFIST Approach. In: Hacid, MS., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds) Foundations of Intelligent Systems. ISMIS 2005. Lecture Notes in Computer Science(), vol 3488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11425274_31
Download citation
DOI: https://doi.org/10.1007/11425274_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25878-0
Online ISBN: 978-3-540-31949-8
eBook Packages: Computer ScienceComputer Science (R0)