Abstract
View materialization is recognized to be one of the most effective ways to increase the Data Warehouse performance; nevertheless, due to the computational complexity of the techniques aimed at choosing the best set of views to be materialized, this task is mainly carried out manually when large workloads are involved. In this paper we propose a set of statistical indicators that can be used by the designer to characterize the workload of the Data Warehouse, thus driving the logical and physical optimization tasks; furthermore we propose a clustering algorithm that allows the cardinality of the workload to be reduced and uses these indicators for measuring the quality of the reduced workload. Using the reduced workload as the input to a view materialization algorithm allows large workloads to be efficiently handled.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baralis, E., Paraboschi, S., Teniente, E.: Materialized view selection in a multidimensional database. In: Proc. 23rd VLDB, Greece (1997)
Ghosh, A., Parikh, J., Sengar, V.S., Haritsa, J.R.: Plan Selection Based on Query Clustering. In: Proc. 28th VLDB, Hong Kong, China (2002)
Gupta, A., Harinarayan, V., Quass, D.: Aggregate-query processing in data-warehousing environments. In: Proc. 21st VLDB, Switzerland (1995)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering A Review. ACM Computing Surveys 31(3) (September 1999)
Nadeau, T.P., Teorey, T.J.: Achieving scalability in OLAP materialized view selection. In: Proc. DOLAP 2002, Virginia USA (2002)
Rizzi, S., Saltarelli, E.: View materialization vs. Indexing: balancing space constraints in Data Warehouse Design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681. Springer, Heidelberg (2003) (to appear)
Sellis, T.K.: Global query Optimization. In: Proc. SIGMOD Conference, Washington D.C., pp. 191–205 (1986)
Theodoratos, D., Bouzeghoub, M.: A General Framework for the View Selection Problem for Data Warehouse Design and Evolution. In: Proc. DOLAP 2000, Washington D.C. USA (2000)
Transaction Processing Performance Council. TPC Benchmark H (Decision Support) Standard Specification, Revision 1.1.0 (1998), http://www.tpc.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Golfarelli, M. (2003). Handling Large Workloads by Profiling and Clustering. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2003. Lecture Notes in Computer Science, vol 2737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45228-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-45228-7_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40807-9
Online ISBN: 978-3-540-45228-7
eBook Packages: Springer Book Archive