Handling Large Workloads by Profiling and Clustering

  • Matteo Golfarelli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2737)


View materialization is recognized to be one of the most effective ways to increase the Data Warehouse performance; nevertheless, due to the computational complexity of the techniques aimed at choosing the best set of views to be materialized, this task is mainly carried out manually when large workloads are involved. In this paper we propose a set of statistical indicators that can be used by the designer to characterize the workload of the Data Warehouse, thus driving the logical and physical optimization tasks; furthermore we propose a clustering algorithm that allows the cardinality of the workload to be reduced and uses these indicators for measuring the quality of the reduced workload. Using the reduced workload as the input to a view materialization algorithm allows large workloads to be efficiently handled.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baralis, E., Paraboschi, S., Teniente, E.: Materialized view selection in a multidimensional database. In: Proc. 23rd VLDB, Greece (1997)Google Scholar
  2. 2.
    Ghosh, A., Parikh, J., Sengar, V.S., Haritsa, J.R.: Plan Selection Based on Query Clustering. In: Proc. 28th VLDB, Hong Kong, China (2002)Google Scholar
  3. 3.
    Gupta, A., Harinarayan, V., Quass, D.: Aggregate-query processing in data-warehousing environments. In: Proc. 21st VLDB, Switzerland (1995)Google Scholar
  4. 4.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering A Review. ACM Computing Surveys 31(3) (September 1999)Google Scholar
  5. 5.
    Nadeau, T.P., Teorey, T.J.: Achieving scalability in OLAP materialized view selection. In: Proc. DOLAP 2002, Virginia USA (2002)Google Scholar
  6. 6.
    Rizzi, S., Saltarelli, E.: View materialization vs. Indexing: balancing space constraints in Data Warehouse Design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681. Springer, Heidelberg (2003) (to appear)CrossRefGoogle Scholar
  7. 7.
    Sellis, T.K.: Global query Optimization. In: Proc. SIGMOD Conference, Washington D.C., pp. 191–205 (1986)Google Scholar
  8. 8.
    Theodoratos, D., Bouzeghoub, M.: A General Framework for the View Selection Problem for Data Warehouse Design and Evolution. In: Proc. DOLAP 2000, Washington D.C. USA (2000)Google Scholar
  9. 9.
    Transaction Processing Performance Council. TPC Benchmark H (Decision Support) Standard Specification, Revision 1.1.0 (1998),

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Matteo Golfarelli
    • 1
  1. 1.DEISUniversity of BolognaBolognaItaly

Personalised recommendations