Skip to main content

Handling Large Workloads by Profiling and Clustering

  • Conference paper
Data Warehousing and Knowledge Discovery (DaWaK 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2737))

Included in the following conference series:

Abstract

View materialization is recognized to be one of the most effective ways to increase the Data Warehouse performance; nevertheless, due to the computational complexity of the techniques aimed at choosing the best set of views to be materialized, this task is mainly carried out manually when large workloads are involved. In this paper we propose a set of statistical indicators that can be used by the designer to characterize the workload of the Data Warehouse, thus driving the logical and physical optimization tasks; furthermore we propose a clustering algorithm that allows the cardinality of the workload to be reduced and uses these indicators for measuring the quality of the reduced workload. Using the reduced workload as the input to a view materialization algorithm allows large workloads to be efficiently handled.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baralis, E., Paraboschi, S., Teniente, E.: Materialized view selection in a multidimensional database. In: Proc. 23rd VLDB, Greece (1997)

    Google Scholar 

  2. Ghosh, A., Parikh, J., Sengar, V.S., Haritsa, J.R.: Plan Selection Based on Query Clustering. In: Proc. 28th VLDB, Hong Kong, China (2002)

    Google Scholar 

  3. Gupta, A., Harinarayan, V., Quass, D.: Aggregate-query processing in data-warehousing environments. In: Proc. 21st VLDB, Switzerland (1995)

    Google Scholar 

  4. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering A Review. ACM Computing Surveys 31(3) (September 1999)

    Google Scholar 

  5. Nadeau, T.P., Teorey, T.J.: Achieving scalability in OLAP materialized view selection. In: Proc. DOLAP 2002, Virginia USA (2002)

    Google Scholar 

  6. Rizzi, S., Saltarelli, E.: View materialization vs. Indexing: balancing space constraints in Data Warehouse Design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681. Springer, Heidelberg (2003) (to appear)

    Chapter  Google Scholar 

  7. Sellis, T.K.: Global query Optimization. In: Proc. SIGMOD Conference, Washington D.C., pp. 191–205 (1986)

    Google Scholar 

  8. Theodoratos, D., Bouzeghoub, M.: A General Framework for the View Selection Problem for Data Warehouse Design and Evolution. In: Proc. DOLAP 2000, Washington D.C. USA (2000)

    Google Scholar 

  9. Transaction Processing Performance Council. TPC Benchmark H (Decision Support) Standard Specification, Revision 1.1.0 (1998), http://www.tpc.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Golfarelli, M. (2003). Handling Large Workloads by Profiling and Clustering. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2003. Lecture Notes in Computer Science, vol 2737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45228-7_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45228-7_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40807-9

  • Online ISBN: 978-3-540-45228-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics