Skip to main content

A Randomized Approach for the Incremental Design of an Evolving Data Warehouse

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2224))

Abstract

A Data Warehouse (DW) can be used to integrate data from multiple distributed data sources. A DW can be seen as a set of materialized views that determine its schema and its content in terms of the schema and the content of the data sources. DW applications require high query performance. For this reason,t he design of a typical DW consists of selecting views to materialize that are able to answer a set of input user queries. However,the cost of answering the queries has to be balanced against the cost of maintaining the materialized views. In an evolving DW application,ne w queries need to be answered by the DW. An incremental selection of materialized views uses the materialized views already in the DW to answer parts of the new queries,an d avoids the re-implementation of the DW from scratch. This incremental design is complex and an exhaustive approach is not feasible. We have developed a randomized approach for incrementally selecting a set of views that are able to answer a set of input user queries locally while minimizing a combination of the query evaluation and view maintenance cost. In this process we exploit “common sub-expressions” among new queries and between new queries and old views. Our approach is implemented and we report on its experimental evaluation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Baralis, S. Paraboshi, and E. Teniente. Materialized view selection in a multidimensional database. In Proc. of the 23rd VLDB Conf., pages 156–165, 1997.

    Google Scholar 

  2. S. Chaudhuri and U. Dayal. An Overview of Data Warehousing and OLAP Technology. SIGMOD Record,26(1):65–74, 1997.

    Article  Google Scholar 

  3. S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In Proc. of the 18th ACM PODS Symp., pages 155–166, 1999.

    Google Scholar 

  4. S. Dar, H.V. Jagadish, A.Y. Levy, and D. Srivastava. Answering SQL Queries with Aggregation using Views. In Proc. of the VLDB Conf., pages 318–329, 1996.

    Google Scholar 

  5. G. Graefe and W. J. McKenna. The Volcano Optimizer Generator: Extensibility and Efficient Search. In Proc. of the 9th ICDE Conf., pages 209–217, 1993.

    Google Scholar 

  6. T. Griffin and L. Libkin. Incremental Maintenance of Views with Duplicates. In Proc. of the ACM SIGMOD Conf., pages 328–339, 1995.

    Google Scholar 

  7. A. Gupta, V. Harinarayan, and D. Quass. Aggregate-Query Processing in Data Warehousing Environments. In Proc. of the VLDB Conf., pages 358–369, 1995.

    Google Scholar 

  8. A. Gupta and I. S. Mumick. Maintenance of materialized views: Problems, techniques and applications. Data Engineering,18( 2):3–18, 1995.

    Google Scholar 

  9. H. Gupta. Selection of Views to Materialize in a Data Warehouse. In Proc. of the 6th Intl. Conf. on Database Theory, pages 98–112, 1997.

    Google Scholar 

  10. H. Gupta and I. S. Mumick. Selection of Views to Materialize Under a Maintenance Cost Constraint. In Proc. of the 7th ICDT Conf., pages 453–470, 1999.

    Google Scholar 

  11. J. Hammer, H. Garcia-Molina, J. Widom, W. Labio, and Y. Zhuge. The Stanford Data Warehousing Project. Data Engineering,18( 2):41–48, 1995.

    Google Scholar 

  12. V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing Data Cubes Efficiently. In Proc. of the ACM SIGMOD Conf.,1996.

    Google Scholar 

  13. Y. Ioannidis and Y. Kang. Randomized algorithms for optimizing large join queries. In Proc. of the ACM SIGMOD Conf., pages 9–22, 1990.

    Google Scholar 

  14. Y. Ioannidis and E. Wong. Query optimization by simulated annealing. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pages 9–22,1987.

    Google Scholar 

  15. R. Kimball. The Data Warehouse Toolkit. John Wiley & Sons,1996.

    Google Scholar 

  16. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science,220( 4598):671–680, 1983.

    Article  MathSciNet  Google Scholar 

  17. Y. Kotidis and N. Roussopoulos. DynaMat: A Dynamic View Management System for Data Warehouses. In Proc. of the ACM SIGMOD Conf., pages 371–382, 1999.

    Google Scholar 

  18. A. Levy, A. O. Mendelson, Y. Sagiv, and D. Srivastava. Answering Queries using Views. In Proc. of the ACM PODS Symp., pages 95–104, 1995.

    Google Scholar 

  19. D. Quass. Maintenance Expressions for Views with Aggregation. In Workshop on Materialized Views: Techniques and Applications, pages 110–118, 1996.

    Google Scholar 

  20. D. Quass, A. Gupta, I. S. Mumick, and J. Widom. Making Views Self Maintainable for Data Warehousing. In Proc. of the 4th PDIS Conf., pages 158–169, 1996.

    Google Scholar 

  21. K. A. Ross, D. Srivastava, and S. Sudarshan. Materialized View Maintenance and Integrity Constraint Checking: Trading Space for Time. In Proc. of the ACM SIGMOD Conf., pages 447–458, 1996.

    Google Scholar 

  22. M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDB Journal,6:191–208, 1997.

    Article  Google Scholar 

  23. A. Swami. Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques. In Proc. of the ACM SIGMOD Conf., pages 367–376, 1989.

    Google Scholar 

  24. A. Swami and A. Gupta. Optimization of Large Join Queries. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pages 8–17, 1988.

    Google Scholar 

  25. D. Theodoratos. Detecting Redundant Materialized Views in Data Warehouse Evolution. Information Systems,26( 5), 2001.

    Google Scholar 

  26. D. Theodoratos, S. Ligoudistianos, and T. Sellis. View Selection for Designing the Global Data Warehouse. To appear in Data and Knowledge Engineering.

    Google Scholar 

  27. D. Theodoratos and T. Sellis. Data Warehouse Configuration. In Proc. of the 23rd Intl. Conf. on Very Large Data Bases, pages 126–135, 1997.

    Google Scholar 

  28. D. Theodoratos and T. Sellis. Designing Data Warehouses. Data and Knowledge Engineering, Elsevier,31( 3):279–301, Oct. 1999.

    Article  MATH  Google Scholar 

  29. D. Theodoratos and T. Sellis. Incremental Design of a Data Warehouse. Journal of Intelligent Information Systems, Kluwer Academic Publishers,15( 1):7–27, 2000.

    Article  Google Scholar 

  30. J. Widom. Research Problems in Data Warehousing. In Proc. of the 4th Intl. Conf. on Information and Knowledge Management, pages 25–30, Nov. 1995.

    Google Scholar 

  31. J. Yang, K. Karlapalem, and Q. Li. Algorithms for Materialized View Design in Data Warehousing Environment. In Proc. of the VLDB Conf., pages 136–145, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Theodoratos, D., Dalamagas, T., Simitsis, A., Stavropoulos, M. (2001). A Randomized Approach for the Incremental Design of an Evolving Data Warehouse. In: S.Kunii, H., Jajodia, S., Sølvberg, A. (eds) Conceptual Modeling — ER 2001. ER 2001. Lecture Notes in Computer Science, vol 2224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45581-7_25

Download citation

  • DOI: https://doi.org/10.1007/3-540-45581-7_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42866-4

  • Online ISBN: 978-3-540-45581-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics