Advertisement

A Randomized Approach for the Incremental Design of an Evolving Data Warehouse

  • Dimitri Theodoratos
  • Theodore Dalamagas
  • Alkis Simitsis
  • Manos Stavropoulos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2224)

Abstract

A Data Warehouse (DW) can be used to integrate data from multiple distributed data sources. A DW can be seen as a set of materialized views that determine its schema and its content in terms of the schema and the content of the data sources. DW applications require high query performance. For this reason,t he design of a typical DW consists of selecting views to materialize that are able to answer a set of input user queries. However,the cost of answering the queries has to be balanced against the cost of maintaining the materialized views. In an evolving DW application,ne w queries need to be answered by the DW. An incremental selection of materialized views uses the materialized views already in the DW to answer parts of the new queries,an d avoids the re-implementation of the DW from scratch. This incremental design is complex and an exhaustive approach is not feasible. We have developed a randomized approach for incrementally selecting a set of views that are able to answer a set of input user queries locally while minimizing a combination of the query evaluation and view maintenance cost. In this process we exploit “common sub-expressions” among new queries and between new queries and old views. Our approach is implemented and we report on its experimental evaluation.

Keywords

Sink Node Transformation Rule Query Evaluation Source Relation Query Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    E. Baralis, S. Paraboshi, and E. Teniente. Materialized view selection in a multidimensional database. In Proc. of the 23rd VLDB Conf., pages 156–165, 1997.Google Scholar
  2. 2.
    S. Chaudhuri and U. Dayal. An Overview of Data Warehousing and OLAP Technology. SIGMOD Record,26(1):65–74, 1997.CrossRefGoogle Scholar
  3. 3.
    S. Cohen, W. Nutt, and A. Serebrenik. Rewriting aggregate queries using views. In Proc. of the 18th ACM PODS Symp., pages 155–166, 1999.Google Scholar
  4. 4.
    S. Dar, H.V. Jagadish, A.Y. Levy, and D. Srivastava. Answering SQL Queries with Aggregation using Views. In Proc. of the VLDB Conf., pages 318–329, 1996.Google Scholar
  5. 5.
    G. Graefe and W. J. McKenna. The Volcano Optimizer Generator: Extensibility and Efficient Search. In Proc. of the 9th ICDE Conf., pages 209–217, 1993.Google Scholar
  6. 6.
    T. Griffin and L. Libkin. Incremental Maintenance of Views with Duplicates. In Proc. of the ACM SIGMOD Conf., pages 328–339, 1995.Google Scholar
  7. 7.
    A. Gupta, V. Harinarayan, and D. Quass. Aggregate-Query Processing in Data Warehousing Environments. In Proc. of the VLDB Conf., pages 358–369, 1995.Google Scholar
  8. 8.
    A. Gupta and I. S. Mumick. Maintenance of materialized views: Problems, techniques and applications. Data Engineering,18( 2):3–18, 1995.Google Scholar
  9. 9.
    H. Gupta. Selection of Views to Materialize in a Data Warehouse. In Proc. of the 6th Intl. Conf. on Database Theory, pages 98–112, 1997.Google Scholar
  10. 10.
    H. Gupta and I. S. Mumick. Selection of Views to Materialize Under a Maintenance Cost Constraint. In Proc. of the 7th ICDT Conf., pages 453–470, 1999.Google Scholar
  11. 11.
    J. Hammer, H. Garcia-Molina, J. Widom, W. Labio, and Y. Zhuge. The Stanford Data Warehousing Project. Data Engineering,18( 2):41–48, 1995.Google Scholar
  12. 12.
    V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing Data Cubes Efficiently. In Proc. of the ACM SIGMOD Conf.,1996.Google Scholar
  13. 13.
    Y. Ioannidis and Y. Kang. Randomized algorithms for optimizing large join queries. In Proc. of the ACM SIGMOD Conf., pages 9–22, 1990.Google Scholar
  14. 14.
    Y. Ioannidis and E. Wong. Query optimization by simulated annealing. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pages 9–22,1987.Google Scholar
  15. 15.
    R. Kimball. The Data Warehouse Toolkit. John Wiley & Sons,1996.Google Scholar
  16. 16.
    S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science,220( 4598):671–680, 1983.CrossRefMathSciNetGoogle Scholar
  17. 17.
    Y. Kotidis and N. Roussopoulos. DynaMat: A Dynamic View Management System for Data Warehouses. In Proc. of the ACM SIGMOD Conf., pages 371–382, 1999.Google Scholar
  18. 18.
    A. Levy, A. O. Mendelson, Y. Sagiv, and D. Srivastava. Answering Queries using Views. In Proc. of the ACM PODS Symp., pages 95–104, 1995.Google Scholar
  19. 19.
    D. Quass. Maintenance Expressions for Views with Aggregation. In Workshop on Materialized Views: Techniques and Applications, pages 110–118, 1996.Google Scholar
  20. 20.
    D. Quass, A. Gupta, I. S. Mumick, and J. Widom. Making Views Self Maintainable for Data Warehousing. In Proc. of the 4th PDIS Conf., pages 158–169, 1996.Google Scholar
  21. 21.
    K. A. Ross, D. Srivastava, and S. Sudarshan. Materialized View Maintenance and Integrity Constraint Checking: Trading Space for Time. In Proc. of the ACM SIGMOD Conf., pages 447–458, 1996.Google Scholar
  22. 22.
    M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDB Journal,6:191–208, 1997.CrossRefGoogle Scholar
  23. 23.
    A. Swami. Optimization of Large Join Queries: Combining Heuristics and Combinatorial Techniques. In Proc. of the ACM SIGMOD Conf., pages 367–376, 1989.Google Scholar
  24. 24.
    A. Swami and A. Gupta. Optimization of Large Join Queries. In Proc. of the ACM SIGMOD Intl. Conf. on Management of Data, pages 8–17, 1988.Google Scholar
  25. 25.
    D. Theodoratos. Detecting Redundant Materialized Views in Data Warehouse Evolution. Information Systems,26( 5), 2001.Google Scholar
  26. 26.
    D. Theodoratos, S. Ligoudistianos, and T. Sellis. View Selection for Designing the Global Data Warehouse. To appear in Data and Knowledge Engineering.Google Scholar
  27. 27.
    D. Theodoratos and T. Sellis. Data Warehouse Configuration. In Proc. of the 23rd Intl. Conf. on Very Large Data Bases, pages 126–135, 1997.Google Scholar
  28. 28.
    D. Theodoratos and T. Sellis. Designing Data Warehouses. Data and Knowledge Engineering, Elsevier,31( 3):279–301, Oct. 1999.zbMATHCrossRefGoogle Scholar
  29. 29.
    D. Theodoratos and T. Sellis. Incremental Design of a Data Warehouse. Journal of Intelligent Information Systems, Kluwer Academic Publishers,15( 1):7–27, 2000.CrossRefGoogle Scholar
  30. 30.
    J. Widom. Research Problems in Data Warehousing. In Proc. of the 4th Intl. Conf. on Information and Knowledge Management, pages 25–30, Nov. 1995.Google Scholar
  31. 31.
    J. Yang, K. Karlapalem, and Q. Li. Algorithms for Materialized View Design in Data Warehousing Environment. In Proc. of the VLDB Conf., pages 136–145, 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Dimitri Theodoratos
    • 1
  • Theodore Dalamagas
    • 1
  • Alkis Simitsis
    • 1
  • Manos Stavropoulos
    • 1
  1. 1.Department of Electrical and Computer EngineeringNational Technical University of AthensAthensGreece

Personalised recommendations