Selection of views to materialize in a data warehouse
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space etc.
In this article, we develop a theoretical framework for the general problem of selection of views in a data warehouse. We present competitive polynomial-time heuristics for selection of views to optimize total query response time, for some important special cases of the general data warehouse scenario, viz.: (i) an AND view graph, where each query/view has a unique evaluation, and (ii) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We extend the algorithms to the case when there is a set of indexes associated with each view. Finally, we extend our heuristic to the most general case of AND-OR view graphs.
KeywordsGreedy Algorithm Monotonicity Property Outgoing Edge Greedy Heuristic Data Cube
Unable to display preview. Download preview PDF.
- [CFN77]G. Cornuejols, M. L. Fisher, and G. L. Nemhauser. Location of bank accounts to optimize float: An analytic study of exact and approximate algorithm. Management Science, 23(8):789–810, 1977.Google Scholar
- [Che96]Chandra Chekuri. Personal Communication, 1996.Google Scholar
- [CM82]U. S. Chakravarthy and J. Minker. Processing multiple queries in database systems. Database Engineering, 5(3):38–44, September 1982.Google Scholar
- [Fei96]U. Feige. A threshold of ln n for approximating set cover. In Proc. of the 28th annual ACM Symp. on the Theory of Comp., pages 314–318, 1996.Google Scholar
- [GHRU96]H. Gupta, V. Harinarayan, A. Rajaraman, and J. Ullman. Index selection in OLAP. Unpublished manuscript. Stanford University, 1996.Google Scholar
- [GM95]A. Gupta and I.S. Mumick. Maintenance of materialized views: Problems, techniques, and applications. IEEE Data Eng. Bulletin, Special Issue on Materialized Views and Data Warehousing, 18(2):3–18, 1995.Google Scholar
- [HRU96]V. Harinarayan, A. Rajaraman, and J. Ullman. Implementing data cubes efficiently. In ACM SIGMOD Intl. Conf. on Mngt. of Data, 1996.Google Scholar
- [IK93]W.H. Inmon and C. Kelley. Rdb/VMS: Developing the Data Warehouse. QED Publishing Group, Boston, Massachussetts, 1993.Google Scholar
- [Rou82]N. Roussopoulos. The logical access path schema of a database. IEEE Transaction in Software Engineering, SE-8(6):563–573, November 1982.Google Scholar
- [RSS96]K. A. Ross, Divesh Srivastava, and S. Sudarshan. Materialized view maintenance and integrity constraint checking: Trading space for time. In Proc. of the ACM SIGMOD Int. Conf. on Mngt. of Data, 1996.Google Scholar
- [Sel88]Timos K. Sellis. Multiple query optimization. ACM Transactions on Database Systems, 13(1):23–52, March 1988.Google Scholar
- [WGL+96]J. Wiener, H. Gupta, W. Labio, Y. Zhuge, H. Garcia-Molina, and J. Widom. A system prototype for warehouse view maintenance. In Workshop on Materialized Views: Tech. and App., 1996.Google Scholar
- [Wid95]J. Widom. Research problems in data warehousing. In Proc. of the 4th Intl. Conf. on Info. and Knowledge Mngt., pages 25–30, 1995.Google Scholar
- [ZGMHW95]Y. Zhuge, H. Garcia-Molina, J. Hammer, and J. Widom. View maintenance in a warehousing environment. In Proceedings of the ACM SIGMOD Intl. Conf. on Mngt. of Data, pages 316–327, 1995.Google Scholar