Abstract
Data Warehousing requires effective methods for processing and storing large amounts of data. OLAP applications form an additional tier in the data warehouse architecture and in order to interact acceptably with the user, typically data pre-computation is required. In such a case compressed representations have the potential to improve storage and processing efficiency. This paper proposes a compressed database system which aims to provide an effective storage model. We show that in several other stages of the Data Warehouse architecture compression can also be employed. Novel systems engineering is adopted to ensure that compression/decompression overheads are limited, and that data reorganisations are of controlled complexity and can be carried out incrementally. The basic architecture is described and experimental results on the TPC-D and other datasets show the performance of our system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Codd, E.F. A relational model for large shared databanks. In Comm.of ACM 13 (6):377–387, 1970
Codd, E.F., Codd, S.B., Salley. C.T. Providing OLAP (On Line Analytical Processing) to User Analyst: An IT Mandate. Available at http://www.arborsoft.com/OLAP.html.
Chaudhuri, S., Dayal, U.“An Overview of Data Warehousing and OLAP Technology ”Technical Report MSR-TR-97-14.,Microsoft Research Advanced Technology.
Widom. J. Research Problems in Data Warehousing. In Proc. 4th Intl. CIKM Conf., 1995.
Shukla, A., Deshpande, M.P., Naughton, J.F., Ramasamy, K. Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. In. Proc. 22nd VLDB, pages 522–531,Mumbay, Sept. 1996.
Agarwal, S., Agrawal, R., Deshpande, M.P., Gupta, A., Naughton, J.F., Ramakrishnan, R., Sarawagi, S. On the Computation of Multidimensional Aggregates. Proc. 22 nd VLDB, page 602, Mumbay, Sept. 1996.
Harinarayan, V., A. Rajaraman, J.D. Ullman. Implementing Data Cubes Efficiently. In Proc. ACM SIGMOD’ 96, Montreal, June 1996.
Gupta, V., Harinarayan, V., Rajaraman, A., Ullman, J. Index Selection for Olap In Proc. 13th ICDE, Manchester, UK April 1997.
Gupta, A. What is the Warehouse Problem? Are materialized views the answer? In Proc. VLDB, Mumbay, Sept. 1996.
Rosenberg, J., Keedy, J.K., Abramson, D. Addressing mechanisms for large Virtual memories. Technical report, St.Andrews University, 1990. CS/90/2.
Garcia-Molina, H. Salem, K. Main memory database systems: an overview. IEEE Transactions on Knowledge and Data Engineering 4:6, 1992, pp 509–516.
Mathews, R. Spintronics In New Scientist, February 98. Pages 24–28
Roth, M.A., Van Horn, S.J. Database Compression. In SIGMOD RECORD, Vol 22, No.3, September 1993
Iyer, B.R., Wilhite, D. Data Compression Support in Databases. In Proc.of the 20nd VLDB, page 695, Chile, 1994.
Cormack. G.V. Data Compression on a Database System. In Communications of the ACM, Volume 28, Number 12, 1985.
Ramakrishnan, R. Database Management Systems. WCB/ McGraw-Hill. 1998.
Graefe, G., Shapiro, L.D. Data Compression and Data Performance. In Proc. of ACM/IEEE Computer Science Symp. on Applied Computing, Kansas City, Apr.1991.
Pucheral, P., Thevenin, J., and Valduriez P., Efficient main memory data management using DBGraph storage model Proc. of the 16th VLDB Conference, Brisbane 1990, pp 683–695.
Goldstein, R.,and Strnad A. The MacAIMS data management system. Proc. of the ACM SCIFIDET Workshop on Data Description and Access, 1970.
Wang C., Lavington S.,The lexical token converter. high performance associative Dictionary for large knowledge bases. Department of Computer Science, University of Essex Internal Report CSM-133.
Lehman, T.J., Shekita, E.J., Cabrera L. An Evaluation of Starburst’s Memory Resident Storage Component. IEEE Transactions Knowledge and Data Engineering.Vol 4.No 6, Dec 1992.
Todd, S.J.P., Hall, P.A., Hall, V., Hitchcock, P. An Algebra for Machine Computation” IBM Publication UKSC 0066 1975.
Huffman, D.A. A Method for the Construction of Minimum-Redundancy Codes”, Proc. of the IRE, 40: 1098–1101 September 1952.
Welch, T.A. A technique for high Performance Data Compression, IEEE Computer 17 June 1984), 8–19.
Kimball, R. “The Data Warehouse” Toolkit. John Wiley, 1996.
Cockshott, W.P., McGregor, D.R., Kotsis, N., Wilson J. Data Compression in Database Systems in IDEAS’98, Cardiff, July 1998.
Labio, W.J., Quass, D., Adelberg, B.“Physical Database Design for Data Warehouses.” TRCS University of Stanford.
Baralis,. E., Paraboschi, S., Teniente E. Materialized View Selection in a Multidimensional Databases. In Proc. 23nd VLDB, page 156, Athens, Sept. 1997.
Cockshott, W.P., Cowie, A.J., Rusell, G.W., McGregor, D. Memory Resident Databases: Reliability, Compression and Performance. Research Report ARCH 11-93,Computer Science, University of Strathclyde.
Bitton, D., DeWitt, D.J., Turbyfill, C. Benchmarking Database Systems-a systematic approach, in Proc. VLDB 1983.
Boncz, P.A., Kersten, M.L Monet: An Impressionist sketch of an advanced database. system. Proceedings IEEE BIWIT workshop. July1990. San Sebastian, Spain.
De Witt, D.J., Ghandeharizadeh, D., Schneider, D., Bricker, A., Hsiao, H., Rasmussen, R. The GAMMA database machine project. IEEE Transactions on Knowledge and Data Engineering, 2, 44–62 (1990).
De Witt, D.J., Ghandeharizadeh, D., Schneider, D., Jauhari, R., Muralikrishna, M. Sharma, A. 1987. A single user evaluation of the GAMMA database machine. In Proceedings of the 5 th International Workshop on Database Machines, October, Tokyo, Japan.
Leland, M.D.P., Roome, W.D., 1987. The Silicon Database Machine: Rational Design and Results. In Proc.of the 5th International Workshop on Database Machines. October, Tokyo, Japan.
Wischut, A.N., Flokstra, J., Apers, PMG., 1992. Parallelism in a Main Memory DBMS: The performance of PRISMA/DB, In Proc. of the 18 th International Conference of Very Large Databases, August, Vancouver, Canada.
Eich, M.H.,1987. MARS: The design of a main memory database machine”, In Proc. of the 5 th International Workshop on Database Machines, October, Tokyo, Japan.
Raab, R. editor. TPC BenchmarkTM D Standard Specification Revision 1.3.1 Transaction Processing Council 1998.
Wu, M.C., Buchmann, A.P. Encoded Bitmap Indexing for Data Warehouses. In SIGMOD Conference 1999.
DeWitt, D.J., Katz, R.H., Olken, F., Shapiro L.D., Stonebraker M.R., Wood, D. 1984 Implementation techniques for main memory database systems. In Proceedings of ACM SIGMOD Conference, New York, 1.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kotsis, N., McGregor, D.R. (1999). Compact Representation: An Approach to Efficient Implementation for the Data Warehouse Architecture. In: Mohania, M., Tjoa, A.M. (eds) DataWarehousing and Knowledge Discovery. DaWaK 1999. Lecture Notes in Computer Science, vol 1676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48298-9_8
Download citation
DOI: https://doi.org/10.1007/3-540-48298-9_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66458-1
Online ISBN: 978-3-540-48298-7
eBook Packages: Springer Book Archive