Advertisement

Journal of Computer Science and Technology

, Volume 15, Issue 3, pp 213–229 | Cite as

Efficient aggregation algorithms on very large compressed data warehouses

  • Li Jianzhong Email author
  • Li Yingshu 
  • Jaideep Srivastava
Article

Abstract

Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.

Keywords

OLAP aggregation data warehouse 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Yazdani S, Wong S. Data Warehousing with Oracle. Prentice-Hall, Upper Saddle River, N.J., 1997.Google Scholar
  2. [2]
    Gupta V R. Data Warehousing with MS SQL Server Unleashed. Sams, Englewood Cliffs, N.J., 1977.Google Scholar
  3. [3]
    Chatziantonian D, Ross K. Querying multiple features of groups in relational databases. InProc. 22nd International Conference on Very Large Data Bases (VLDB), 1996, pp.295–306.Google Scholar
  4. [4]
    Arbor S. The role of multidimensional database in a data warehousing solution. White Paper, Arbor Software, http://www.arborsoft.com/papers/wareTOC.htmlGoogle Scholar
  5. [5]
    Inmon W H. Multidimensional, databases and data warehousing.Data Management Riview, Feb. 1995.Google Scholar
  6. [6]
    Colliat G. OLAP, Relational and multidimensional databases systems.SIGMOD Record, 1996, 25(3): 64–69.CrossRefGoogle Scholar
  7. [7]
    Graefe G. Query evaluation techniques for large databases.ACM Computing Surveys, 1993, 25(2): 73–170.CrossRefGoogle Scholar
  8. [8]
    Bassiouni M A. Data compression in scientific and statistical databases.IEEE Transactions on Software Engineering, 1985, SE-11(10): 1047–1057.CrossRefGoogle Scholar
  9. [9]
    Roth M A, Van Horn S J. Database compression.SIGMOD RECORD, 1993, 22(3): 31–39.CrossRefGoogle Scholar
  10. [10]
    Eggers S, Shoshani A. Efficient Access of Compressed Data. InProc. 6th International Conference on Very Large Data Bases (VLDB), 1980, pp.205–211.Google Scholar
  11. [11]
    Li Jianzhong, Wang H K, Rotem D. Batched international searching on databases.Information Sciences, 1989, 48: 79–98.zbMATHCrossRefMathSciNetGoogle Scholar
  12. [12]
    Li Jianzhong, Li Yingshu, Srivastava Jaideep. Aggregation algorithms for very large compressed data warehouses. Technique Report, http://www.banner.com.cn/~jzli/paper/agg.doc.Google Scholar

Copyright information

© Science Press, Beijing China and Allerton Press Inc. 2000

Authors and Affiliations

  • Li Jianzhong 
    • 1
    Email author
  • Li Yingshu 
    • 2
  • Jaideep Srivastava
    • 3
  1. 1.Department of Computer Science and EngineeringHarbin Institute of TechnologyHarbinP.R. China
  2. 2.Beijing Institute of TechnologyBeijingP.R. China
  3. 3.University of MinnesotaUSA

Personalised recommendations