An Efficient Indexing Technique for Computing High Dimensional Data Cubes

Leng, Fangling; Bao, Yubin; Yu, Ge; Wang, Daling; Liu, Yuntao

doi:10.1007/11775300_47

An Efficient Indexing Technique for Computing High Dimensional Data Cubes

Fangling Leng¹⁹,
Yubin Bao¹⁹,
Ge Yu¹⁹,
Daling Wang¹⁹ &
…
Yuntao Liu¹⁹

Conference paper

1232 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4016))

Abstract

The computation of a data cube is one of the most essential but challenging issues in data warehousing and OLAP. Partition based algorithm is one of the efficient methods to compute data cubes on high dimensionality, low cardinality, and moderate size datasets, which exist in real applications like bioinformatics, statistics, and text processing. To deal with such high dimensional data cubes, we propose an efficient indexing technique consisting of a compressed bitmap index and two algorithms for cube constructing and querying. Experimental results show that our method saves at least 25% on storage space and about 30% on computation time compared with the Frag-Cubing algorithm.

Supported by the National Natural Science Foundation of China under Grant No.60473073, 60503036, 60573090.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chaudhuri, S., Dayal, U.: An Overview of Data Warehousing and OLAP Technology. SIGMOD 26(1), 65–74 (1997)
Article Google Scholar
Agarwal, S., Agrawal, R., Deshpande, P.M., et al.: On the computation of multidimensional aggregates. In: VLDB, Bombay, India, pp. 506–521 (1996)
Google Scholar
Zhao, Y., Deshpande, P.M., Naughton, J.F.: An array-based algorithm for simultaneous multidimensional aggregates. In: SIGMOD, Tucson, Arizona, pp. 159–170 (1997)
Google Scholar
Han, J., Pei, J., Dong, G., Wang, K.: Efficient computation of iceberg cubes with complex measures. In: SIGMOD, Santa Barbara, CA, USA, pp. 1–12 (2001)
Google Scholar
Xin, D., Han, J., Li, X., Wah, B.W.: Starcubing: Computing iceberg cubes by top-down and bottom-up integration. In: VLDB, Berlin, Germany, pp. 476–487 (2003)
Google Scholar
Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: SIGMOD, pp. 205–216 (1996)
Google Scholar
Wang, W., Lu, H., Feng, J., Yu, J.X.: Condensed cube: An effective approach to reducing data cube size. In: ICDE, Madison, Wisconsin, pp. 464–475 (2002)
Google Scholar
Sismanis, Y., Roussopoulos, N., Deligianannakis, A., Kotidis, Y.: Dwarf: Shrinking the petacube. In: SIGMOD, pp. 564–475 (2002)
Google Scholar
Lakshmanan, L.V.S., Pei, J., Han, J.: Quotient cube: How to summarize the semantics of a data cube. In: VLDB, Hong Kong, China, pp. 778–789 (2002)
Google Scholar
Peng, Z., Li, Q., Feng, L., et al.: Using Object Deputy Model to Prepare Data for Data Warehousing. TKDE 17(9), 1274–1288 (2005)
Google Scholar
Li, X.L., Han, J.W., Gonzalez, H.: High-Dimensional OLAP:A Minimal Cubing Approach. In: VLDB, Toronto, Canada, pp. 528–539 (2004)
Google Scholar
Sismanis, Y., Roussopoulos, N.: The dwarf data cube eliminates the high dimensionality curse. TR-CS4552, University of Maryland (2003)
Google Scholar
Wu, M.C., Buchmann, A.P.: Encoded bitmap indexing for data warehouses. In: ICDE, Orlando, Florida, USA, pp. 220–230 (1998)
Google Scholar
Chan, C.Y., Ioannidis, Y.E.: Bitmap index design and evaluation. In: SIGMOD, Seattle, Washington, pp. 355–366 (1998)
Google Scholar
KDD CUP 1999 Data (1999), http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

Download references

Author information

Authors and Affiliations

School of Information Science & Engineering, Northeastern University, Shenyang, 110004, P.R.China
Fangling Leng, Yubin Bao, Ge Yu, Daling Wang & Yuntao Liu

Authors

Fangling Leng
View author publications
You can also search for this author in PubMed Google Scholar
Yubin Bao
View author publications
You can also search for this author in PubMed Google Scholar
Ge Yu
View author publications
You can also search for this author in PubMed Google Scholar
Daling Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuntao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Chinese University of Hong Kong, Hong Kong, China
Jeffrey Xu Yu
Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, 153-8505, Tokyo, Japan
Masaru Kitsuregawa
Department of Computing, Hong Kong Polytechnic University, Hong Kong
Hong Va Leong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leng, F., Bao, Y., Yu, G., Wang, D., Liu, Y. (2006). An Efficient Indexing Technique for Computing High Dimensional Data Cubes. In: Yu, J.X., Kitsuregawa, M., Leong, H.V. (eds) Advances in Web-Age Information Management. WAIM 2006. Lecture Notes in Computer Science, vol 4016. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11775300_47

Download citation

DOI: https://doi.org/10.1007/11775300_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35225-9
Online ISBN: 978-3-540-35226-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics