Skip to main content

Exact and Approximate Sizes of Convex Datacubes

  • Conference paper
Data Warehousing and Knowledge Discovery (DaWaK 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5691))

Included in the following conference series:

Abstract

In various approaches, data cubes are pre-computed in order to efficiently answer Olap queries. The notion of data cube has been explored in various ways: iceberg cubes, range cubes, differential cubes or emerging cubes. Previously, we have introduced the concept of convex cube which generalizes all the quoted variants of cubes. More precisely, the convex cube captures all the tuples satisfying a monotone and/or antimonotone constraint combination. This paper is dedicated to a study of the convex cube size. Actually, knowing the size of such a cube even before computing it has various advantages. First of all, free space can be saved for its storage and the data warehouse administration can be improved. However the main interest of this size knowledge is to choose at best the constraints to apply in order to get a workable result. For an aided calibrating of constraints, we propose a sound characterization, based on inclusion-exclusion principle, of the exact size of convex cube as long as an upper bound which can be very quickly yielded. Moreover we adapt the nearly optimal algorithm HyperLogLog in order to provide a very good approximation of the exact size of convex cubes. Our analytical results are confirmed by experiments: the approximated size of convex cubes is really close to their exact size and can be computed quasi immediately.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beyer, K.S., Ramakrishnan, R.: Bottom-up computation of sparse and iceberg cubes. In: Delis, A., Faloutsos, C., Ghandeharizadeh, S. (eds.) SIGMOD Conference, pp. 359–370. ACM Press, New York (1999)

    Google Scholar 

  2. Casali, A., Cicchetti, R., Lakhal, L.: Cube lattices: A framework for multidimensional data mining. In: Barbará, D., Kamath, C. (eds.) SDM. SIAM, Philadelphia (2003)

    Google Scholar 

  3. Casali, A.: Mining borders of the difference of two datacubes. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 391–400. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Nedjar, S., Casali, A., Cicchetti, R., Lakhal, L.: Emerging cubes for trends analysis in olap databases. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 135–144. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream cube: An architecture for multi-dimensional analysis of data streams. Distributed and Parallel Databases 18(2), 173–197 (2005)

    Article  Google Scholar 

  6. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)

    Article  Google Scholar 

  7. Casali, A., Nedjar, S., Cicchetti, R., Lakhal, L.: Convex cube: Towards a unified structure for multidimensional databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 572–581. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Flajolet, P., Fusy, E., Gandouet, O., Meunier, F.: Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm. In: Proceedings of the Conference on Analysis of Algorithms, AofA 2007, pp. 127–146 (2007)

    Google Scholar 

  9. Lakshmanan, L.V.S., Pei, J., Han, J.: Quotient cube: How to summarize the semantics of a data cube. In: VLDB, pp. 778–789. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  10. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)

    MATH  Google Scholar 

  11. Pei, J., Han, J., Lakshmanan, L.V.S.: Pushing convertible constraints in frequent itemset mining. Data Min. Knowl. Discov. 8(3), 227–252 (2004)

    Article  MathSciNet  Google Scholar 

  12. Vel, M.: Theory of Convex Structures, vol. (50). North-Holland, Amsterdam (1993)

    MATH  Google Scholar 

  13. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with titanic. Data Knowl. Eng. 42(2), 189–222 (2002)

    Article  MATH  Google Scholar 

  14. Feller, W.: An Introduction to Probability Theory and Its Applications, vol. 2. John Wiley & Sons, Chichester (1971)

    MATH  Google Scholar 

  15. Shukla, A., Deshpande, P., Naughton, J.F., Ramasamy, K.: Storage estimation for multidimensional aggregates in the presence of hierarchies. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.) VLDB, pp. 522–531. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  16. Xin, D., Han, J., Li, X., Shao, Z., Wah, B.W.: Computing iceberg cubes by top-down and bottom-up integration: The starcubing approach. IEEE Trans. Knowl. Data Eng. 19(1), 111–126 (2007)

    Article  Google Scholar 

  17. Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Grossman, R.L., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) SDM. SIAM, Philadelphia (2002)

    Google Scholar 

  18. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24(1), 25–46 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  19. Xin, D., Shao, Z., Han, J., Liu, H.: C-cubing: Efficient computation of closed cubes by aggregation-based checking. In: Liu, L., Reuter, A., Whang, K.Y., Zhang, J. (eds.) ICDE, p. 4. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nedjar, S. (2009). Exact and Approximate Sizes of Convex Datacubes. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2009. Lecture Notes in Computer Science, vol 5691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03730-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03730-6_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03729-0

  • Online ISBN: 978-3-642-03730-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics