Skip to main content

RoK: Roll-Up with the K-Means Clustering Method for Recommending OLAP Queries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5690))

Abstract

Dimension hierarchies represent a substantial part of the data warehouse model. Indeed they allow decision makers to examine data at different levels of detail with On-Line Analytical Processing (OLAP) operators such as drill-down and roll-up. The granularity levels which compose a dimension hierarchy are usually fixed during the design step of the data warehouse, according to the identified analysis needs of the users. However, in practice, the needs of users may evolve and grow in time. Hence, to take into account the users’ analysis evolution into the data warehouse, we propose to integrate personalization techniques within the OLAP process. We propose two kinds of OLAP personalization in the data warehouse: (1) adaptation and (2) recommendation.

Adaptation allows users to express their own needs in terms of aggregation rules defined from a child level (existing level) to a parent level (new level). The system will adapt itself by including the new hierarchy level into the data warehouse schema. For recommending new OLAP queries, we provide a new OLAP operator based on the K-means method. Users are asked to choose K-means parameters following their preferences about the obtained clusters which may form a new granularity level in the considered dimension hierarchy. We use the K-means clustering method in order to highlight aggregates semantically richer than those provided by classical OLAP operators. In both adaptation and recommendation techniques, the new data warehouse schema allows new and more elaborated OLAP queries.

Our approach for OLAP personalization is implemented within Oracle 10 g as a prototype which allows the creation of new granularity levels in dimension hierachies of the data warehouse. Moreover, we carried out some experiments which validate the relevance of our approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rizzi, S.: OLAP Preferences: A Research Agenda. In: DOLAP 2007, pp. 99–100 (2007)

    Google Scholar 

  2. Bentayeb, F., Favre, C., Boussaid, O.: A User-driven Data Warehouse Evolution Approach for Concurrent Personalized Analysis Needs. Journal of Integrated Computer-Aided Engineering 15(1), 21–36 (2008)

    Google Scholar 

  3. Domshlak, C., Joachims, T.: Efficient and Non-Parametric Reasoning over User Preferences. User Modeling and User-Adapted Interaction 17(1-2), 41–69 (2007)

    Article  Google Scholar 

  4. Korfhage, R.R.: Information storage and retrieval. John Wiley & Sons, Inc., Chichester (1997)

    Google Scholar 

  5. Manber, U., Patel, A., Robison, J.: Experience with personalization of yahoo! Communications of the ACM 43(8), 35–39 (2000)

    Article  Google Scholar 

  6. Pretschner, A., Gauch, S.: Ontology Based Personalized Search. In: ICTAI 1999, Chicago, Illinois, USA, pp. 391–398 (1999)

    Google Scholar 

  7. Cherniack, M., Galvez, E.F., Franklin, M.J., Zdonik, S.B.: Profile-Driven Cache Management. In: ICDE 2003, Bangalore, India, pp. 645–656 (2003)

    Google Scholar 

  8. Chomicki, J.: Preference Formulas in Relational Queries. ACM Transactions on Database Systems 28(4), 427–466 (2003)

    Article  MathSciNet  Google Scholar 

  9. Koutrika, G., Ioannidis, Y.: Personalized Queries under a Generalized Preference Model. In: ICDE 2005, Tokyo, Japan, pp. 841–852 (2005)

    Google Scholar 

  10. Bellatreche, L., Giacometti, A., Marcel, P., Mouloudi, H., Laurent, D.: A Personalization Framework for OLAP Queries. In: DOLAP 2005, pp. 9–18 (2005)

    Google Scholar 

  11. Ravat, F., Teste, O.: Personalization and OLAP Databases. Annals of Information Systems, New Trends in Data Warehousing and Data Analysis (2008)

    Google Scholar 

  12. Jerbi, H., Ravat, F., Teste, O., Zurfluh, G.: Management of context-aware preferences in multidimensional databases. In: ICDIM 2008, pp. 669–675 (2008)

    Google Scholar 

  13. Giacometti, A., Marcel, P., Negre, E.: A Framework for Recommending OLAP Queries. In: DOLAP 2008, pp. 73–80 (2008)

    Google Scholar 

  14. BenMessaoud, R., Boussaid, O., Rabaseda, S.: A new OLAP aggregation based on the AHC technique. In: DOLAP 2004, pp. 65–72 (2004)

    Google Scholar 

  15. Kaya, M.A., Alhajj, R.: Extending OLAP with Fuzziness for Effective Mining of Fuzzy Multidimensional Weighted Association Rules. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 64–71. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  16. Blaschka, M., Sapia, C., Höfling, G.: On Schema Evolution in Multidimensional Databases. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 153–164. Springer, Heidelberg (1999)

    Google Scholar 

  17. Hurtado, C., Mendelzon, A., Vaisman, A.: Maintaining Data Cubes under Dimension Updates. In: ICDE 1999, pp. 346–355 (1999)

    Google Scholar 

  18. Morzy, T., Wrembel, R.: Modeling a Multiversion Data Warehouse: A Formal Approach. In: ICEIS 2003, vol. 1, pp. 120–127 (2003)

    Google Scholar 

  19. Vaisman, A., Mendelzon, A.: Temporal Queries in OLAP. In: VLDB 2000, pp. 242–253 (2000)

    Google Scholar 

  20. Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In: ICDE 1996, pp. 152–159 (1996)

    Google Scholar 

  21. Forgy, E.: Cluster Analysis of Multivariate Data: Efficiency versus Interpretability of Classification. Biometrics 21

    Google Scholar 

  22. MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: Vth Berkeley Symposium, pp. 281–297 (1967)

    Google Scholar 

  23. Huang, Z.: Clustering Large Data Sets with Mixed Numeric and Categorical Values. In: PAKDD 1997 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bentayeb, F., Favre, C. (2009). RoK: Roll-Up with the K-Means Clustering Method for Recommending OLAP Queries. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03573-9_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03572-2

  • Online ISBN: 978-3-642-03573-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics