Mining Top-K Multidimensional Gradients

Alves, Ronnie; Belo, Orlando; Ribeiro, Joel

doi:10.1007/978-3-540-74553-2_35

Ronnie Alves¹,
Orlando Belo¹ &
Joel Ribeiro¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4654))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

1205 Accesses
2 Citations

Abstract

Several business applications such as marketing basket analysis, clickstream analysis, fraud detection and churning migration analysis demand gradient data analysis. By employing gradient data analysis one is able to identify trends, outliers and answering “what-if” questions over large databases. Gradient queries were first introduced by Imielinski et al [1] as the cubegrade problem. The main idea is to detect interesting changes in a multidimensional space (MDS). Thus, changes in a set of measures (aggregates) are associated with changes in sector characteristics (dimensions). MDS contains a huge number of cells which poses great challenge for mining gradient cells on a useful time. Dong et al [2] have proposed gradient constraints to smooth the computational costs involved in such queries. Even by using such constraints on large databases, the number of interesting cases to evaluate is still large. In this work, we are interested to explore best cases (Top-K cells) of interesting multidimensional gradients. There several studies on Top-K queries, but preference queries with multidimensional selection were introduced quite recently by Dong et al [9]. Furthermore, traditional Top-K methods work well in presence of convex functions (gradients are non-convex ones). We have revisited iceberg cubing for complex measures, since it is the basis for mining gradient cells. We also propose a gradient-based cubing strategy to evaluate interesting gradient regions in MDS. Thus, the main challenge is to find maximum gradient regions (MGRs) that maximize the task of mining Top-K gradient cells. Our performance study indicates that our strategy is effective on finding the most interesting gradients in multidimensional space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Imielinski, T., Khachiyan, L., Abdulghani, A.: Cubegrades: Generalizing Association Rules. Data Mining and Knowledge Discovery (2002)
Google Scholar
Dong, G., Han, J., Lam, J.M.W., Pei, J., Wang, K., Zou, W.: Mining Constrained Gradients in Large Databases. IEEE Transactions on Knowledge Discovery and Data Engineering (2004)
Google Scholar
Sarawagi, S., Agrawal, R., Megiddo, N.: Discovery-Driven Exploration of OLAP Data Cubes. In: Proc. Int. Conference on Extending Database Technology (EDBT) (1998)
Google Scholar
Sarawagi, S., Sathe, G.: i3: Intelligent, Interactive Investigaton of OLAP data cubes. In: Proc. Int. Conference on Management of Data (SIGMOD) (2000)
Google Scholar
Sathe, G., Sarawagi, S.: Intelligent Rollups in Multidimensional OLAP Data. In: Proc. Int. Conference on Very Large Databases (VLDB) (2001)
Google Scholar
Chang, Y., Bergman, L., Castelli, V., Li, M.L.C., Smith, J.: Onion technique: Indexing for linear optimization queries. In: Proc. Int. Conference on Management of Data (SIGMOD) (2000)
Google Scholar
Hristidis, V., Koudas, N., Papakonstantinou, Y.: Prefer: A system for the efficient execution of multi-parametric ranked queries. In: Proc. Int. Conference on Management of Data (SIGMOD) (2001)
Google Scholar
Bruno, N., Chaudhuri, S., Gravano, L.: Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Transactions on Database Systems (2002)
Google Scholar
Dong, X., Han, J., Cheng, H., Xiaolei, L.: Answering Top-k Queries with Multi-Dimensional Selections: The Ranking Cube Approach. In: Proc. Int. Conference on Very Large Databases (VLDB) (2006)
Google Scholar
Han, J., Pei, J., Dong, G., Wank, K.: Efficient Computation of Iceberg Cubes with Complex Measures. In: Proc. Int. Conference on Management of Data (SIGMOD) (2001)
Google Scholar
Li, X., Han, J., Gonzalez, H.: High-dimensional OLAP: A Minimal Cubing Approach. In: Proc. Int. Conference on Very Large Databases (VLDB) (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, School of Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
Ronnie Alves, Orlando Belo & Joel Ribeiro

Authors

Ronnie Alves
View author publications
You can also search for this author in PubMed Google Scholar
Orlando Belo
View author publications
You can also search for this author in PubMed Google Scholar
Joel Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Il Yeal Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alves, R., Belo, O., Ribeiro, J. (2007). Mining Top-K Multidimensional Gradients. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-540-74553-2_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74552-5
Online ISBN: 978-3-540-74553-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics