Abstract
In this paper, we propose a new method for efficient processing of a top-k join query by its translation into a sequence of range queries, which are generated by performing iterative domain refinement of attributes included in the scoring function. In this process, we exploit the statistics for data distributions of the individual attributes, which in the form of histograms are available to an RDBMS. To improve the performance of our method, we use heuristic techniques to minimize the execution cost of range queries and the number of iterations. We use the PostgreSQL query engine optimizer to prove our theoretical results. We have done exhaustive set of experiments by exploiting different input parameters and by using cross checks to prove the results. We have applied our experiments to the TPC-H benchmark data sets, and the results we obtained confirm the efficiency of our approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4), 1–11 (2008)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2001, pp. 102–113. ACM, New York (2001)
Ilyas, I.F., Aref, W.G., Elmagarmid, A.K.: Supporting top-k join queries in relational databases. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, vol. 29. VLDB Endowment, pp. 754–765 (2003)
Natsev, A., Chang, Y.C., Smith, J.R., Li, C.S., Vitter, J.S.: Supporting incremental join queries on ranked inputs. In: Proceedings of the 27th International Conference on Very Large Data Bases, VLDB 2001, pp. 281–290. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Pang, H., Ding, X., Zheng, B.: Efficient processing of exact top-k queries over disk-resident sorted lists. The VLDB Journal 19, 437–456 (2010)
Li, C., Chang, K.C.C., Ilyas, I.F., Song, S.: Ranksql: query algebra and optimization for relational top-k queries. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, SIGMOD 2005, pp. 131–142. ACM, New York (2005)
Khabbaz, M., Lakshmanan, L.V.S.: Toprecs: Top-k algorithms for item-based collaborative filtering. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 213–224. ACM, New York (2011)
Ranu, S., Singh, A.K.: Answering top-k queries over a mixture of attractive and repulsive dimensions. In: Proc. VLDB Endow, vol. 5(3), pp. 169–180 (November 2011)
Tsaparas, P., Koudas, N., Kotidis, Y., Palpanas, T., Srivastava, D.: Ranked join indices. In: ICDE, pp. 277–288 (2003)
Vlachou, A., Doulkeridis, C., Nørvåg, K.: Monitoring reverse top-k queries over mobile devices. In: Kollios, G., Tao, Y. (eds.) MobiDE, pp. 17–24. ACM (2011)
Bruno, N., Chaudhuri, S., Gravano, L.: Top-k selection queries over relational databases: Mapping strategies and performance evaluation. ACM Trans. Database Syst. 27(2), 153–187 (2002)
Vitter, J.S.: An efficient algorithm for sequential random sampling. ACM Trans. Math. Softw. 13(1), 58–67 (1987)
Shapiro, L.D.: Join processing in database systems with large main memories. ACM Trans. Database Syst. 11(3), 239–264 (1986)
Herodotou, H., Babu, S.: Xplus: a sql-tuning-aware query optimizer. In: Proc. VLDB Endow, vol. 3, pp. 1149–1160 (2010)
PostgreSQL: Tunning your postgresql server. PostgreSQL Global Development Group
TPC: Transaction processing performance council. DBGEN Database population text generation program
Dimovski, A., Velinov, G., Sahpaski, D.: Horizontal Partitioning by Predicate Abstraction and Its Application to Data Warehouse Design. In: Catania, B., Ivanović, M., Thalheim, B. (eds.) ADBIS 2010. LNCS, vol. 6295, pp. 164–175. Springer, Heidelberg (2010)
Yang, D., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 57–68. ACM, New York (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sahpaski, D., Dimovski, A.S., Velinov, G., Kon-Popovska, M. (2012). Efficient Processing of Top-K Join Queries by Attribute Domain Refinement. In: Morzy, T., Härder, T., Wrembel, R. (eds) Advances in Databases and Information Systems. ADBIS 2012. Lecture Notes in Computer Science, vol 7503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33074-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-33074-2_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33073-5
Online ISBN: 978-3-642-33074-2
eBook Packages: Computer ScienceComputer Science (R0)