Abstract
Criteria that induce a Skyline naturally represent user’s preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large, making unfeasible for users to process this set of points. To identify the best points among the Skyline, the Top-k Skyline approach has been proposed. Top-k Skyline uses discriminatory criteria to induce a total order of the points that comprise the Skyline, and recognizes the best or top-k objects based on these criteria. Different algorithms have been defined to compute the top-k objects among the Skyline; while existing solutions are able to produce the Top-k Skyline, they may be very costly. First, state-of-the-art Top-k Skyline solutions require the computation of the whole Skyline; second, they execute probes of the multicriteria function over the whole Skyline points. Thus, if k is much smaller than the cardinality of the Skyline, these solutions may be very inefficient because a large number of non-necessary probes may be evaluated. In this paper, we propose the TKSI, an efficient solution for the Top-k Skyline that overcomes existing solutions drawbacks. The TKSI is an index-based algorithm that is able to compute only the subset of the Skyline that will be required to produce the top-k objects; thus, the TKSI is able to minimize the number of non-necessary probes. We have empirically studied the quality of TKSI, and we report initial experimental results that show the TKSI is able to speed up the computation of the Top-k Skyline in at least 50% percent w.r.t. the state-of-the-art solutions, when k is smaller than the size of the Skyline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Balke, W.-T., Güntzer, U.: Multi-objective Query Processing for Database Systems. In: Proceedings of the International Conference on Very Large Databases (VLDB), Canada, pp. 936–947 (2004)
Balke, W.-T., Güntzer, U., Zheng, J.X.: Efficient distributed skylining for web information systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 256–273. Springer, Heidelberg (2004)
Börzönyi, S., Kossman, D., Stocker, K.: The Skyline operator. In: Proceedings of the International Conference on Data Engineering (ICDE), Germany, pp. 421–430 (2001)
Brando, C., Goncalves, M., González, V.: Evaluating top-k skyline queries over relational databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 254–263. Springer, Heidelberg (2007)
Carey, M., Kossman, D.: On saying “Enough already!” in SQL. In: Proceedings of the ACM SIGMOD Conference on Management of Data, vol. 26(2), pp. 219–230 (1997)
Chang, K., Hwang, S.-W.: Optimizing access cost for top-k queries over Web sources: A unified cost-based approach. Technical Report UIUCDS-R-2003-2324, University of Illinois at Urbana-Champaign (2003)
Chan, C.-Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: On high dimensional skylines. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 478–495. Springer, Heidelberg (2006)
Fagin, R.: Combining fuzzy information from multiple systems. Journal of Computer and System Sciences (JCSS) 58(1), 216–226 (1996); Proceedings of the Conference on Very Large Data Bases (VLDB), Norway, pp. 229–240 (2005)
Godfrey, P., Shipley, R., Gryz, J.: Maximal Vector Computation in Large Data Sets
Goncalves, M., Vidal, M.-E.: Preferred skyline: A hybrid approach between sQLf and skyline. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 375–384. Springer, Heidelberg (2005)
Goncalves, M., Vidal, M.E.: Top-k Skyline: A Unified Approach. In: Proceedings of OTM (On the Move) 2005 PhD Symposium, Cyprus, pp. 790–799 (2005)
Lee, J., You, G.-w., Hwang, S.-w.: Telescope: Zooming to interesting skylines. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 539–550. Springer, Heidelberg (2007)
Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: The k most representative Skyline operator. In: Proceedings of the International Conference on Data Engineering (ICDE), Turkey, pp. 86–95 (2007)
Lo, E., Yip, K., Lin, K.-I., Cheung, D.: Progressive Skylining over Web-Accessible Database. Journal of Data and Knowledge Engineering 57(2), 122–147 (2006)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive Skyline computation in database systems. ACM Transactions Database Systems 30(1), 41–82 (2005)
Pei, J., Jin, W., Ester, M., Tao, Y.: Catching the Best Views of Skyline: A semantic Approach Based on Decisive Subspaces. In: Proceedings of the Very Large Databases (VLDB), Norway, pp. 253–264 (2005)
Tao, Y., Xiao, X., Pei, J.: Efficient Skyline and Top-k Retrieval in Subspaces. IEEE Transactions on Knowledge and Data Engineering 19(8), 1072–1088 (2007)
Vlachou, A., Vazirgiannis, M.: Link-based ranking of Skyline result sets. In: Proc. of 3rd Multidiciplinary Workshop on Advances in Preference Handling (2007)
http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Goncalves, M., Vidal, ME. (2009). Reaching the Top of the Skyline: An Efficient Indexed Algorithm for Top-k Skyline Queries. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_41
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)