SCSA: Evaluating skyline queries in incomplete data

  • Yonis Gulzar
  • Ali A. AlwanEmail author
  • Radhwan Mohamed Abdullah
  • Qin Xin
  • Marwa B. Swidan


Skyline queries have been extensively incorporated in various contemporary database applications. The list includes but is not limited to multi-criteria decision-making systems, decision support systems, and recommendation systems. Due to its great benefits and wide application range, many skyline algorithms have already been proposed in numerous data settings. Nonetheless, most researchers presume the completion of data meaning that all data item values are available. Since this assumption cannot be sustained in a large number of real-world database applications, the existing algorithms are rather inadequate to be directly applied on a database with incomplete data. In such cases, processing skyline queries on incomplete data incur exhaustive pairwise comparisons between data items, which may lead to loss of the transitivity property of the skyline technique. Losing the transitivity property may in turn give rise to the problem of cyclic dominance. In order to address these issues, we propose a new skyline algorithm called Sorting-based Cluster Skyline Algorithm (SCSA) that combines the sorting and partitioning techniques and simplifies the skyline computation on an incomplete dataset. These two techniques help boost the skyline process and avoid many unnecessary pairwise comparisons between data items to prune the dominated data items. The comprehensive experiments carried out on both synthetic and real-life datasets demonstrate the effectiveness and versatility of our approach as compared to the currently used approaches.


Skyline Skyline queries Incomplete data Missing data Preference queries Query processing 



This research is supported by the project FRGS15-205-0491, Ministry of Education, Malaysia.


  1. 1.
    Borzsony S, Kossmann D, Stocker K (2001) The Skyline operator. In: Proceedings 17th International Conference on Data Engineering, Cancun, Mexico, 2001. pp 421–430. doi:
  2. 2.
    Khalefa ME, Mokbel MF, Levandoski JJ (2008) Skyline Query Processing for Incomplete Data. In: IEEE 24th International Conference on Data Engineering, Cancun, (Mexico). PP. 556–565, 7–12 April 2008 2008. pp 556–565. doi:
  3. 3.
    Alwan AA, Ibrahim H, Udzir NI (2014) A Framework for Identifying Skylines over Incomplete Data. In: 3rd International Conference on Advanced Computer Science Applications and Technologies (ACSAT), 2014 2014. IEEE, pp 79–84Google Scholar
  4. 4.
    Gulzar Y, Alwan AA, Salleh N, Shaikhli IFA, Alvi SIM (2016) A Framework for Evaluating Skyline Queries over Incomplete Data. Procedia Computer Science 94:191–198. CrossRefGoogle Scholar
  5. 5.
    Abidi A, Elmi S, Bach Tobji MA, HadjAli A, Ben Yaghlane B (2018) Skyline queries over possibilistic RDF data. Int J Approx Reason 93:277–289. CrossRefGoogle Scholar
  6. 6.
    Elmi S, Benouaret K, Hadjali A, Bach Tobji MA, Ben Yaghlane B (2014) Computing Skyline from Evidential Data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8720:148–161Google Scholar
  7. 7.
    Elmi S, Hadjali A, Tobji MAB, Yaghlane BB (2016) Imperfect top-k skyline query with confidence level. In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Nov. 29 2016-Dec. 2 2016 2016. pp 1–8. doi:
  8. 8.
    Elmi S, Tobji MAB, Hadjali A, Yaghlane BB (2016) Efficient Skyline Maintenance over Frequently Updated Evidential Databases. Communications in Computer and Information Science 611:199–210. CrossRefGoogle Scholar
  9. 9.
    Gulzar Y, Alwan AA, Salleh N, Shaikhli IFA (2017) Processing skyline queries in incomplete database: Issues, challenges and future trends. J Comput Sci 13(11):647–658. CrossRefGoogle Scholar
  10. 10.
    Gulzar Y, Alwan AA, Salleh N, Al Shaikhli IF (2018) A Model for Skyline Query Processing in a Partially Complete Database. Adv Sci Lett 24(2):1339–1343. CrossRefGoogle Scholar
  11. 11.
    Tan K-L, Eng P-K, Ooi BC (2001) Efficient progressive skyline computation. In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB27), Roma, 2001. pp 301–310Google Scholar
  12. 12.
    Chan C-Y, Jagadish HV, Tan K-L, Tung AKH, Zhang Z (2006) On High Dimensional Skylines. In: Advances in Database Technology - EDBT 2006: 10th International Conference on Extending Database Technology, Munich, Germany, March 26–31, 2006. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 478–495. doi:10.1007/11687238_30Google Scholar
  13. 13.
    Chan C-Y, Jagadish HV, Tan K-L, Tung AKH, Zhang Z (2006) Finding k-dominant skylines in high dimensional space. Paper presented at the Proceedings of the 2006 ACM SIGMOD international conference on Management of data, ChicagoGoogle Scholar
  14. 14.
    Mouratidis K, Bakiras S, Papadias D (2006) Continuous monitoring of top-k queries over sliding windows. Paper presented at the Proceedings of the 2006 ACM SIGMOD international conference on Management of data, ChicagoGoogle Scholar
  15. 15.
    Yiu ML, Mamoulis N (2007) Efficient processing of top-k dominating queries on multi-dimensional data. Paper presented at the Proceedings of the 33rd international conference on Very large data bases, ViennaGoogle Scholar
  16. 16.
    Morse M, Patel JM, Grosky WI (2007) Efficient continuous skyline computation. Inf Sci 177(17):3411–3437. MathSciNetCrossRefGoogle Scholar
  17. 17.
    Kalyvas C, Tzouramanis T, Manolopoulos Y (2017) Processing skyline queries in temporal databases. Paper presented at the Proceedings of the Symposium on Applied Computing, MarrakechGoogle Scholar
  18. 18.
    Lofi C, El Maarry K, Balke W-T (2013) Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing. In: Ng W, Storey VC, Trujillo JC (eds) Conceptual Modeling: 32th International Conference, ER 2013, Hong-Kong, China, November 11–13, 2013. Proceedings. Springer Berlin Heidelberg, Berlin, pp 298–312. doi:
  19. 19.
    Lee J, Lee D, Kim S-W (2016) CrowdSky: Skyline Computation with Crowdsourcing. In: EDBT, 2016. pp 125–136Google Scholar
  20. 20.
    Swidan MB, Alwan AA, Turaev S, Gulzar Y (2018) A Model for Processing Skyline Queries in Crowd-sourced Databases. Indonesian Journal of Electrical Engineering and Computer Science 10(2):798–806. CrossRefGoogle Scholar
  21. 21.
    Gulzar Y, Alwan AA, Salleh N, Shaikhli IFA (2017) Skyline Query Processing for Incomplete Data in Cloud Environment. In: Proceedings of the 6th International Conference on Computing & Informatics, Kuala Lumpur, Malaysia, 2017. J. & N. H. Zakaria (Eds.), pp 567–576. 10 August 2017
  22. 22.
    Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: an online algorithm for skyline queries. Paper presented at the Proceedings of the 28th international conference on Very Large Data Bases, Hong KongGoogle Scholar
  23. 23.
    Chomicki J, Godfrey P, Gryz J, Liang D (2003) Skyline with presorting. In: Proceedings 19th International Conference on Data Engineering (ICDE03), Bangalore (India), 5–8 March 2003 2003. pp 717–719. doi:
  24. 24.
    Papadias D, Tao Y, Fu G, Seeger B (2003) An optimal and progressive algorithm for skyline queries. Paper presented at the Proceedings of the 2003 ACM SIGMOD international conference on Management of data, San DiegoGoogle Scholar
  25. 25.
    Kung HT, Luccio F, Preparata FP (1975) On Finding the Maxima of a Set of Vectors. J ACM 22(4):469–476. MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Bentley JL, Kung HT, Schkolnick M, Thompson CD (1978) On the Average Number of Maxima in a Set of Vectors and Applications. J ACM 25(4):536–543. MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Bentley JL, Clarkson KL, Levine DB (1993) Fast linear expected-time algorithms for computing maxima and convex hulls. Algorithmica 9(2):168–183. MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Tomoiagă B, Chindriş M, Sumper A, Sudria-Andreu A, Villafafila-Robles R (2013) Pareto Optimal Reconfiguration of Power Distribution Systems Using a Genetic Algorithm Based on NSGA-II. Energies 6(3):1439CrossRefGoogle Scholar
  29. 29.
    Rodger JA, Pankaj P, Nahouraii A (2014) A Petri Net Pareto ISO 31000 Workflow Process Decision Making Approach for Supply Chain Risk Trigger Inventory Decisions in Government Organizations. Intell Inf Manag 6(03):157Google Scholar
  30. 30.
    Godfrey P, Shipley R, Gryz J (2005) Maximal vector computation in large data sets. Paper presented at the Proceedings of the 31st international conference on Very large data bases, TrondheimGoogle Scholar
  31. 31.
    Bartolini I, Ciaccia P, Patella M (2006) SaLSa: computing the skyline without scanning the whole sky. Paper presented at the Proceedings of the 15th ACM international conference on Information and knowledge management, ArlingtonGoogle Scholar
  32. 32.
    Zhang S, Mamoulis N, Cheung DW (2009) Scalable skyline computation using object-based space partitioning. Paper presented at the Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, ProvidenceGoogle Scholar
  33. 33.
    Lee KC, Lee W-C, Zheng B, Li H, Tian Y (2010) Z-SKY: an efficient skyline query processing framework based on Z-order. VLDB J 19(3):333–362. CrossRefGoogle Scholar
  34. 34.
    Lee J, S-w H (2014) Scalable skyline computation using a balanced pivot selection technique. Inf Syst 39(Supplement C):1–21. CrossRefGoogle Scholar
  35. 35.
    Arefin MS, Morimoto Y (2012) Skyline sets queries for incomplete data. International Journal of Computer Science & Information Technology 4(5):67–80CrossRefGoogle Scholar
  36. 36.
    Miao X, Gao Y, Chen L, Chen G, Li Q, Jiang T (2013) On Efficient k-Skyband Query Processing over Incomplete Data. In: Meng W, Feng L, Bressan S, Winiwarter W, Song W (eds) 18th International Conference on Database Systems for Advanced Applications, Wuhan, Chian, 2013. pp 424–439. doi:10.1007/978-3-642-37487-6_32Google Scholar
  37. 37.
    Bharuka R, Kumar PS (2013) Finding skylines for incomplete data. Paper presented at the Proceedings of the 24th Australasian Database Conference - Volume 137, AdelaideGoogle Scholar
  38. 38.
    Balke W-T, Güntzer U, Zheng JX (2004) Efficient Distributed Skylining for Web Information Systems. In, Berlin, Heidelberg, 2004. Advances in Database Technology - EDBT 2004. Springer Berlin Heidelberg, pp 256–273Google Scholar
  39. 39.
    Bharuka R, Kumar PS (2013) Finding superior skyline points from incomplete data. Paper presented at the Proceedings of the 19th International Conference on Management of Data, AhmedabadGoogle Scholar
  40. 40.
    Zhang K, Gao H, Wang H, Li J (2016) ISSA: Efficient Skyline Computation for Incomplete Data. In: Gao H, Kim J, Sakurai Y (eds) Database Systems for Advanced Applications: DASFAA 2016 International Workshops: BDMS, BDQM, MoI, and SeCoP, Dallas, TX, USA, April 16–19, 2016, Proceedings. Springer International Publishing, Cham, pp 321–328. doi:10.1007/978-3-319-32055-7_26Google Scholar
  41. 41.
    Lee J, Im H, G-w Y (2016) Optimizing Skyline Queries over Incomplete Data. Inf Sci 361:14–28. CrossRefGoogle Scholar
  42. 42.
    Alwan AA, Ibrahim H, Udzir NI, Sidi F (2016) An Efficient Approach for Processing Skyline Queries in Incomplete Multidimensional Database. Arab J Sci Eng 41(8):2927–2943. CrossRefGoogle Scholar
  43. 43.
    Wang Y, Shi Z, Wang J, Sun L, Song B (2017) Skyline Preference Query Based on Massive and Incomplete Dataset. IEEE Access 5:3183–3192. CrossRefGoogle Scholar
  44. 44.
    Fotiadou K, Pitoura E (2008) BITPEER: continuous subspace skyline computation with distributed bitmap indexes. Paper presented at the Proceedings of the 2008 international workshop on Data management in peer-to-peer systems, NantesGoogle Scholar
  45. 45.
    Wong RC-W, Fu AW-C, Pei J, Ho YS, Wong T, Liu Y (2008) Efficient skyline querying with variable user preferences on nominal attributes. Proc VLDB Endow 1(1):1032–1043. CrossRefGoogle Scholar
  46. 46.
    Soliman MA, Ilyas IF, Ben-David S (2010) Supporting ranking queries on uncertain and incomplete data. VLDB J 19(4):477–501. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science, Kulliyyah of Information and Communication TechnologyInternational Islamic University MalaysiaKuala LumpurMalaysia
  2. 2.Division of Basic Sciences, College of Agriculture and ForestryUniversity of MosulMosulIraq
  3. 3.Faculty of Computer Science and Information TechnologyUniversiti Putra MalaysiaSerdangMalaysia
  4. 4.Faculty of Science and TechnologyUniversity of Faroe IslandsFaroe IslandsDenmark

Personalised recommendations