Collusion-resistant protocols for private processing of aggregated queries in distributed databases

Abstract

Private processing of database queries protects the confidentiality of sensitive data when queries are answered. It is important to design collusion-resistant protocols ensuring that privacy remains protected even when a certain number of honest-but-curious participants collude to share their knowledge in order to gain unauthorised access to sensitive information. A novel setting arises when aggregated queries need to be answered for a large distributed database, but legal requirements or commercial interests forbid making access to records in each subdatabase available to other counterparts. For example, a very large number of medical records may be stored in a distributed database, which is a union of several separate databases from different hospitals, or even from different countries. The present article introduces and investigates two protocols for collusion-resistant private processing of aggregated queries in this novel setting: Accelerated Multi-round Iterative Protocol (AMIP) and Restricted Multi-round Iterative Protocol (RMIP). We define a large collection of query functions and show that AMIP and RMIP protocols can answer all queries in this collection. Our experiments demonstrate that the AMIP protocol outperforms all other applicable algorithms, and this achievement is especially significant in terms of the communication complexity.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. 1.

    Amagata, D., Sasaki, Y., Hara, T., Nishio, S.: Probabilistic nearest neighbor query processing on distributed uncertain data. Distrib. Parallel Databases 34, 259–287 (2016)

    Google Scholar 

  2. 2.

    Drissi, A., Nait-Bahloul, S., Benouaret, K., Benslimane, D.: Horizontal fragmentation for fuzzy querying databases. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-018-7250-4

  3. 3.

    Guzun, G., Canahuate, G.: High-dimensional similarity searches using query driven dynamic quantization and distributed indexing. Distrib. Parallel Databases (2019). https://doi.org/10.1007/s10619-019-07266-x

  4. 4.

    Mershad, K., Malluhi, Q.M., Ouzzani, M., Tang, M., Gribskov, M., Aref, W.G., Prakash, D.: COACT: a query interface language for collaborative databases. Distrib. Parallel Databases 36, 121–151 (2018)

    Google Scholar 

  5. 5.

    Wang, X., Shen, D., Yu, G.: Uncertain top-k query processing in distributed environments. Distrib. Parallel Databases 34, 567–589 (2016)

    Google Scholar 

  6. 6.

    Ebenstein, R., Agrawal, G.: DistriPlan: an optimized join execution framework for geo-distributed scientific data. Distrib. Parallel Databases 38, 1–26 (2019)

    Google Scholar 

  7. 7.

    Jafarinejad, M., Amini, M.: Multi-join query optimization in bucket-based encrypted databases using an enhanced ant colony optimization algorithm. Distrib. Parallel Databases 36, 399–441 (2018)

    Google Scholar 

  8. 8.

    Li, H., Cui, J., Meng, X., Ma, J.: IHP: improving the utility in differential private histogram publication. Distrib. Parallel Databases 37, 1–30 (2019)

    Google Scholar 

  9. 9.

    Örencik, C., Savaş, E.: An efficient privacy-preserving multi-keyword search over encrypted cloud data with ranking. Distrib. Parallel Databases 32, 119–160 (2014)

    Google Scholar 

  10. 10.

    Wang, S., Agrawal, D., El Abbadi, A.: Towards practical private processing of database queries over public data. Distrib. Parallel Databases 32, 65–89 (2014)

    Google Scholar 

  11. 11.

    Cho, H.J., Kwon, S.J., Jin, R., Chung, T.S.: A privacy-aware monitoring algorithm for moving k-nearest neighbor queries in road networks. Distrib. Parallel Databases 33, 319–352 (2015)

    Google Scholar 

  12. 12.

    Huang, J., Qi, J., Xu, Y., Chen, J.: A privacy-enhancing model for location-based personalized recommendations. Distrib. Parallel Databases 33, 253–276 (2015)

    Google Scholar 

  13. 13.

    Kafali, O., Günay, A., Yolum, P.: Detecting and predicting privacy violations in online social networks. Distrib. Parallel Databases 32, 161–190 (2014)

    Google Scholar 

  14. 14.

    Omer, M.Z., Gao, H., Mustafa, N.: Privacy-preserving of SVM over vertically partitioned with imputing missing data. Distrib. Parallel Databases 35, 363–382 (2017)

    Google Scholar 

  15. 15.

    Sellami, M., Hacid, M.S., Gammoudi, M.M.: A FCA framework for inference control in data integration systems. Distrib. Parallel Databases 37, 1–44 (2019)

    Google Scholar 

  16. 16.

    Belyaev, K., Sun, W., Ray, I., Ray, I.: On the design and analysis of protocols for personal health record storage on personal data server devices. Future Gener. Comput. Syst. 80, 467–482 (2018)

    Google Scholar 

  17. 17.

    Karapiperis, D., Gkoulalas-Divanis, A., Verykios, V.S.: Summarizing and linking electronic health records. Distrib. Parallel Databases 38, 1–40 (2019)

    Google Scholar 

  18. 18.

    Teng, D., Kong, J., Wang, F.: Scalable and flexible management of medical image big data. Distrib. Parallel Databases 37, 235–250 (2019)

    Google Scholar 

  19. 19.

    Wiese, I., Sarna, N., Wiese, L.: Concept acquisition and improved in-database similarity analysis for medical data. Distrib. Parallel Databases 38, 1–25 (2019)

    Google Scholar 

  20. 20.

    Forkan, A., Khalil, I., Atiquzzaman, M.: ViSiBiD: a learning model for early discovery and real-time prediction of severe clinical events using vital signs as big data. Comput. Netw. 113, 244–257 (2017)

    Google Scholar 

  21. 21.

    Singh, K., Rong, J., Batten, L.: Sharing sensitive medical data sets for research purposes: a case study. In: Proceedings of 2014 IEEE International Conference Data Science and Advanced Analytics, DSAA 2014, pp. 555–562 (2014)

  22. 22.

    Zhang, C., Zhu, L., Xu, C., Lu, R.: PPDP: an efficient and privacy-preserving disease prediction scheme in cloud-based e-healthcare system. Future Gener. Comput. Syst. 79, 16–25 (2018)

    Google Scholar 

  23. 23.

    Banerjee, M., Chen, Z., Gangopadhyay, A.: A generic and distributed privacy preserving classification method with a worst-case privacy guarantee. Distrib. Parallel Databases 32, 5–35 (2014)

    Google Scholar 

  24. 24.

    Pieprzyk, J., Hardjono, T., Seberry, J.: Fundamentals of Computer Security. Springer, Berlin (2003)

    Google Scholar 

  25. 25.

    Yi, X., Bouguettaya, A., Georgakopoulos, D., Song, A., Willemson, J.: Privacy protection for wireless medical sensor data. IEEE Trans. Depend. Secur. Comput. 13, 369–380 (2016)

    Google Scholar 

  26. 26.

    Yi, X., Paulet, R., Bertino, E.: Homomorphic Encryption and Applications. Springer, New York (2014)

    Google Scholar 

  27. 27.

    Louhichi, S., Gzara, M., Ben-Abdallah, H.: MDCUT2: a multi-density clustering algorithm with automatic detection of density variation in data with noise. Distrib. Parallel Databases 37, 73–99 (2019)

    Google Scholar 

  28. 28.

    Wu, K., Rusu, F.: Special issue on scientific and statistical data management. Distrib. Parallel Databases 37, 1–3 (2019)

    Google Scholar 

  29. 29.

    Zhang, X., Zheng, F., Nguyen, B.: DeStager: feature guided in-situ data management in distributed deep memory hierarchies. Distrib. Parallel Databases 37, 209–231 (2019)

    Google Scholar 

  30. 30.

    Kelarev, A., Yi, X., Badsha, S., Yang, X., Rylands, L., Seberry, J.: A multistage protocol for aggregated queries in distributed cloud databases with privacy protection. Future Gener. Comput. Syst. 90, 368–380 (2019)

    Google Scholar 

  31. 31.

    Abawajy, J., Kelarev, A., Yi, X., Jelinek, H.F.: Minimal ensemble based on subset selection using ECG to diagnose categories of CAN. Comput. Methods Progr. Biomed. 160, 85–94 (2018)

    Google Scholar 

  32. 32.

    Dai, H., Wang, M., Yi, X., Yang, G., Bao, J.: Secure MAX/MIN queries in two-tiered wireless sensor networks. IEEE Access 5, 14478–14489 (2017)

    Google Scholar 

  33. 33.

    Li, W., Santos, I., Delicato, F.C., Pires, P.F., Pirmez, L., Wei, W., Song, H., Zomaya, A., Khan, S.: System modelling and performance evaluation of a three-tier cloud of things. Future Gener. Comput. Syst. 70, 104–125 (2017)

    Google Scholar 

  34. 34.

    Wang, Y., Luo, J., Song, A., Dong, F.: OATS: online aggregation with two-level sharing strategy in cloud. Distrib. Parallel Databases 32, 467–505 (2014)

    Google Scholar 

  35. 35.

    Zhang, M., Li, H., Liu, L., Buyya, R.: An adaptive multi-objective evolutionary algorithm for constrained workflow scheduling in clouds. Distrib. Parallel Databases 36, 339–368 (2018)

    Google Scholar 

  36. 36.

    Zhang, S., Wang, G., Liu, Q., Abawajy, J.H.: A trajectory privacy-preserving scheme based on query exchange in mobile social networks. Soft Comput. 22, 6121–6133 (2018)

    Google Scholar 

  37. 37.

    Goonetilleke, O., Koutra, D., Liao, K., Sellis, T.: On effective and efficient graph edge labeling. Distrib. Parallel Databases 37, 5–38 (2019)

    Google Scholar 

  38. 38.

    That, D.H.T., Wagner, J., Rasin, A., Malik, T.: PLI+: efficient clustering of cloud databases. Distrib. Parallel Databases 37, 177–208 (2019)

    Google Scholar 

  39. 39.

    Singh, K., Batten, L.: Aggregating privatized medical data for secure querying applications. Future Gener. Comput. Syst. 72, 250–263 (2017)

    Google Scholar 

  40. 40.

    Papapetrou, O., Garofalakis, M.: Monitoring distributed fragmented skylines. Distrib. Parallel Databases 36, 675–715 (2018)

    Google Scholar 

  41. 41.

    Jang, M., Song, Y., Chang, J.W.: A parallel computation of skyline using multiple regression analysis-based filtering on mapreduce. Distrib. Parallel Databases 35, 383–409 (2017)

    Google Scholar 

  42. 42.

    Atzeni, P., Bellomarini, L., Bugiotti, F., De Leonardis, M.: Executable schema mappings for statistical data processing. Distrib. Parallel Databases 36, 265–300 (2018)

    Google Scholar 

  43. 43.

    Zhu, Y., Xu, Q., Shi, H., Samsudin, J.: An efficient distributed search solution for federated cloud. Distrib. Parallel Databases 35, 411–433 (2017)

    Google Scholar 

  44. 44.

    Au, M.H., Yuen, T.H., Liu, J.K., Susilo, W., Huang, X., Xiang, Y., Jiang, Z.L.: A general framework for secure sharing of personal health records in cloud system. J. Comput. Syst. Sci. 90, 46–62 (2017)

    MathSciNet  MATH  Google Scholar 

  45. 45.

    Vimalachandran, P., Wang, H., Zhang, Y., Heyward, B., Zhao, Y.: Preserving patient-centred controls in electronic health record systems: A reliance-based model implication. In: Proceeding of the 2017 International Conference on Orange Technologies, ICOT 2017, vol. 2018-January, pp. 37–44 (2018)

  46. 46.

    Vimalachandran, P., Wang, H., Zhang, Y., Zhuo, G., Kuang, H.: Cryptographic access control in electronic health record systems: a security implication. LNCS 10570, 540–549 (2017)

    Google Scholar 

  47. 47.

    Liu, G., Yang, G., Wang, H., Xiang, Y., Dai, H.: A novel secure scheme for supporting complex SQL queries over encrypted databases in cloud computing. Secur. Commun. Netw. 2018, 7383514 (2018)

    Google Scholar 

  48. 48.

    Sookhak, M., Yu, F.R., Khan, M.K., Xiang, Y., Buyya, R.: Attribute-based data access control in mobile cloud computing: taxonomy and open issues. Future Gener. Comput. Syst. 72, 273–287 (2017)

    Google Scholar 

  49. 49.

    NIST/SEMATECH: E-handbook of Statistical Methods. http://www.itl.nist.gov/div898/handbook/ (2019). Accessed 5 May 2019

  50. 50.

    Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: Proceedings of 20th Annual ACM Symposium on Theory of Computing, STOC’88, ACM, pp. 1–10 (1988)

  51. 51.

    Shamir, A.: How to share a secret. Commun. ACM 22, 612–613 (1979)

    MathSciNet  MATH  Google Scholar 

  52. 52.

    Miller, K.: Athletic involvement study (of students in a Northeastern University in the United States). Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI. https://doi.org/10.3886/ICPSR33661.v1 (2013). Accessed 15 Nov 2018

  53. 53.

    Kaplan, G.A.: Alameda County [California] health and ways of living study, 1974 panel. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI. https://doi.org/10.3886/ICPSR06838.v2 (2018). Accessed 15 Nov 2018

  54. 54.

    Wright, H.H., Capilouto, G.J.: Discourse processing in healthy aging in the United States. nter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017) https://doi.org/10.3886/ICPSR36634.v1. Accessed 15 Nov 2018

  55. 55.

    United States Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics: National health interview survey, 1983. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2011) https://doi.org/10.3886/ICPSR08603.v4. Accessed 15 Nov 2018

  56. 56.

    Kenny, R.A.: The Irish Longitudinal Study on Ageing (TILDA). Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2018). https://doi.org/10.3886/ICPSR34315.v2. Accessed 15 Nov 2018

  57. 57.

    Ryff, C., Almeida, D., Ayanian, J., Binkley, N., Carr, D.S., Coe, C., Williams, D.: Midlife in the United States (MIDUS 3) 2013-2014. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017). https://doi.org/10.3886/ICPSR36346.v6. Accessed 15 Nov 2018

  58. 58.

    United States Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics: National health interview survey, 2000. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2017). https://doi.org/10.3886/ICPSR03381.v2. Accessed 15 Nov 2018

  59. 59.

    Harris, K.M., Udry, J.R.: National longitudinal study of adolescent to adult health (add health), 1994–2008. Carolina Population Center, University of North Carolina-Chapel Hill [distributor], Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI (2018). https://doi.org/10.3886/ICPSR21600.v21. Accessed 15 Nov 2018

  60. 60.

    Ryff, C., Kawakami, N., Kitayama, S., Karasawa, M., Markus, H., Coe, C.: Survey of midlife in Japan (MIDJA 2): Biomarker project, 2013-2014. Inter-university Consortium for Political and Social Research [distributor], Ann Arbon, MI (2018). https://doi.org/10.3886/ICPSR36530.v4. Accessed 15 Nov 2018

Download references

Acknowledgements

The authors are grateful to two anonymous reviewers for comments that have helped to improve this article.

Funding

This work has been supported by the Australian Research Council, Discovery Grant DP160100913.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Leanne Rylands.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rylands, L., Seberry, J., Yi, X. et al. Collusion-resistant protocols for private processing of aggregated queries in distributed databases. Distrib Parallel Databases 39, 97–127 (2021). https://doi.org/10.1007/s10619-020-07293-z

Download citation

Keywords

  • Aggregated queries
  • Multi-round iterative protocol
  • Privacy protection
  • Private query processing
  • Distributed databases