Efficient Two-Party Privacy Preserving Collaborative k-means Clustering Protocol Supporting both Storage and Computation Outsourcing

  • Zoe L. Jiang
  • Ning Guo
  • Yabin Jin
  • Jiazhuo Lv
  • Yulin Wu
  • Yating Yu
  • Xuan Wang
  • S. M. Yiu
  • Junbin FangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11337)


Privacy preserving collaborative data mining aims to extract useful knowledge from distributed databases owned by multiple parties while keeping the privacy of both data and mining result. Nowadays, more and more companies reply on cloud to store data and handle with data. In this context, privacy preserving collaborative k-means clustering framework was proposed to support both storage and computation outsourcing for two parties. However, the computing cost and communication overhead are too high to practical. In this paper, we propose to encrypt each party’s data once and then store them in cloud. Privacy preserving k-means collaborative clustering protocol is executed mainly at cloud side, with total \(O(k(m+n))\)-round interactions among the two parties and the cloud. Here, m and n means that the total numbers of records for the two parties, respectively. The protocol is secure in the semi-honest security model and especially secure in the malicious model supporting only one party corrupted during k centroids re-computation. We also implement it in real cloud environment using e-health data as the testing data.


Privacy-preserving data mining k-means clustering Storage outsourcing Computation outsourcing Secure multiparty computation 



This work is supported by Basic Reasearch Project of Shenzhen of China (No. JCYJ20160318094015947), National Key Research and Development Program of China (No. 2017YFB0803002), National Natural Science Foundation of China (No. 61771222), Key Technology Program of Shenzhen, China (No. JSGG20160427185010977).


  1. 1.
    Lindell, Y., Pinkas, B.: Privacy preserving data mining. In: Bellare, M. (ed.) CRYPTO 2000. LNCS, vol. 1880, pp. 36–54. Springer, Heidelberg (2000). Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Privacy preserving data mining. ACM Sigmod 29(2), 439–450 (2000)CrossRefGoogle Scholar
  3. 3.
    Vaidya, J., Clifton, C.: Privacy-preserving K-means clustering over vertically partitioned data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 206–215. ACM (2003)Google Scholar
  4. 4.
    Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: di Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005). Scholar
  5. 5.
    Doganay, M.C., Pedersen, T.B., Saygin, Y., et al.: Distributed privacy preserving k-means clustering with additive secret sharing. In: International workshop on Privacy and Anonymity in Information Society 2008, pp. 3–11. ACM (2008)Google Scholar
  6. 6.
    Upmanyu, M., Namboodiri, A.M., Srinathan, K., Jawahar, C.V.: Efficient privacy preserving K-means clustering. In: Chen, H., Chau, M., Li, S., Urs, S., Srinivasa, S., Wang, G.A. (eds.) PAISI 2010. LNCS, vol. 6122, pp. 154–166. Springer, Heidelberg (2010). Scholar
  7. 7.
    Patel, S., Garasia, S., Jinwala, D.: An efficient approach for privacy preserving distributed K-means clustering based on Shamir’s secret sharing scheme. In: Dimitrakos, T., Moona, R., Patel, D., McKnight, D.H. (eds.) IFIPTM 2012. IAICT, vol. 374, pp. 129–141. Springer, Heidelberg (2012). Scholar
  8. 8.
    Patel, S., Patel, V., Jinwala, D.: Privacy preserving distributed K-means clustering in Malicious model using zero knowledge proof. In: Hota, C., Srimani, P.K. (eds.) ICDCIT 2013. LNCS, vol. 7753, pp. 420–431. Springer, Heidelberg (2013). Scholar
  9. 9.
    Liu, D., Bertino, E., Yi, X.: Privacy of outsourced k-means clustering. In: ACM Symposium on Information, Computer and Communications Security 2014, pp. 123–134. ACM (2014)Google Scholar
  10. 10.
    Patel, S., Sonar, M., Jinwala, D.C.: Privacy preserving distributed K-means clustering in Malicious model using verifiable secret sharing scheme. Int. J. Distrib. Syst. Technol. (IJDST) 5(2), 44–70 (2014)CrossRefGoogle Scholar
  11. 11.
    Liu, X., Jiang, Z.L., Yiu, S.M., et al.: Outsourcing two-party privacy preserving K-means clustering protocol in wireless sensor networks. In: 2015 11th International Conference on Mobile Ad-hoc and Sensor Networks (MSN) 2015, pp. 124–133. IEEE (2015)Google Scholar
  12. 12.
    Samanthula, B.K., Rao, F.Y., Bertino, E., et al.: Privacy-preserving and outsourced multi-user k-means clustering. In: IEEE Conference on Collaboration and Internet Computing 2015, pp. 80–90. IEEE (2015)Google Scholar
  13. 13.
    Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 223–238. Springer, Heidelberg (1999). Scholar
  14. 14.
    Xu, L., Jiang, C., Wang, J., et al.: Information security in big data: privacy and data mining. IEEE Access 2014(2), 1–28 (2014)Google Scholar
  15. 15.
    Mohassel, P., Rosulek, M., Zhang, Y.: Fast and secure three-party computation: the garbled circuit approach. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 591–602. ACM (2015)Google Scholar
  16. 16.
    Bellare, M., Hoang, V.T., Rogaway, P.: Foundations of garbled circuits. In: ACM Conference on Computer and Communications Security 2012, pp. 784–796. ACM (2012)Google Scholar
  17. 17.
    López-Alt, A., Tromer, E., Vaikuntanathan, V.: On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption. In: ACM Symposium on Theory of Computing 2012, pp. 1219–1234. ACM (2012)Google Scholar
  18. 18.
    Li, P., Li, J., Huang, Z., et al.: Privacy-preserving outsourced classification in cloud computing. Cluster Comput. 1–10 (2017)Google Scholar
  19. 19.
    Li, P., Li, J., Huang, Z., et al.: Multi-key privacy-preserving deep learning in cloud computing. Future Gener. Comput. Syst. 74, 76–85 (2017)CrossRefGoogle Scholar
  20. 20.
    Liu, X., Choo, R., Deng, R., et al.: Efficient and privacy-preserving outsourced calculation of rational numbers. IEEE Trans. Dependable Secure Comput. 15, 27–39 (2016)CrossRefGoogle Scholar
  21. 21.
    Liu, X., Deng, R.H., Choo, K.K.R., et al.: An efficient privacy-preserving outsourced calculation toolkit with multiple keys. IEEE Trans. Inf. Forensics Secur. 11(11), 2401 (2016)CrossRefGoogle Scholar
  22. 22.
    Liu, X., Deng, R.H., Ding, W., et al.: Privacy-preserving outsourced calculation on floating point numbers. IEEE Trans. Inf. Forensics Secur. 11(11), 2513–2527 (2016)CrossRefGoogle Scholar
  23. 23.
    Jagannathan, G., Wright, R.N.: Privacy-preserving distributed K-means clustering over arbitrarily partitioned data. In: ACM SIGKDD International Conference on Knowledge Discovery in Data Mining 2005, pp. 593–599. ACM (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Zoe L. Jiang
    • 1
  • Ning Guo
    • 1
  • Yabin Jin
    • 1
  • Jiazhuo Lv
    • 1
  • Yulin Wu
    • 1
  • Yating Yu
    • 1
  • Xuan Wang
    • 1
  • S. M. Yiu
    • 2
  • Junbin Fang
    • 3
    Email author
  1. 1.Harbin Institute of Technology (Shenzhen)ShenzhenChina
  2. 2.The University of Hong KongHong KongChina
  3. 3.Jinan UniversityGuangzhouChina

Personalised recommendations