Advertisement

Unsupervised Machine Learning on Encrypted Data

  • Angela JäschkeEmail author
  • Frederik Armknecht
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11349)

Abstract

In the context of Fully Homomorphic Encryption, which allows computations on encrypted data, Machine Learning has been one of the most popular applications in the recent past. All of these works, however, have focused on supervised learning, where there is a labeled training set that is used to configure the model. In this work, we take the first step into the realm of unsupervised learning, which is an important area in Machine Learning and has many real-world applications, by addressing the clustering problem. To this end, we show how to implement the \(K\)-Means-Algorithm. This algorithm poses several challenges in the FHE context, including a division, which we tackle by using a natural encoding that allows division and may be of independent interest. While this theoretically solves the problem, performance in practice is not optimal, so we then propose some changes to the clustering algorithm to make it executable under more conventional encodings. We show that our new algorithm achieves a clustering accuracy comparable to the original \(K\)-Means-Algorithm, but has less than \(5\%\) of its runtime.

Keywords

Machine Learning Clustering Fully Homomorphic Encryption 

Supplementary material

References

  1. 1.
    Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2001).  https://doi.org/10.1007/3-540-44503-X_27CrossRefGoogle Scholar
  2. 2.
    Armknecht, F., et al.: A guide to fully homomorphic encryption. IACR Cryptology ePrint Archive (2015/1192)Google Scholar
  3. 3.
    Armknecht, F., Katzenbeisser, S., Peter, A.: Group homomorphic encryption: characterizations, impossibility results, and applications. DCC 67, 209–232 (2013)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Barnett, A., et al.: Image classification using non-linear support vector machines on encrypted data. IACR Cryptology ePrint Archive (2017/857)Google Scholar
  5. 5.
    Bonte, C., Vercauteren, F.: Privacy-preserving logistic regression training. IACR Cryptology ePrint Archive 233 (2018)Google Scholar
  6. 6.
    Bos, J.W., Lauter, K.E., Naehrig, M.: Private predictive analysis on encrypted medical data. J. Biomed. Inform. 50, 234–243 (2014)CrossRefGoogle Scholar
  7. 7.
    Bost, R., Popa, R.A., Tu, S., Goldwasser, S.: Machine learning classification over encrypted data. In: NDSS (2015)Google Scholar
  8. 8.
    Brakerski, Z., Gentry, C., Vaikuntanathan, V.: Fully homomorphic encryption without bootstrapping. In: ECCC, vol. 18 (2011)Google Scholar
  9. 9.
    Bunn, P., Ostrovsky, R.: Secure two-party k-means clustering. In: CCS (2007)Google Scholar
  10. 10.
    Chabanne, H., de Wargny, A., Milgram, J., Morel, C., Prouff, E.: Privacy-preserving classification on deep neural network. IACR Cryptology ePrint Archive (2017/035)Google Scholar
  11. 11.
    Chen, H., Laine, K., Player, R.: Simple encrypted arithmetic library - SEAL v2.1. IACR Cryptology ePrint Archive 2017, 224 (2017)Google Scholar
  12. 12.
    Chillotti, I., Gama, N., Georgieva, M., Izabachène, M.: Faster fully homomorphic encryption: bootstrapping in less than 0.1 seconds. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 3–33. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-662-53887-6_1CrossRefzbMATHGoogle Scholar
  13. 13.
    Coron, J.-S., Lepoint, T., Tibouchi, M.: Scale-invariant fully homomorphic encryption over the integers. In: Krawczyk, H. (ed.) PKC 2014. LNCS, vol. 8383, pp. 311–328. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-642-54631-0_18CrossRefGoogle Scholar
  14. 14.
    Coron, J.-S., Naccache, D., Tibouchi, M.: Public key compression and modulus switching for fully homomorphic encryption over the integers. In: Pointcheval, D., Johansson, T. (eds.) EUROCRYPT 2012. LNCS, vol. 7237, pp. 446–464. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-29011-4_27CrossRefGoogle Scholar
  15. 15.
    van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V.: Fully homomorphic encryption over the integers. In: Gilbert, H. (ed.) EUROCRYPT 2010. LNCS, vol. 6110, pp. 24–43. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13190-5_2CrossRefGoogle Scholar
  16. 16.
    Ducas, L., Micciancio, D.: FHEW: bootstrapping homomorphic encryption in less than a second. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9056, pp. 617–640. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-662-46800-5_24CrossRefzbMATHGoogle Scholar
  17. 17.
    Esperança, P.M., Aslett, L.J.M., Holmes, C.C.: Encrypted accelerated least squares regression. In: Singh, A., Zhu, X.J. (eds.) AISTATS (2017)Google Scholar
  18. 18.
    Fan, J., Vercauteren, F.: Somewhat practical fully homomorphic encryption. IACR Cryptology ePrint Archive (2012/144)Google Scholar
  19. 19.
    Gentry, C.: A fully homomorphic encryption scheme. Ph.D. thesis, Stanford University (2009)Google Scholar
  20. 20.
    Gentry, C., Sahai, A., Waters, B.: Homomorphic encryption from learning with errors: conceptually-simpler, asymptotically-faster, attribute-based. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8042, pp. 75–92. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-40041-4_5CrossRefGoogle Scholar
  21. 21.
    Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K.E., Naehrig, M., Wernsing, J.: CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In: ICML (2016)Google Scholar
  22. 22.
    Graepel, T., Lauter, K., Naehrig, M.: ML confidential: machine learning on encrypted data. In: Kwon, T., Lee, M.-K., Kwon, D. (eds.) ICISC 2012. LNCS, vol. 7839, pp. 1–21. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37682-5_1CrossRefGoogle Scholar
  23. 23.
    Halevi, S., Shoup, V.: Algorithms in HElib. In: Garay, J.A., Gennaro, R. (eds.) CRYPTO 2014. LNCS, vol. 8616, pp. 554–571. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-44371-2_31CrossRefzbMATHGoogle Scholar
  24. 24.
    Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N., Umano, D.: Communication-efficient privacy-preserving clustering. Trans. Data Priv. 3, 1–25 (2010)MathSciNetGoogle Scholar
  25. 25.
    Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: SIGKDD (2005)Google Scholar
  26. 26.
    Jäschke, A., Armknecht, F.: Accelerating homomorphic computations on rational numbers. In: Manulis, M., Sadeghi, A.-R., Schneider, S. (eds.) ACNS 2016. LNCS, vol. 9696, pp. 405–423. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-39555-5_22CrossRefGoogle Scholar
  27. 27.
    Jäschke, A., Armknecht, F.: (Finite) field work: choosing the best encoding of numbers for FHE computation. In: Capkun, S., Chow, S. (eds.) Cryptology and Network Security. CANS 2017, vol. 11261, pp. 482–492. Springer, Cham (2017).  https://doi.org/10.1007/978-3-030-02641-7_23CrossRefGoogle Scholar
  28. 28.
    Jha, S., Kruger, L., McDaniel, P.: Privacy preserving clustering. In: di Vimercati, S.C., Syverson, P., Gollmann, D. (eds.) ESORICS 2005. LNCS, vol. 3679, pp. 397–417. Springer, Heidelberg (2005).  https://doi.org/10.1007/11555827_23CrossRefGoogle Scholar
  29. 29.
    Kim, A., Song, Y., Kim, M., Lee, K., Cheon, J.H.: Logistic regression model training based on the approximate homomorphic encryption. IACR Cryptology ePrint Archive (254) (2018)Google Scholar
  30. 30.
    Kim, M., Song, Y., Wang, S., Xia, Y., Jiang, X.: Secure logistic regression based on homomorphic encryption. IACR Cryptology ePrint Archive (074) (2018)Google Scholar
  31. 31.
    Liu, X., et al.: Outsourcing two-party privacy preserving k-means clustering protocol in wireless sensor networks. In: MSN (2015)Google Scholar
  32. 32.
    Lu, W., Kawasaki, S., Sakuma, J.: Using fully homomorphic encryption for statistical analysis of categorical, ordinal and numerical data. IACR Cryptology ePrint Archive (2016/1163)Google Scholar
  33. 33.
    MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (1967)Google Scholar
  34. 34.
    Meskine, F., Bahloul, S.N.: Privacy preserving k-means clustering: a survey research. Int. Arab J. Inf. Technol. 9, 194–200 (2012)Google Scholar
  35. 35.
    Naehrig, M., Lauter, K.E., Vaikuntanathan, V.: Can homomorphic encryption be practical? In: CCSW (2011)Google Scholar
  36. 36.
    Phong, L.T., Aono, Y., Hayashi, T., Wang, L., Moriai, S.: Privacy-preserving deep learning via additively homomorphic encryption. IACR Cryptology ePrint Archive (2017/715)Google Scholar
  37. 37.
    Smart, N.P., Vercauteren, F.: Fully homomorphic encryption with relatively small key and ciphertext sizes. In: Nguyen, P.Q., Pointcheval, D. (eds.) PKC 2010. LNCS, vol. 6056, pp. 420–443. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13013-7_25CrossRefzbMATHGoogle Scholar
  38. 38.
  39. 39.
    Ultsch, A.: Clustering with SOM: U* c. In: Proceedings of Workshop on Self-Organizing Maps (2005)Google Scholar
  40. 40.
    Vaidya, J., Clifton, C.: Privacy-preserving k-means clustering over vertically partitioned data. In: SIGKDD (2003)Google Scholar
  41. 41.
    Wu, D.J., Feng, T., Naehrig, M., Lauter, K.E.: Privately evaluating decision trees and random forests. PoPETs, (4) (2016)CrossRefGoogle Scholar
  42. 42.
    Xing, K., Hu, C., Yu, J., Cheng, X., Zhang, F.: Mutual privacy preserving \(k\) -means clustering in social participatory sensing. IEEE Trans. Ind. Inform. 13, 2066–2076 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of MannheimMannheimGermany

Personalised recommendations