Advertisement

Privacy-Preserving Multiparty Learning for Logistic Regression

  • Wei DuEmail author
  • Ang Li
  • Qinghua Li
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 254)

Abstract

In recent years, machine learning techniques are widely used in numerous applications, such as weather forecast, financial data analysis, spam filtering, and medical prediction. In the meantime, massive data generated from multiple sources further improve the performance of machine learning tools. However, data sharing from multiple sources brings privacy issues for those sources since sensitive information may be leaked in this process. In this paper, we propose a framework enabling multiple parties to collaboratively and accurately train a learning model over distributed datasets while guaranteeing the privacy of data sources. Specifically, we consider logistic regression model for data training and propose two approaches for perturbing the objective function to preserve \( \epsilon \)-differential privacy. The proposed solutions are tested on real datasets, including Bank Marketing and Credit Card Default prediction. Experimental results demonstrate that the proposed multiparty learning framework is highly efficient and accurate.

References

  1. 1.
  2. 2.
    Abadi, M., et al.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)Google Scholar
  3. 3.
    Bhaskar, R., Laxman, S., Smith, A., Thakurta, A.: Discovering frequent patterns in sensitive data. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 503–512. ACM (2010)Google Scholar
  4. 4.
    Bos, J.W., Lauter, K., Naehrig, M.: Private predictive analysis on encrypted medical data. J. Biomed. Inform. 50, 234–243 (2014)CrossRefGoogle Scholar
  5. 5.
    Bouwen, R., Taillieu, T.: Multi-party collaboration as social learning for interdependence: developing relational knowing for sustainable natural resource management. J. Community Appl. Soc. Psychol. 14(3), 137–153 (2004)CrossRefGoogle Scholar
  6. 6.
    Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)Google Scholar
  7. 7.
    Dwork, C.: Differential privacy: a survey of results. In: Agrawal, M., Du, D., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 1–19. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-79228-4_1CrossRefzbMATHGoogle Scholar
  8. 8.
    Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 9(3–4), 211–407 (2014)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 493–502. ACM (2010)Google Scholar
  10. 10.
    Graepel, T., Lauter, K., Naehrig, M.: ML confidential: machine learning on encrypted data. In: Kwon, T., Lee, M.-K., Kwon, D. (eds.) ICISC 2012. LNCS, vol. 7839, pp. 1–21. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37682-5_1CrossRefGoogle Scholar
  11. 11.
    Heikkilä, M., Okimoto, Y., Kaski, S., Shimizu, K., Honkela, A.: Differentially private Bayesian learning on distributed data. arXiv preprint arXiv:1703.01106 (2017)
  12. 12.
    Inan, A., Kantarcioglu, M., Bertino, E.: Using anonymized data for classification. In: 2009 IEEE 25th International Conference on Data Engineering, ICDE 2009, pp. 429–440. IEEE (2009)Google Scholar
  13. 13.
    Kabir, S.M., Youssef, A.M., Elhakeem, A.K.: On data distortion for privacy preserving data mining. In: 2007 Canadian Conference on Electrical and Computer Engineering, CCECE 2007, pp. 308–311. IEEE (2007)Google Scholar
  14. 14.
    Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques (2007)Google Scholar
  15. 15.
    Kutner, M.H., Nachtsheim, C., Neter, J.: Applied Linear Regression Models. McGraw-Hill/Irwin, New York (2004)Google Scholar
  16. 16.
    Li, H., Xiong, L., Ohno-Machado, L., Jiang, X.: Privacy preserving RBF kernel support vector machine. BioMed Res. Int. 2014, 1–10 (2014)Google Scholar
  17. 17.
    Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)CrossRefGoogle Scholar
  18. 18.
    McSherry, F., Mironov, I.: Differentially private recommender systems: building privacy into the net. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 627–636. ACM (2009)Google Scholar
  19. 19.
    Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)CrossRefGoogle Scholar
  20. 20.
    Ohrimenko, O., et al.: Oblivious multi-party machine learning on trusted processors. In: USENIX Security Symposium, pp. 619–636 (2016)Google Scholar
  21. 21.
    Pathak, M., Rane, S., Raj, B.: Multiparty differential privacy via aggregation of locally trained classifiers. In: Advances in Neural Information Processing Systems, pp. 1876–1884 (2010)Google Scholar
  22. 22.
    Rajkumar, A., Agarwal, S.: A differentially private stochastic gradient descent algorithm for multiparty classification. In: Artificial Intelligence and Statistics, pp. 933–941 (2012)Google Scholar
  23. 23.
    Rudin, W., et al.: Principles of Mathematical Analysis, vol. 3. McGraw-hill, New York (1964)zbMATHGoogle Scholar
  24. 24.
    Shobana, S., Nagajothi, P.: Deriving private information from randomized dataset using data reorganization techniques. Data Min. Knowl. Eng. 4(4), 191–194 (2012)Google Scholar
  25. 25.
    Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321. ACM (2015)Google Scholar
  26. 26.
    Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Cambridge (2016)Google Scholar
  27. 27.
    Yeh, I.C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2), 2473–2480 (2009)CrossRefGoogle Scholar
  28. 28.
    Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: regression analysis under differential privacy. Proc. VLDB Endow. 5(11), 1364–1375 (2012)CrossRefGoogle Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringMichigan State UniversityEast LansingUSA
  2. 2.Department of Computer Science and Computer EngineeringUniversity of ArkansasFayettevilleUSA

Personalised recommendations