Privacy Preserving Client/Vertical-Servers Classification
Abstract
We present a novel client/vertical-servers architecture for hybrid multi-party classification problem. The model consists of clients whose attributes are distributed on multiple servers and remain secret during training and testing. Our solution builds privacy-preserving random forests and completes them with a special private set intersection protocol that provides a central commodity server with anonymous conditional statistics. Subsequently, the private set intersection protocol can be used to privately classify the queries of new clients using the commodity server’s statistics. The proviso is that the commodity server must not collude with other parties. In cases where this restriction is acceptable, it allows an effective method without computationally expensive public key operations, while it is still secure and avoids precision losses. We report the runtime results on some real-world datasets, and discuss different security aspects and finally give an outlook on further improvements.
Keywords
Vertically partitioned data Private evaluation Secure multi-party computation Privacy preserving data mining Random forestReferences
- 1.Aggarwal, C.C., Yu, P.S.: A general survey of privacy-preserving data mining models and algorithms. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining, pp. 11–52. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5_2CrossRefGoogle Scholar
- 2.Blockeel, H., Raedt, L.D.: Top-down induction of first-order logical decision trees. Artif. Intell. 101(1), 285–297 (1998)MathSciNetCrossRefGoogle Scholar
- 3.Bojarski, M., Choromanska, A., Choromanski, K., Lecun, Y.: Differentially- and non-differentially-private random decision trees, October 2014Google Scholar
- 4.Bost, R., Popa, R.A., Tu, S., Goldwasser, S.: Machine learning classification over encrypted data. Cryptology ePrint Archive, Report 2014/331 (2014)Google Scholar
- 5.Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
- 6.Brickell, J., Porter, D.E., Shmatikov, V., Witchel, E.: Privacy-preserving remote diagnostics. In: Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS), pp. 498–507 (2007)Google Scholar
- 7.Bundesamt für Sicherheit in der Informationstechnik: Kryptographische verfahren: Empfehlungen und schluessellaengen, May 2018. https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/TechnischeRichtlinien/TR02102/BSI-TR-02102.pdf?__blob=publicationFile&v=8
- 8.Damgård, I., Jurik, M., Nielsen, J.: A generalization of paillier’s public-key system with applications to electronic voting. Int. J. Inf. Secur. 9, 371–385 (2003)CrossRefGoogle Scholar
- 9.Drummond, C., Holte, R.C.: Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: International Conference on Machine Learning (ICML), pp. 239–246 (2000)Google Scholar
- 10.Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining (CRPIT)-Volume 14, pp. 1–8 (2002)Google Scholar
- 11.Emekci, F., Sahin, O., Agrawal, D., Abbadi, A.E.: Privacy preserving decision tree learning over multiple parties. Data Knowl. Eng. 63(2), 348–361 (2007)CrossRefGoogle Scholar
- 12.Friedman, A., Schuster, A.: Data mining with differential privacy. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 493–502 (2010)Google Scholar
- 13.Wu, D.J., Feng, T., Naehrig, M., Lauter, K.: Privately evaluating decision trees and random forests. Proc. Priv. Enhancing Technol. 2016, 335–355 (2016)CrossRefGoogle Scholar
- 14.Jagannathan, G., Pillaipakkamnatt, K., Wright, R.N.: A practical differentially private random decision tree classifier. In: IEEE International Conference on Data Mining Workshops, pp. 114–121, December 2009Google Scholar
- 15.Kaghazgaran, P., Takabi, H.: Differentially private decision tree learning from distributed data, May 2015Google Scholar
- 16.Kolesnikov, V., Kumaresan, R., Rosulek, M., Trieu, N.: Efficient batched oblivious PRF with applications to private set intersection. In: Proceedings of the ACM SIG SAC Conference on Computer and Communications Security (CCS), pp. 818–829 (2016)Google Scholar
- 17.Kolesnikov, V., Matania, N., Pinkas, B., Rosulek, M., Trieu, N.: Practical multi-party private set intersection from symmetric-key techniques. In: Proceedings of the ACM SIG SAC Conference on Computer and Communications Security (CCS), pp. 1257–1272 (2017)Google Scholar
- 18.Lindell, P.: Privacy preserving data mining. J. Cryptol. 15(3), 177–206 (2002)MathSciNetCrossRefGoogle Scholar
- 19.Liu, X., Li, Q., Li, T., Chen, D.: Differentially private classification with decision tree ensemble. Appl. Soft Comput. 62, 807–816 (2018)CrossRefGoogle Scholar
- 20.Ma, Q., Deng, P.: Secure multi-party protocols for privacy preserving data mining. In: Li, Y., Huynh, D.T., Das, S.K., Du, D.-Z. (eds.) WASA 2008. LNCS, vol. 5258, pp. 526–537. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88582-5_49CrossRefGoogle Scholar
- 21.Quinlan, J.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
- 22.Sravya, C.L., Lakshmi, G.R.: Privacy-preserving data mining with random decision tree framework. IOSR J. Comput. Eng. 19, 43–49 (2017)Google Scholar
- 23.Suthampan, E., Maneewongvatana, S.: Privacy preserving decision tree in multi party environment. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 727–732. Springer, Heidelberg (2005). https://doi.org/10.1007/11562382_75CrossRefGoogle Scholar
- 24.Tai, R.K.H., Ma, J.P.K., Zhao, Y., Chow, S.S.M.: Privacy-preserving decision trees evaluation via linear functions. In: Foley, S.N., Gollmann, D., Snekkenes, E. (eds.) ESORICS 2017. LNCS, vol. 10493, pp. 494–512. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66399-9_27CrossRefGoogle Scholar
- 25.Teng, Z., Du, W.: A hybrid multi-group privacy-preserving approach for building decision trees. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 296–307. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71701-0_30CrossRefGoogle Scholar
- 26.Vaidya, J., Clifton, C.: Privacy-preserving decision trees over vertically partitioned data. In: Jajodia, S., Wijesekera, D. (eds.) DBSec 2005. LNCS, vol. 3654, pp. 139–152. Springer, Heidelberg (2005). https://doi.org/10.1007/11535706_11CrossRefzbMATHGoogle Scholar
- 27.Zhan, J.Z., Chang, L.W., Matwin, S.: Privacy-preserving multi-party decision tree induction. In: Farkas, C., Samarati, P. (eds.) DBSec 2004. IIFIP, vol. 144, pp. 341–355. Springer, Boston, MA (2004). https://doi.org/10.1007/1-4020-8128-6_23CrossRefGoogle Scholar