Abstract
In this paper, we consider a new scenario for privacy-preserving data mining called two-part partitioned record model (TPR) and find solutions for a family of frequency-based learning algorithms in TPR model. In TPR, the dataset is distributed across a large number of users in which each record is owned by two different users, one user only knows the values for a subset of attributes and the other knows the values for the remaining attributes. A miner aims to learn, for example, classification rules on their data, while preserving each user’s privacy. In this work we develop a cryptographic solution for frequency-based learning methods in TPR. The crucial step in the proposed solution is the privacy-preserving computation of frequencies of a tuple of values in the users’ data, which can ensure each user’s privacy without loss of accuracy.We illustrate the applicability of the method by using it to build the privacy preserving protocol for the naive Bayes classifier learning, and briefly address the solution in other applications. Experimental results show that our protocol is efficient.
An erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02821-7_39
An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02821-7_39
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM (2001)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM (1993)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 439–450. ACM Press (2000)
Agrawal, R., Srikant, R., Thomas, D.: Privacy preserving olap. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 251–262. ACM (2005)
Boneh, D.: The decision diffie-hellman problem. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 48–63. Springer, Heidelberg (1998)
Charu, A.C., Yu, P.S.: Privacy-Preserving Data Mining: Models and Algorithms. ASPVU, Boston (2008)
Domingos, P., Pazzani, M.J.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29(2-3), 103–130 (1997)
Du, W., Zhan, Z.: Building decision tree classifier on private data. In: Proceedings of the IEEE International Conference on Privacy, Security and Data Mining, pp. 1–8. Australian Computer Society, Inc. (2002)
Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 505–510. ACM (2003)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 211–222. ACM (2003)
Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J.: Privacy preserving mining of association rules. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–228. ACM (2002)
Goldreich, O.: Foundations of Cryptography: Basic Tools, vol. 1. Cambridge University Press, New York (2001)
Hirt, M., Sako, K.: Efficient receipt-free voting based on homomorphic encryption. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 539–556. Springer, Heidelberg (2000)
Jacobsen, B.K., Thelle, D.S.: The tromson heart study: The relationship between food habits and the body mass index. Journal of Chronic Diseases 40(8), 795–800 (1987)
Joachim, G.: The relationship between habits of food consumption and reported reactions to food in people with inflammatory bowel disease–testing the limits. J. Nutrition and Health 13(2), 69–83 (1999)
Kantarcioglu, M., Vaidya, J.: Privacy preserving naive bayes classifier for horizontally partitioned data. In: IEEE ICDM Workshop on Privacy Preserving Data Mining (2003)
European Parliament. Eu directive 95/46/ec of the european parliament and of the council on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official J. European Communities, 31 (1995)
Pinkas, B.: Cryptographic techniques for privacy-preserving data mining. SIGKDD Explor. Newsl. 4(2), 12–19 (2002)
Terry, D.J.: Investigating the relationship between parenting styles and delinquent behavior. McNair Scholars Journal 8(1), Article 11 (2004)
Tsiounis, Y., Yung, M.: On the security of elgamal based encryption. In: Imai, H., Zheng, Y. (eds.) PKC 1998. LNCS, vol. 1431, pp. 117–134. Springer, Heidelberg (1998)
Vaidya, J., Clifton, C.: Privacy preserving association rule mining in vertically partitioned data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2002)
Vaidya, J., Kantarciouglu, M., Clifton, C.: Privacy-preserving naive bayes classification. The VLDB Journal 17(4), 879–898 (2008)
Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. SIGMOD Rec. 33(1), 50–57 (2004)
Wu, F., Liu, J., Zhong, S.: An efficient protocol for private and accurate mining of support counts. Pattern Recogn. Lett. 30(1), 80–86 (2009)
Yang, Z., Zhong, S., Wright, R.N.: Anonymity-preserving data collection. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 334–343. ACM (2005)
Yang, Z., Zhong, S., Wright, R.N.: Privacy-preserving classification of customer data without loss of accuracy. In: SIAM SDM, pp. 21–23 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Luong, T.D., Tran, D.H. (2014). Privacy Preserving Frequency-Based Learning Algorithms in Two-Part Partitioned Record Model. In: Huynh, V., Denoeux, T., Tran, D., Le, A., Pham, S. (eds) Knowledge and Systems Engineering. Advances in Intelligent Systems and Computing, vol 245. Springer, Cham. https://doi.org/10.1007/978-3-319-02821-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-02821-7_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02820-0
Online ISBN: 978-3-319-02821-7
eBook Packages: EngineeringEngineering (R0)