Abstract
We propose a new scheme for multiple parties to conduct data mining computations without disclosing their actual data sets to each other. We then apply the new scheme to let multiple parties build a decision tree classifier on their joint data set. We evaluate our scheme through a set of experiments. The empirical results show the tradeoffs between privacy and accuracy can be obtained.
Chapter PDF
References
R. Quinlan. Introduction of Decision Trees. In Journal of Machine Learning, Vol. 1, Pages: 81–106, 1986.
J. Vaidya and C. Clifton. Privacy-preserving k-means clustering over vertically partitioned data. In Proc. of the 8th ACM SIGKDD Int’l Conference on Knowledge Discovery and Data Mining, 2003, Washington, D.C, USA.
J. Vaidya and C. Clifton. Privacy-Preserving Association Rule Mining in Vertically Partitioned Data. In Proc. of the 8th ACM SIGKDD Int’l Conference on Knowledge Discovery and Data Mining, 2002, Edmonton, Canada.
W. B. Barksdale. New Randomized Response Techniques for Control of Nonsampling Errors in Surveys. University of North Carolina, Chapel Hill, 1971.
O. Goldreich. Secure Multi-party Computation (working draft), 1998.
S. Goldwasser. Multi-Party Computations: Past and Present. Proceedings of the 16th Annual ACM Symposium on Principles of Distributed Computing, 1997, Santa Barbara, CA USA, August 21–24.
J. Han and M. Kamber. Data Mining Concepts and Techniques. Morgan Kaufmann Publishers, 2001.
Y. Lindell and B. Pinkas. Privacy Preserving Data Mining. Advances in Cryptology-CRYPTO’ 00,2000,1880 of Lecture Notes in Computer Science. Spinger-Verlag, 36–54.
A. C. Tamhane. Randomized Response Techniques for Multiple Sensitive Attributes. The American Statistical Association, 1981, volume 76, pages 916–923, December.
S. L. Warner. Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. The American Statistical Association, 1965, March, volume 60, pages 63–69.
A. C. Yao. Protocols for secure computations. Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, 1982.
Z. Zhan and L. Chang. Privacy-Preserving Collaborative Data Mining, Workshop on Foundation and New Direction of Data Mining at The 2003 IEEE International Conference on Data Mining (ICDM’03), 2003 November 19, Melbourne, Florida, USA.
O. Goldreich, S. Micali and A. Wigderson. How to Play any Mental Game. Proceedings of the 19th Annual ACM Symposium on Theory of Computing, pages: 218–229, 1987.
M. Franklin, Z. Galil and M. Yung. An Overview of Secure Distributed Computing, Department of Computer Science, Columbia University, 1992.
W. Du and Z. Zhan. Building Decision Tree Classifier on Private Data, Workshop on Privacy, Security, and Data Mining at The 2002 IEEE International Conference on Data Mining, December 9, Maebashi City, Japan, 2002.
W. Du and Z. Zhan. Using Randomized Response Techniques For Privacy-Preserving Data Mining. Proceedings of The 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 24–27, 2003, Washington, DC, USA.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer Science + Business Media, Inc.
About this paper
Cite this paper
Zhan, J.Z., Chang, L., Matwin, S. (2004). Privacy-Preserving Multi-Party Decision Tree Induction. In: Farkas, C., Samarati, P. (eds) Research Directions in Data and Applications Security XVIII. IFIP International Federation for Information Processing, vol 144. Springer, Boston, MA. https://doi.org/10.1007/1-4020-8128-6_23
Download citation
DOI: https://doi.org/10.1007/1-4020-8128-6_23
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-8127-9
Online ISBN: 978-1-4020-8128-6
eBook Packages: Springer Book Archive