Privacy-Preserving Data Mining
Data Mining techniques that use specialized approaches to protect against the disclosure of private information may involve anonymizing private data, distorting sensitive values, encrypting data, or other means to ensure that sensitive data is protected.
The field of privacy-preserving data mining began in 2000 with two papers of that name [1,4]. Both papers addressed construction of decision trees, approximating the ID3 algorithm while limiting disclosure of data. While the problems appeared similar on the surface, the fundamental difference in privacy constraints shows the complexity of this field. In , the assumption was that individuals were providing their own data to a common server, and added noise to sensitive values to protect privacy. The key to the technique was to discover the original distribution of the data, enabling successful construction of the decision tree. In , the data was presumed to be divided between two (or a small...
- 1.Agrawal R, Srikant R. Privacy-preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 439–50.Google Scholar
- 2.Atallah MJ, Elmongui HG, Deshpande V, Schwarz LB. Secure supply-chain protocols. In: Proceedings of the IEEE International Conference on E-commerce; 2003. p. 293–302.Google Scholar
- 3.Kaski S. Dimensionality reduction by random mapping. In: Proceedings of the International Joint Conference on Neural Networks; 1999. p. 413–8.Google Scholar
- 5.Oliveira SRM, Zaïane OR. Privacy preserving clustering by data transformation. In: Proceedings of the 18th Brazilian Symposium on Databases; 2003.Google Scholar
- 6.Vaidya J, Clifton C. Privacy-preserving outlier detection. In: Proceedings of the 4th IEEE International Conference on Data Mining; 2004. p. 233–40.Google Scholar