Deriving Private Information from Arbitrarily Projected Data
Distance-preserving projection based perturbation has gained much attention in privacy-preserving data mining in recent years since it mitigates the privacy/accuracy tradeoff by achieving perfect data mining accuracy. One apriori knowledge PCA based attack was recently investigated to show the vulnerabilities of this distance-preserving projected based perturbation approach when a sample dataset is available to attackers. As a result, non-distance-preserving projection was suggested to be applied since it is resilient to the PCA attack with the sacrifice of data mining accuracy to some extent. In this paper we investigate how to recover the original data from arbitrarily projected data and propose AK-ICA, an Independent Component Analysis based reconstruction method. Theoretical analysis and experimental results show that both distance-preserving and non-distance-preserving projection approaches are vulnerable to this attack. Our results offer insight into the vulnerabilities of projection based approach and suggest a careful scrutiny when it is applied in privacy-preserving data mining.
KeywordsIndependent Component Analysis Transformation Matrix Reconstruction Error Independent Component Analysis Privacy Preserve
Unable to display preview. Download preview PDF.
- 1.Agrawal, D., Agrawal, C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th Symposium on Principles of Database Systems (2001)Google Scholar
- 4.Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proceedings of the 5th IEEE International Conference on Data Mining, Houston,TX, Nov. 2005, IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
- 5.Guo, S., Wu, X.: On the use of spectral filtering for privacy preserving data mining. In: Proceedings of the 21st ACM Symposium on Applied Computing, Dijion, France, April 2006, ACM, New York (2006)Google Scholar
- 7.Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proceedings of the ACM SIGMOD Conference on Management of Data, Baltimore, MA, ACM Press, New York (2005)Google Scholar
- 8.Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, Chichester (2001)Google Scholar
- 9.Kargupta, H., et al.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd International Conference on Data Mining, pp. 99–106 (2003)Google Scholar
- 10.Liu, K., Giannella, C., Kargupta, H.: An attacker’s view of distance preserving maps for privacy preserving data mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, Springer, Heidelberg (2006)Google Scholar
- 13.Oliveira, S., Zaiane, O.: Achieving privacy preservation when sharing data for clustering. In: Proceedings of the Workshop on Secure Data Management in a Connected World, Toronto, Canada, August 2004, pp. 67–82 (2004)Google Scholar