Abstract
We examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and produce exactly the same results as if applied to the original data e.g. distance-based clustering, k-nearest neighbor classification. However, the issue of how well the original data is hidden has, to our knowledge, not been carefully studied. We take a step in this direction by assuming the role of an attacker armed with two types of prior information regarding the original data. We examine how well the attacker can recover the original data from the transformed data and prior information. Our results offer insight into the vulnerabilities of distance preserving transformations.
Chapter PDF
References
Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proc. ACM SIGMOD, pp. 439–450 (2000)
Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random data perturbation techniques and privacy preserving data mining. Knowledge and Information Systems 7(5), 387–414 (2005)
Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proc. ACM SIGMOD, pp. 37–48 (2005)
Sweeney, L.: K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proc. IEEE ICDM, pp. 589–592 (2005)
Oliveira, S.R.M., Zaïane, O.R.: Privacy preservation when sharing data for clustering. In: Proc. Workshop on Secure Data Management in a Connected World, pp. 67–82 (2004)
Artin, M.: Algebra. Prentice Hall, Englewood Cliffs (1991)
Adam, N.R., Worthmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21(4), 515–556 (1989)
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer Series in Statistics. Springer, Heidelberg (2002)
Strang, G.: Linear Algebra and Its Applications, 3rd edn. Harcourt Brace Jovanovich College Publishers, New York (1986)
Szekély, G.J., Rizzo, M.L.: Testing for equal distributions in high dimensions. InterStat (5) (November 2004)
Vaidya, J., Clifton, C., Zhu, M.: Privacy Preserving Data Mining. In: Advances in Information Security, vol. 19. Springer, Heidelberg (2006)
Kim, J.J., Winkler, W.E.: Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census (2003)
Liu, K., Kargupta, H., Ryan, J.: Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)
Evfimevski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. ACM PODS (2003)
Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proc. 28th VLDB, pp. 682–693 (2002)
Hore, B., Mehrotra, S., Tsudik, G.: A privacy-preserving index for range queries. In: Proc. 30th VLDB, pp. 720–731 (2004)
Verykios, V.S., Elmagarmid, A.K., Elisa, B., Saygin, Y., Elena, D.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16(4), 434–447 (2004)
Fienberg, S.E., McIntyre, J.: Data swapping: Variations on a theme by dalenius and reiss. Technical report, U.S. National Institute of Statistical Sciences (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, K., Giannella, C., Kargupta, H. (2006). An Attacker’s View of Distance Preserving Maps for Privacy Preserving Data Mining. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science(), vol 4213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871637_30
Download citation
DOI: https://doi.org/10.1007/11871637_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45374-1
Online ISBN: 978-3-540-46048-0
eBook Packages: Computer ScienceComputer Science (R0)