A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods

  • Kun Liu
  • Chris Giannella
  • Hillol Kargupta
Part of the Advances in Database Systems book series (ADBS, volume 34)

We focus primarily on the use of additive and matrix multiplicative data perturbation techniques in privacy preserving data mining (PPDM). We survey a recent body of research aimed at better understanding the vulnerabilities of these techniques. These researchers assumed the role of an attacker and developed methods for estimating the original data from the perturbed data and any available prior knowledge. Finally, we briefly discuss research aimed at attacking k-anonymization, another data perturbation technique in PPDM.


Data perturbation additive noise matrix multiplicative noise attack techniques k-anonymity 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    N. R. Adam and J. C. Worthmann. Security-control methods for statistical databases: a comparative study. ACM Computing Surveys (CSUR), 21(4):515–556, 1989.CrossRefGoogle Scholar
  2. 2.
    Charu C. Aggarwal. On k-anonymity and the curse of dimensionality. In Proceedings of the 31st VLDB Conference, pages 901–909, Trondheim, Norway, 2005.Google Scholar
  3. 3.
    Charu C. Aggarwal and Philip S. Yu. A condensation based approach to privacy preserving data mining. In Proceedings of the 9th International Conference on Extending Database Technology (EDBT’04), pages 183–199, Heraklion, Crete, Greece, March 2004.Google Scholar
  4. 4.
    D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 247–255, Santa Barbara, CA, 2001.Google Scholar
  5. 5.
    R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 439–450, Dallas, TX, May 2000.Google Scholar
  6. 6.
    R. Brand. Microdata protection through noise addition. Lecture Notes in Computer Science - Inference Control in Statistical Databases, 2316:97–116, 2002.MathSciNetGoogle Scholar
  7. 7.
    K. Chen and L. Liu. Privacy preserving data classification with rotation perturbation. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05), pages 589–592, Houston, TX, November 2005.Google Scholar
  8. 8.
    K. Chen, G. Sun, and L. Liu. Towards attack-resilient geometric data perturbation. In Proceedings of the 2007 SIAM International Conference on Data Mining (SDM’07), Minneapolis, MN, April 2007.Google Scholar
  9. 9.
    J. Domingo-Ferrer, F. Sebé, and J. Castellà-Roca. On the security of noise addition for privacy in statistical databases. Privacy in Statistical Databases, LNCS3050:149–161, 2004.Google Scholar
  10. 10.
    A. Evfimevski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In Proceedings of the ACM SIGMOD/PODS Conference, San Diego, CA, June 2003.Google Scholar
  11. 11.
    S. E. Fienberg and J. McIntyre. Data swapping: Variations on a theme by dalenius and reiss. Technical report, National Institute of Statistical Sciences, Research Triangle Park, NC, 2003.Google Scholar
  12. 12.
    A. Friedman, R. Wolff, and A. Schuster. Providing k-anonymity in data mining. Journal of VLDB, 2006 (to be published).Google Scholar
  13. 13.
    G. Strang. Linear Algebra and Its Applications (3rd Ed.). Harcourt Brace Jovanovich College Publishers, New York, 1986.Google Scholar
  14. 14.
    S. Guo and X. Wu. On the use of spectral filtering for privacy preserving data mining. In Proceedings of the 21st ACM Symposium on Applied Computing, pages 622–626, Dijon, France, April 2006.Google Scholar
  15. 15.
    S. Guo and X. Wu. Deriving private information from arbitrarily projected data. In Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’07), Nanjing, China, May 2007.Google Scholar
  16. 16.
    S. Guo, X. Wu, and Y. Li. Deriving private information from perturbed data using iqr based approach. In Proceedings of the Second International Workshop on Privacy Data Management (PDM’06), Atlanta, GA, April 2006.Google Scholar
  17. 17.
    S. Guo, X. Wu, and Y. Li. On the lower bound of reconstruction error for spectral filtering based privacy preserving data mining. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06), pages 520–527, Berlin, Germany, September 2006.Google Scholar
  18. 18.
    Z. Huang, W. Du, and B. Chen. Deriving private information from randomized data. In Proceedings of the 2005 ACM SIGMOD Conference, pages 37–48, Baltimroe, MD, June 2005.Google Scholar
  19. 19.
    A. Hyvärinen and E. Oja. Independent component analysis: Algorithms and applications. Neural Networks, 13(4):411–430, June 2000.CrossRefGoogle Scholar
  20. 20.
    I. T. Jolliffe. Principal Component Analysis. Springer Series in Statistics. Springer, second edition, 2002.Google Scholar
  21. 21.
    D. Jonsson. Some limit theorems for the eigenvalues of a sample covariance matrix. Journal of Multivariate Analysis, 12:1–38, 1982.zbMATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceedings of the IEEE International Conference on Data Mining (ICDM’03), pages 99–106, Melbourne, FL, November 2003.Google Scholar
  23. 23.
    J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the American Statistical Association on Survey Research Methods, pages 370–374, Washington, DC, 1986.Google Scholar
  24. 24.
    J. J. Kim and W. E. Winkler. Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census, Washington D.C., April 2003.Google Scholar
  25. 25.
    N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 23rd International Conference on Data Engineering (ICDE’07), pages 106–115, Istanbul, Turkey, April 2007.Google Scholar
  26. 26.
    X.-B. Li and S. Sarkar. A tree-based data perturbation approach for privacy-preserving data mining. IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(9):1278–1283, 2006.CrossRefGoogle Scholar
  27. 27.
    C. K. Liew, U. J. Choi, and C. J. Liew. A data distortion by probability distribution. ACM Transactions on Database Systems (TODS), 10(3):395–411, 1985.zbMATHCrossRefGoogle Scholar
  28. 28.
    K. Liu. Multiplicative Data Perturbation for Privacy Preserving Data Mining. PhD thesis, University of Maryland, Baltimore County, Baltimore, MD, January 2007.Google Scholar
  29. 29.
    K. Liu, C. Giannella, and H. Kargupta. An attacker’s view of distance preserving maps for privacy preserving data mining. In Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06), pages 297–308, Berlin, Germany, September 2006.Google Scholar
  30. 30.
    K. Liu, H. Kargupta, and J. Ryan. Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(1):92–106, January 2006.CrossRefGoogle Scholar
  31. 31.
    M. Kantarcioǧlu, J. Jin, and C. Clifton. When do data mining results violate privacy? In Proceedings of the 10th ACM SIGKDD Conference (KDD’04), pages 599–604, Seattle, WA, August 2004.Google Scholar
  32. 32.
    A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1(1), 2006.Google Scholar
  33. 33.
    S. Mukherjee, Z. Chen, and A. Gangopadhyay. A privacy preserving technique for euclidean distance-based mining algorithms using fourier-related transforms. The VLDB Journal, 15(4):293–315, 2006.CrossRefGoogle Scholar
  34. 34.
    K. Muralidhar and R. Sarathy. Data shuffling - a new masking approach for numerical data. Management Science, 52(5):658–670, May 2006.CrossRefGoogle Scholar
  35. 35.
    J. A. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, 7:308–313, 1965.zbMATHGoogle Scholar
  36. 36.
    S. R. M. Oliveira and O. R. Zaïane. Privacy preserving clustering by data transformation. In Proceedings of the 18th Brazilian Symposium on Databases, pages 304–318, Manaus, Amazonas, Brazil, October 2003.Google Scholar
  37. 37.
    S. R. M. Oliveira and O. R. Zaïane. Privacy preservation when sharing data for clustering. In Proceedings of the International Workshop on Secure Data Management in a Connected World, pages 67–82, Toronto, Canada, August 2004.Google Scholar
  38. 38.
    P. Samarati. Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027, November/December 2001.CrossRefGoogle Scholar
  39. 39.
    J. W. Silverstein and P. L. Combettes. Signal detection via spectral theory of large dimensional random matrices. IEEE Transactions on Signal Processing, 40(8):2100–2105, 1992.CrossRefGoogle Scholar
  40. 40.
    G. W. Stewart and Ji-Guang Sun. Matrix Perturbation Theory. Academic Press, 1990.Google Scholar
  41. 41.
    L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557–570, 2002.zbMATHCrossRefMathSciNetGoogle Scholar
  42. 42.
    G. J. Szekély and M. L. Rizzo. Testing for equal distributions in high dimensions. InterStat, November(5), 2004.Google Scholar
  43. 43.
    P. Tendick. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference, 27(2):341–353, 1991.zbMATHCrossRefMathSciNetGoogle Scholar
  44. 44.
    M. Trottini, S. E. Fienberg, U. E. Makov, and M. M. Meyer. Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: A simulation study. Journal of Computational Methods in Sciences and Engineering, 4:5–16, 2004.zbMATHGoogle Scholar
  45. 45.
    V. S. Verykios, A. K. Elmagarmid, E. Bertino, Y. Saygin, and E. Dasseni. Association rule hiding. In IEEE Transactions on Knowledge and Data Engineering, volume 16, pages 434–447, 2004.Google Scholar
  46. 46.
    K. Wang, Benjamin C. M. Fung, and Philip S. Yu. Handicapping attacker’s confidence: an alternative to k-anonymization. Knowledge and Information Systems, 11(3):345–368, 2007.CrossRefGoogle Scholar
  47. 47.
    E. P. Wigner. On the statistical distribution of the widths and spacings of nuclear resonance levels. Proceedings of the Cambridge Philosophical Society, 47:790–798, 1952.CrossRefGoogle Scholar
  48. 48.
    R. Chi-Wing Wong, J. Li, A. Wai-Chee Fu, and K. Wang. (α,k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. In Proceedings of the 12th ACM SIGKDD Conference (KDD’06), pages 754–759, Philadelphia, PA, August 2006.Google Scholar
  49. 49.
    Y. Zhu and L. Liu. Optimal randomization for privacy preserving data mining. In Proceedings of the 10th ACM SIGKDD Conference (KDD’04), pages 761–766, Seattle, WA, August 2004.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Kun Liu
    • 1
  • Chris Giannella
    • 2
  • Hillol Kargupta
    • 3
  1. 1.IBM Almaden esearch CenterSan JoseUSA
  2. 2.Department of Computer ScienceLoyola College in MarylandBaltimoreUSA
  3. 3.Department of Computer Science and Electrical EngineeringUniversity of MarylandBaltimoreUSA

Personalised recommendations