Skip to main content
Log in

Publishing Set-Valued Data Against Realistic Adversaries

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Privacy protection in publishing set-valued data is an important problem. However, privacy notions proposed in prior works either assume that the adversary has unbounded knowledge and hence provide over-protection that causes excessive distortion, or ignore the knowledge about the absence of certain items and do not prevent attacks based on such knowledge. To address these issues, we propose a new privacy notion, (k, )(m,n)-privacy, which prevents both the identity disclosure and the sensitive item disclosure to a realistic privacy adversary who has bounded knowledge about the presence of items and the bounded knowledge about the absence of items. In addition to the new notion, our contribution is an efficient algorithm that finds a near-optimal solution and is applicable for anonymizing real world databases. Extensive experiments on real world databases showed that our algorithm outperforms the state of the art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Liu J, Pan Y, Wang K, Han J. Mining frequent item sets by opportunistic projection. In Proc. the 8th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002, pp.229–238.

  2. Narayanan A, Shmatikov V. How to break anonymity of the Netflix prize dataset. ArXiv Computer Science e-prints, Volume: abs/cs/061, December 2005, pp.1–10.

  3. Adar E. User 4XXXXX9: Anonymizing query logs. In Query Log Analysis Workshop at the 16th Int. World Wide Web Conf., Banff, Canada, May 2007.

  4. Xiong L, Agichtein E. Towards privacy-preserving query log publishing. In Query Log Analysis Workshop at Int. World Wide Web Conf., Banff, Canada, May 2007.

  5. He Y, Naughton J. Anonymization of set-valued data via top-down, local generalization. In Proc. Very Large Data Bases Conf., Lyon, France, August 2009, pp.934–945.

  6. Ghinita G, Tao Y, Kalnis P. On the anonymization of sparse high-dimensional data. In Proc. Int. Conf. Data Engineering, Cancun, Mexico, April 2008, pp.715–724.

  7. Terrovitis M, Mamoulis N, Kalnis P. Privacy preserving anonymization of set-valued data. In Proc. the 34th Very Large Data Bases Conf., Auckland, New Zealand, Aug. 2008, pp.115–125.

  8. Xu Y, Wang K, Fu A, Yu P S. Anonymizing transaction databases for publication. In Proc. the 14th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Las Vegas, USA, August 2008, pp.767–775.

  9. Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information. In Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Seattle, USA, June 1998, p.188.

  10. Aggarwal C C. On k-anonymity and the curse of dimensionality. In Proc. the 31st Very Large Data Bases Conf., Trondheim, Norway, August 2005, pp.901–909.

  11. LeFevre K, DeWitt D, Ramakrishnan R. Mondrian multidimensional k-anonymity. In Proc. Int. Conf. Data Engineering, Atlanta, USA, April 2006, pp.25.

  12. Liu J, Wang K. On optimal anonymization for +-diversity. In Proc. Int. Conf. Data Engineering, Long Beach, USA, March 2010, pp.213–224.

  13. Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M. -Diversity: Privacy beyond k-anonymity. In Proc. Int. Conf. Data Engineering, Atlanta, USA, April 2006, p.24.

  14. Machanavajjhala A, Gehrke J, Götz M. Data publishing against realistic adversaries. In Proc. Very Large Data Bases Conf., Lyon, France, August 2009, pp.790–801.

  15. Martin D J, Kifer D, Machanavajjhala A, Gehrke J. Worst-case background knowledge for privacy preserving data publishing. In Proc. Int. Conf. Data Engineering, Istanbul, Turkey, April 2007, pp.126–135.

  16. Cormode G, Srivastava D, Yu T, Zhang Q. Anonymizing bipartite graph data using safe groupings. In Proc. the 34th Very Large Data Bases Conf., Auckland, New Zealand, August 2008, pp.833–844.

  17. Evfimievski A, Srikant R, Agrawal R, Gehrke J. Privacy preserving mining of association rules. In Proc. the 8th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002, pp.217–228.

  18. Verykios V S, Elmagarmid A K, Bertino E, Saygin Y, Dasseni E. Association rule hiding. Trans. Knowledge and Data Engineering, April 2004, 16(4): 434–447.

    Article  Google Scholar 

  19. Atzori M, Bonchi F, Giannotti F, Pedreschi D. Anonymity preserving pattern discovery. VLDB Journal, July 2008, 17(4):703–727.

    Article  Google Scholar 

  20. Dwork C. Differential privacy. In Proc. the 33rd Int. Colloquium on Automata, Languages and Programming, Venice, Italy, July 2006, Part II, pp.1–12.

  21. Korolova A, Kenthapadi K, Mishra N, Ntoulas A. Releasing search queries and clicks privately. In Proc. Int. World Wide Web Conf., Madrid, Spain, April 2009, pp.171–180.

  22. Xiao Y, Xiong L, Yuan C. Differentially private data release through multidimensional partitioning. In Secure Data Management Workshop at Very Large Data Bases Conf., Singapore, September 2010, pp.150–168.

  23. Chen R, Mohammed N, Fung B C M, Desai B C, Xiong L. Publishing set-valued data via differential privacy. In Proc. the 37th Very Large Data Bases Conf., Seattle, USA, August 2011, pp.1087–1098.

  24. Iyengar V. Transforming data to satisfy privacy constraints. In Proc. the 8th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Edmonton, Canada, July 2002, pp.279–288.

  25. Xu J, Wang W, Pei J, Wang X, Shi B, Fu A. Utility-based anonymization using local recoding. In Proc. the 12th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Philadelphia, USA, August 2006, pp.785–790.

  26. Bayardo R J, Agrawal R. Data privacy through optimal k-anonymization. In Proc. Int. Conf. Data Engineering, Tokyo, Japan, April 2005, pp.217–228.

  27. Zheng Z, Kohavi R, Mason L. Real world performance of association rule algorithms. In Proc. the 7th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, San Francisco, USA, August 2001, pp.401–406.

  28. Xiao X, Tao Y. Anatomy: Simple and effective privacy preservation. In Proc. the 32nd Very Large Data Bases Conf., Seoul, Korea, Sept. 2006, pp.139–150.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun-Qiang Liu.

Additional information

This work is supported in part by the Natural Science Foundation of Zhejiang Provice of China under Grant No. Y105700, and the Science and Technology Development Plan of Zhejiang Province of China under Grant No. 2006C21034.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 104 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, JQ. Publishing Set-Valued Data Against Realistic Adversaries. J. Comput. Sci. Technol. 27, 24–36 (2012). https://doi.org/10.1007/s11390-012-1203-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1203-6

Keywords

Navigation