Skip to main content

On t-Closeness with KL-Divergence and Semantic Privacy

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5982))

Included in the following conference series:

Abstract

In this paper, we study how to sanitize the publishing data with sensitive attribute to achieve t-closeness and δ-disclosure privacy under Incognito framework. t-closeness is a privacy measure proposed to account for skewness attack and similarity attack, which are limitations of l-diversity. Under the t-closeness model, the distance between the privacy attribute distribution and the global one should be under the threshold t. Whereas semantic privacy (δ-disclosure privacy) is used to measure the incremental information gain from the anonymized tables. We use the Kullback-Leibler divergence to measure the distance between distributions and discuss the properties of the semantic privacy. We also study the relationship between t-closeness with KL-divergence and semantic privacy, and show that t-closeness with KL-divergence and δ-disclosure privacy satisfy the generalization property and the subset property, which entail us to use the Incognito algorithm. Experiments demonstrate the efficiency and effectiveness of our approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.cs.cornell.edu/database/privacy/code/l-diversity/incognito-ldiversity.tgz

  2. Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. Journal of Privacy Technology (2005)

    Google Scholar 

  3. Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE 2005 (2005)

    Google Scholar 

  4. Brickell, J., Shmatikov, V.: The cost of privacy: Destruction of data-mining utility in anonymized data publishing. In: KDD 2008 (2008)

    Google Scholar 

  5. Bu, Y., Fu, A., Wong, R., Chen, L., Li, J.: Privacy preserving serial data publishing by role composition. In: VLDB 2008 (2008)

    Google Scholar 

  6. Cover, T., Thomas, J.: Elements of Information Theory. Wiley Interscience, Hoboken (1991)

    Book  MATH  Google Scholar 

  7. Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast data anonymization with low information loss. In: VLDB 2007 (2007)

    Google Scholar 

  8. Kifer, D., Gehrke, J.: Injecting utility into anonymized datasets. In: SIGMOD (2006)

    Google Scholar 

  9. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: Efficient full-domain k-anonymity. In: SIGMOD (2005)

    Google Scholar 

  10. LeFevre, K., DeWitt, D., Ramakrishnan, R.: Workload-aware anonymization. In: KDD 2006 (2006)

    Google Scholar 

  11. Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE 2007 (2007)

    Google Scholar 

  12. Li, T., Li, N.: On the tradeoff between privacy and utility in data publishing. In: KDD 2009 (2009)

    Google Scholar 

  13. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ICDE 2006 (2006)

    Google Scholar 

  14. Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: PODS 2004 (2004)

    Google Scholar 

  15. Park, H., Shim, K.: Approximate algorithms for k-anonymity. In: SIGMOD 2007 (2007)

    Google Scholar 

  16. Samarati, P.: Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13, 1010–1027 (2001)

    Article  Google Scholar 

  17. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10 (2002)

    Google Scholar 

  18. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10 (2002)

    Google Scholar 

  19. Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: VLDB 2006 (2006)

    Google Scholar 

  20. Xiao, X., Tao, Y.: Personalized privacy preservation. In: SIGMOD 2006 (2006)

    Google Scholar 

  21. Xiao, X., Tao, Y.: m-invariance: Towards privacy preserving re-publication of dynamic datasets. In: SIGMOD 2007 (2007)

    Google Scholar 

  22. Xiao, X., Tao, Y.: Dynamic anonymization: Accurate statistical analysis with privacy preservation. In: SIGMOD 2008 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sha, C., Li, Y., Zhou, A. (2010). On t-Closeness with KL-Divergence and Semantic Privacy. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5982. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12098-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12098-5_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12097-8

  • Online ISBN: 978-3-642-12098-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics