Skip to main content

Privacy Risk Diagnosis: Mining l-Diversity

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5667))

Included in the following conference series:

Abstract

Most of the recent efforts addressing the issue of data privacy have focused on devising algorithms for anonymization and diversification. Our objective is upstream of these works: we are concerned with the diagnosis of privacy risk and more specifically in this paper with l-diversity. We show that diagnosing l-diversity for various definitions of the concept is a knowledge discovery problem that can be mapped to the framework proposed by Mannila and Toivonen. The problem can therefore be solved with level-wise algorithms such as the apriori algorithm. We introduce and prove the necessary monotonicity property with respect to subset operator on attributes set for several instantiations of the l-diversity principle. We present and evaluate an algorithm based on the apriori algorithm. This algorithm computes, for instance, “maximum sets of attributes that can safely be published without jeopardizing sensitive attributes”, even if they were quasi-identifiers available from external sources, and “minimum subsets of attributes which jeopardize anonymity”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Flouvat, F., Marchi, F.D., Petit, J.-M.: iZi: A New Toolkit for Pattern Mining Problems, pp. 131–136. Springer, Heidelberg (2008)

    Google Scholar 

  2. Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 237–244 (2006)

    Google Scholar 

  3. Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM Conference on Management of Data (SIGMOD), pp. 207–216 (1993)

    Google Scholar 

  4. USA CENSUS DATA

    Google Scholar 

  5. Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)

    Google Scholar 

  6. Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security. Purdue University (2006)

    Google Scholar 

  7. Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering, ICDE (2005)

    Google Scholar 

  8. Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)

    Google Scholar 

  9. Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty. Fuzziness and Knowledge-based Systems 10, 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving Anonymity via Clustering. In: Principles of Database Systems(PODS) (2006)

    Google Scholar 

  11. Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04. SRI Computer Science Laboratory (1998)

    Google Scholar 

  12. Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)

    Google Scholar 

  13. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-based Anonymization Using Local Recoding. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining (2006)

    Google Scholar 

  14. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient Full-domain k-Anonymity. In: ACM Conference on Management of Data (SIGMOD), pp. 49–60 (2005)

    Google Scholar 

  15. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering, ICDE (2006)

    Google Scholar 

  16. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering, ICDE 2006 (2006)

    Google Scholar 

  17. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data 1 (2007)

    Google Scholar 

  18. Wong, R.C.-W., Li, J., Fu, A.W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD (2006)

    Google Scholar 

  19. Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), pp. 106–115 (2007)

    Google Scholar 

  20. Wong, R.C.-W., Liu, Y., Yin, J., Huang, Z., Fu, A.W.-c., Pei, J.: (α, k)-anonymity based privacy preservation by lossy join. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 733–744. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  21. Li, J., Wong, R.C.-W., Fu, A.W.-C., Pei, J.: Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures, pp. 405–416. Springer, Heidelberg (2006)

    Google Scholar 

  22. Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM, New York (2007)

    Google Scholar 

  23. Motwani, R., Xu, Y.: Efficient Algorithms for Masking and Finding Quasi-Identifiers. In: SIAM International Workshop on Practical Privacy-Preserving Data Mining (2008)

    Google Scholar 

  24. Truta, T.M., Vinay, B.: Privacy Protection: p-Sensitive k-Anonymity Property. In: International Workshop of Privacy Data Management (PDM) Conjunction with 22th International Conference of Data Engineering, ICDE (2006)

    Google Scholar 

  25. Mannila, H., Toivonen, H.: Levelwise Search and Borders of Theories in Knowledge Discovery. In: Data Mining and Knowledge Discovery, vol. 1, pp. 241–258 (1997)

    Google Scholar 

  26. Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  27. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Chichester (1991)

    Book  MATH  Google Scholar 

  28. Zare Mirakabad, M.R., Jantan, A., Bressan, S.: Towards a Privacy Diagnosis Centre: Measuring k-anonymity. In: The 2008 International Symposium on Computer Science and its Applications. IEEE CS, Los Alamitos (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zare-Mirakabad, MR., Jantan, A., Bressan, S. (2009). Privacy Risk Diagnosis: Mining l-Diversity. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04205-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04205-8_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04204-1

  • Online ISBN: 978-3-642-04205-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics