Privacy Risk Diagnosis: Mining l-Diversity

Zare-Mirakabad, Mohammad-Reza; Jantan, Aman; Bressan, Stéphane

doi:10.1007/978-3-642-04205-8_19

Mohammad-Reza Zare-Mirakabad^20,21,
Aman Jantan²⁰ &
Stéphane Bressan²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5667))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

507 Accesses
3 Citations

Abstract

Most of the recent efforts addressing the issue of data privacy have focused on devising algorithms for anonymization and diversification. Our objective is upstream of these works: we are concerned with the diagnosis of privacy risk and more specifically in this paper with l-diversity. We show that diagnosing l-diversity for various definitions of the concept is a knowledge discovery problem that can be mapped to the framework proposed by Mannila and Toivonen. The problem can therefore be solved with level-wise algorithms such as the apriori algorithm. We introduce and prove the necessary monotonicity property with respect to subset operator on attributes set for several instantiations of the l-diversity principle. We present and evaluate an algorithm based on the apriori algorithm. This algorithm computes, for instance, “maximum sets of attributes that can safely be published without jeopardizing sensitive attributes”, even if they were quasi-identifiers available from external sources, and “minimum subsets of attributes which jeopardize anonymity”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Flouvat, F., Marchi, F.D., Petit, J.-M.: iZi: A New Toolkit for Pattern Mining Problems, pp. 131–136. Springer, Heidelberg (2008)
Google Scholar
Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 237–244 (2006)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM Conference on Management of Data (SIGMOD), pp. 207–216 (1993)
Google Scholar
USA CENSUS DATA
Google Scholar
Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)
Google Scholar
Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security. Purdue University (2006)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering, ICDE (2005)
Google Scholar
Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)
Google Scholar
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty. Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Article MathSciNet MATH Google Scholar
Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving Anonymity via Clustering. In: Principles of Database Systems(PODS) (2006)
Google Scholar
Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04. SRI Computer Science Laboratory (1998)
Google Scholar
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-based Anonymization Using Local Recoding. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining (2006)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient Full-domain k-Anonymity. In: ACM Conference on Management of Data (SIGMOD), pp. 49–60 (2005)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering, ICDE (2006)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering, ICDE 2006 (2006)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data 1 (2007)
Google Scholar
Wong, R.C.-W., Li, J., Fu, A.W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD (2006)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), pp. 106–115 (2007)
Google Scholar
Wong, R.C.-W., Liu, Y., Yin, J., Huang, Z., Fu, A.W.-c., Pei, J.: (α, k)-anonymity based privacy preservation by lossy join. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 733–744. Springer, Heidelberg (2007)
Chapter Google Scholar
Li, J., Wong, R.C.-W., Fu, A.W.-C., Pei, J.: Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures, pp. 405–416. Springer, Heidelberg (2006)
Google Scholar
Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM, New York (2007)
Google Scholar
Motwani, R., Xu, Y.: Efficient Algorithms for Masking and Finding Quasi-Identifiers. In: SIAM International Workshop on Practical Privacy-Preserving Data Mining (2008)
Google Scholar
Truta, T.M., Vinay, B.: Privacy Protection: p-Sensitive k-Anonymity Property. In: International Workshop of Privacy Data Management (PDM) Conjunction with 22th International Conference of Data Engineering, ICDE (2006)
Google Scholar
Mannila, H., Toivonen, H.: Levelwise Search and Borders of Theories in Knowledge Discovery. In: Data Mining and Knowledge Discovery, vol. 1, pp. 241–258 (1997)
Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Article Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Chichester (1991)
Book MATH Google Scholar
Zare Mirakabad, M.R., Jantan, A., Bressan, S.: Towards a Privacy Diagnosis Centre: Measuring k-anonymity. In: The 2008 International Symposium on Computer Science and its Applications. IEEE CS, Los Alamitos (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Sciences, Universiti Sains Malaysia, Malaysia
Mohammad-Reza Zare-Mirakabad & Aman Jantan
School of Computing, National University of Singapore, Singapore
Mohammad-Reza Zare-Mirakabad & Stéphane Bressan

Authors

Mohammad-Reza Zare-Mirakabad
View author publications
You can also search for this author in PubMed Google Scholar
Aman Jantan
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bressan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hong Kong University of Science and Technology, Hong Kong
Lei Chen
Swinburne University of Technology, Melbourne, Australia
Chengfei Liu
CSIRO, Castray Esplanade, 7000, Hobart, TAS, Australia
Qing Liu
School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, QLD, Australia
Ke Deng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zare-Mirakabad, MR., Jantan, A., Bressan, S. (2009). Privacy Risk Diagnosis: Mining l-Diversity. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04205-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-04205-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04204-1
Online ISBN: 978-3-642-04205-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics