Clustering-Based Frequency l-Diversity Anonymization

Zare-Mirakabad, Mohammad-Reza; Jantan, Aman; Bressan, Stéphane

doi:10.1007/978-3-642-02617-1_17

Mohammad-Reza Zare-Mirakabad^22,23,
Aman Jantan²² &
Stéphane Bressan²³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 5576))

Included in the following conference series:

International Conference on Information Security and Assurance

1828 Accesses
2 Citations

Abstract

Privacy preservation is realized by transforming data into k-anonymous (k-anonymization) and l-diverse (l-diversification) versions while minimizing information loss. Frequency l-diversity is possibly the most practical instance of the generic l-diversity principle for privacy preservation. In this paper, we propose an algorithm for frequency l-diversification. Our primary objective is to minimize information loss. Most studies in privacy preservation have focused on k-anonymization. While simple principles of l-diversification algorithms can be obtained by adapting k-anonymization algorithms it is not straightforward for some other principles. Our algorithm, called Bucket Clustering, adapts k-member Clustering. However, in order to guarantee termination we use hashing and buckets as in the Anatomy algorithm. In order to minimize information loss we choose tuples that minimize information loss during the creation of clusters. We empirically show that our algorithm achieves low information loss with acceptable efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security, Purdue University (2006)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering (ICDE) (2005)
Google Scholar
Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)
Google Scholar
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Article MathSciNet MATH Google Scholar
Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory (1998)
Google Scholar
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)
Google Scholar
LeFevre, K., DeWitt, D. J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering (ICDE) (2006)
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering (ICDE 2006) (2006)
Google Scholar
Wong, R. C.-W., Li, J., Fu, A. W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2006)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), 106–115 (2007)
Google Scholar
Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM Press, New York (2007)
Google Scholar
Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Sciences, Universiti Sains Malaysia, Malaysia
Mohammad-Reza Zare-Mirakabad & Aman Jantan
School of Computing, National University of Singapore, Singapore
Mohammad-Reza Zare-Mirakabad & Stéphane Bressan

Authors

Mohammad-Reza Zare-Mirakabad
View author publications
You can also search for this author in PubMed Google Scholar
Aman Jantan
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bressan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Engineering Department, Kyungnam University, 449, Wolyong-dong, 631-701, Kyungnam, Masan, Korea
Jong Hyuk Park
Institute of Communications Engineering, National Sun Yat-Sen University, Kaohsiung City, Taiwan
Hsiao-Hwa Chen
School of Computer Science,, University of Oklahoma, 200 Felgar Street, 73019, P.O. Box, Norman, OK, USA
Mohammed Atiquzzaman
School of Computer Engineering, Hanshin University, Kyeong-Gi, Osan, Korea
Changhoon Lee
Division of Multimedia, Hannam University, Daejeon, Korea
Tai-hoon Kim
Division of Computer Engineering, Mokwon University, Daejeon, Korea
Sang-Soo Yeo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zare-Mirakabad, MR., Jantan, A., Bressan, S. (2009). Clustering-Based Frequency l-Diversity Anonymization. In: Park, J.H., Chen, HH., Atiquzzaman, M., Lee, C., Kim, Th., Yeo, SS. (eds) Advances in Information Security and Assurance. ISA 2009. Lecture Notes in Computer Science, vol 5576. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02617-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-02617-1_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02616-4
Online ISBN: 978-3-642-02617-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics