Clustering-Based Frequency l-Diversity Anonymization

  • Mohammad-Reza Zare-Mirakabad
  • Aman Jantan
  • Stéphane Bressan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5576)


Privacy preservation is realized by transforming data into k-anonymous (k-anonymization) and l-diverse (l-diversification) versions while minimizing information loss. Frequency l-diversity is possibly the most practical instance of the generic l-diversity principle for privacy preservation. In this paper, we propose an algorithm for frequency l-diversification. Our primary objective is to minimize information loss. Most studies in privacy preservation have focused on k-anonymization. While simple principles of l-diversification algorithms can be obtained by adapting k-anonymization algorithms it is not straightforward for some other principles. Our algorithm, called Bucket Clustering, adapts k-member Clustering. However, in order to guarantee termination we use hashing and buckets as in the Anatomy algorithm. In order to minimize information loss we choose tuples that minimize information loss during the creation of clusters. We empirically show that our algorithm achieves low information loss with acceptable efficiency.


Equivalence Class Information Loss Optimistic Cluster Privacy Preservation Sensitive Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security, Purdue University (2006)Google Scholar
  2. 2.
    Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering (ICDE) (2005)Google Scholar
  3. 3.
    Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)Google Scholar
  4. 4.
    Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10, 557–570 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04, SRI Computer Science Laboratory (1998)Google Scholar
  6. 6.
    Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)Google Scholar
  7. 7.
    LeFevre, K., DeWitt, D. J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering (ICDE) (2006)Google Scholar
  8. 8.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering (ICDE 2006) (2006)Google Scholar
  9. 9.
    Wong, R. C.-W., Li, J., Fu, A. W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2006)Google Scholar
  10. 10.
    Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), 106–115 (2007)Google Scholar
  11. 11.
    Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM Press, New York (2007)Google Scholar
  12. 12.
    Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Mohammad-Reza Zare-Mirakabad
    • 1
    • 2
  • Aman Jantan
    • 1
  • Stéphane Bressan
    • 2
  1. 1.School of Computer SciencesUniversiti Sains MalaysiaMalaysia
  2. 2.School of ComputingNational University of SingaporeSingapore

Personalised recommendations