A Multi-phase k-anonymity Algorithm Based on Clustering Techniques

  • Fei Liu
  • Yan Jia
  • Weihong Han
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 320)


We proposed a new k-anonymity algorithm to publish datasets with privacy protection. We improved clustering techniquesto lower data distort and enhance diversity of sensitive attributes values. Our algorithm includes four phases. Tuples are distributed to several groups in phase one. Tuples in a group own same sensitive value. In phase two, groups smaller than the threshold merge and then they are partitioned into several clusters according to quasi-identifier attributes. Each cluster would become an equivalence class. In phase three, remainder tuples are distributed to clusters evenly to satisfy L-diversity. Finally, quasi-identifier attributes values in each cluster are generalized to satisfy k-anonymity. We used OCC dataset to compare our algorithm with classic method based on clustering. Empirical results showed that our algorithm could be used to publish datasets with high security and limited information loss.


privacy protection k-anonymity cluster L-diversity 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Aggarwal, C.C.: On k-anonymity and the curse of dimensionality. In: VLDB 2005, pp. 901–909 (2005)Google Scholar
  3. 3.
    Aggarwal, G., Feder, T., Kenthapadi, K., Zhu, A., Panigrahy, R., Thomas, D.: Achieving anonymity via clustering in a metric space. In: PODS, pp. 153–162 (2006)Google Scholar
  4. 4.
    Li, J., Wong, R.C.-W., Fu, A.W.-c., Pei, J.: Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 405–416. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    EnamulKabir, M., Wang, H., Bertino, E.: Efficient Systematic Clustering Method for k-Anonymization. ActaInformatic 48(1), 51–66 (2011)Google Scholar
  6. 6.
    Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymization Using Clustering Techniques. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 188–200. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-diversity: Privacy beyond k-anonymity. In: ICDE, p. 24 (2006)Google Scholar
  8. 8.
    Li, J., Wong, R.C.-W., Fu, A.W.-C., Pei, J.: Anonymisation by Local Recoding in Data with Attribute Hierarchical Taxonomies. IEEE Transactions on Knowledge and Data Engineering 20, 1181–1194 (2008)CrossRefGoogle Scholar
  9. 9.
    MPC Data Projects,
  10. 10.
    He, Y., Barman, S., Naughton, J.F.: Preventing Equivalence Attacks in Updated,Anonymized Data. In: ICDE, pp. 529–540 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Fei Liu
    • 1
  • Yan Jia
    • 1
  • Weihong Han
    • 1
  1. 1.School of Computer ScienceNational University of Defense TechnologyChangshaChina

Personalised recommendations