K-Anonymity Algorithm Based on Improved Clustering

  • Wantong Zheng
  • Zhongyue Wang
  • Tongtong Lv
  • Yong Ma
  • Chunfu JiaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11335)


K-anonymity is the most widely used technology in the field of privacy preservation. It has a good performance particularly in protecting data privacy in the scenarios of data publication, location-based service and social network. In this paper, we propose a new algorithm to achieve k-anonymity in a better way through improved clustering, and we optimize the clustering process by considering the overall distribution of quasi-identifier groups in a multidimensional space. With the local optimal clustering, we try our best to guarantee minimized intra-cluster distances and maximized inter-cluster distances. Therefore, the quality of anonymized data can be greatly improved. Compared with some popular algorithms like k-member, Mondrian, and one-time k-means, the experimental results show our algorithm can effectively reduce the information loss while generating equivalence classes. The total information loss of the anonymized dataset decreases by about 20% on average than that of other algorithms. It also performs well in dealing with both numerical attributes and categorical attributes.


Information loss Privacy preservation K-anonymity Clustering 


  1. 1.
    Aggarwal, G., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving anonymity via clustering. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 153–162 (2006)Google Scholar
  2. 2.
    Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 188–200. Springer, Heidelberg (2007). Scholar
  3. 3.
    Fung, B.C.M., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. 42(4), 14 (2010)CrossRefGoogle Scholar
  4. 4.
    Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation. In: International Conference on Data Engineering, pp. 205–216 (2005)Google Scholar
  5. 5.
    Gkountouna, O., Terrovitis, M.: Anonymizing collections of tree-structured data. IEEE Trans. Knowl. Data Eng. 27(8), 2034–2048 (2015)CrossRefGoogle Scholar
  6. 6.
    Lefevre, K., Dewitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering, p. 25 (2006)Google Scholar
  7. 7.
    Li, H., Zhu, H., Du, S., Liang, X., Shen, X.: Privacy leakage of location sharing in mobile social networks: attacks and defense. IEEE Trans. Dependable Sec. Comput. PP(99), 1 (2016)Google Scholar
  8. 8.
    Lin, J.L., Wei, M.C.: An efficient clustering method for k-anonymization. In: International Workshop on Privacy and Anonymity in Information Society, pp. 46–50. ACM (2008)Google Scholar
  9. 9.
    Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the ACM Symposium on Principles of Database Systems, PODS 2004, pp. 223–228. ACM (2004)Google Scholar
  10. 10.
    Ozalp, I., Gursoy, M.E., Nergiz, M.E., Saygin, Y.: Privacy-preserving publishing of hierarchical data. ACM Trans. Priv. Secur. 19(3), 7 (2016)CrossRefGoogle Scholar
  11. 11.
    Palanisamy, B., Liu, L., Zhou, Y., Wang, Q.: Privacy-preserving publishing of multilevel utility-controlled graph datasets. ACM Trans. Internet Technol. 18(2), 24 (2018)CrossRefGoogle Scholar
  12. 12.
    Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: The ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, vol. 98, p. 188. Citeseer (1998)Google Scholar
  13. 13.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)Google Scholar
  14. 14.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: a data mining solution to privacy protection, pp. 249–256. IEEE (2004)Google Scholar
  16. 16.
    Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C.: Utility-based anonymization for privacy preservation with less information loss. ACM SIGKDD Explorations Newsl. 8(2), 21–30 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Wantong Zheng
    • 1
  • Zhongyue Wang
    • 1
  • Tongtong Lv
    • 1
  • Yong Ma
    • 2
  • Chunfu Jia
    • 1
    Email author
  1. 1.College of Cyberspace SecurityNankai UniversityTianjinChina
  2. 2.Civil Aviation University of ChinaTianjinChina

Personalised recommendations