K−Means Clustering Microaggregation for Statistical Disclosure Control

  • Md. Enamul Kabir
  • Abdun Naser Mahmood
  • Abdul K. Mustafa
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 174)


 This paper presents a K-means clustering technique that satisfies the bi-objective function to minimize the information loss and maintain k-anonymity. The proposed technique starts with one cluster and subsequently partitions the dataset into two or more clusters such that the total information loss across all clusters is the least, while satisfying the k-anonymity requirement. The structure of K− means clustering problem is defined and investigated and an algorithm of the proposed problem is developed. The performance of the K− means clustering algorithm is compared against the most recent microaggregation methods. Experimental results show that K− means clustering algorithm incurs less information loss than the latest microaggregation methods for all of the test situations.


Information Loss Means Cluster Intra Cluster Distance Statistical Disclosure Control Anonymization Technique 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Domingo-Ferrer, J., Mateo-Sanz, J.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)CrossRefGoogle Scholar
  2. 2.
    Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous kanonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Domingo-Ferrer, J., Martinez-Balleste, A., Mateo-Sanz, J.M., Sebe, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15(4), 355–369 (2006)CrossRefGoogle Scholar
  4. 4.
    Domingo-Ferrer, J., Sebe, F., Solanas, A.: A polynomial-time approximation to optimal mul tivariate microaggregation. Computer and Mathematics with Applications 55(4), 714–732 (2008)MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Samarati, P.: Protecting respondent’s privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)CrossRefGoogle Scholar
  6. 6.
    Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Kabir, M.E., Wang, H.: Systematic Clustering-based Microaggregation for Statistical Disclosure Control. In: Proc. IEEE International Conference on Network and System Security, Melbourne, pp. 435–441 (September 2010)Google Scholar
  8. 8.
    Kabir, M.E., Wang, H., Bertino, E., Chi, Y.: Systematic Clustering Method for l-diversity Model. In: Proc. Australasian Database Conference, Brisbane, pp. 93–102 (January 2010)Google Scholar
  9. 9.
    Kabir, M.E., Wang, H.: Microdata Protection Method Through Microaggragation: A Median Based Approach. Information Security Journal: A Global Perspective (in press)Google Scholar
  10. 10.
    Chang, C.-C., Li, Y.-C., Huang, W.-H.: TFRP: An efficient microaggregation algorithm for statistical disclosure control. Journal of Systems and Software 80(11), 1866–1878 (2007)CrossRefGoogle Scholar
  11. 11.
    Lin, J.-L., Wen, T.-H., Hsieh, J.-C., Chang, P.-C.: Density-based microaggregation for statistical disclosure control. Expert Systems with Applications 37(4), 3256–3263 (2010)CrossRefGoogle Scholar
  12. 12.
    Lloyd, S.: Least squares quantization in PCM. IEEE Transactions on Information Theory 28(2), 129–137 (1982)MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer India 2013

Authors and Affiliations

  • Md. Enamul Kabir
    • 1
  • Abdun Naser Mahmood
    • 1
  • Abdul K. Mustafa
    • 1
  1. 1.University of New South WalesKensingtonAustralia

Personalised recommendations