Skip to main content

New Multi-dimensional Sorting Based K-Anonymity Microaggregation for Statistical Disclosure Control

  • Conference paper
Book cover Security and Privacy in Communication Networks (SecureComm 2012)

Abstract

In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be linked to specific individuals. Microaggregation works by partitioning the microdata into groups of at least k records and then replacing the records in each group with the centroid of the group. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a new microaggregation technique for Statistical Disclosure Control (SDC). It consists of two stages. In the first stage, the algorithm sorts all the records in the data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage an optimal microaggregation method is used to create k-anonymous clusters while minimizing the information loss. It works by taking the sorted data and simultaneously creating two distant clusters using the two extreme sorted values as seeds for the clusters. The performance of the proposed technique is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithm has the lowest information loss compared with a basket of techniques in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bezdek, J.C.: Pattern recognition with fuzzy objective function algorithms. Academic Publishers, Norwell (1981)

    Book  MATH  Google Scholar 

  2. Domingo-Ferrer, J., Torra, V.: Privacy in data mining. Data Mining and Knowledge Discovery 11(2), 117–119 (2005)

    Article  MathSciNet  Google Scholar 

  3. Domingo-Ferrer, J., Mateo-Sanz, J.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)

    Article  Google Scholar 

  4. Domingo-Ferrer, J., Torra, V.: Extending microaggregation procedures using defuzzification methods for categorical variables. In: 1st international IEEE Symposium on intelligent Systems, Verna, pp. 44–49 (2002)

    Google Scholar 

  5. May, P., Ehrlich, H.-C., Steinke, T.: ZIB Structure Prediction Pipeline: Composing a Complex Biological Workflow Through Web Services. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 1148–1158. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Domingo-Ferrer, J., Torra, V.: Towards fuzzy c-means based microaggregation. In: Grzegorzewski, P., Hryniewicz, O., Gil, A. (eds.) Soft Methods in Probability, Statistics and Data Analysis. Advances in Soft Computing, vol. 16, pp. 289–294. Physica-Verlag, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Domingo-Ferrer, J., Torra, V.: Fuzzy microaggregation for microdata protection. Journal of Advanced Computational Intelligence and Intelligent Informatics 7(2), 153–159 (2003)

    Article  MATH  Google Scholar 

  8. Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous kanonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)

    Article  MathSciNet  Google Scholar 

  9. Domingo-Ferrer, J., Martinez-Balleste, A., Mateo-Sanz, J.M., Sebe, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15(4), 355–369 (2006)

    Article  Google Scholar 

  10. Domingo-Ferrer, J., Sebe, F., Solanas, A.: A polynomial-time approximation to optimal multivariate microaggregation. Computer and Mathematics with Applications 55(4), 714–732 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  11. Han, J.-M., Cen, T.-T., Yu, H.-Q., Yu, J.: A multivariate immune clonal selection microaggregation algorithm. In: IEEE International Conference on Granular Computing, Hangzhou, pp. 252–256 (2008)

    Google Scholar 

  12. Hansen, S., Mukherjee, S.: A polynomial algorithm for optimal univariate microaggregation. IEEE Transactions on Knowledge and Data Engineering 15(4), 1043–1044 (2003)

    Article  Google Scholar 

  13. Laszlo, M., Mukherjee, S.: Minimum spanning tree partitioning algorithm for microaggregation. IEEE Transactions on Knowledge and Data Engineering 17(7), 902–911 (2005)

    Article  Google Scholar 

  14. Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. Statistical Journal of the United Nations Economic Commission for Europe 18, 345–354 (2001)

    Google Scholar 

  15. Solanas, A.: Privacy protection with genetic algorithms. In: Yang, A., Shan, Y., Bui, L.T. (eds.) Success in Evolutionary Computation. SCI, vol. 92, pp. 215–237. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  16. Solanas, A., Martinez-Balleste, A., Domingo-Ferrer, J.: V − MDAV: A multivariate microaggregation with variable group size. In: 17th COMPSTAT Symposium of the IASC, Rome (2006)

    Google Scholar 

  17. Samarati, P.: Protecting respondent’s privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  18. Sweeney, L.: k-Anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. Torra, V.: Microaggregation for Categorical Variables: A Median Based Approach. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 162–174. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  20. Kabir, M.E., Wang, H.: Systematic Clustering-based Microaggregation for Statistical Disclosure Control. In: IEEE International Conference on Network and System Security, Melbourne, pp. 435–441 (2010)

    Google Scholar 

  21. Kabir, M.E., Wang, H., Bertino, E., Chi, Y.: Systematic Clustering Method for l-diversity Model. In: Australasian Database Conference, Brisbane, pp. 93–102 (2010)

    Google Scholar 

  22. Kabir, M.E., Wang, H., Zhang, Y.: A Pairwise-Systematic Microaggregation for Statistical Disclosure Control. In: IEEE International Conference on Data Mining, Sydney, pp. 266–273 (2010)

    Google Scholar 

  23. Kabir, M.E., Wang, H.: Microdata Protection Method Through Microaggragation: A Median Based Approach. Information Security Journal: A Global Perspective 20(1), 1–8 (2011)

    MathSciNet  Google Scholar 

  24. Ward, J.H.J.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58(301), 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  25. Wang, H., Zhang, Y., Cao, J.: Effective collaboration with information sharing in virtual universities. IEEE Transactions on Knowledge and Data Engineering 21(6), 840–853 (2009)

    Article  Google Scholar 

  26. Willenborg, L., Waal, T.D.: Elements of statistical disclosure control. Lecture Notes in Statistics, vol. 155 (2001)

    Google Scholar 

  27. Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20(1), 68–86 (1971)

    Article  MATH  Google Scholar 

  28. Chang, C.-C., Li, Y.-C., Huang, W.-H.: TFRP: An efficient microaggregation algorithm for statistical disclosure control. Journal of Systems and Software 80(11), 1866–1878 (2007)

    Article  Google Scholar 

  29. Lin, J.-L., Wen, T.-H., Hsieh, J.-C., Chang, P.-C.: Density-based microaggregation for statistical disclosure control. Expert Systems with Applications 37(4), 3256–3263 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Mahmood, A.N., Kabir, M.E., Mustafa, A.K. (2013). New Multi-dimensional Sorting Based K-Anonymity Microaggregation for Statistical Disclosure Control. In: Keromytis, A.D., Di Pietro, R. (eds) Security and Privacy in Communication Networks. SecureComm 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36883-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36883-7_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36882-0

  • Online ISBN: 978-3-642-36883-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics