Skip to main content

Improving Microaggregation for Complex Record Anonymization

  • Conference paper
Modeling Decisions for Artificial Intelligence (MDAI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5285))

  • 816 Accesses

Abstract

Microaggregation is one of the most commonly employed microdata protection methods. This method builds clusters of at least k original records and replaces the records in each cluster with the centroid of the cluster. Usually, when records are complex, i.e., the number of attributes of the data set is large, this data set is split into smaller blocks of attributes and microaggregation is applied to each block, successively and independently. In this way, the information loss when collapsing several values to the centroid of their group is reduced, at the cost of losing the k-anonymity property when at least two attributes of different blocks are known by the intruder.

In this work, we present a new microaggregation method called One dimension microaggregation (Mic1D − κ). This method gathers all the values of the data set into a single sorted vector, independently of the attribute they belong to. Then, it microaggregates all the mixed values together. Our experiments show that, using real data, our proposal obtains lower disclosure risk than previous approaches whereas the information loss is preserved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adam, N.R., Wortmann, J.C.: Security-control for statistical databases: a comparative study. ACM Computing Surveys 21, 515–556 (1989)

    Article  Google Scholar 

  2. Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. In: [6], pp. 91–110 (2001)

    Google Scholar 

  3. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: [6], pp. 111–133 (2001)

    Google Scholar 

  4. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. on Knowledge and Data Engineering 14(1), 189–201 (2002)

    Article  Google Scholar 

  5. Domingo-Ferrer, J., Martínez-Ballesté, A., Mateo-Sanz, J.M., Sebé, F.: Efficient multivariate data-oriented microaggregation. The VLDB Journal 15, 355–369 (2006)

    Article  Google Scholar 

  6. Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.): Confidentiality, disclosure, and data access: theory and practical applications for statistical agencies. Elsevier Science, Amsterdam (2001)

    Google Scholar 

  7. Felsö, F., Theeuwes, J., Wagner, G.: Disclosure Limitation in Use: Results of a Survey. In: [6], pp. 17–42 (2001)

    Google Scholar 

  8. Hansen, S., Mukherjee, S.: A Polynomial Algorithm for Optimal Univariate Microaggregation. Trans. on Knowledge and Data Engineering 15(4), 1043–1044 (2003)

    Article  Google Scholar 

  9. Medrano-Gracia, P., Pont-Tuset, J., Nin, J., Muntés-Mulero, V.: Ordered Data Set Vectorization for Linear Regression on Data Privacy. In: Torra, V., Narukawa, Y., Yoshida, Y. (eds.) MDAI 2007. LNCS (LNAI), vol. 4617, pp. 361–372. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Murphy, P., Aha, D.W.: UCI Repository machine learning databases. University of California, Department of Information and Computer Science, Irvine (1994), http://www.ics.uci.edu/~mlearn/MLRepository.html

    Google Scholar 

  11. Nin, J., Herranz, J., Torra, V.: Attribute Selection in Multivariate Microaggregation. In: Post-Proc. of 11th ACM International Conference on Extending Database Technology (2008)

    Google Scholar 

  12. Nin, J., Herranz, J., Torra, V.: How to group attributes in multivariate microaggregation. Int. J. on Uncertainty, Fuzziness and Knowledge-Based Systems 16(1), 121–138 (2008)

    Article  Google Scholar 

  13. Nin, J., Torra, V.: Analysis of the Univariate Microaggregation Disclosure Risk (submitted, 2007)

    Google Scholar 

  14. Oganian, A., Domingo-Ferrer, J.: On the Complexity of Optimal Microaggregation for Statistical Disclosure Control. Statistical J. United Nations Economic Commission for Europe 18(4), 345–354 (2000)

    Google Scholar 

  15. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression, SRI Intl. Tech. Rep. (1998)

    Google Scholar 

  16. Sande, G.: Exact and approximate methods for data directed microaggregation in one or more dimensions. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 459–476 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  18. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. of Unc., Fuzz. and Knowledge Based Systems 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  19. U.S. Census Bureau, Data Extraction System (1990), http://www.census.gov/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pont-Tuset, J., Nin, J., Medrano-Gracia, P., Larriba-Pey, J.L., Muntés-Mulero, V. (2008). Improving Microaggregation for Complex Record Anonymization. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2008. Lecture Notes in Computer Science(), vol 5285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88269-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88269-5_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88268-8

  • Online ISBN: 978-3-540-88269-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics