Skip to main content

On Optimizing the k-Ward Micro-aggregation Technique for Secure Statistical Databases

  • Conference paper
Information Security and Privacy (ACISP 2006)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 4058))

Included in the following conference series:

Abstract

We consider the problem of securing a statistical database by utilizing the well-known micro-aggregation strategy, and in particular, the k-Ward strategy introduced in [1] and utilized in [2]. The latter scheme, which represents the state-of-the-art, coalesces the sorted data attribute values into groups, and on being queried, reports the means of the corresponding groups. We demonstrate that such a scheme can be optimized on two fronts. First of all, we minimize the computations done in evaluating the between-class distance matrix, to require only a constant number of updating distance computations. Secondly, and more importantly, we propose that the data set be partitioned recursively before a k-Ward strategy is invoked, and that the latter be invoked on the “primitive” sub-groups which terminate the recursion. Our experimental results, done on two benchmark data sets, demonstrate a marked improvement. While the information loss is comparable to the k-Ward micro-aggregation technique proposed by Domingo-Ferrer et.al. [2], the computations required to achieve this loss is a fraction of the computations required in the latter – providing a computational advantage which sometimes exceeds 80% if one method is used by itself, and more than 90% if both enhancements are invoked simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ward, J.H.: Hierarchical grouping to optimize an objective function. J. American Statistical Association 58, 236–245 (1963)

    Article  Google Scholar 

  2. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14, 189–201 (2002)

    Article  Google Scholar 

  3. Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21, 515–556 (1989)

    Article  Google Scholar 

  4. Baeyens, Y., Defays, D.: Estimation of variance loss following microaggregation by the individual ranking method. In: Proceedings of Statistical Data Protection 1998, pp. 101–108. Office for Official Publications of the Eur. Comm., Luxembourg (1999)

    Google Scholar 

  5. Cuppen, M.: Source Data Perturbation in Statistical Disclosure Control. PhD thesis, Statistics Netherlands (2000)

    Google Scholar 

  6. Mateo-Sanz, J.M., Domingo-Ferrer, J.: A method for data-oriented multivariate microaggregation. In: Proceedings of Statistical Data Protection 1998, pp. 89–99. Office for Official Publications of the European Communities, Luxembourg (1999)

    Google Scholar 

  7. Hansen, S.L., Mukherjee, S.: A polynomial algorithm for univariate optimal microaggregation. IEEE Trans. on Know. and Data Eng. 15, 1043–1044 (2003)

    Article  Google Scholar 

  8. Laszlo, M., Mukherjee, S.: Minimum spanning tree partitioning algorithm for microaggregation. IEEE Trans. on Know. and Data Eng. 17, 902–911 (2005)

    Article  Google Scholar 

  9. Mateo-Sanz, J.M., Domingo-Ferrer, J.: A comparative study of microaggregation methods. Questiio 22, 511–526 (1998)

    MATH  Google Scholar 

  10. Solanas, A., Martínez-Ballesté, A., Domingo-Ferrer, J., Mateo-Sanz, J.: A 2d-tree-based blocking method for microaggregating very large data sets. In: The First International Conference on Availability, Reliability and Security (2006)

    Google Scholar 

  11. Defays, D., Nanopoulos, P.: Panels of enterprises and confidentiality: the small aggregates method. In: Proceedings of 92 Symposium on Design and Analysis of Longitudinal Surveys, pp. 195–204. Statistics Canada, Ottawa (1993)

    Google Scholar 

  12. Defays, D., Anwar, N.: Micro-aggregation: A generic method. In: Proceedings of the 2nd International Symposium on Statistical Confidentiality, pp. 69–78. Office for Official Publications of the European Communities, Luxembourg (1995)

    Google Scholar 

  13. Solanas, A., Martínez-Ballesté, A.: V-mdav: A multivariate microaggregation with variable group size. In: 17th COMPSTAT Symposium of the IASC, Rome (2006)

    Google Scholar 

  14. Li, Y., Zhu, S., Wang, L., Jajodia, S.: A privacy-enhanced microaggregation method. In: Eiter, T., Schewe, K.-D. (eds.) FoIKS 2002. LNCS, vol. 2284, pp. 148–159. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  15. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Resampling for statistical confidentiality in contingency tables. Comp. and Math. with App. 38, 13–32 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  16. Fayyoumi, E., Oommen, B.J.: (Enhancing k-ward micro-aggregation for secure statistical databases using distance-based and recursive optimizations) Unabridged Version of This Paper

    Google Scholar 

  17. Brucker, P.: On the complexity of clustering problems. In: Hehn, R., Korte, B., Oettli, W. (eds.) Optimization and Operations Research, pp. 45–54 (1977)

    Google Scholar 

  18. Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 113–134. Springer, Berlin (2002)

    Google Scholar 

  19. Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for protection of numerical microdata. Technical report, CASC PROJECT, Computational Aspects of Statistical Confidentiality (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fayyoumi, E., Oommen, B.J. (2006). On Optimizing the k-Ward Micro-aggregation Technique for Secure Statistical Databases. In: Batten, L.M., Safavi-Naini, R. (eds) Information Security and Privacy. ACISP 2006. Lecture Notes in Computer Science, vol 4058. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780656_27

Download citation

  • DOI: https://doi.org/10.1007/11780656_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35458-1

  • Online ISBN: 978-3-540-35459-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics