Aggregation Methods to Evaluate Multiple Protected Versions of the Same Confidential Data Set
This work is about disclosure risk for national statistical offices and, more particularly, for the case of releasing multiple protected versions of the same micro-data files. This is, several copies of a single original data file are released to several data users. Each user receives a protected copy, and the masking method for each copy is selected according to the research interests of the user: the selected masking method is such that it minimizes the information loss for his/her particular research.
Nevertheless, multiple releases of the same data increase the disclosure risk. This is so, because coalitions of data users can reconstruct original data and, thus, find the original (non-masked) information. In this work we propose a tool for evaluating this reconstruction.
KeywordsInformation Fusion Aggregation Operator Irrelevant Alternative Soft Computing Technique Linguistic Label
Unable to display preview. Download preview PDF.
- 1.Census Bureau, (1993), American Housing Survey 1993, Data publicly available from the U. S. Bureau of the Census through the Data Extraction System, http://www.census.gov/DES/www/welcome.html Google Scholar
- 3.Domingo-Ferrer, J., Torra, V., (2001), A Quantitative Comparison of Disclosure Control Methods for Microdata, 111–133, in Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, P. Doyle, J. I. Lane, J. J. M. Theeuwes, L. M. Zayatz (Eds.), Elsevier.Google Scholar
- 4.Domingo-Ferrer, J., Torra, V., (2001), Disclosure Control Methods and Information Loss for Microdata, 91–110, in Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, P. Doyle, J. I. Lane, J. J. M. Theeuwes, L. M. Zayatz (Eds.), Elsevier.Google Scholar
- 5.Domingo-Ferrer, J., Torra, V., (2002), Aggregation techniques for statistical confidentiality, in “Aggregation operators: New trends and applications”, (Ed.), R. Mesiar, T. Calvo, G. Mayor, Physica-Verlag, Springer.Google Scholar
- 6.Domingo-Ferrer, J., Torra, V., (2002), On the Connections between Statistical Disclosure Control for Microdata and Some Artificial Intelligence Tools, submitted.Google Scholar
- 7.Domingo-Ferrer, J., Torra, V., Valls, A., (2002), Semantic based aggregation for statistical disclosure control, submitted.Google Scholar
- 9.F. Sebe, J. Domingo-Ferrer, J. M. Mateo-Sanz, V. Torra, Post-Masking optimization of the tradeoff between information loss and disclosure risk in masked microdata sets, Lecture Notes in Computer Science 2316, 163–171.Google Scholar
- 10.Torra, V., (1996), Negation functions based semantics for ordered linguistic labels, Int. J. of Intelligent Systems, 11 975–988.Google Scholar
- 11.Torra, Towards the re-identification of individuals in data files with common variables, Proc. of the 14th European Conference on Artificial Intelligence (ECAI2000), Berlin, Germany, 2000.Google Scholar
- 12.Torra, V., (2000), Re-identifying Individuals using OWA Operators, Proc. of the 6th Int. Conference on Soft Computing, Iizuka, Fukuoka, Japan, 2000.Google Scholar
- 13.Willenborg, L., De Waal, T., (1996), Statistical Disclosure Control in Practice, Springer LNS 111.Google Scholar
- 15.Valls, A., Moreno, A., Sanchez, D., A multi-criteria decision aid agent applied to the selection of the best receiver in a transplant, Proc. of the 4th Int. Conference on Enterprise Information Systems, ICEIS, 431–438, Ciudad Real, Spain, 2002.Google Scholar
- 16.Valls, A., Torra, V., (2000), Explaining the consensus of opinions with the vocabulary of the experts, Proc. IPMU 2000, Madrid, Spain, 2000.Google Scholar
- 17.Winkler, W. E., (1995), Advanced methods for record linkage, American Statistical Association, Proceedings of the Section on Survey Research Methods, pp. 467–472.Google Scholar