Abstract
Masking methods protect data sets against disclosure by perturbing the original values before publication. Masking causes some information loss (masked data are not exactly the same as original data) and does not completely suppress the risk of disclosure for the individuals behind the data set. Information loss can be measured by observing the differences between original and masked data while disclosure risk can be measured by means of record linkage and confidentiality intervals. Outliers in the original data set are particularly difficult to protect, as they correspond to extreme inviduals who stand out from the rest. The objective of our work is to compare, for different masking methods, the information loss and disclosure risk related to outliers. In this way, the protection level offered by different masking methods to extreme individuals can be evaluated.
This work was partly supported by the European Commission under project “CASC” (IST-2000-25069) and by the Spanish Ministry of Science and Technology and the FEDER fund under project “STREAMOBILE” (TIC-2001-0633-C03-01).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dandekar, R.A., Cohen, M., Kirkendall, N.: Sensitive micro data protection using latin hypercube sampling technique. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 245–253. Springer, Heidelberg (2002)
Domingo-Ferrer, J., Mateo-Sanz, J.M.: project “OTTILIE-R: Optimizing the Tradeoff between Information Loss and Disclosure Risk for continuous microdata”, Deliverable D4: “Experiments on test data”, U.S.Bureau of the Census (U.S. Department of Commerce) (2001)
Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14, 189–201 (2002)
Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: Proceedings of ETK-NTTS 2001, pp. 807–825. Eurostat, Luxemburg (2001)
Domingo-Ferrer, J., Torra, V.: Disclosure protection methods and information loss for microdata. In: Zayatz, L., Doyle, P., Theeuwes, J., Lane, J. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 91–110. North-Holland, Amsterdam (2001)
Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 Census of Tampa, Florida’. Journal of the American Statistical Association 84, 414–420 (1989)
Joint Photographic Experts Group, Standard IS 10918-1 (ITU-T T.81), http://www.jpeg.org
Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proc. of the ASA Sect. on Survey Res. Meth., pp. 303–308 (1986)
Moore, R.: Controlled data swapping techniques for masking public use microdata sets. U.S. Bureau of the Census (1996) (unpublished manuscript)
Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Springer, Heidelberg (2001)
Yancey, W.E., Winkler, W.E., Creecy, R.H.: Disclosure risk assissment in perturbative microdata protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mateo-Sanz, J.M., Sebé, F., Domingo-Ferrer, J. (2004). Outlier Protection in Continuous Microdata Masking. In: Domingo-Ferrer, J., Torra, V. (eds) Privacy in Statistical Databases. PSD 2004. Lecture Notes in Computer Science, vol 3050. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25955-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-25955-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22118-0
Online ISBN: 978-3-540-25955-8
eBook Packages: Springer Book Archive