Abstract
To reconcile the demand of information dissemination and preservation of privacy, a popular approach generalizes the attribute values in the dataset, for example by dropping the last digit of the postal code, so that the published dataset meets certain privacy requirements, like the notions of k-anonymity and ℓ-diversity. On the other hand, the published dataset should remain useful and not over generalized. Hence it is desire to disseminate a database with high “usefulness”, measured by a utility function. This leads to a generic framework whereby the optimal dataset (w.r.t. the utility function) among all the generalized datasets that meet certain privacy requirements, is chosen to be disseminated. In this paper,we observe that, the fact that a generalized dataset is optimal may leak information about the original. Thus, an adversary who is aware of how the dataset is generalized may able to derive more information than what the privacy requirements constrained. This observation challenges the widely adopted approach that treats the generalization process as an optimization problem. We illustrate the observation by giving counter-examples in the context of k-anonymity and ℓ-diversity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys, 515–556 (1989)
Aggarwal, C.C.: On k-anonymity and the curse of dimensionality. In: 31st International Conference on Very Large Data Bases, pp. 901–909 (2005)
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: k-anonymity: Algorithms and hardness. Technical report, Stanford University (2004)
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Anonymizing tables. In: 10th International Conference on Database Theory, pp. 246–258 (2005)
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. Journal of Privacy Technology (2005)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: International Conference on Data Engineering, pp. 217–228 (2005)
Bettini, C., Wang, X.S., Jajodia, S.: Protecting privacy against location-based personal identification. Secure Data Management, 185–199 (2005)
Duncan, G.T., Feinberg, S.E.: Obtaining information while preserving privacy: A markov perturbation method for tabular data. In: Joint Statistical Meetings, pp. 351–362 (1997)
Fung, B., Wang, K., Yu, P.: Top-down specialization for information and privacy preservation. In: International Conference on Data Engineering, pp. 205–216 (2005)
Gedik, B., Liu, L.: A customizable k-anonymity model for protecting location privacy. In: 25th International Conference on Distributed Computing Systems (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: International Conference on Data Engineering (2006)
LeFevrea, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient fulldomain k-anonymity. In: SIGMOD (2005)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. In: International Conference on Data Engineering, p. 24 (2006)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: 23rd ACM Symposium on the principles of Database Systems, pp. 223–228 (2004)
Samarati, P.: Protecting respondents’ identities in microdata release. In: IEEE Transactions on Knowledge and Data Engineering, pp. 1010–1027 (2001)
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, CMU, SRI (1998)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based System, 571–588 (2002)
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based System, 557–570 (2002)
Xu, S., Yung, M.: k-anonymous secret handshakes with reusable credentials. In: 11th ACM Conference on Computer and Communications Security, pp. 158–167 (2004)
Yao, G., Feng, D.: A new k-anonymous message transmission protocol. In: 5th International Workshop on Information Security Applications, pp. 388–399 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fang, C., Chang, EC. (2008). Information Leakage in Optimal Anonymized and Diversified Data. In: Solanki, K., Sullivan, K., Madhow, U. (eds) Information Hiding. IH 2008. Lecture Notes in Computer Science, vol 5284. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88961-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-88961-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88960-1
Online ISBN: 978-3-540-88961-8
eBook Packages: Computer ScienceComputer Science (R0)