Abstract
Most of the recent efforts addressing the issue of data privacy have focused on devising algorithms for anonymization and diversification. Our objective is upstream of these works: we are concerned with the diagnosis of privacy risk and more specifically in this paper with l-diversity. We show that diagnosing l-diversity for various definitions of the concept is a knowledge discovery problem that can be mapped to the framework proposed by Mannila and Toivonen. The problem can therefore be solved with level-wise algorithms such as the apriori algorithm. We introduce and prove the necessary monotonicity property with respect to subset operator on attributes set for several instantiations of the l-diversity principle. We present and evaluate an algorithm based on the apriori algorithm. This algorithm computes, for instance, “maximum sets of attributes that can safely be published without jeopardizing sensitive attributes”, even if they were quasi-identifiers available from external sources, and “minimum subsets of attributes which jeopardize anonymity”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Flouvat, F., Marchi, F.D., Petit, J.-M.: iZi: A New Toolkit for Pattern Mining Problems, pp. 131–136. Springer, Heidelberg (2008)
Knobbe, A.J., Ho, E.K.Y.: Maximally informative k-itemsets and their efficient discovery. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 237–244 (2006)
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: ACM Conference on Management of Data (SIGMOD), pp. 207–216 (1993)
USA CENSUS DATA
Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (1998)
Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-Anonymity Using Clustering Technique. In: CERIAS Tech Report 2006-10, Center for Education and Research in Information Assurance and Security. Purdue University (2006)
Bayardo, R.J., Agrawal, R.: Data Privacy through Optimal k-Anonymization. In: 21st International Conference on Data Engineering, ICDE (2005)
Xiao, X., Tao, Y.: Anatomy: Simple and Effective Privacy Preservation. In: Very Large Data Bases (VLDB) Conference, pp. 139–150 (2006)
Sweeney, L.: k-Anonymity: A Model for Protecting Privacy. International Journal on Uncertainty. Fuzziness and Knowledge-based Systems 10, 557–570 (2002)
Aggarwal, G., Feder, T., Kenthapadi, K., Khuller, S., Panigrahy, R., Thomas, D., Zhu, A.: Achieving Anonymity via Clustering. In: Principles of Database Systems(PODS) (2006)
Samarati, P., Sweeney, L.: Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement through Generalization and Suppression. In: Technical Report SRI-CSL-98-04. SRI Computer Science Laboratory (1998)
Iyengar, V.: Transforming Data to Satisfy Privacy Constraints. In: SIGKDD, pp. 279–288 (2002)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-based Anonymization Using Local Recoding. In: 12th ACM SIGKDD international conference on Knowledge discovery and data mining (2006)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: Efficient Full-domain k-Anonymity. In: ACM Conference on Management of Data (SIGMOD), pp. 49–60 (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian Multidimensional k-Anonymity. In: 22nd International Conference on Data Engineering, ICDE (2006)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. In: IEEE 22nd International Conference on Data Engineering, ICDE 2006 (2006)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-Diversity: Privacy beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data 1 (2007)
Wong, R.C.-W., Li, J., Fu, A.W.-C., Wang, K.: (alpha, k)-Anonymity: An Enhanced k-Anonymity Model for Privacy Preserving Data Publishing. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD (2006)
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In: IEEE 23rd International Conference on Data Engineering (ICDE), pp. 106–115 (2007)
Wong, R.C.-W., Liu, Y., Yin, J., Huang, Z., Fu, A.W.-c., Pei, J.: (α, k)-anonymity based privacy preservation by lossy join. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.) APWeb/WAIM 2007. LNCS, vol. 4505, pp. 733–744. Springer, Heidelberg (2007)
Li, J., Wong, R.C.-W., Fu, A.W.-C., Pei, J.: Achieving k-Anonymity by Clustering in Attribute Hierarchical Structures, pp. 405–416. Springer, Heidelberg (2006)
Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast Data Anonymization with Low Information Loss. In: Very Large Data Bases (VLDB) Conference. ACM, New York (2007)
Motwani, R., Xu, Y.: Efficient Algorithms for Masking and Finding Quasi-Identifiers. In: SIAM International Workshop on Practical Privacy-Preserving Data Mining (2008)
Truta, T.M., Vinay, B.: Privacy Protection: p-Sensitive k-Anonymity Property. In: International Workshop of Privacy Data Management (PDM) Conjunction with 22th International Conference of Data Engineering, ICDE (2006)
Mannila, H., Toivonen, H.: Levelwise Search and Borders of Theories in Knowledge Discovery. In: Data Mining and Knowledge Discovery, vol. 1, pp. 241–258 (1997)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering 13(6), 1010–1027 (2001)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Chichester (1991)
Zare Mirakabad, M.R., Jantan, A., Bressan, S.: Towards a Privacy Diagnosis Centre: Measuring k-anonymity. In: The 2008 International Symposium on Computer Science and its Applications. IEEE CS, Los Alamitos (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zare-Mirakabad, MR., Jantan, A., Bressan, S. (2009). Privacy Risk Diagnosis: Mining l-Diversity. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04205-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-04205-8_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04204-1
Online ISBN: 978-3-642-04205-8
eBook Packages: Computer ScienceComputer Science (R0)