Clustering-Based k-Anonymity

He, Xianmang; Chen, HuaHui; Chen, Yefang; Dong, Yihong; Wang, Peng; Huang, Zhenhua

doi:10.1007/978-3-642-30217-6_34

Xianmang He²³,
HuaHui Chen²³,
Yefang Chen²³,
Yihong Dong²³,
Peng Wang²⁴ &
…
Zhenhua Huang²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7301))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3003 Accesses
10 Citations

Abstract

Privacy is one of major concerns when data containing sensitive information needs to be released for ad hoc analysis, which has attracted wide research interest on privacy-preserving data publishing in the past few years. One approach of strategy to anonymize data is generalization. In a typical generalization approach, tuples in a table was first divided into many QI (quasi-identifier)-groups such that the size of each QI-group is no less than k. Clustering is to partition the tuples into many clusters such that the points within a cluster are more similar to each other than points in different clusters. The two methods share a common feature: distribute the tuples into many small groups. Motivated by this observation, we propose a clustering-based k-anonymity algorithm, which achieves k-anonymity through clustering. Extensive experiments on real data sets are also conducted, showing that the utility has been improved by our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Samarati, P.: Protecting respondents’ identities in microdata release. TKDE 13(6), 1010–1027 (2001)
Google Scholar
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS 1998, p. 188. ACM, New York (1998)
Chapter Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations, Berkeley, pp. 281–297 (1967)
Google Scholar
Kalnis, P., Ghinita, G., Mouratidis, K., Papadias, D.: Preventing location-based identity inference in anonymous spatial queries. TKDE 19(12), 1719–1733 (2007)
Google Scholar
Mokbel, M.F., Chow, C.-Y., Aref, W.G.: The new casper: query processing for location services without compromising privacy. In: VLDB 2006, pp. 763–774 (2006)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ICDE 2006, p. 24 (2006)
Google Scholar
Li, N., Li, T.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: KDD 2007, pp. 106–115 (2007)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-based anonymization using local recoding. In: KDD 2006, pp. 785–790. ACM (2006)
Google Scholar
Ghinita, G., Karras, P., Kalnis, P., Mamoulis, N.: Fast data anonymization with low information loss. In: VLDB 2007, pp. 758–769. VLDB Endowment (2007)
Google Scholar
Fung, B.C.M., Wang, K., Yu, P.S.: Top-down specialization for information and privacy preservation, pp. 205–216 (2005)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Workload-aware anonymization. In: KDD 2006, pp. 277–286. ACM, New York (2006)
Chapter Google Scholar
Wong, W.K., Mamoulis, N., Cheung, D.W.L.: Non-homogeneous generalization in privacy preserving data publishing. In: SIGMOD 2010, pp. 747–758. ACM, New York (2010)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: SIGMOD 2005, pp. 49–60. ACM, New York (2005)
Chapter Google Scholar
Iwuchukwu, T., Naughton, J.F.: K-anonymization as spatial indexing: toward scalable and incremental anonymization. In: VLDB 2007, pp. 746–757 (2007)
Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: ICDE 2006, Washington, DC, USA, p. 25 (2006)
Google Scholar
Gionis, A., Mazza, A., Tassa, T.: k-anonymization revisited. In: ICDE 2008, pp. 744–753. IEEE Computer Society, Washington, DC (2008)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE 2005, pp. 217–228. IEEE Computer Society, Washington, DC (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Technology, NingBo University, No.818, Fenghua Road, Ning Bo, 315122, P.R. China
Xianmang He, HuaHui Chen, Yefang Chen & Yihong Dong
School of Computer Science and Technology, Fudan University, No.220, Handan Road, Shanghai, 200433, P.R. China
Peng Wang
School of Electronic and Information Engineering, Tongji University, No.1239. Siping Road, Shanghai, 200433, P.R. China
Zhenhua Huang

Authors

Xianmang He
View author publications
You can also search for this author in PubMed Google Scholar
HuaHui Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yefang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Dong
View author publications
You can also search for this author in PubMed Google Scholar
Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenhua Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Michigan State University, 428 S. Shaw Lane, 48824-1226, East Lansing, MI, USA
Pang-Ning Tan
School of Information Technologies, University of Sydney, 1 Cleveland St., 2006, Sydney, NSW, Australia
Sanjay Chawla
Faculty of Computing and Informatics, Jalan Multimedia, Multimedia University, 63100, Cyberjaya, Selangor, Malaysia
Chin Kuan Ho
Department of Computing and Information Systems, The University of Melbourne, 111 Barry Street, 3053, Melbourne, VIC, Australia
James Bailey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, X., Chen, H., Chen, Y., Dong, Y., Wang, P., Huang, Z. (2012). Clustering-Based k-Anonymity. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-30217-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30216-9
Online ISBN: 978-3-642-30217-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics