Advertisement

Journal of Computer Science and Technology

, Volume 33, Issue 6, pp 1231–1242 | Cite as

Privacy-Preserving Algorithms for Multiple Sensitive Attributes Satisfying t-Closeness

  • Rong Wang
  • Yan ZhuEmail author
  • Tung-Shou Chen
  • Chin-Chen Chang
Regular Paper
  • 53 Downloads

Abstract

Although k-anonymity is a good way of publishing microdata for research purposes, it cannot resist several common attacks, such as attribute disclosure and the similarity attack. To resist these attacks, many refinements of kanonymity have been proposed with t-closeness being one of the strictest privacy models. While most existing t-closeness models address the case in which the original data have only one single sensitive attribute, data with multiple sensitive attributes are more common in practice. In this paper, we cover this gap with two proposed algorithms for multiple sensitive attributes and make the published data satisfy t-closeness. Based on the observation that the values of the sensitive attributes in any equivalence class must be as spread as possible over the entire data to make the published data satisfy t-closeness, both of the algorithms use different methods to partition records into groups in terms of sensitive attributes. One uses a clustering method, while the other leverages the principal component analysis. Then, according to the similarity of quasiidentifier attributes, records are selected from different groups to construct an equivalence class, which will reduce the loss of information as much as possible during anonymization. Our proposed algorithms are evaluated using a real dataset. The results show that the average speed of the first proposed algorithm is slower than that of the second proposed algorithm but the former can preserve more original information. In addition, compared with related approaches, both proposed algorithms can achieve stronger protection of privacy and reduce less.

Keywords

data privacy k-anonymity t-closeness multiple sensitive attribute 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2018_1884_MOESM1_ESM.pdf (186 kb)
ESM 1 (PDF 185 kb)

References

  1. [1]
    Sánchez D, Martínez S, Domingo-Ferrer J. Comment on “Unique in the shopping mall: On the reidentifiability of credit card metadata”. Science, 2016, 351(6279): 1274.Google Scholar
  2. [2]
    Sweeney L. k-anonymity: A model for protecting privacy. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2002, 10(5): 557-570.Google Scholar
  3. [3]
    LeFevre K, DeWitt D J, Ramakrishnan R. Mondrian multidimensional k-anonymity. In Proc. the 22nd International Conference on Data Engineering, April 2006, p.25.Google Scholar
  4. [4]
    Machanavajjhala A, Gehrke J, Kifer D. Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. In Proc. the 22nd International Conference on Data Engineering, April 2006, p.24.Google Scholar
  5. [5]
    Li N H, Li T C, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proc. the 23rd International Conference on Data Engineering, April 2007, pp.106-115.Google Scholar
  6. [6]
    Domingo-Ferrer J, Soria-Comas J. From t-closeness to differential privacy and vice versa in data anonymization. Knowledge-Based Systems, 2015, 74: 151-158.Google Scholar
  7. [7]
    Rebollo-Monedero D, Forne J, Domingo-Ferrer J. From t-closeness-like privacy to postrandomization via information theory. IEEE Trans. Knowl. Data Eng., 2010, 22(11): 1623-1636.Google Scholar
  8. [8]
    Cao J N, Karras P, Kalnis P, Tan K L. SABRE: A sensitive attribute bucketization and redistribution framework for t-closeness. The VLDB Journal, 2011, 20: 59-81.Google Scholar
  9. [9]
    Soria-Comas J, Domingo-Ferrer J, Sánchez D, Martínez S. t-closeness through microaggregation: Strict privacy with enhanced utility preservation. IEEE Trans. Knowl. Data Eng., 2015, 27(11): 3098-3110.Google Scholar
  10. [10]
    Sha C F, Li Y, Zhou A Y. On t-closeness with KL-divergence and semantic privacy. In Proc. the 15th International Conference on Database Systems for Advanced Applications, April 2010, pp.153-167.Google Scholar
  11. [11]
    Zhang J P, Xie J, Yang J, Zhang B. A t-closeness privacy model based on sensitive attribute values semantics bucketization. Journal of Computer Research and Development, 2014, 51(1): 126-137. (in Chinese)Google Scholar
  12. [12]
    Rubner Y, Tomasi C, Guibas L J. The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 2000, 40(2): 99-121.Google Scholar
  13. [13]
    Xu J, Wang W, Pei J, Wang X Y, Shi B L, Fu A W C. Utility-based anonymization using local recoding. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2006, pp.785-790.Google Scholar
  14. [14]
    Ghinita G, Karras P, Kalnis P, Mamoulis N. Fast data anonymization with low information loss. In Proc. the 33rd International Conference on Very Large Data Bases, September 2007, pp.758-769.Google Scholar
  15. [15]
    LeFevre K, DeWitt D J, Ramakrishnan R. Incognito: Efficient full-domain k-anonymity. In Proc. ACM SIGMOD International Conference on Management of Data, June 2005, pp.49-60.Google Scholar
  16. [16]
    Li N H, Li T C, Venkatasubramanian S. Closeness: A new privacy measure for data publishing. IEEE Trans. Knowl. Data Eng., 2010, 22(7): 943-956.Google Scholar
  17. [17]
    Fang Y, Ashrafi M Z, Ng S K. Privacy beyond single sensitive attribute. In Proc. the 22nd International Conference on Database and Expert Systems Applications, August 2011, pp.187-201.Google Scholar
  18. [18]
    Sei Y C, Okumura H, Takenouchi T, Ohsuga A. Anonymization of sensitive quasiidentifiers for l-diversity and t-closeness. IEEE Transactions on Dependable and Secure Computing. doi:10.1109/TDSC.2017.2698472.Google Scholar
  19. [19]
    Höppner F, Klawonn F. Clustering with size constraints. In Computational Intelligence Paradigms, Jain L C, Sato-Ilic M, Virvou M, Tsihrintzis G A, Balas V E (eds.), Springer, Berlin, Heidelberg, 2008, pp.167-180.Google Scholar
  20. [20]
    Jolliffe I T, Cadima J. Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2016, 374(2065): 20150202.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Rong Wang
    • 1
  • Yan Zhu
    • 1
    Email author
  • Tung-Shou Chen
    • 2
  • Chin-Chen Chang
    • 3
  1. 1.School of Information Science and TechnologySouthwest Jiaotong UniversityChengduChina
  2. 2.Department of Computer Science and Information Engineering“National” Taichung University of Science and TechnologyTaichungChina
  3. 3.Department of Information Engineering and Computer ScienceFeng Chia UniversityTaichungChina

Personalised recommendations