A generalization model for multi-record privacy preservation

  • Xinning Li
  • Zhiping ZhouEmail author
Original Research


Privacy preservation becomes a more and more serious problem in data publication, which has drawn dramatic attention in research and development. Recently, several privacy preservation models and algorithms have been proposed for publishing data. However, most of the previous methods suffer from more than one drawback as follows: (i) Could not be used on multi-record datasets. (ii) Only guarantee one-way generalization. (iii) User privacy preferences are ignored. In order to satisfy higher privacy requirements and make it suitable for multi-record publishing datasets, a bidirectional personalized generalization (BP-generalization) model is proposed as a new solution in this paper. The rational is to focus anonymous objects on both relational and set-valued information. First, we merge tuples with the same attribute values in multi-record datasets to ensure the validity of quasi-identifier anonymity. Second, by enforcing l-diversity on equivalence groups and k-anonymity on fingerprint buckets respectively, privacy preservation model may resist bi-directional chain attack. Finally, a new hierarchical generalization strategy is also proposed for personal privacy preservation of sensitive attributes, then different generalization rules can be adopted for different levels of sensitive values. Extensive experimental results on two datasets show that the performance of our method is better than state-of-art techniques in terms of efficiency and information loss.


Privacy preservation Data publication Multi-record microdata Generalization 


Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.


  1. Acs G, Achara JP, Castelluccia C (2015) Probabilistic km-anonymity efficient anonymization of large set-valued datasets. In: 2015 IEEE international conference on big data (Big Data), pp 1164–1173Google Scholar
  2. Chen Z, Kang H, Yin S, Kim S (2016) An efficient privacy protection in mobility social network services with novel clustering-based anonymization. Eurasip J Wirel Commun Netw 2016(1):275CrossRefGoogle Scholar
  3. Ge Z, Song Z, Ding SX (2017) Data mining and analytics in the process industry: the role of machine learning. IEEE Access 5:20590–20616CrossRefGoogle Scholar
  4. Ghinita G, Karras P, Kalnis P, Mamoulis N (2007) Fast data anonymization with low information loss. In: 33rd international conference on very large data bases, VLDB 2007–conference proceedings, pp 758 – 769Google Scholar
  5. He Y, Naughton JF (2009) Anonymization of set-valued data via top-down, local generalization. Proc VLDB Endow 2(1):934–945CrossRefGoogle Scholar
  6. Le J, Zhang D, Mu N, Liao X, Yang F (2018) Anonymous privacy preservation based on m-signature and fuzzy processing for real-time data release. IEEE Trans Syst Man Cybern Syst 99:1–13CrossRefGoogle Scholar
  7. LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: 22nd International conference on data engineering (ICDE’06) vol 1, p 25Google Scholar
  8. Li B, Liu Y, Han X, Zhang J (2018) Cross-bucket generalization for information and privacy preservation. IEEE Trans Knowl Data Eng 30(3):449–459CrossRefGoogle Scholar
  9. Liu X, Xie Q, Wang L (2017) Personalized extended (alpha, k)-anonymity model for privacy preserving data publishing. Concurr Comput Pract Exp 29(6):e3886CrossRefGoogle Scholar
  10. Loukides G, Gkoulalas-Divanis A, Shao J (2013) Efficient and flexible anonymization of transaction data. Knowl Inf Syst 36(1):153–210CrossRefGoogle Scholar
  11. Lu Q, Wang C, Xiong Y, Xia H, Huang W, Gong X (2017) Personalized privacy-preserving trajectory data publishing. Chin J Electron 26(2):285–291CrossRefGoogle Scholar
  12. Ni S, Xie M, Qian Q (2017) Clustering based k-anonymity algorithm for privacy preservation. IJ Netw Secur 19(6):1062–1071Google Scholar
  13. Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadopoulos S (2013) Anonymizing data with relational and transaction attributes. Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8190 LNAI(PART 3), pp 353–369Google Scholar
  14. Sei Y, Okumura H, Takenouchi T, Ohsuga A (2017) Anonymization of sensitive quasi-identifiers for l-diversity and t-closeness. In: IEEE transactions on dependable and secure computing, pp 1–1Google Scholar
  15. Sheela MA, Vijayalakshmi K (2017) Partition based perturbation for privacy preserving distributed data mining. Cybernetics and Information Technologies 17(2):44–55CrossRefGoogle Scholar
  16. Sopaoglu U, Abul O (2017) A top-down k-anonymization implementation for apache spark. In: 2017 IEEE international conference on big data (big data), pp 4513–4521Google Scholar
  17. Sweeney L (2002) K-generalization: A model for protecting privacy. Int J Uncertain Fuzziness Knowl Based Syst 10(5):557–570MathSciNetCrossRefzbMATHGoogle Scholar
  18. Terrovitis M, Mamoulis N, Kalnis P (2011) Local and global recoding methods for anonymizing set-valued data. VLDB J 20(1):83–106CrossRefGoogle Scholar
  19. Terrovitis M, Liagouris J, Mamoulis N, Skiadopoulos S (2012) Privacy preservation by disassociation. Proc VLDB Endow 5(10):944–955CrossRefGoogle Scholar
  20. Wang K, Wang P, Fu AW, Wong RCW (2016) Generalized bucketization scheme for flexible privacy settings. Inf Sci 348:377–393MathSciNetCrossRefzbMATHGoogle Scholar
  21. Wang SL, Tsai YC, Kao HY, Hong TP (2011) Extending suppression for anonymization on set-valued data. Int J Innov Comput Inf Control 7(12):6849–6863Google Scholar
  22. Wang SL, Tsai YC, Kao HY (2014) On anonymizing transactions with sensitive items. Appl Intell 41(4):1043–1058CrossRefGoogle Scholar
  23. Xiao X, Yi K, Tao Y (2010) The hardness and approximation algorithms for l-diversity. Advances in Database Technology—EDBT 2010. In: 13th International conference on extending database technology, proceedings, pp 135 – 146Google Scholar
  24. Xin Y, Xie Z, Yang J (2017) The privacy preserving method for dynamic trajectory releasing based on adaptive clustering. Inf Sci 378:131–143CrossRefGoogle Scholar
  25. Zakerzadeh H, Aggarwal CC, Barker K (2016) Managing dimensionality in data privacy anonymization. Knowl Inf Syst 49(1):341–373CrossRefGoogle Scholar
  26. Zhang H, Zhou Z, Ye L (2015) Towards privacy preserving publishing of set-valued data on hybrid cloud. IEEE Trans Cloud Comput 6(2):316–329CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Internet of Things EngineeringJiangnan UniversityWuxiChina
  2. 2.Engineering Research Center of Internet of Things Technology Applications Ministry of EducationJiangnan UniversityWuxiChina

Personalised recommendations