(k, l)-Clustering for Transactional Data Streams Anonymization

  • Jimmy TekliEmail author
  • Bechara Al BounaEmail author
  • Youssef Bou IssaEmail author
  • Marc Kamradt
  • Ramzi Haraty
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11125)


In this paper, we address the correlation problem in the anonymization of transactional data streams. We propose a bucketization-based technique, entitled (k, l)-clustering to prevent such privacy breaches by ensuring that the same k individuals remain grouped together over the entire anonymized stream. We evaluate our algorithm in terms of utility by considering two different (k, l)-clustering approaches.


Data privacy Data stream Correlation Anonymization 


  1. 1.
    Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)CrossRefGoogle Scholar
  2. 2.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Campan, A., Cooper, N., Truta, T.M.: On-the-fly generalization hierarchies for numerical attributes revisited. In: Jonker, W., Petković, M. (eds.) SDM 2011. LNCS, vol. 6933, pp. 18–32. Springer, Heidelberg (2011). Scholar
  4. 4.
    He, Y., Naughton, J.F.: Anonymization of set-valued data via top-down, local generalization. Proc. VLDB Endow. 2(1), 934–945 (2009)CrossRefGoogle Scholar
  5. 5.
    Anjum, A., Raschia, G.: BangA: an efficient and flexible generalization-based algorithm for privacy preserving data publication. Computers 6(1), 1 (2017)CrossRefGoogle Scholar
  6. 6.
    Xiao, X., Tao. Y.: Anatomy: simple and effective privacy preservation. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea (2006)Google Scholar
  7. 7.
    Li, T., Li, N., Zhang, J., Molloy, I.: Slicing: a new approach for privacy preserving data publishing. IEEE Trans. Knowl. Data Eng. 24(3), 561–574 (2012)CrossRefGoogle Scholar
  8. 8.
    Ciriani, V., De Capitani Di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P.: Combining fragmentation and encryption to protect privacy in data storage. ACM Trans. Inf. Syst. Secur. 13, 22:1–22:33 (2010)CrossRefGoogle Scholar
  9. 9.
    Manolis, T., Nikos, M., John, L., Spiros, S.: Privacy preservation by disassociation. Proc. VLDB Endow. 5(10), 944–955 (2012)CrossRefGoogle Scholar
  10. 10.
    Wang, K., Wang, P., Fu, A.W., Wong, R.C.: Generalized bucketization scheme for flexible privacy settings. Inf. Sci. 348, 377–393 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Wong, R.C., Fu, A.W., Wang, K., Yu, P., Jian, P.: Can the utility of anonymized data be used for privacy breaches? ACM Trans. Knowl. Discov. Data 5(3), 16:1–16:24 (2011)CrossRefGoogle Scholar
  12. 12.
    Cormode, G., Li, N., Li, T., Srivastava, D.: Minimizing minimality and maximizing utility: analyzing method-based attacks on anonymized data. Proc. VLDB Endow. 3, 1045–1056 (2010)CrossRefGoogle Scholar
  13. 13.
    Kifer, D., Attacks on privacy and deFinetti’s theorem. In: SIGMOD Conference, pp. 127–138 (2009)Google Scholar
  14. 14.
    Al Bouna, B., Clifton, C., Malluhi, Q.M.: Efficient sanitization of unsafe data correlations. In: Proceedings of the Workshops of the EDBT/ICDT 2015 Joint Conference (EDBT/ICDT), Brussels, Belgium, pp. 278–285 (2015)Google Scholar
  15. 15.
    Li, T., Li, N.: Injector: mining background knowledge for data anonymization. In: ICDE, pp. 446–455 (2008)Google Scholar
  16. 16.
    Al Bouna, B., Clifton, C., Malluhi, Q.: Using Safety constraint for transactional dataset anonymization. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 164–178. Springer, Heidelberg (2013). Scholar
  17. 17.
    Al Bouna, B., Clifton, C., Malluhi, Q.M.: Anonymizing transactional datasets. J. Comput. Secur. 23(1), 89–106 (2015)CrossRefGoogle Scholar
  18. 18.
    Gong, Q., Luo, J., Yang, M., Ni, W., Li, X.I.: Anonymizing 1: M microdata with high utility. Knowl.-Based Syst. 115(Suppl. C), 15–26 (2017)CrossRefGoogle Scholar
  19. 19.
    Lu, J., Wang, P., Zhao, L., Yang, J.: Sanatomy: privacy preserving publishing of data streams via anatomy. In: 2010 Third International Symposium on Information Processing (ISIP). IEEE (2010)Google Scholar
  20. 20.
    Yazdani, N., Amiri, F., Shakery, A.: Bottom-up sequential anonymization in the presence of adversary knowledge. Inf. Sci. 405, 316–335 (2018)MathSciNetGoogle Scholar
  21. 21.
    Cao, J., Carminati, B., Ferrari, E., Tan, K.: Castle: continuously anonymizing data streams. IEEE Trans. Dependable Secur. Comput. 8(3), 337–352 (2011)CrossRefGoogle Scholar
  22. 22.
    Zhao, L., Wang, P., Lu, J., Yang, J.: B-castle: an efficient publishing algorithm for k-anonymizing data streams. In: 2010 Second WRI Global Congress on Intelligent Systems (GCIS), pp. 2155–6083. IEEE (2011)Google Scholar
  23. 23.
    Zakerzadeh, H., Osborn, S.L.: FAANST: fast anonymizing algorithm for numerical streaming DaTa. In: Garcia-Alfaro, J., Navarro-Arribas, G., Cavalli, A., Leneutre, J. (eds.) DPM/SETOP -2010. LNCS, vol. 6514, pp. 36–50. Springer, Heidelberg (2011). Scholar
  24. 24.
    Guo, K., Zhang, Q.: Fast clustering-based anonymization approaches with time constraints for data streams. Knowl.-Based Syst. 46, 95–108 (2013)CrossRefGoogle Scholar
  25. 25.
    Noferesti, M., Mohammadian, E., Jalili, R.: Fast: Fast anonymization of big data streams. In: Proceeding BigDataScience, 14 Proceedings of the 2014 International Conference on Big Data Science and Computing. ACM (2014)Google Scholar
  26. 26.
    Shakery, A., Amiri, F., Yazdani, N., Chinaei, A.H.: Hierarchical anonymization algorithms against background knowledge attack in data releasing. Knowl.-Based Syst. 101, 71–89 (2016)CrossRefGoogle Scholar
  27. 27.
    Domingo-Ferrer, J., Soria-Comas, J.: Steered microaggregation: a unified primitive for anonymization of data sets and data streams. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE (2017)Google Scholar
  28. 28.
    Ghafoor, A., Pervaiz, Z., Aref, W.G.: Precision-bounded access control using sliding-window query views for privacy-preserving data streams. IEEE Trans. Knowl. Data Eng. 27, 1992–2004 (2015)CrossRefGoogle Scholar
  29. 29.
    Bonomi, L., Xiong, L.: On differentially private longest increasing subsequence computation in data stream. Trans. Data Priv. 9, 73–100 (2016)Google Scholar
  30. 30.
    Nie, Y., et al.: Geospatial streams publish with differential privacy. In: Wang, S., Zhou, A. (eds.) CollaborateCom 2016. LNICST, vol. 201, pp. 152–164. Springer, Cham (2017). Scholar
  31. 31.
    Liu, X., et al.: On efficient and robust anonymization for privacy protection on massive streaming categorical information. IEEE Trans. Dependable Secur. Comput. 14, 507–520 (2017)CrossRefGoogle Scholar
  32. 32.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.BMW GroupMunichGermany
  2. 2.TICKET Lab.Antonine UniversityBaabdaLebanon
  3. 3.Université de Franche ComtéBelfortFrance
  4. 4.Department of Computer Science and MathematicsLebanese American UnivesityBeirutLebanon

Personalised recommendations