A Noise Generation Scheme Based on Huffman Coding for Preserving Privacy

  • Iuon-Chang Lin
  • Li-Cheng Yang
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 733)


The cloud computing technique rises in these years. Due to cloud computing techniques have some features including low cost, robustness, flexibility and ubiquitous nature. The data in organization will increase immediately. A large number of data can be used on many applications of data analysis involves business, medical and government. But it has some privacy issues, if dealer wants to understand their customer behavior for requirement of marketing, they may publish data into data analysis company, third-party, to analysis. To preserve privacy in database, this paper proposes an efficient noise generation scheme which is based on Huffman coding algorithm. The features of Huffman coding algorithm are a character with lower occurrence frequency has longer code and vice versa. It is suitable to be applied on protecting privacy on database, that tuple with lower occurrence frequency has more noise. The paper presents a noise matrix, a set of noise, which is based on this concept. Although this scheme may lead to data distortion by replace original value, but does not affect to data analysis. In the section of experiments, we consider running time of noise generation with integer number and real number. Overall, this paper shares different concept to perturb original value and propose an efficient data perturbation scheme.


Huffman coding Noise matrix Numerical database Privacy preserving 


  1. 1.
    Agrawal, D., Aggarwal, C.: On the design and quantification of privacy preserving data mining algorithms. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255 (2001)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: ACM SIGMOD International Conference on Management of Data, pp. 439–450 (2000)Google Scholar
  3. 3.
    Asuncion, A., Newman, D.: UCI Machine Learning Repository. University of California, Irvine (2007).
  4. 4.
    Dwork, C., Kenthapadi, K., McSherry, F.: Our data, ourselves: privacy via distributed noise generation. In: Advances in Cryptology - EUROCRYPT 2006, vol. 4004, pp. 486–503 (2006)Google Scholar
  5. 5.
    Gonciari, P.: Variable-length input Huffman coding for system-on-a-chip test. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 22(6), 783–796 (2003)CrossRefGoogle Scholar
  6. 6.
    Guan, S., Zhang, Y., Ji, Y.: Privacy-preserving health data collection for preschool children. Comput. Math. Methods Med. 2013, 1–5 (2013)CrossRefzbMATHGoogle Scholar
  7. 7.
    Guo, Y., Zhang, L., Lin, F., Li, X.: A solution for privacy-preserving data manipulation and query on NoSQL database. J. Comput. 8(6), 1427–1432 (2013)CrossRefGoogle Scholar
  8. 8.
    Han, J., Haihong, E., Le, G., Du, J.: Survey on NoSQL database. In: International Conference on Pervasive Computing and Applications, pp. 363–366 (2011)Google Scholar
  9. 9.
    Huffman, D.: A method for the construction of minimum redundancy codes. In: The IRE, vol. 27, pp. 1098–1101 (1952)Google Scholar
  10. 10.
    Kavousianos, X.: Optimal selective Huffman coding for test-data compression. IEEE Trans. Comput. 56(8), 1146–1152 (2007)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Kellaris, G., Papadopoulos, S.: Practical differential privacy via grouping and smoothing. In: ACM International Conference on Very Large Data Bases, pp. 301–312 (2013)Google Scholar
  12. 12.
    Li, Y., Chen, M., Li, Q., Zhang, W.: Enabling multilevel trust in privacy preserving data mining. IEEE Trans. Knowl. Data Eng. 24(9), 1598–1612 (2012)CrossRefGoogle Scholar
  13. 13.
    Liu, L., Kantarcioglu, M., Thuraisingham, B.: Privacy preserving decision tree mining from perturbed data. In: International Conference on System Sciences, vol. 5, pp. 1–10 (2009)Google Scholar
  14. 14.
    Mesnier, M., Ganger, G., Riedel, E.: Object-based storage. Commun. Mag. 41(8), 84–90 (2003)CrossRefGoogle Scholar
  15. 15.
    Okman, L., Gal-Oz, N., Gonen, Y., Gudes, E., Abramov, J.: Security issues in NoSQL databases. In: IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 541–547 (2011)Google Scholar
  16. 16.
    Tudorica, B.G., Bucur, C.: A comparison between several NoSQL databases with comments and notes. In: International Conference 10th Edition: Networking in Education and Research, pp. 1–5 (2011)Google Scholar
  17. 17.
    Zhang, G., Liu, X., Yang, Y.: Time-series pattern based effective noise generation for privacy protection on cloud. IEEE Trans. Comput. PP(99), 1 (2014)Google Scholar
  18. 18.
    Zhang, G., Yang, Y., Chen, J.: A historical probability based noise generation strategy for privacy protection in cloud computing. J. Comput. Syst. Sci. 78(5), 1374–1381 (2012)CrossRefGoogle Scholar
  19. 19.
    Zhang, G., Yang, Y., Liu, X., Chen, J.: A time-series pattern based noise generation strategy for privacy protection in cloud computing. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 458–465 (2012)Google Scholar
  20. 20.
    Zhang, G., Yang, Y., Yuan, D., Chen, J.: A trust-based noise injection strategy for privacy protection in cloud. Softw. Pract. Exp. 42(4), 431–445 (2012)CrossRefGoogle Scholar
  21. 21.
    EZCloudStor OSD (Object Storage Device): Have Your Own Amazon S3 on Premise. EZ Cloud Tech.

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Management Information SystemsNational Chung Hsing UniversityTaichungTaiwan

Personalised recommendations