Reviewing the Methods of Estimating the Density Function Based on Masked Data

  • Yan-Xia LinEmail author
  • Pavel N. Krivitsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11126)


Data privacy is an issue of increasing importance for big data mining, especially for micro-level data. A popular approach to protecting the such is perturbation. Therefore, techniques used to recover the statistical information of the original data from the perturbed data become indispensable in data mining.

This paper reviews and exams the existing techniques for estimating (alternatively, reconstructing) the density function of the original data based on the data perturbed using the additive/multiplicative noise method. Our studies show that the techniques developed for noise-added data cannot replace the techniques for noise-multiplied data, though the two types of masked data could be mutually converted through data transformation. This conclusion might attract data providers’ attention.


Confidential data Masked data Multiplicative noise method Additive noise method 



Part of R code for implementing the AS2000 Approach was developed by Miss A. Fernando supported by the Winter Project Scholarship 2016, School of Mathematics and Applied Statistics, UoW.


  1. 1.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM SIGMOD Rec. 29, 439–450 (2000)CrossRefGoogle Scholar
  2. 2.
    Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 247–255. ACM (2001)Google Scholar
  3. 3.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: 2003 Third IEEE International Conference on Data Mining, ICDM 2003, pp. 99–106. IEEE (2003)Google Scholar
  4. 4.
    Lin, Y.-X.: Density approximant based on noise multiplied data. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 89–104. Springer, Cham (2014). Scholar
  5. 5.
    Lin, Y.X., Fielding, M.J.: MaskDensity14: an R package for the density approximant of a univariate based on noise multiplied data. SoftwareX 3, 37–43 (2015)CrossRefGoogle Scholar
  6. 6.
    Lin, Y.X.: Mining the statistical information of confidential data from noise-multiplied data. In: Proceedings of the 3rd IEEE International Conference on Big Data Intelligence and Computing (2017)Google Scholar
  7. 7.
    Domingo-Ferrer, J., Sebé, F., Castellà-Roca, J.: On the security of noise addition for privacy in statistical databases. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 149–161. Springer, Heidelberg (2004). Scholar
  8. 8.
    Lin, Y.X., Mazur, L., Sarathy, R., Muralidhar, K.: Statistical information recovery from multivariate noise-multiplied data, a computational approach. Trans. Data Priv. 11, 23–45 (2018)Google Scholar
  9. 9.
    Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proceedings of the Section on Survey Research Methods, pp. 303–308. American Statistical Association (1986)Google Scholar
  10. 10.
    Kim, J., Winkler, W.: Multiplicative noise for masking continuous data. Statistics 2003-01 (2003)Google Scholar
  11. 11.
    Mivule, K.: Utilizing noise addition for data privacy, an overview. In: Proceedings of the International Conference on Information and Knowledge Engineering (IKE), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), p. 1 (2012)Google Scholar
  12. 12.
    Torra, V.: Data Privacy: Foundations, New Developments and the Big Data Challenge. SBD, vol. 28. Springer, Cham (2017). Scholar
  13. 13.
    Nayak, T.K., Sinha, B., Zayatz, L.: Statistical properties of multiplicative noise masking for confidentiality protection. J. Off. Stat. 27(3), 527–544 (2011)Google Scholar
  14. 14.
    Muralidhar, K., Batra, D., Kirs, P.J.: Accessibility, security, and accuracy in statistical databases: the case for the multiplicative fixed data perturbation approach. Manag. Sci. 41(9), 1549–1564 (1995)CrossRefGoogle Scholar
  15. 15.
    Provost, S.B.: Moment-based density approximants. Math. J. 9(4), 727–756 (2005)Google Scholar
  16. 16.
    Lin, Y.X.: A computational Bayesian approach for estimating density functions based on noise-multiplied data. Int. J. Big Data Intell. (2018). (in press)Google Scholar
  17. 17.
    Ma, Y., Lin, Y.X., Sarathy, R.: The vulnerability of multiplicative noise protection to correlational attacks on continuous microdata. Technical report, National Institute for Applied Statistics Research Australia, School of Mathematics and Applied Statistics, University of Wollongong, Australia (2017)Google Scholar
  18. 18.
    United States Census Bureau: United states census dataset (2000). Accessed 27 July 2000Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Mathematics and Applied Statistics, National Institute for Applied Statistics Research AustraliaUniversity of WollongongWollongongAustralia

Personalised recommendations