Advertisement

Efficiency and Sample Size Determination of Protected Data

  • Bradley WakefieldEmail author
  • Yan-Xia Lin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11126)

Abstract

This paper assesses the usefulness of a proposed multiplicative perturbation method by contrasting the statistical efficiency achieved in point hypothesis testing of simple proportions with that of the differentially private aggregated Laplace mechanism. This efficiency is evaluated by obtaining an analytical expression that determines the sample size required for protected data to retain a given significance level and power.

Keywords

Noise multiplied data Hypothesis test Confidential data Data protection Multiplicative perturbation Proportions 

Notes

Acknowledgement

This research has been conducted with the support of the Australian Government Research Training Program Scholarship.

References

  1. 1.
    Ács, G., Castelluccia, C.: I have a DREAM! (DiffeRentially privatE smArt Metering). In: Filler, T., Pevný, T., Craver, S., Ker, A. (eds.) IH 2011. LNCS, vol. 6958, pp. 118–132. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-24178-9_9CrossRefGoogle Scholar
  2. 2.
    Drechsler, J.: My understanding of the differences between the CS and the statistical approach to data confidentiality. In: IFE Research (ed.) 4th IAB Workshop on Confidentiality and Disclosure (2011). http://doku.iab.de/veranstaltungen/2011/ws_data2011_drechsler.pdf
  3. 3.
    Duncan, G.T., Lambert, D.: Disclosure-limited data dissemination. J. Am. Stat. Assoc. 81, 10–18 (1986)CrossRefGoogle Scholar
  4. 4.
    Dwork, C., Smith, A.: Differential privacy for statistics: what we know and what we want to learn. J. Priv. Confid. 2, 135–154 (2010)Google Scholar
  5. 5.
    Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–407 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006).  https://doi.org/10.1007/11787006_1CrossRefGoogle Scholar
  7. 7.
    Gostin, L.O.: Privacy and security of personal information in a new health care system. J. Am. Med. Assoc. 270, 2487–2493 (1993)CrossRefGoogle Scholar
  8. 8.
    Green, A.K., et al.: The project data sphere initiative: accelerating cancer research by sharing data. Oncologist 20, 464–471 (2015)CrossRefGoogle Scholar
  9. 9.
    Hwang, J.T.: Multiplicative errors-in-variables models with applications to recent data released by the U.S. Department of Energy. J. Am. Stat. Assoc. 81, 680–688 (1986)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Kim, J.J., Winkler, W.E.: Multiplicative Noise for Masking Continuous Data, Research Report Series (Statistics \(\sharp \)2003-01), Statistical Research Division, US Bureau of the Census, Washington D.C., pp. 1–17 (2003)Google Scholar
  11. 11.
    Kim, J.J., Jeong, D.M.: Truncated triangular distribution for multiplicative noise and domain estimation. Sect. Gov. Stat. - JSM 2008, 1023–1030 (2008)Google Scholar
  12. 12.
    Klein, M., Mathew, T., Sinha, B.: Noise multiplicative for statistical disclosure control of extreme values in log-normal regression samples. J. Priv. Confid. 6, 77–125 (2014)Google Scholar
  13. 13.
    Lin, Y.-X., Fielding, M.J.: MaskDensity14: an R package for the density approximant of a univariate based on noise multiplied data. SoftwareX 3–4, 37–43 (2015).  https://doi.org/10.1016/j.softx.2015.11.002CrossRefGoogle Scholar
  14. 14.
    Lin, Y.-X., Wise, P.: Estimation of regression parameters from noise multiplied data. J. Priv. Confid. 61–94 (2012)Google Scholar
  15. 15.
    Lin, Y.-X.: Density approximant based on noise multiplied data. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 89–104. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11257-2_8CrossRefGoogle Scholar
  16. 16.
    Ma, Y., Lin, Y.-X., Sarathy, R.: The vulnerability of multiplicative noise protection to correlational attacks on continuous microdata. In: 2016 Working Paper, School of Mathematics and Applied Statistics, National Institute for Applied Statistics Research Australia, University of Wollongong, Australia (2016)Google Scholar
  17. 17.
    McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, Providence, Rhode Island, USA, pp. 19–30,  https://doi.org/10.1145/1559845.1559850 (2009)
  18. 18.
    McSherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, pp. 94–103 (2007).  https://doi.org/10.1109/FOCS.2007.41
  19. 19.
    Oganian, A.: Multiplicative noise protocols. In: Domingo-Ferrer, J., Magkos, E. (eds.) PSD 2010. LNCS, vol. 6344, pp. 107–117. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15838-4_10CrossRefGoogle Scholar
  20. 20.
    Oganian, A.: Multiplicative noise for masking numerical microdata data with constraints. SORT - Stat. Oper. Res. Trans. (Special Issue), 99–112 (2011)Google Scholar
  21. 21.
    Sarathy, R., Muralidhar, K.: Evaluating laplace noise addition to satisfy differential privacy for numeric data. Trans. Data Priv. 4, 1–17 (2011)MathSciNetGoogle Scholar
  22. 22.
    Sinha, B., Nayak, T.K., Zayatz, L.: Privacy protection and quantile estimation from noise multiplied data. Sankhya B 73, 297–315 (2011)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Shlomo, N., Skinner, C.J.: Privacy protection from sampling and perturbation in survey microdata. J. Priv. Confid. 4, 155–169 (2012)Google Scholar
  24. 24.
    Torra, V.: Data Privacy: Foundations, New Developments and the Big Data Challenge. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-57358-8CrossRefGoogle Scholar
  25. 25.
    Wang, Y., Lee, J., Kifer, D.: Differentially private hypothesis testing (2015). Revisited, CoRR, arXiv: 1511.03376
  26. 26.
    Vu, D., Slavkovic, A.: Differential privacy for clinical trial data: preliminary evaluations. In: Proceedings of the 2009 IEEE International Conference on Data Mining Workshops, Washington, DC, USA, pp. 138–143 (2009).  https://doi.org/10.1109/ICDMW.2009.52
  27. 27.
    Wang, Y., Wu, X., Hu, D.: Using randomized response for differential privacy preserving data collection. In: Proceedings of the Workshops of the (EDBT/ICDT) 2016 Joint Conference, (EDBT/ICDT) Workshops 2016, Bordeaux, France, 15 March 2016 (2016). http://ceur-ws.org/Vol-1558/paper35.pdf
  28. 28.
    Willenborg, L., De Waal, T.: Elements of Statistical Disclosure Control. LNS, vol. 155. Springer, New York (2012).  https://doi.org/10.1007/978-1-4613-0121-9CrossRefzbMATHGoogle Scholar
  29. 29.
    Warner, S.L.: Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc. 60, 63–69 (1965)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.National Institute for Applied Statistics Research Australia, School of Mathematics and Applied StatisticsUniversity of WollongongWollongongAustralia

Personalised recommendations