Multiparty Computation with Statistical Input Confidentiality via Randomized Response
We explore a setting in which a number of subjects want to compute on their pooled data while keeping the statistical confidentiality of their input. Statistical confidentiality is different from the cryptographic confidentiality guaranteed by cryptographic multiparty secure computation: whereas in the latter nothing is disclosed about the input, in statistical input confidentiality a noise-added version of the input is disclosed, which allows more flexible computations. We propose a protocol based on local anonymization via randomized response, whereby the empirical distribution of the data of the subjects is approximated. From that distribution, most statistical calculations can be approximated as well. Regarding the accuracy of the approximation, ceteris paribus it improves with the number of subjects. Large dimensionality (that is, a large number of attributes) decreases accuracy and we propose a strategy to mitigate the dimensionality problem. We show how to characterize the privacy guarantee for each subject in terms of differential privacy. Experimental work is reported on the attained accuracy as a function of the number of respondents, number of attributes and randomized response parameters.
KeywordsMultiparty anonymous computation Randomized response Local anonymization Big data Privacy
Acknowledgments and Disclaimer
The following funding sources are gratefully acknowledged: European Commission (H2020-700540 “CANVAS”), Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer) and Spanish Government (projects TIN2014-57364-C2-1-R “SmartGlacis” and TIN2015-70054-REDC). The views in this paper are the authors’ own and do not necessarily reflect the views of UNESCO or any of the funders.
- 1.Ben-Or, M., Goldwasser, S., Wigderson, A.: Completeness theorems for non-cryptographic fault-tolerant distributed computation. In: STOC (1988)Google Scholar
- 3.Chaum, D., Crépeau, C., Damgaard, I.: Multiparty unconditionally secure protocols. In: STOC (1988)Google Scholar
- 7.Lin, F., Cohen, W.W.: Power iteration clustering. In: Proceedings of the 27th International Conference on Machine Learning-ICML 2010 (2010)Google Scholar
- 8.Van den Hout, A.: Analyzing misclassified data: randomized response and post randomization. Ph.D. thesis, University of Utrecht (2004)Google Scholar
- 9.Wang, Y., Wu, X., Hu, D.: Using randomized response for differential privacy preserving data collection. Technical report DPL-2014-003. University of Arkansas (2014)Google Scholar
- 10.Wang, Y., Wu, X., Hu, D.: Using randomized response for differential privacy preserving data collection. In: EDBT/ICDT 2016 Joint Conference, Bordeaux, France (2016)Google Scholar
- 12.Yao, A.: Protocols for secure computations. In: FOCS (1982)Google Scholar