Abstract
Privacy-preserving set intersection (PPSI) of very large data sets is increasingly being required in many real application areas including health-care, national security, and law enforcement. Various techniques have been developed to address this problem, where the majority of them rely on computationally expensive cryptographic techniques. Moreover, conventional data structures cannot be used efficiently for providing count estimates of the elements of the intersection of very large data sets. We consider the problem of efficient PPSI by integrating sets from multiple (three or more) sources in order to create a global synopsis which is the result of the intersection of efficient data structures, known as Count-Min sketches. This global synopsis furthermore provides count estimates of the intersected elements. We propose two protocols for the creation of this global synopsis which are based on homomorphic computations, a secure distributed summation scheme, and a symmetric noise addition technique. Experiments conducted on large synthetic and real data sets show the efficiency and accuracy of our protocols, while at the same time privacy under the Honest-but-Curious model is preserved.
This research was partially funded by the Australian Research Council under Discovery Project DP130101801.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
OTS SA (2014). http://www.ots.gr/
Adamic, L., Huberman, B.: Zipf’s law and the internet. Glottonmetrics 11, 143–150 (2002)
Aggarwal, C., Yu, P.: A general survey of privacy-preserving data mining models and algorithms. Adv. Datab. Sys. 34, 11–52 (2008)
Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: SIGMOD, San Diego, California, USA, pp. 86–97 (2003)
Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. J. of Cryptol. 23(2), 281–343 (2010)
Broder, A., Mitzenmacher, M.: Network applications of Bloom filters: A survey. Internet Math. 1(4), 485–509 (2002)
Burkhart, M., Dimitropoulos, X.: Privacy-preserving distributed network troubleshooting - bridging the gap between theory and practice. ACM Trans. Inf. Sys. Sec. 14(4) (2011)
Charikar, Moses, Chen, Kevin, Farach-Colton, MartÃn: Finding frequent items in data streams. In: Widmayer, Peter, Triguero, Francisco, Morales, R., Hennessy, Matthew, Eidenbenz, Stephan, Conejo, Ricardo (eds.) ICALP 2002. LNCS, vol. 2380, pp. 693–703. Springer, Heidelberg (2002)
Clifton, C., Kantarcioglou, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002)
Cohen, S., Matias, Y.: Spectral Bloom filters. In: SIGMOD, San Diego, California, pp. 241–252 (2003)
Cormode, G., Garofalakis, M.: Sketching streams through the net distributed approximate query tracking. In: VLDB, Trondheim, Norway, pp. 13–24 (2005)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: the Count-Min sketch and its applications. J. of Algor. 55(1), 58–75 (2005)
Dachman-Soled, D., Malkin, T., Raykova, M., Yung, M.: Efficient robust private set intersection. Appl. Cryptog. 2(4), 289–303 (2012)
Dong, C., Chen, L., Wan, Z.: When private set intersection meets big data: an efficient and scalable protocol. In: SIGSAC, Berlin, Germany, pp. 789–800 (2013)
Freedman, Michael J., Nissim, Kobbi, Pinkas, Benny: Efficient private matching and set intersection. In: Cachin, Christian, Camenisch, Jan L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)
Frikken, K.: Privacy-preserving set union. Appl. Cryptog. Network Sec. 4521, 237–252 (2007)
Glassman, S.: A caching relay for the world wide web. Comput. Netw. ISDN Syst. 27(2), 165–173 (1994)
Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: STOC, New York, USA, pp. 218–229 (1987)
Hall, Rob, Fienberg, Stephen E.: Privacy-preserving record linkage. In: Domingo-Ferrer, Josep, Magkos, Emmanouil (eds.) PSD 2010. LNCS, vol. 6344, pp. 269–283. Springer, Heidelberg (2010)
Hazay, Carmit, Lindell, Yehuda: Efficient protocols for set intersection and pattern matching with security against malicious and covert adversaries. In: Canetti, Ran (ed.) TCC 2008. LNCS, vol. 4948, pp. 155–175. Springer, Heidelberg (2008)
Jauhari, M., Saxena, A., Gautam, J.: Zipf’s law and the number of hits on the world wide web. Annals of Lib. and Inf. Studies 54, 81–84 (2007)
Kantarcioglu, Murat, Jiang, Wei, Malin, Bradley: A privacy-preserving framework for integrating person-specific databases. In: Domingo-Ferrer, Josep, Saygın, Yücel (eds.) PSD 2008. LNCS, vol. 5262, pp. 298–314. Springer, Heidelberg (2008)
Kiayias, A., Mitrofanova, A.: Testing disjointness of private datasets. In: Patrick, Andrew S., Yung, M. (eds.) FC 2005. LNCS 3570, vol. 3570, pp. 109–124. Springer, Heidelberg (2005)
Kissner, Lea, Song, Dawn: Privacy-preserving set operations. In: Shoup, Victor (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 241–257. Springer, Heidelberg (2005)
Krashakov, S., Teslyuk, A., Shchur, L.: On the universality of rank distributions of website popularity. Comp. Netw. 50(11), 1769–1780 (2006)
Krawczyk, H., Bellare, M., Canetti, R.: HMAC: keyed-hashing for message authentication, Internet RFC 2104 (1997). http://tools.ietf.org/html/rfc2104
Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining. J. Priv. Conf. 1(1) (2009)
Many, D., Burkhart, M., Dimitropoulos, X.: Fast private set operations with sepia. Tech. Rep. no. 345, ETH Zurich (2012)
Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press (1995)
Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation. In: STOC, Atlanta, Georgia, USA, pp. 245–254 (1999)
Paillier, Pascal: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, Jacques (ed.) EUROCRYPT 1999. LNCS, vol. 1592, p. 223. Springer, Heidelberg (1999)
Pierre, K., Lai, S., Yiu, K., Chow, C., Chong, L., Hui, C.: An efficient Bloom filter based solution for multiparty private matching. In: SAM (2006)
Roughan, M., Zhang, Y.: Secure distributed data-mining and its application to large-scale network measurements. SIGCOMM Comput. Commun. Rev. 36(1), 7–14 (2006)
Rusu, F., Dobra, A.: Statistical analysis of sketch estimators. In: SIGMOD, Beijing, China, pp. 187–198 (2007)
Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. J. Inf. Sys. 38(6), 946–969 (2013)
Yao, A.: How to generate and exchange secrets. In: SFCS, Toronto, Canada, pp. 162–167 (1986)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Karapiperis, D., Vatsalan, D., Verykios, V.S., Christen, P. (2015). Large-Scale Multi-party Counting Set Intersection Using a Space Efficient Global Synopsis. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9050. Springer, Cham. https://doi.org/10.1007/978-3-319-18123-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-18123-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18122-6
Online ISBN: 978-3-319-18123-3
eBook Packages: Computer ScienceComputer Science (R0)