Skip to main content

Large-Scale Multi-party Counting Set Intersection Using a Space Efficient Global Synopsis

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9050))

Included in the following conference series:

Abstract

Privacy-preserving set intersection (PPSI) of very large data sets is increasingly being required in many real application areas including health-care, national security, and law enforcement. Various techniques have been developed to address this problem, where the majority of them rely on computationally expensive cryptographic techniques. Moreover, conventional data structures cannot be used efficiently for providing count estimates of the elements of the intersection of very large data sets. We consider the problem of efficient PPSI by integrating sets from multiple (three or more) sources in order to create a global synopsis which is the result of the intersection of efficient data structures, known as Count-Min sketches. This global synopsis furthermore provides count estimates of the intersected elements. We propose two protocols for the creation of this global synopsis which are based on homomorphic computations, a secure distributed summation scheme, and a symmetric noise addition technique. Experiments conducted on large synthetic and real data sets show the efficiency and accuracy of our protocols, while at the same time privacy under the Honest-but-Curious model is preserved.

This research was partially funded by the Australian Research Council under Discovery Project DP130101801.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. OTS SA (2014). http://www.ots.gr/

  2. Adamic, L., Huberman, B.: Zipf’s law and the internet. Glottonmetrics 11, 143–150 (2002)

    Google Scholar 

  3. Aggarwal, C., Yu, P.: A general survey of privacy-preserving data mining models and algorithms. Adv. Datab. Sys. 34, 11–52 (2008)

    Article  Google Scholar 

  4. Agrawal, R., Evfimievski, A., Srikant, R.: Information sharing across private databases. In: SIGMOD, San Diego, California, USA, pp. 86–97 (2003)

    Google Scholar 

  5. Aumann, Y., Lindell, Y.: Security against covert adversaries: Efficient protocols for realistic adversaries. J. of Cryptol. 23(2), 281–343 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  6. Broder, A., Mitzenmacher, M.: Network applications of Bloom filters: A survey. Internet Math. 1(4), 485–509 (2002)

    Article  MathSciNet  Google Scholar 

  7. Burkhart, M., Dimitropoulos, X.: Privacy-preserving distributed network troubleshooting - bridging the gap between theory and practice. ACM Trans. Inf. Sys. Sec. 14(4) (2011)

    Google Scholar 

  8. Charikar, Moses, Chen, Kevin, Farach-Colton, Martín: Finding frequent items in data streams. In: Widmayer, Peter, Triguero, Francisco, Morales, R., Hennessy, Matthew, Eidenbenz, Stephan, Conejo, Ricardo (eds.) ICALP 2002. LNCS, vol. 2380, pp. 693–703. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Clifton, C., Kantarcioglou, M., Vaidya, J., Lin, X., Zhu, M.Y.: Tools for privacy preserving distributed data mining. ACM SIGKDD Explor. Newsl. 4(2), 28–34 (2002)

    Article  Google Scholar 

  10. Cohen, S., Matias, Y.: Spectral Bloom filters. In: SIGMOD, San Diego, California, pp. 241–252 (2003)

    Google Scholar 

  11. Cormode, G., Garofalakis, M.: Sketching streams through the net distributed approximate query tracking. In: VLDB, Trondheim, Norway, pp. 13–24 (2005)

    Google Scholar 

  12. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the Count-Min sketch and its applications. J. of Algor. 55(1), 58–75 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  13. Dachman-Soled, D., Malkin, T., Raykova, M., Yung, M.: Efficient robust private set intersection. Appl. Cryptog. 2(4), 289–303 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  14. Dong, C., Chen, L., Wan, Z.: When private set intersection meets big data: an efficient and scalable protocol. In: SIGSAC, Berlin, Germany, pp. 789–800 (2013)

    Google Scholar 

  15. Freedman, Michael J., Nissim, Kobbi, Pinkas, Benny: Efficient private matching and set intersection. In: Cachin, Christian, Camenisch, Jan L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 1–19. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  16. Frikken, K.: Privacy-preserving set union. Appl. Cryptog. Network Sec. 4521, 237–252 (2007)

    Article  Google Scholar 

  17. Glassman, S.: A caching relay for the world wide web. Comput. Netw. ISDN Syst. 27(2), 165–173 (1994)

    Article  Google Scholar 

  18. Goldreich, O., Micali, S., Wigderson, A.: How to play ANY mental game. In: STOC, New York, USA, pp. 218–229 (1987)

    Google Scholar 

  19. Hall, Rob, Fienberg, Stephen E.: Privacy-preserving record linkage. In: Domingo-Ferrer, Josep, Magkos, Emmanouil (eds.) PSD 2010. LNCS, vol. 6344, pp. 269–283. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Hazay, Carmit, Lindell, Yehuda: Efficient protocols for set intersection and pattern matching with security against malicious and covert adversaries. In: Canetti, Ran (ed.) TCC 2008. LNCS, vol. 4948, pp. 155–175. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  21. Jauhari, M., Saxena, A., Gautam, J.: Zipf’s law and the number of hits on the world wide web. Annals of Lib. and Inf. Studies 54, 81–84 (2007)

    Google Scholar 

  22. Kantarcioglu, Murat, Jiang, Wei, Malin, Bradley: A privacy-preserving framework for integrating person-specific databases. In: Domingo-Ferrer, Josep, Saygın, Yücel (eds.) PSD 2008. LNCS, vol. 5262, pp. 298–314. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  23. Kiayias, A., Mitrofanova, A.: Testing disjointness of private datasets. In: Patrick, Andrew S., Yung, M. (eds.) FC 2005. LNCS 3570, vol. 3570, pp. 109–124. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  24. Kissner, Lea, Song, Dawn: Privacy-preserving set operations. In: Shoup, Victor (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 241–257. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  25. Krashakov, S., Teslyuk, A., Shchur, L.: On the universality of rank distributions of website popularity. Comp. Netw. 50(11), 1769–1780 (2006)

    Article  MATH  Google Scholar 

  26. Krawczyk, H., Bellare, M., Canetti, R.: HMAC: keyed-hashing for message authentication, Internet RFC 2104 (1997). http://tools.ietf.org/html/rfc2104

  27. Lindell, Y., Pinkas, B.: Secure multiparty computation for privacy-preserving data mining. J. Priv. Conf. 1(1) (2009)

    Google Scholar 

  28. Many, D., Burkhart, M., Dimitropoulos, X.: Fast private set operations with sepia. Tech. Rep. no. 345, ETH Zurich (2012)

    Google Scholar 

  29. Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press (1995)

    Google Scholar 

  30. Naor, M., Pinkas, B.: Oblivious transfer and polynomial evaluation. In: STOC, Atlanta, Georgia, USA, pp. 245–254 (1999)

    Google Scholar 

  31. Paillier, Pascal: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, Jacques (ed.) EUROCRYPT 1999. LNCS, vol. 1592, p. 223. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  32. Pierre, K., Lai, S., Yiu, K., Chow, C., Chong, L., Hui, C.: An efficient Bloom filter based solution for multiparty private matching. In: SAM (2006)

    Google Scholar 

  33. Roughan, M., Zhang, Y.: Secure distributed data-mining and its application to large-scale network measurements. SIGCOMM Comput. Commun. Rev. 36(1), 7–14 (2006)

    Article  Google Scholar 

  34. Rusu, F., Dobra, A.: Statistical analysis of sketch estimators. In: SIGMOD, Beijing, China, pp. 187–198 (2007)

    Google Scholar 

  35. Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. J. Inf. Sys. 38(6), 946–969 (2013)

    Article  Google Scholar 

  36. Yao, A.: How to generate and exchange secrets. In: SFCS, Toronto, Canada, pp. 162–167 (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitrios Karapiperis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Karapiperis, D., Vatsalan, D., Verykios, V.S., Christen, P. (2015). Large-Scale Multi-party Counting Set Intersection Using a Space Efficient Global Synopsis. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9050. Springer, Cham. https://doi.org/10.1007/978-3-319-18123-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18123-3_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18122-6

  • Online ISBN: 978-3-319-18123-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics