Simple multi-party set reconciliation

Mitzenmacher, Michael; Pagh, Rasmus

doi:10.1007/s00446-017-0316-0

Simple multi-party set reconciliation

Published: 23 October 2017

Volume 31, pages 441–453, (2018)
Cite this article

Distributed Computing Aims and scope Submit manuscript

Michael Mitzenmacher¹ &
Rasmus Pagh²

516 Accesses
14 Citations
4 Altmetric
Explore all metrics

Abstract

Many distributed cloud-based services use multiple loosely consistent replicas of user information to avoid the high overhead of more tightly coupled synchronization. Periodically, the information must be synchronized, or reconciled. One can place this problem in the theoretical framework of set reconciliation: two parties \(A_1\) and \(A_2\) each hold a set of keys, named \(S_1\) and \(S_2\) respectively, and the goal is for both parties to obtain \(S_1 \cup S_2\). Typically, set reconciliation is interesting algorithmically when sets are large but the set difference \(|S_1-S_2|+|S_2-S_1|\) is small. In this setting the focus is on accomplishing reconciliation efficiently in terms of communication; ideally, the communication should depend on the size of the set difference, and not on the size of the sets. In this paper, we extend recent approaches using Invertible Bloom Lookup Tables (IBLTs) for set reconciliation to the multi-party setting. There are three or more parties \(A_1,A_2,\ldots ,A_n\) holding sets of keys \(S_1,S_2,\ldots ,S_n\) respectively, and the goal is for all parties to obtain \(\cup _i S_i\). While this could be done by pairwise reconciliations, we seek more effective methods. Our general approach can function even if the number of parties is not exactly known in advance, and with some additional cost can be used to determine which other parties hold missing keys. Our methodology uses network coding techniques in conjunction with IBLTs, allowing efficiency in network utilization along with efficiency obtained by passing messages of size \(O(|\cup _i S_i - \cap _i S_i|)\). By connecting reconciliation with network coding, we can provide efficient reconciliation methods for a number of natural distributed settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Searchable Attribute-Based Proxy Re-encryption: Keyword Privacy, Verifiable Expressive Search, and Outsourced Decryption

Article 23 April 2024

Data replication schemes in cloud computing: a survey

Article 16 April 2021

Sharding for Scalable Blockchain Networks

Article 15 October 2022

Notes

See also https://redd.it/2hchs0 for further discussion.
Some preliminary results for multi-party settings, also using the IBLT framework but based on pairwise reconciliations, were provided to us by Goyal and Varghese [22]. Also, after the appearance of this work on the arxiv, multi-party set reconciliation using characteristic polynomials and repeated pairwise reconciliations was examined in [4]; their conclusion is that is while it is possible, it currently seems much less efficient than the methods considered here.
To obtain structures where m / t is very close to 1, one must use irregular IBLTs, where different keys utilize a different number of hash functions. We simplify our description here and use regular IBLTs, whics use the same number of hash functions for each key. See [20, 34] for more discussion.
We note that it is possible to show that \(H(x) = (a^x \bmod p) \bmod q\), where \(1<a<p-1\) is a random positive integer and \(p>q^2\), has the needed properties when all keys are smaller than q.
For this problem we follow the standard notation for gossip problems and use n for the number of vertices.
These experiments were performed by Marco Gentili, who we thank for allowing their use in this paper.

References

Andresen, G.: \(O(1)\) Block Propagation. https://gist.github.com/gavinandresen/e20c3b5a1d4b97f79ac2
Appleby, A.: MurmurHash2 Implementation. https://github.com/aappleby/smhasher/wiki/MurmurHash2
Bailis, P., Kingsbury, K.: The network is reliable: an informal survey of real-world communications failures. Queue 12(7), 20:20–20:32 (2014)
Google Scholar
Boral, A., Mitzenmacher, M.: Multi-party set reconciliation using characteristic polynomials. In: Proceedings of the 52nd Allerton Conference, pp. 1182–1187 (2014)
Botelho, F., Wormald, N., Ziviani, N.: Cores of random \(r\)-partite hypergraphs. Inf. Process. Lett. 112(8), 314–319 (2012)
Article MathSciNet MATH Google Scholar
Broder, A., Charikar, M., Frieze, A., Mitzenmacher, M.: Min-wise independent permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)
Article MathSciNet MATH Google Scholar
Byers, J., Considine, J., Mitzenmacher, M., Rost, S.: Informed content delivery across adaptive overlay networks. IEEE/ACM Trans. Netw. 12(5), 767–780 (2004)
Article Google Scholar
Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., et al.: Windows Azure Storage: a highly available cloud storage service with strong consistency. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 143–157 (2011)
Chen, D., Konrad, C., Yi, Ke, Yu, W., Zhang, Q.: Robust set reconciliation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 135–146 (2014)
Chierichetti, F., Lattanzi, S., Panconesi, A.: Almost tight bounds for rumour spreading with conductance. In: Proceedings of the Forty-Second ACM Symposium on Theory of Computing, pp. 399–408 (2010)
Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)
Article MathSciNet MATH Google Scholar
Deb, S., Médard, M., Choute, C.: Algebraic gossip: a network coding approach to optimal multiple rumor mongering. IEEE Trans. Inf. Theory 52(6), 2486–2507 (2006)
Article MATH Google Scholar
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles, pp. 205–220 (2007)
Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic algorithms for replicated database maintenance. In: Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 1–12 (1987)
Dodis, Y., Ostrovsky, R., Reyzin, L., Smith, A.: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data. SIAM J. Comput. 38(1), 97–139 (2008)
Article MathSciNet MATH Google Scholar
Eppstein, D., Goodrich, M.: Straggler identification in round-trip data streams via Newton’s identities and invertible bloom filters. IEEE Trans. Knowl. Data Eng. 23(2), 297–306 (2011)
Article MATH Google Scholar
Eppstein, D., Goodrich, M., Uyeda, F., Varghese, G.: What’s the difference? Efficient set reconciliation without prior context. In: Proceedings of the ACM SIGCOMM Conference, pp. 218–229 (2011)
Giakkoupis, G.: Tight bounds for rumor spreading in graphs of a given conductance. In: Proceedings of the 28th International Symposium on Theoretical Aspects of Computer Science, pp. 57–68 (2011)
Gabrys, R., Farnoud, F.: Reconciling similar sets of data. In: Proceedings of the International Symposium on Information Theory, pp. 2837–2341 (2014)
Goodrich, M., Mitzenmacher, M.: Invertible bloom lookup tables. In: Proceedings of the 49th Annual Allerton Conference, pp. 792–799 (2011)
Gorale, A.: Bitcoin in Bloom: How IBLTs Allow Bitcoin to Scale. Cryptocoin News, October 2, 2014. https://www.cryptocoinsnews.com/bitcoin-in-bloom-how-iblts-allow-bitcoin-scale
Goyal, N., Varghese, G.: Personal communication (2013)
Haeupler, B.: Analyzing network coding gossip made easy. In: Proceedings of the 43rd Annual ACM Symposium on Theory of Computing, pp. 293–302 (2011)
Hedetniemi, S., Hedetniemi, S., Liestman, A.: A survey of gossiping and broadcasting in communication networks. Networks 18(4), 319–349 (1988)
Article MathSciNet MATH Google Scholar
Huang, Z., Yi, Ke, Zhang, Q.: Randomized algorithms for tracking distributed count, frequencies, and ranks. In: Proceedings of the 31st Symposium on Principles of Database Systems, pp. 295–306 (2012)
Juels, A., Sudan, M.: A fuzzy vault scheme. Des. Codes Crypt. 38(2), 237–257 (2006)
Article MathSciNet MATH Google Scholar
Katti, S., Rahul, H., Hu, W., Katabi, D., Médard, M., Crowcroft, J.: XORs in the air: practical wireless network coding. ACM SIGCOMM Comput. Commun. Rev. 36(4), 243–254 (2006)
Article Google Scholar
Knödel, W.: New gossips and telephones. Discrete Math. 13(1), 95 (1975)
Article MathSciNet MATH Google Scholar
Kostić, D., Snoeren, A., Vadhat, A., Braud, R., Killian, C., Anderson, J., Albrecht, J., Rodriguesz, A., Vandkieft, E.: High-bandwidth data dissemination for large-scale distributed systems. ACM Trans. Comput. Syst. 26(1), Article 3 (2008)
Luby, M., Mitzenmacher, M., Shokrollahi, A., Spielman, D.: Efficient erasure correcting codes. IEEE Trans. Inf. Theory 47(2), 569–584 (2001)
Article MathSciNet MATH Google Scholar
Minsky, Y., Trachtenberg, A., Zippel, R.: Set reconciliation with nearly optimal communication complexity. IEEE Trans. Inf. Theory 49(9), 2213–2218 (2003)
Article MathSciNet MATH Google Scholar
Mitzenmacher, M., Varghese, G.: The complexity of object reconciliation, and open problems related to set difference and coding. In: Proceedings of the 50th Annual Allerton Conference, pp. 1126–1132 (2012)
Molloy, M.: The pure literal rule threshold and cores in random hypergraphs. In: Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 672–681 (2004)
Rink, M.: Mixed hypergraphs for linear-time construction of denser hashing-based data structures. In: SOFSEM 2013: Theory and Practice of Computer Science, pp. 356–368 (2013)
Schwartz, J.T.: Fast probabilistic algorithms for verification of polynomial identities. J. ACM 27(4), 701–717 (1980)
Article MathSciNet MATH Google Scholar
Skachek, V., Rabbat, M.: Subspace synchronization: a network-coding approach to object reconciliation. In: Proceedings of the International Symposium on Information Theory, pp. 2301–2305 (2014)
Skjegstad, M., Maseng, T.: Low complexity set reconciliation using Bloom filters. In: Proceedings of the 7th ACM ACM SIGACT/SIGMOBILE International Workshop on Foundations of Mobile Computing, pp. 33–41 (2011)
Shah, D.: Gossip algorithms. Found. Trends Netw. 3(1), 1–125 (2008)
Article MATH Google Scholar
Zippel, R.: Probabilistic algorithms for sparse polynomials. In: Proceedings on the International Symposium on Symbolic and Algebraic Computation, pp. 216–226 (1979)

Download references

Acknowledgements

The first author thanks George Varghese for suggesting the problem of multi-party set synchronization, and Marco Gentili for allowing us to report on his experiments in this paper.

Author information

Authors and Affiliations

School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Michael Mitzenmacher
IT University of Copenhagen, Copenhagen, Denmark
Rasmus Pagh

Authors

Michael Mitzenmacher
View author publications
You can also search for this author in PubMed Google Scholar
Rasmus Pagh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Mitzenmacher.

Additional information

Michael Mitzenmacher was supported in part by NSF Grants CCF-1563710, CCF-1535795, CCF-1320231, CNS-1228598, IIS-0964473, and CCF-0915922. Rasmus Pagh was supported by the Danish National Research Foundation under the Sapere Aude program.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mitzenmacher, M., Pagh, R. Simple multi-party set reconciliation. Distrib. Comput. 31, 441–453 (2018). https://doi.org/10.1007/s00446-017-0316-0

Download citation

Received: 12 August 2015
Accepted: 25 September 2017
Published: 23 October 2017
Issue Date: November 2018
DOI: https://doi.org/10.1007/s00446-017-0316-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simple multi-party set reconciliation

Abstract

Access this article

Similar content being viewed by others

Searchable Attribute-Based Proxy Re-encryption: Keyword Privacy, Verifiable Expressive Search, and Outsourced Decryption

Data replication schemes in cloud computing: a survey

Sharding for Scalable Blockchain Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Simple multi-party set reconciliation

Abstract

Access this article

Similar content being viewed by others

Searchable Attribute-Based Proxy Re-encryption: Keyword Privacy, Verifiable Expressive Search, and Outsourced Decryption

Data replication schemes in cloud computing: a survey

Sharding for Scalable Blockchain Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation