Advertisement

Algorithmica

pp 1–18 | Cite as

A Faster Algorithm for Cuckoo Insertion and Bipartite Matching in Large Graphs

  • Megha KhoslaEmail author
  • Avishek Anand
Article

Abstract

Hash tables are ubiquitous in computer science for efficient access to large datasets. However, there is always a need for approaches that offer compact memory utilisation without substantial degradation of lookup performance. Cuckoo hashing is an efficient technique of creating hash tables with high space utilisation and offer a guaranteed constant access time. We are given n locations and m items. Each item has to be placed in one of the \(k\ge 2\) locations chosen by k random hash functions. By allowing more than one choice for a single item, cuckoo hashing resembles multiple choice allocations schemes. In addition it supports dynamically changing the location of an item among its possible locations. We propose and analyse an insertion algorithm for cuckoo hashing that runs in linear time with high probability and in expectation. Previous work on total allocation time has analysed breadth first search, and it was shown to be linear only in expectation. Our algorithm finds an assignment (with probability 1) whenever it exists. In contrast, the other known insertion method, known as random walk insertion, may run indefinitely even for a solvable instance. We also present experimental results comparing the performance of our algorithm with the random walk method, also for the case when each location can hold more than one item. As a corollary we obtain a linear time algorithm (with high probability and in expectation) for finding perfect matchings in a special class of sparse random bipartite graphs. We support this by performing experiments on a real world large dataset for finding maximum matchings in general large bipartite graphs. We report an order of magnitude improvement in the running time as compared to the Hopkraft–Karp matching algorithm.

Keywords

Cuckoo hashing Bipartite matching Load balancing 

Notes

References

  1. 1.
    Arbitman, Y., Naor, M., Segev, G.: De-amortized cuckoo hashing: Provable worst-case performance and experimental results. In: Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I, ICALP ’09, pp. 107–118 (2009)Google Scholar
  2. 2.
    Aumüller, M., Dietzfelbinger, M., Woelfel, P.: A Simple hash class with strong randomness properties in graphs and hypergraphs. ArXiv e-prints (2016)Google Scholar
  3. 3.
    Cain, J.A., Sanders, P., Wormald, N.: The random graph threshold for k-orientiability and a fast algorithm for optimal multiple-choice allocation. In: Proceedings of the 18th annual ACM-SIAM symposium on Discrete algorithms (SODA 2007), pp. 469–476 (2007)Google Scholar
  4. 4.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press, Cambridge (2009). ISBN 0262033844, 9780262033848zbMATHGoogle Scholar
  5. 5.
    Czumaj, A., Stemann, V.: Randomized allocation processes. Random Struct. Algorithms 18(4), 297–331 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Dietzfelbinger, M., Schellbach, U.: On risks of using cuckoo hashing with simple universal hash classes. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’09, pp. 795–804 (2009)Google Scholar
  7. 7.
    Fernholz, D., Ramachandran, V.: The k-orientability thresholds for \({G}_{n,p}\). In: Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), pp. 459–468 (2007)Google Scholar
  8. 8.
    Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.: Space efficient hash tables with worst case constant access time. In: STACS ’03, volume 2607 of Lecture Notes in Computer Science, pp. 271–282 (2003)Google Scholar
  9. 9.
    Fountoulakis, N., Panagiotou, K.: Sharp load thresholds for cuckoo hashing. Random Struct. Algorithms 41(3), 306–333 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Fountoulakis, N., Panagiotou, K., Steger, A.: On the insertion time of cuckoo hashing. SIAM J. Comput. 42(6), 2156–2181 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Frieze, A., Melsted, P.: Maximum matchings in random bipartite graphs and the space utilization of cuckoo hash tables. Random Struct. Algorithms 41(3), 334–364 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Frieze, A., Melsted, P., Mitzenmacher, M.: An analysis of random-walk cuckoo hashing. SIAM J. Comput. 40(2), 291–308 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Booth, M., Rossi, F.: GNU scientific library reference manual. (2003). http://www.gnu.org/software/gsl
  14. 14.
    Hopcroft, J.E., Karp, R.M.: An n\(\hat{~}\)5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2(4), 225–231 (1973)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Khosla, M.: Balls into bins made faster. In: Algorithms–ESA 2013, volume 8125 of Lecture Notes in Computer Science, pp. 601–612 (2013)Google Scholar
  16. 16.
    Kirsch, A., Mitzenmacher, M., Wieder, U.: More robust hashing: cuckoo hashing with a stash. SIAM J. Comput. 39(4), 1543–1561 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Lelarge, M.: A new approach to the orientation of random hypergraphs. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’12, pp. 251–264 (2012)Google Scholar
  18. 18.
    Mitzenmacher, M., Vadhan, S.: Why simple hash functions work: exploiting the entropy in a data stream. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, pp. 746–755 (2008)Google Scholar
  19. 19.
    Pagh, R., Rodler, F.F.: Cuckoo hashing. In: ESA ’01, pp. 121–133 (2001). ISBN 3-540-42493-8Google Scholar
  20. 20.
    Sanders, P., Egner, S., Korst, J.: Fast concurrent access to parallel disks. In: Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 1999), pp. 849–858 (1999)Google Scholar
  21. 21.
    Schlegel, B., Gemulla, R., Lehner, W.: Fast integer compression using SIMD instructions. In: Workshop on Data Management on New Hardware (DaMoN 2010), pp. 34–40 (2010)Google Scholar
  22. 22.
    Zubiaga, A., Fresno, V., Martinez, R., Garcia-Plaza, A.P.: Harnessing folksonomies to produce a social classification of resources. IEEE Trans. Knowl. Data Eng. 25(8), 1801–1813 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.L3S Research CenterLeibniz UniversityHannoverGermany

Personalised recommendations