Advertisement

A Fast PQ Hash Code Indexing

  • Jingsong ShanEmail author
  • Yongjun Zhang
  • Mingxin Jiang
  • Chunhua Jin
  • Zhengwei Zhang
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 773)

Abstract

This paper presents a Compressed PQ Indexing (CPQI) data structure, which realizes the further compression of sparse entries, requires only sub-linear search time, and the sparse entries are no longer stored. The proposed CPQI saves storage space and is suitable for in-memory computing for large-scale data. The CPQI employs the Minimal Perfect Hash to hash the PQ code, preserve non-null entries, and store the structure very compactly; the compressed PQ hash code index no longer stores PQ code. A sub-linear time search is implemented by combining Bloom filtering with a minimum perfect hash function.

Notes

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 61403060, Huaian Natural Science Foundation HAB201704, Six Talent Peaks project in Jiangsu Province under Grant 2016XYDXXJS-012, the Natural Science Foundation of Jiangsu Province under Grant BK20171267, 533 talents engineering project in Huaian under Grant HAA201738.

References

  1. 1.
    Wang, J., Zhang, T., Song, J., et al.: A Survey on Learning to Hash. CoRR, abs/1606.00185 (2016)Google Scholar
  2. 2.
    Torralba, A., Murphy, K.P., Freeman, W.T., et al.: Context-based vision system for place and objectrecognition. In: Proceedings of 9th IEEE International Conference on Computer Vision (ICCV2003), Nice, France, 14–17 October 2003, pp. 273–280 (2003)Google Scholar
  3. 3.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  4. 4.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefGoogle Scholar
  5. 5.
    Seidl, T., Kriegel, H.: Optimal multi-step k-Nearest neighbor search. In: Proceedings of SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA, 2–4 June 1998, pp. 154–165 (1998)Google Scholar
  6. 6.
    Xu, H., Wang, J., Li, Z., et al.: Complementary hashing for approximate nearest neighbor search. In: Proceedings of IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, 6–13 November 2011, pp. 1631–1638 (2011)Google Scholar
  7. 7.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRefGoogle Scholar
  8. 8.
    Fu, A.W., Chan, P.M., Cheung, Y., et al.: Dynamic vp-Tree indexing for n-Nearest neighbor search given pair-wise distances. VLDB J. 9(2), 154–173 (2000)CrossRefGoogle Scholar
  9. 9.
    Indyk, P.: Nearest neighbors in high-dimensional spaces. In: Proceedings of Handbook of Discrete and Computational Geometry, pp. 877–892, 2nd edn. (2004)Google Scholar
  10. 10.
    Datar, M., Immorlica, N., Indyk, P., et al.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th ACM Symposium on Computational Geometry, Brooklyn, New York, USA, 8–11 June 2004, pp. 253–262 (2004)Google Scholar
  11. 11.
    Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)CrossRefGoogle Scholar
  12. 12.
    Ge, T., He, K., Ke, Q., et al.: Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 744–755 (2014)CrossRefGoogle Scholar
  13. 13.
    Kalantidis, Y., Avrithis, Y.S.: Locally optimized product quantization for approximate nearest neighbor search. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 2329–2336 (2014)Google Scholar
  14. 14.
    Fox, E.A., Heath, L.S., Chen, Q.F., et al.: Practical minimal perfect hash functions for large databases. Commun. ACM 35(1), 105–121 (1992)CrossRefGoogle Scholar
  15. 15.
    Limasset, A., Rizk, G., Chikhi, R., et al.: Fast and scalable minimal perfect hashing for massive keysets. CoRR, abs/1702.03154 (2017)Google Scholar
  16. 16.
    Mitzenmacher, M., Vadhan, S.P.: Why simple hash functions work: exploiting the entropy in a data stream. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, 20–22 January 2008, pp. 746–755 (2008)Google Scholar
  17. 17.
    Bloom, B.H.: Space/Time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  • Jingsong Shan
    • 1
    Email author
  • Yongjun Zhang
    • 1
  • Mingxin Jiang
    • 1
  • Chunhua Jin
    • 1
  • Zhengwei Zhang
    • 1
  1. 1.Huaiyin Institute of TechnologyHuaianPeople’s Republic of China

Personalised recommendations