A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval

  • Johannes L. SchönbergerEmail author
  • True Price
  • Torsten Sattler
  • Jan-Michael Frahm
  • Marc Pollefeys
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10111)


Spatial verification is a crucial part of every image retrieval system, as it accounts for the fact that geometric feature configurations are typically ignored by the Bag-of-Words representation. Since spatial verification quickly becomes the bottleneck of the retrieval process, runtime efficiency is extremely important. At the same time, spatial verification should be able to reliably distinguish between related and unrelated images. While methods based on RANSAC’s hypothesize-and-verify framework achieve high accuracy, they are not particularly efficient. Conversely, verification approaches based on Hough voting are extremely efficient but not as accurate. In this paper, we develop a novel spatial verification approach that uses an efficient voting scheme to identify promising transformation hypotheses that are subsequently verified and refined. Through comprehensive experiments, we show that our method is able to achieve a verification accuracy similar to state-of-the-art hypothesize-and-verify approaches while providing faster runtimes than state-of-the-art voting-based methods.


Image Retrieval Visual Word Query Image Query Expansion Vote Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



True Price and Jan-Michael Frahm were supported in part by the NSF No. IIS-1349074, No. CNS-1405847.

Supplementary material

416257_1_En_21_MOESM1_ESM.pdf (176 kb)
Supplementary material 1 (pdf 175 KB)


  1. 1.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  2. 2.
    Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)Google Scholar
  3. 3.
    Arandjelović, R., Zisserman, A.: DisLocation: scalable descriptor distinctiveness for location recognition. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 188–204. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16817-3_13 Google Scholar
  4. 4.
    Sattler, T., Havlena, M., Schindler, K., Pollefeys, M.: Large-scale location recognition and the geometric burstiness problem. In: CVPR (2016)Google Scholar
  5. 5.
    Torii, A., Arandjelovic, R., Sivic, J., Okutomi, M., Pajdla, T.: 24/7 place recognition by view synthesis. In: CVPR (2015)Google Scholar
  6. 6.
    Sattler, T., Weyand, T., Leibe, B., Kobbelt, L.: Image retrieval for image-based localization revisited. In: BMVC (2012)Google Scholar
  7. 7.
    Sattler, T., Havlena, M., Radenovic, F., Schindler, K., Pollefeys, M.: Hyperpoints and fine vocabularies for large-scale location recognition. In: ICCV (2015)Google Scholar
  8. 8.
    Gammeter, S., Quack, T., Van Gool, L.: I know what you did last summer: object-level auto-annotation of holiday snaps. In: ICCV (2009)Google Scholar
  9. 9.
    Weyand, T., Leibe, B.: Discovering favorite views of popular places with iconoid shift. In: ICCV (2011)Google Scholar
  10. 10.
    Weyand, T., Leibe, B.: Discovering details and scene structure with hierarchical iconoid shift. In: ICCV (2013)Google Scholar
  11. 11.
    Lee, G.H., Fraundorfer, F., Pollefeys, M.: Structureless pose-graph loop-closure with a multi-camera system on a self-driving car. In: IROS (2013)Google Scholar
  12. 12.
    Schönberger, J.L., Radenović, F., Chum, O., Frahm, J.M.: From single image query to detailed 3d reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  13. 13.
    Radenović, F., Schönberger, J.L., Ji, D., Frahm, J.M., Chum, O., Matas, J.: From dusk till dawn: modeling in the dark. In: CVPR (2016)Google Scholar
  14. 14.
    Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  15. 15.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)Google Scholar
  16. 16.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: CVPR, pp. 1–8 (2007)Google Scholar
  17. 17.
    Jégou, H., Zisserman, A.: Triangulation embedding and democratic aggregation for image search. In: CVPR (2014)Google Scholar
  18. 18.
    Arandjelović, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: NetVLAD: CNN architecture for weakly supervised place recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  19. 19.
    Radenović, F., Tolias, G., Chum, O.: CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 3–20. Springer, Heidelberg (2016). doi: 10.1007/978-3-319-46448-0_1 CrossRefGoogle Scholar
  20. 20.
    Gordo, A., Almazan, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. arXiv:1604.01325 (2016)
  21. 21.
    Chum, O., Mikulik, A., Perdoch, M., Matas, J.: Total recall II: query expansion revisited. In: CVPR (2011)Google Scholar
  22. 22.
    Mikulík, A., Perdoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. IJCV (2013)Google Scholar
  23. 23.
    Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: ICCV (2013)Google Scholar
  24. 24.
    Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)Google Scholar
  25. 25.
    Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88682-2_24 CrossRefGoogle Scholar
  26. 26.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)Google Scholar
  27. 27.
    Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM (1981)Google Scholar
  28. 28.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV (2004)Google Scholar
  29. 29.
    Avrithis, Y., Tolias, G.: Hough pyramid matching: speeded-up geometry re-ranking for large scale image retrieval. IJCV (2014)Google Scholar
  30. 30.
    Wu, X., Kashino, K.: Adaptive dither voting for robust spatial verification. In: ICCV (2015)Google Scholar
  31. 31.
    Li, X., Larson, M., Hanjalic, A.: Pairwise geometric matching for large-scale object retrieval. In: CVPR (2015)Google Scholar
  32. 32.
    Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: ICCV (2007)Google Scholar
  33. 33.
    Mikulík, A., Radenović, F., Chum, O., Matas, J.: Efficient image detail mining. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9004, pp. 118–132. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16808-1_9 Google Scholar
  34. 34.
    Irschara, A., Zach, C., Frahm, J.M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: CVPR (2009)Google Scholar
  35. 35.
    Sivic, J., Zisserman, A.: Efficient visual search cast as text retrieval. PAMI (2009)Google Scholar
  36. 36.
    Sattler, T., Leibe, B., Kobbelt, L.: SCRAMSAC: improving RANSAC’s efficiency with a spatial consistency filter. In: ICCV (2009)Google Scholar
  37. 37.
    Wu, X., Kashino, K.: Robust spatial matching as ensemble of weak geometric relations. In: BMVC (2015)Google Scholar
  38. 38.
    Chum, O., Matas, J., Kittler, J.: Locally optimized RANSAC. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 236–243. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-45243-0_31 CrossRefGoogle Scholar
  39. 39.
    Lebeda, K., Matas, J., Chum, O.: Fixing the locally optimized ransac. In: BMVC (2012)Google Scholar
  40. 40.
    Chum, O., Matas, J.: Matching with prosac-progressive sample consensus. In: CVPR (2005)Google Scholar
  41. 41.
    Raguram, R., Chum, O., Pollefeys, M., Matas, J., Frahm, J.: Usac: a universal framework for random sample consensus. PAMI (2013)Google Scholar
  42. 42.
    Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a Haystack. In: CVPR (2009)Google Scholar
  43. 43.
    Zhang, Y., Jia, Z., Chen, T.: Image retrieval with geometry-preserving visual phrases. In: CVPR (2011)Google Scholar
  44. 44.
    Johns, E.D., Yang, G.-Z.: Pairwise probabilistic voting: fast place recognition without RANSAC. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 504–519. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10605-2_33 Google Scholar
  45. 45.
    Tolias, G., Kalantidis, Y., Avrithis, Y., Kollias, S.: Towards large-scale geometry indexing by feature selection. CVIU (2014)Google Scholar
  46. 46.
    Shen, X., Lin, Z., Brandt, J., Wu, Y.: Spatially-constrained similarity measure for large-scale object retrieval. PAMI (2014)Google Scholar
  47. 47.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)Google Scholar
  48. 48.
    Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.J.: Yfcc100m: the new data in multimedia research. Comm. ACM (2016)Google Scholar
  49. 49.
    Heinly, J., Schönberger, J.L., Dunn, E., Frahm, J.M.: Reconstructing the world* in six days *(as captured by the yahoo 100 million image dataset). In: CVPR (2015)Google Scholar
  50. 50.
    Perdoch, M., Chum, O., Matas, J.: Efficient representation of local geometry for large scale object retrieval. In: CVPR (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Johannes L. Schönberger
    • 1
    Email author
  • True Price
    • 2
  • Torsten Sattler
    • 1
  • Jan-Michael Frahm
    • 2
  • Marc Pollefeys
    • 1
    • 3
  1. 1.ETH ZürichZürichSwitzerland
  2. 2.UNC Chapel HillChapel HillUSA
  3. 3.MicrosoftRedmondUSA

Personalised recommendations