Avoiding Confusing Features in Place Recognition

  • Jan Knopp
  • Josef Sivic
  • Tomas Pajdla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6311)


We seek to recognize the place depicted in a query image using a database of “street side” images annotated with geolocation information. This is a challenging task due to changes in scale, viewpoint and lighting between the query and the images in the database. One of the key problems in place recognition is the presence of objects such as trees or road markings, which frequently occur in the database and hence cause significant confusion between different places. As the main contribution, we show how to avoid features leading to confusion of particular places by using geotags attached to database images as a form of supervision. We develop a method for automatic detection of image-specific and spatially-localized groups of confusing features, and demonstrate that suppressing them significantly improves place recognition performance while reducing the database size. We show the method combines well with the state of the art bag-of-features model including query expansion, and demonstrate place recognition that generalizes over wide range of viewpoints and lighting conditions. Results are shown on a geotagged database of over 17K images of Paris downloaded from Google Street View.


Visual Word Query Image Query Expansion Object Retrieval Place Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
  3. 3.
    Schindler, G., Brown, M., Szeliski, R.: City-scale location recognition. In: CVPR (2007)Google Scholar
  4. 4.
    Aguera y Arcas, B.: Augmented reality using Bing maps. Talk at TED (2010)Google Scholar
  5. 5.
    Quack, T., Leibe, B., Van Gool, L.: World-scale mining of objects and events from community photo collections. In: CIVR (2008)Google Scholar
  6. 6.
    Li, Y., Crandall, D., Huttenlocher, D.: Landmark classification in large-scale image collections. In: ICCV (2009)Google Scholar
  7. 7.
    Snavely, N., Seitz, S., Szeliski, R.: Photo tourism: exploring photo collections in 3D. In: SIGGRAPH (2006)Google Scholar
  8. 8.
    Havlena, M., Torii, A., Pajdla, T.: Efficient structure from motion by graph optimization. In: ECCV (2010)Google Scholar
  9. 9.
    Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or “How do I organize my holiday snaps? In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 414–431. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: WS-SLCV, ECCV (2004)Google Scholar
  11. 11.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  12. 12.
    Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: ICCV (2007)Google Scholar
  13. 13.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)Google Scholar
  14. 14.
    Shao, H., Svoboda, T., Tuytelaars, T., van Gool, L.: Hpat indexing for fast object/scene recognition based on local appearance. In: CIVR (2003)Google Scholar
  15. 15.
    Silpa-Anan, C., Hartley, R.: Localization using an image-map. In: ACRA (2004)Google Scholar
  16. 16.
    Zhang, W., Kosecka, J.: Image based localization in urban environments. In: 3DPVT (2006)Google Scholar
  17. 17.
    Cummins, M., Newman, P.: Highly scalable appearance-only SLAM - FAB-MAP 2.0. In: Proceedings of Robotics: Science and Systems, Seattle, USA (2009)Google Scholar
  18. 18.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)Google Scholar
  19. 19.
    Hays, J., Efros, A.: im2gps: estimating geographic information from a single image. In: CVPR (2008)Google Scholar
  20. 20.
    Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: Finding a (thick) needle in a haystack. In: CVPR (2009)Google Scholar
  21. 21.
    Li, X., Wu, C., Zach, C., Lazebnik, S., Frahm, J.-M.: Modeling and recognition of landmark image collections using iconic scene graphs. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 427–440. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Simon, I., Snavely, N., Seitz, S.: Scene summarization for online image collections. In: SIGGRAPH (2006)Google Scholar
  23. 23.
    Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large-scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  24. 24.
    Turcot, P., Lowe, D.: Better matching with fewer features: The selection of useful features in large database recognition problem. In: WS-LAVD, ICCV (2009)Google Scholar
  25. 25.
    Lee, Y., Grauman, K.: Foreground focus: Unsupervised learning from partially matching images. IJCV 85 (2009)Google Scholar
  26. 26.
    Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: CVPR (2006)Google Scholar
  27. 27.
    Torralba, A., Murphy, K., Freeman, W.: Sharing visual features for multiclass and multiview object detection. IEEE PAMI 29 (2007)Google Scholar
  28. 28.
    Kulis, B., Jain, P., Grauman, K.: Fast similarity search for learned metrics. IEEE PAMI 31 (2009)Google Scholar
  29. 29.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Learning query-dependent prefilters for scalable image retrieval. In: CVPR (2009)Google Scholar
  30. 30.
    Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: ICCV (2007)Google Scholar
  31. 31.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  32. 32.
    Muja, M., Lowe, D.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP (2009)Google Scholar
  33. 33.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24 (1988)Google Scholar
  34. 34.
    Chum, O., Matas, J., Obdrzalek, S.: Enhancing RANSAC by generalized model optimization. In: ACCV (2004)Google Scholar
  35. 35.
    Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In: ICCV (2001)Google Scholar
  36. 36.
    Jegou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)Google Scholar
  37. 37.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jan Knopp
    • 1
  • Josef Sivic
    • 2
  • Tomas Pajdla
    • 3
  1. 1.VISICS, ESAT-PSIK.U. LeuvenBelgium
  2. 2.INRIA, WILLOW projectEcole Normale SuperieureParisFrance
  3. 3.Center for Machine PerceptionCzech Technical University in Prague 

Personalised recommendations