DisLocation: Scalable Descriptor Distinctiveness for Location Recognition

Arandjelović, Relja; Zisserman, Andrew

doi:10.1007/978-3-319-16817-3_13

Relja Arandjelović¹⁷ &
Andrew Zisserman¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9006))

Included in the following conference series:

Asian Conference on Computer Vision

2627 Accesses
28 Citations

Abstract

The objective of this paper is to improve large scale visual object retrieval for visual place recognition. Geo-localization based on a visual query is made difficult by plenty of non-distinctive features which commonly occur in imagery of urban environments, such as generic modern windows, doors, cars, trees, etc. The focus of this work is to adapt standard Hamming Embedding retrieval system to account for varying descriptor distinctiveness. To this end, we propose a novel method for efficiently estimating distinctiveness of all database descriptors, based on estimating local descriptor density everywhere in the descriptor space. In contrast to all competing methods, the (unsupervised) training time for our method (DisLoc) is linear in the number database descriptors and takes only a 100 s on a single CPU core for a 1 million image database. Furthermore, the added memory requirements are negligible (1 %).

The method is evaluated on standard publicly available large-scale place recognition benchmarks containing street-view imagery of Pittsburgh and San Francisco. DisLoc is shown to outperform all baselines, while setting the new state-of-the-art on both benchmarks. The method is compatible with spatial reranking, which further improves recognition results.

Finally, we also demonstrate that 7 % of the least distinctive features can be removed, therefore reducing storage requirements and improving retrieval speed, without any loss in place recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Quack, T., Leibe, B., Van Gool, L.: World-scale mining of objects and events from community photo collections. In: Proceedings of the CIVR (2008)
Google Scholar
Chen, D.M., Tsai, S.S., Vedantham, R., Grzeszczuk, R., Girod, B.: Streaming mobile augmented reality on mobile phones. In: International Symposium on Mixed and Augmented Reality, ISMAR (2009)
Google Scholar
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. Int. J. Rob. Res. 27, 647–665 (2008)
Article Google Scholar
Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: Proceedings of the ICCV (2009)
Google Scholar
Schindler, G., Brown, M., Szeliski, R.: City-scale location recognition. In: Proceedings of the CVPR (2007)
Google Scholar
Knopp, J., Sivic, J., Pajdla, T.: Avoiding confusing features in place recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 748–761. Springer, Heidelberg (2010)
Chapter Google Scholar
Chen, D.M., Baatz, G., Koeser, K., Tsai, S.S., Vedantham, R., Pylvanainen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., Grzeszczuk, R.: City-scale landmark identification on mobile devices. In: Proceedings of the CVPR (2011)
Google Scholar
Torii, A., Sivic, J., Pajdla, T., Okutomi, M.: Visual place recognition with repetitive structures. In: Proceedings of the CVPR (2013)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of the ICCV, vol. 2, pp. 1470–1477 (2003)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the CVPR (2007)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of the CVPR, pp. 2161–2168 (2006)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of the CVPR (2008)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87, 316–336 (2010)
Article Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Descriptor learning for efficient retrieval. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 677–691. Springer, Heidelberg (2010)
Chapter Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. IEEE PAMI 36, 1573–1585 (2014)
Article Google Scholar
Gronat, P., Obozinski, G., Sivic, J., Pajdla, T.: Learning and calibrating per-location classifiers for visual place recognition. In: Proceedings of the CVPR (2013)
Google Scholar
Cao, S., Snavely, N.: Graph-based discriminative learning for location recognition. In: Proceedings of the CVPR (2013)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: Proceedings of the CVPR (2009)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Exploiting descriptor distances for precise image search. Technical report, INRIA (2011)
Google Scholar
Aly, M., Munich, M., Perona, P.: CompactKdt: compact signatures for accurate large scale object recognition. In: IEEE Workshop on Applications of Computer Vision (2012)
Google Scholar
Sattler, T., Weyand, T., Leibe, B., Kobbelt, L.: Image retrieval for image-based localization revisited. In: Proceedings of the BMVC (2012)
Google Scholar
Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: selective match kernels for image search. In: Proceedings of the ICCV (2013)
Google Scholar
Qin, D., Wengert, C., Van Gool, L.: Query adaptive similarity for large scale object retrieval. In: Proceedings of the CVPR (2013)
Google Scholar
Turcot, T., Lowe, D.G.: Better matching with fewer features: the selection of useful features in large database recognition problems. In: ICCV Workshop on Emergent Issues in Large Amounts of Visual Data (WS-LAVD) (2009)
Google Scholar
Philbin, J., Zisserman, A.: Object mining using a matching graph on very large image collections. In: Proceedings of the ICVGIP (2008)
Google Scholar
Jégou, H., Harzallah, H., Schmid, C.: A contextual dissimilarity measure for accurate and efficient image search. In: Proceedings of the CVPR (2007)
Google Scholar
Qin, D., Gammeter, S., Bossard, L., Quack, T., Van Gool, L.: Hello neighbor: accurate object retrieval with k-reciprocal nearest neighbors. In: Proceedings of the CVPR (2011)
Google Scholar
Delvinioti, A., Jégou, H., Amsaleg, L., Houle, M.E.: Image retrieval with reciprocal and shared nearest neighbors. In: VISAPP - International Conference on Computer Vision Theory and Applications (2014)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proceedings of the CVPR (2010)
Google Scholar
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51, 117–122 (2008)
Article Google Scholar
Tolias, G., Jégou, H.: Visual query expansion with or without geometry: refining local descriptors by feature aggregation. Pattern Recogn. 47, 3466–3476 (2014)
Article Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE PAMI 33, 117–128 (2011)
Article Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Proceedings of the CVPR (2012)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. IJCV 1, 63–86 (2004)
Article Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: Proceedings of the ICCV (2007)
Google Scholar
Chum, O., Mikulik, A., Perd’och, M., Matas, J.: Total recall II: query expansion revisited. In: Proceedings of the CVPR (2011)
Google Scholar
Kennedy, L., Naaman, M.: Generating diverse and representative image search results for landmarks. In: Proceedings of the World Wide Web (2008)
Google Scholar
van Leuken, R.H., Garcia, L., Olivares, X., van Zwol, R.: Visual diversification of image search results. In: Proceedings of the World Wide Web (2009)
Google Scholar

Download references

Acknowledgement

We thank A. Torri and D. Chen for sharing their datasets, and are grateful for financial support from ERC grant VisRec no. 228180 and a Royal Society Wolfson Research Merit Award.

Author information

Authors and Affiliations

Department of Engineering Science, University of Oxford, Oxford, UK
Relja Arandjelović & Andrew Zisserman

Authors

Relja Arandjelović
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Zisserman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Relja Arandjelović .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arandjelović, R., Zisserman, A. (2015). DisLocation: Scalable Descriptor Distinctiveness for Location Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-16817-3_13
Published: 17 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16816-6
Online ISBN: 978-3-319-16817-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics