A Memory Efficient Discriminative Approach for Location-Aided Recognition

Sinha, Sudipta N.; Hedau, Varsha; Zitnick, C. Lawrence; Szeliski, Richard

doi:10.1007/978-3-319-25781-5_15

A Memory Efficient Discriminative Approach for Location-Aided Recognition

Sudipta N. Sinha⁷,
Varsha Hedau⁸,
C. Lawrence Zitnick⁹ &
…
Richard Szeliski¹⁰

Chapter
First Online: 06 July 2016

1627 Accesses

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

In this chapter, we describe a visual recognition technique for fast recognition of urban landmarks on a GPS-enabled mobile device. Most existing methods offload their computation to a server by uploading the query image. Over a slow network, this can cause a latency of several seconds. In contrast, our approach requires uploading only the approximate GPS location to a server after which a compact, location-specific classifier is downloaded to the device and all subsequent computation is performed on it. Our approach is supervised and involves training compact random forest classifiers (RDF) on a database of geo-tagged images. The feature vector for the RDF is computed by densely searching the image for the presence of selective discriminative local image patches extracted from the training images. The images are rectified using detected vanishing points and binary descriptors allow for an efficient search for the discriminative patches, a step that is further accelerated using min-hash. We have evaluated the performance of our approach on representative urban datasets where it outperforms traditional methods based on bag-of-visual-words features or direct matching of local feature descriptors, neither of which are feasible approaches when processing must occur on a low-power mobile device.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aly M, Welinder P, Munich M, Perona P (2009) Towards automated large scale discovery of image families. CVPR Workshop Intern Vis 9–16
Google Scholar
Amit YDG (1997) Shape quantization and recognition with randomized trees. Neural Comput 9
Google Scholar
Arth C, Schmalstieg D (2011) Challenges of large-scale augmented reality on smartphones. Graz University of Technology, Graz, pp 1–4
Google Scholar
Arth C, Wagner D, Klopschitz M, Irschara A, Schmalstieg D (2009) Wide area localization on mobile phones. In: ISMAR, pp 73–82
Google Scholar
Arth C, Klopschitz M, Reitmayr G, Schmalstieg D (2011) Real-time self-localization from panoramic images on mobile devices. In: 2013 IEEE international symposium on mixed and augmented reality (ISMAR) vol 0, pp 37–46
Google Scholar
Baatz G, Koser K, Grzeszczuk R, Pollefeys M (2010) Handling urban location recognition as a 2d homothetic problem. In: IEEE proceedings of ECCV
Google Scholar
Baatz G, Saurer O, Köser K, Pollefeys M (2012) Large scale visual geo-localization of images in mountainous terrain. In: ECCV (2), pp 517–530
Google Scholar
Bergamo A, Sinha SN, Torresani L (2013) Leveraging structure from motion to learn discriminative codebooks for scalable landmark classification. In: CVPR, pp 763–770
Google Scholar
Breiman L (2001) Random forests. Machine Learn 45
Google Scholar
Cstrecha AM, Bronstein MMB, Fua P (2012) LDAHash: improved matching with smaller descriptors. IEEE Trans Pattern Anal Mach Intell 34(1)
Google Scholar
Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: Binary robust independent elementary features. In: ECCV 4:778–792
Google Scholar
Cao S, Snavely N (2013) Graph-based discriminative learning for location recognition. In: CVPR, pp 700–707
Google Scholar
Cao S, Snavely N (2014) Minimal scene descriptions from structure from motion models. In: CVPR
Google Scholar
Chandrasekhar V, Takacs G, Chen D, Tsai S, Grzeszczuk R, Girod B (2009) CHoG: compressed histogram of gradients a low bit-rate feature descriptor. In: IEEE conference on computer vision and pattern recognition (2009), pp 2504–2511
Google Scholar
Chen DM, Baatz G, Koser K, Tsai SS, Vedantham R, Pylvanainen T, Roimela K, Chen X, Bach J, Pollefeys M, Girod B, Grzeszczuk R (2011) City-scale landmark identification on mobile devices. In: 2013 IEEE conference on computer vision and pattern recognition, vol 0, pp 737–744
Google Scholar
Cheng Z, Ren J, Shen J, Miao H (2013) Building a large scale test collection for effective benchmarking of mobile landmark search. In: Advances in multimedia modeling, pp 36–46. Springer
Google Scholar
Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: CVPR, pp 3001–3008
Google Scholar
Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227
Google Scholar
Doersch C, Singh S, Gupta A, Sivic J, Efros AA (2012) What makes paris look like paris? ACM Trans Graph 31(4)
Google Scholar
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–395
Google Scholar
Gronat P, Obozinski G, Sivic J, Pajdla T (2013) Learning and calibrating per-location classifiers for visual place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Google Scholar
Guan T, Fan Y, Duan L, Yu J (2014) On-device mobile visual location recognition by using panoramic images and compressed sensing based visual descriptors. PloS one 9(6):e98,806
Google Scholar
Hays J, Efros A (20078) IM2GPS: estimating geographic information from a single image. In: IEEE proceedings of CVPR
Google Scholar
http://maps.google.com/help/maps/streetview/
http://www.bing.com/maps/
Hua G, Brown M, Winder S (2007) Discriminant embedding for local image descriptors. In: IEEE proceedings of ICCV
Google Scholar
Irschara A, Zach C, Frahm JM, Bischof H (2009) From structure-from-motion point clouds to fast location recognition. In: CVPR, pp 2599–2606. IEEE
Google Scholar
Jshotton M, Johnson RC (2008) Semantic texton forests for image categorization and segmentation. In: IEEE proceedings of CVPR
Google Scholar
Knopp J, Sivic J, Pajdla T (2010) Avoiding confusing features in place recognition. In: IEEE proceedings of ECCV
Google Scholar
Lepetit V, Fua P (2006) Keypoint recognition using randomized trees. PAMI 28:1465–1479
Google Scholar
Li X, Wu C, Zach C, Lazebnik S, Frahm JM (2008) Modeling and recognition of landmark image collections using iconic scene graphs. In: IEEE proceedings of ECCV
Google Scholar
Li Y, Crandall D, Huttenlocher D (2009) Landmark classification in large-scale image collections. In: IEEE Proceedings of ICCV
Google Scholar
Li Y, Snavely N, Huttenlocher D (2010) Location recognition using prioritized feature matching. In: IEEE Proceedings of ECCV
Google Scholar
Li Y, Snavely N, Huttenlocher D, Fua P (2012) Worldwide pose estimation using 3d point clouds. In: Computer Vision–ECCV 2012, pp 15–29. Springer
Google Scholar
Li Z, Yap KH (2012) Content and context boosting for mobile landmark recognition. IEEE Sig Process Lett 19(8):459–462
Google Scholar
Lim H, Sinha SN, Cohen MF, Uyttendaele M (2012) Real-time image-based 6-dof localization in large-scale environments. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 1043–1050. IEEE
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60
Google Scholar
Micusík B, Wildenauer H, Kosecka J (2008) Detection and matching of rectilinear structures. In: IEEE Proceedings of CVPR
Google Scholar
Middelberg S, Sattler T, Untzelmann O, Kobbelt L (2014) Scalable 6-dof localization on mobile devices. In: Computer vision ECCV 2014, lecture notes in computer science, vol 8690, pp 268–283
Google Scholar
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168
Google Scholar
Ondrej Chum JP, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: BMVC
Google Scholar
Perdoch OCM, Matas J (2009) Geometric min-hashing: Finding a (thick) needle in a haystack. In: IEEE Proceedings of CVPR
Google Scholar
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: IEEE proceedings of CVPR
Google Scholar
Robertson D, Cipolla R (2004) An image based system for urban navigation. In: BMVC, pp 819–828
Google Scholar
Sattler T, Leibe B, Kobbelt L (2012) Improving image-based localization by active correspondence search. In: ECCV 2012, pp 752–765. Springer
Google Scholar
Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: IEEE proceedings of CVPR
Google Scholar
Shao H, Svoboda T, Gool LV (2003) ZUBUD-Zurich buildings database for image based recognition. Tech. rep., No. 260, Swiss Federal Inst. of Technology
Google Scholar
Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE transactions on pattern analysis and machine intelligence 32(5):815–830
Google Scholar
Torii A, Sivic J, Pajdla T, Okutomi M (2013) Visual place recognition with repetitive structures. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Google Scholar
Turcot P, Lowe DG (2009) Better matching with fewer features: the selection of useful features in large database recognition problems. In: ICCV workshop on emergent issues in large amounts of visual data (WS-LAVD)
Google Scholar
Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. In: Advances in neural information processing systems, vol 13, pp 668–674. MIT Press
Google Scholar
Winder SAJ, Hua G, Brown M (2009) Picking the best daisy. In: CVPR, pp 178–185
Google Scholar
Zamir A, Shah M (2010) Accurate image localization based on google maps street view. In: IEEE proceedings of ECCV
Google Scholar
Zhang W, Kosecka J (2007) Hierarchical building recognition. Image Vis Comput 25(5):704–716
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, Redmond, WA, USA
Sudipta N. Sinha
Apple, Cupertino, CA, USA
Varsha Hedau
Facebook AI Research, Palo Alto, CA, USA
C. Lawrence Zitnick
Facebook, Seattle, WA, USA
Richard Szeliski

Authors

Sudipta N. Sinha
View author publications
You can also search for this author in PubMed Google Scholar
Varsha Hedau
View author publications
You can also search for this author in PubMed Google Scholar
C. Lawrence Zitnick
View author publications
You can also search for this author in PubMed Google Scholar
Richard Szeliski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudipta N. Sinha .

Editor information

Editors and Affiliations

Computer Science Department, Stanford University Computer Science Department, Stanford, California, USA
Amir R. Zamir
Decisive Analytics Corporation, Arlington, Virginia, USA
Asaad Hakeem
ETH Zürich, Zürich, Switzerland
Luc Van Gool
University of Central Florida, Orlando, Florida, USA
Mubarak Shah
Facebook, Seattle, Washington, USA
Richard Szeliski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sinha, S.N., Hedau, V., Zitnick, C.L., Szeliski, R. (2016). A Memory Efficient Discriminative Approach for Location-Aided Recognition. In: Zamir, A., Hakeem, A., Van Gool, L., Shah, M., Szeliski, R. (eds) Large-Scale Visual Geo-Localization. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-25781-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-25781-5_15
Published: 06 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25779-2
Online ISBN: 978-3-319-25781-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics