Skip to main content

A Memory Efficient Discriminative Approach for Location-Aided Recognition

  • Chapter
  • First Online:
Large-Scale Visual Geo-Localization

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

  • 1661 Accesses

Abstract

In this chapter, we describe a visual recognition technique for fast recognition of urban landmarks on a GPS-enabled mobile device. Most existing methods offload their computation to a server by uploading the query image. Over a slow network, this can cause a latency of several seconds. In contrast, our approach requires uploading only the approximate GPS location to a server after which a compact, location-specific classifier is downloaded to the device and all subsequent computation is performed on it. Our approach is supervised and involves training compact random forest classifiers (RDF) on a database of geo-tagged images. The feature vector for the RDF is computed by densely searching the image for the presence of selective discriminative local image patches extracted from the training images. The images are rectified using detected vanishing points and binary descriptors allow for an efficient search for the discriminative patches, a step that is further accelerated using min-hash. We have evaluated the performance of our approach on representative urban datasets where it outperforms traditional methods based on bag-of-visual-words features or direct matching of local feature descriptors, neither of which are feasible approaches when processing must occur on a low-power mobile device.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aly M, Welinder P, Munich M, Perona P (2009) Towards automated large scale discovery of image families. CVPR Workshop Intern Vis 9–16

    Google Scholar 

  2. Amit YDG (1997) Shape quantization and recognition with randomized trees. Neural Comput 9

    Google Scholar 

  3. Arth C, Schmalstieg D (2011) Challenges of large-scale augmented reality on smartphones. Graz University of Technology, Graz, pp 1–4

    Google Scholar 

  4. Arth C, Wagner D, Klopschitz M, Irschara A, Schmalstieg D (2009) Wide area localization on mobile phones. In: ISMAR, pp 73–82

    Google Scholar 

  5. Arth C, Klopschitz M, Reitmayr G, Schmalstieg D (2011) Real-time self-localization from panoramic images on mobile devices. In: 2013 IEEE international symposium on mixed and augmented reality (ISMAR) vol 0, pp 37–46

    Google Scholar 

  6. Baatz G, Koser K, Grzeszczuk R, Pollefeys M (2010) Handling urban location recognition as a 2d homothetic problem. In: IEEE proceedings of ECCV

    Google Scholar 

  7. Baatz G, Saurer O, Köser K, Pollefeys M (2012) Large scale visual geo-localization of images in mountainous terrain. In: ECCV (2), pp 517–530

    Google Scholar 

  8. Bergamo A, Sinha SN, Torresani L (2013) Leveraging structure from motion to learn discriminative codebooks for scalable landmark classification. In: CVPR, pp 763–770

    Google Scholar 

  9. Breiman L (2001) Random forests. Machine Learn 45

    Google Scholar 

  10. Cstrecha AM, Bronstein MMB, Fua P (2012) LDAHash: improved matching with smaller descriptors. IEEE Trans Pattern Anal Mach Intell 34(1)

    Google Scholar 

  11. Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: Binary robust independent elementary features. In: ECCV 4:778–792

    Google Scholar 

  12. Cao S, Snavely N (2013) Graph-based discriminative learning for location recognition. In: CVPR, pp 700–707

    Google Scholar 

  13. Cao S, Snavely N (2014) Minimal scene descriptions from structure from motion models. In: CVPR

    Google Scholar 

  14. Chandrasekhar V, Takacs G, Chen D, Tsai S, Grzeszczuk R, Girod B (2009) CHoG: compressed histogram of gradients a low bit-rate feature descriptor. In: IEEE conference on computer vision and pattern recognition (2009), pp 2504–2511

    Google Scholar 

  15. Chen DM, Baatz G, Koser K, Tsai SS, Vedantham R, Pylvanainen T, Roimela K, Chen X, Bach J, Pollefeys M, Girod B, Grzeszczuk R (2011) City-scale landmark identification on mobile devices. In: 2013 IEEE conference on computer vision and pattern recognition, vol 0, pp 737–744

    Google Scholar 

  16. Cheng Z, Ren J, Shen J, Miao H (2013) Building a large scale test collection for effective benchmarking of mobile landmark search. In: Advances in multimedia modeling, pp 36–46. Springer

    Google Scholar 

  17. Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: CVPR, pp 3001–3008

    Google Scholar 

  18. Criminisi A, Shotton J, Konukoglu E (2012) Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis 7(2–3):81–227

    Google Scholar 

  19. Doersch C, Singh S, Gupta A, Sivic J, Efros AA (2012) What makes paris look like paris? ACM Trans Graph 31(4)

    Google Scholar 

  20. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24:381–395

    Google Scholar 

  21. Gronat P, Obozinski G, Sivic J, Pajdla T (2013) Learning and calibrating per-location classifiers for visual place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  22. Guan T, Fan Y, Duan L, Yu J (2014) On-device mobile visual location recognition by using panoramic images and compressed sensing based visual descriptors. PloS one 9(6):e98,806

    Google Scholar 

  23. Hays J, Efros A (20078) IM2GPS: estimating geographic information from a single image. In: IEEE proceedings of CVPR

    Google Scholar 

  24. http://maps.google.com/help/maps/streetview/

  25. http://www.bing.com/maps/

  26. Hua G, Brown M, Winder S (2007) Discriminant embedding for local image descriptors. In: IEEE proceedings of ICCV

    Google Scholar 

  27. Irschara A, Zach C, Frahm JM, Bischof H (2009) From structure-from-motion point clouds to fast location recognition. In: CVPR, pp 2599–2606. IEEE

    Google Scholar 

  28. Jshotton M, Johnson RC (2008) Semantic texton forests for image categorization and segmentation. In: IEEE proceedings of CVPR

    Google Scholar 

  29. Knopp J, Sivic J, Pajdla T (2010) Avoiding confusing features in place recognition. In: IEEE proceedings of ECCV

    Google Scholar 

  30. Lepetit V, Fua P (2006) Keypoint recognition using randomized trees. PAMI 28:1465–1479

    Google Scholar 

  31. Li X, Wu C, Zach C, Lazebnik S, Frahm JM (2008) Modeling and recognition of landmark image collections using iconic scene graphs. In: IEEE proceedings of ECCV

    Google Scholar 

  32. Li Y, Crandall D, Huttenlocher D (2009) Landmark classification in large-scale image collections. In: IEEE Proceedings of ICCV

    Google Scholar 

  33. Li Y, Snavely N, Huttenlocher D (2010) Location recognition using prioritized feature matching. In: IEEE Proceedings of ECCV

    Google Scholar 

  34. Li Y, Snavely N, Huttenlocher D, Fua P (2012) Worldwide pose estimation using 3d point clouds. In: Computer Vision–ECCV 2012, pp 15–29. Springer

    Google Scholar 

  35. Li Z, Yap KH (2012) Content and context boosting for mobile landmark recognition. IEEE Sig Process Lett 19(8):459–462

    Google Scholar 

  36. Lim H, Sinha SN, Cohen MF, Uyttendaele M (2012) Real-time image-based 6-dof localization in large-scale environments. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 1043–1050. IEEE

    Google Scholar 

  37. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60

    Google Scholar 

  38. Micusík B, Wildenauer H, Kosecka J (2008) Detection and matching of rectilinear structures. In: IEEE Proceedings of CVPR

    Google Scholar 

  39. Middelberg S, Sattler T, Untzelmann O, Kobbelt L (2014) Scalable 6-dof localization on mobile devices. In: Computer vision ECCV 2014, lecture notes in computer science, vol 8690, pp 268–283

    Google Scholar 

  40. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: CVPR, pp 2161–2168

    Google Scholar 

  41. Ondrej Chum JP, Zisserman A (2008) Near duplicate image detection: min-hash and tf-idf weighting. In: BMVC

    Google Scholar 

  42. Perdoch OCM, Matas J (2009) Geometric min-hashing: Finding a (thick) needle in a haystack. In: IEEE Proceedings of CVPR

    Google Scholar 

  43. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: IEEE proceedings of CVPR

    Google Scholar 

  44. Robertson D, Cipolla R (2004) An image based system for urban navigation. In: BMVC, pp 819–828

    Google Scholar 

  45. Sattler T, Leibe B, Kobbelt L (2012) Improving image-based localization by active correspondence search. In: ECCV 2012, pp 752–765. Springer

    Google Scholar 

  46. Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: IEEE proceedings of CVPR

    Google Scholar 

  47. Shao H, Svoboda T, Gool LV (2003) ZUBUD-Zurich buildings database for image based recognition. Tech. rep., No. 260, Swiss Federal Inst. of Technology

    Google Scholar 

  48. Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE transactions on pattern analysis and machine intelligence 32(5):815–830

    Google Scholar 

  49. Torii A, Sivic J, Pajdla T, Okutomi M (2013) Visual place recognition with repetitive structures. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  50. Turcot P, Lowe DG (2009) Better matching with fewer features: the selection of useful features in large database recognition problems. In: ICCV workshop on emergent issues in large amounts of visual data (WS-LAVD)

    Google Scholar 

  51. Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. In: Advances in neural information processing systems, vol 13, pp 668–674. MIT Press

    Google Scholar 

  52. Winder SAJ, Hua G, Brown M (2009) Picking the best daisy. In: CVPR, pp 178–185

    Google Scholar 

  53. Zamir A, Shah M (2010) Accurate image localization based on google maps street view. In: IEEE proceedings of ECCV

    Google Scholar 

  54. Zhang W, Kosecka J (2007) Hierarchical building recognition. Image Vis Comput 25(5):704–716

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudipta N. Sinha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Sinha, S.N., Hedau, V., Zitnick, C.L., Szeliski, R. (2016). A Memory Efficient Discriminative Approach for Location-Aided Recognition. In: Zamir, A., Hakeem, A., Van Gool, L., Shah, M., Szeliski, R. (eds) Large-Scale Visual Geo-Localization. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-25781-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25781-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25779-2

  • Online ISBN: 978-3-319-25781-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics