Skip to main content

Visual Localization for Micro Aerial Vehicles in Urban Outdoor Environments

  • Chapter
Advanced Topics in Computer Vision

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

  • 3181 Accesses

Abstract

Accurate localization of a micro aerial vehicle (MAV) with respect to a scene is important for a wide range of applications, in particular autonomous navigation, surveillance, and inspection. In this context, visual localization in urban outdoor environments is gaining importance as common methods such as GPS positioning are often not accurate enough or even fail. We present recent approaches and results for robust 3D reconstruction of suitable visual landmarks, for the alignment in a world coordinate system, and for fast, high-accuracy monocular localization. We introduce a scalable representation of the prior knowledge about the scene and demonstrate how in-flight information can be integrated to facilitate long-term operation. Our method outperforms a state-of-the-art visual SLAM approach and achieves localization accuracies comparable to differential GPS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://aerial.icg.tugraz.at

References

  1. Achtelik M, Achtelik M, Weiss S, Siegwart R (2011) Onboard IMU and monocular vision based control for MAVs in unknown in- and outdoor environments. In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  2. Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building Rome in a day. In: IEEE international conference on computer vision (ICCV)

    Google Scholar 

  3. Beder C, Steffen R (2006) Determining an initial image pair for fixing the scale of a 3d reconstruction from an image sequence. In: Proc DAGM, pp 657–666

    Google Scholar 

  4. Bloesch M, Weiss S, Scaramuzza D, Siegwart R (2010) Vision based MAV navigation in unknown and unstructured environments. In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  5. Boissonnat J-D, Yvinec M (1998) Algorithmic geometry. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  6. Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239

    Article  Google Scholar 

  7. Collins RT (1996) A space-sweep approach to true multi-image matching. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  8. Conway AR (1995) Autonomous control of an unstable model helicopter using carrier phase GPS only. PhD thesis, Stanford University, USA

    Google Scholar 

  9. Davison AJ (2003) Real-time simultaneous localisation and mapping with a single camera. In: IEEE international conference on computer vision (ICCV)

    Google Scholar 

  10. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2007) The PASCAL visual object classes challenge 2007 results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  11. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with application to image analysis and automated cartography. Commun ACM 24(6):381–395

    Article  MathSciNet  Google Scholar 

  12. Frahm J-M, Georgel P, Gallup D, Johnson T, Raguram R, Wu C, Jen Y-H, Dunn E, Clipp B, Lazebnik S, Pollefeys M (2010) Building Rome on a cloudless day. In: European conference on computer vision (ECCV)

    Google Scholar 

  13. Frueh C, Zakhor A (2003) Constructing 3D city models by merging ground-based and airborne views. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  14. Fukunaga K, Narendra PM (1975) A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans Comput C-24(7):750–753

    Article  MathSciNet  Google Scholar 

  15. Furukawa Y, Ponce J (2009) Accurate, dense, and robust multi-view stereopsis. In: IEEE transactions on pattern analysis and machine intelligence (PAMI)

    Google Scholar 

  16. Golparvar Fard M, Peña-Mora F, Savarese S (2011) Monitoring changes of 3D building elements from unordered photo collections. In: IEEE international conference on computer vision (ICCV) workshops

    Google Scholar 

  17. Haralick RM, Lee C, Ottenberg K, Nölle M (1991) Analysis and solutions of the three point perspective pose estimation problem. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 592–598

    Google Scholar 

  18. Hartley RI, Sturm PF (1997) Triangulation. Comput Vis Image Underst 68(2):146–157

    Article  Google Scholar 

  19. Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  20. Hoppe C, Klopschitz M, Rumpler M, Wendel A, Kluckner S, Bischof H, Reitmayr G (2012) Online feedback for structure-from-motion image acquisition. In: British machine vision conference (BMVC)

    Google Scholar 

  21. Hoppe C, Wendel A, Zollmann S, Pirker K, Irschara A, Bischof H, Kluckner S (2012) Photogrammetric camera network design for micro aerial vehicles. In: Computer vision winter workshop (CVWW)

    Google Scholar 

  22. Hrabar S, Sukhatme G (2009) Vision-based navigation through urban canyons. J Field Robot 26:431–452

    Article  Google Scholar 

  23. Huber PJ (1981) Robust statistics. Wiley, New York

    Book  MATH  Google Scholar 

  24. Irschara A, Zach C, Frahm JM, Bischof H (2009) From structure-from-motion point clouds to fast location recognition. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  25. Irschara A, Kaufmann V, Klopschitz M, Bischof H, Leberl F (2010) Towards fully automatic photogrammetric reconstruction using digital images taken from UAVs. In: Proceedings of the ISPRS symposium, 100 years ISPRS—advancing remote sensing science

    Google Scholar 

  26. Irschara A, Rumpler M, Meixner P, Pock T, Bischof H (2012) Efficient and globally optimal multi view dense matching for aerial images. In: ISPRS annals of photogrammetry, remote sensing and spatial information sciences

    Google Scholar 

  27. Kaminsky RS, Snavely N, Seitz SM, Szeliski R (2009) Alignment of 3D point clouds to overhead images. In: IEEE conference on computer vision and pattern recognition (CVPR) workshops. IEEE, pp 63–70

    Google Scholar 

  28. Klein G, Murray DW (2007) Parallel tracking and mapping for small AR workspaces. In: International symposium on mixed and augmented reality (ISMAR)

    Google Scholar 

  29. Klein G, Murray D (2008) Improving the agility of keyframe-based SLAM. In: European conference on computer vision (ECCV)

    Google Scholar 

  30. Klopschitz M, Irschara A, Reitmayr G, Schmalstieg D (2010) Robust incremental structure from motion. In: International symposium on 3D data processing, visualization and transmission (3DPVT)

    Google Scholar 

  31. Kluckner S, Birchbauer JA, Windisch C, Hoppe C, Irschara A, Wendel A, Zollmann S, Reitmayr G, Bischof H (2011) Construction site monitoring from highly-overlapping MAV images. In: IEEE international conference on advanced video- and signal-based surveillance (AVSS)

    Google Scholar 

  32. Kneip L, Scaramuzza D, Siegwart R (2011) A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  33. Konolige K, Bowman J (2009) Towards lifelong visual maps. In: International conference on intelligent robots and systems (IROS)

    Google Scholar 

  34. Labatut P, Pons JP, Keriven R (2007) Efficient multi-view reconstruction of large-scale scenes using interest points, delaunay triangulation and graph cuts. In: IEEE international conference on computer vision (ICCV)

    Google Scholar 

  35. Labrosse F (2006) The visual compass: performance and limitations of an appearance-based method. J Field Robot 23(10):913–941

    Article  Google Scholar 

  36. Li Y, Snavely N, Huttenlocher DP (2010) Location recognition using prioritized feature matching. In: European conference on computer vision (ECCV)

    Google Scholar 

  37. Lim H, Sinha SN, Cohen MF, Uyttendaele M (2012) Real-time image-based 6-dof localization in large-scale environments. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  38. Low K (2004) Linear least-squares optimization for point-to-plane ICP surface registration. Technical report, TR04-004, University of North Carolina

    Google Scholar 

  39. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  40. Lupashin S, Schoellig A, Sherback M, D’Andrea R (2010) A simple learning strategy for high-speed quadrocopter multi-flips. In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  41. Matthews L, Ishikawa T, Baker S (2004) The template update problem. In: IEEE transactions on pattern analysis and machine intelligence (PAMI)

    Google Scholar 

  42. Mei C, Sibley G, Cummins M, Newman P, Reid I (2009) A constant-time efficient stereo SLAM system. In: British machine vision conference (BMVC)

    Google Scholar 

  43. Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Van Gool L (2005) A comparison of affine region detectors. Int J Comput Vis 65:43–72

    Article  Google Scholar 

  44. Muja M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. In: International conference on computer vision theory and applications (VISAPP), pp 331–340

    Google Scholar 

  45. Newcombe RA, Davison AJ, Izadi S, Kohli P, Hilliges O, Shotton J, Molyneaux D, Hodges S, Kim D, Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking. In: International symposium on mixed and augmented reality (ISMAR)

    Google Scholar 

  46. Nistér D (2003) An efficient solution to the five-point relative pose problem. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 195–202

    Google Scholar 

  47. Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–770

    Article  Google Scholar 

  48. Nistér D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  49. Pollefeys M, Van Gool L, Vergauwen M, Verbiest F, Cornelis K, Tops J, Koch R (2004) Visual modeling with a hand-held camera. Int J Comput Vis 59(3):207–232

    Article  Google Scholar 

  50. Raguram R, Frahm J-M (2011) RECON: scale-adaptive robust estimation via residual consensus. In: IEEE international conference on computer vision (ICCV)

    Google Scholar 

  51. Rudol P, Wzorek M, Doherty P (2010) Vision-based pose estimation for autonomous indoor navigation of micro-scale unmanned aircraft systems. In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  52. Rumpler M, Irschara A, Wendel A, Bischof H (2012) Rapid 3d city model approximation from publicly available geographic data sources and georeferenced aerial images. In: Computer vision winter workshop (CVWW)

    Google Scholar 

  53. Sattler T, Leibe B, Kobbelt L (2011) Fast image-based localization using direct 2d-to-3d matching. In: IEEE international conference on computer vision (ICCV)

    Google Scholar 

  54. Sattler T, Weyand T, Leibe B, Kobbelt L (2012) Image retrieval for image-based localization revisited. In: British machine vision conference (BMVC)

    Google Scholar 

  55. Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  56. Schmid K, Hirschmueller H, Doemel A, Grixa I, Suppa M, Hirzinger G (2012) View planning for multi-view stereo 3d reconstruction using an autonomous multicopter. J Intell Robot Syst 65:309–323

    Article  Google Scholar 

  57. Silpa Anan C, Hartley RI (2008) Optimised KD-trees for fast image descriptor matching. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  58. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: IEEE international conference on computer vision (ICCV), pp 1470–1477

    Chapter  Google Scholar 

  59. Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. In: ACM SIGGRAPH

    Google Scholar 

  60. Snavely N, Seitz SM, Szeliski RS (2008) Modeling the world from internet photo collections. Int J Comput Vis 80(2):189–210

    Article  Google Scholar 

  61. Strasdat H, Montiel JMM, Davison AJ (2010) Real-time monocular SLAM: why filter? In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  62. Strecha C, Pylvaenaeinen T, Fua P (2010) Dynamic and scalable large scale image reconstruction. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  63. Szeliski R (2006) Image alignment and stitching: a tutorial. Found Trends Comput Graph Vis 2(1):1–104

    Article  Google Scholar 

  64. Triggs B, McLauchlan P, Hartley R, Fitzgibbon A (2000) Bundle adjustment—a modern synthesis. In: Vision algorithms: theory and practice, pp 298–375

    Chapter  Google Scholar 

  65. Weiss S, Achtelik M, Kneip L, Scaramuzza D, Siegwart R (2011) Intuitive 3D maps for MAV terrain exploration and obstacle avoidance. J Intell Robot Syst 61:473–493

    Article  Google Scholar 

  66. Wendel A, Irschara A, Bischof H (2011) Automatic alignment of 3D reconstructions using a digital surface model. In: IEEE international conference on computer vision and pattern recognition (CVPR), workshop on aerial video processing

    Google Scholar 

  67. Wendel A, Irschara A, Bischof H (2011) Natural landmark-based monocular localization for MAVs. In: IEEE international conference on robotics and vision (ICRA)

    Google Scholar 

  68. Wendel A, Hoppe C, Bischof H, Leberl F (2012) Automatic fusion of partial reconstructions. In: ISPRS annals of photogrammetry, remote sensing and spatial information sciences

    Google Scholar 

  69. Wendel A, Maurer M, Bischof H (2012) Visual landmark-based localization for MAVs using incremental feature updates. In: Joint 3DIM/3DPVT conference: 3D imaging, modeling, processing, visualization & transmission (3DIMPVT)

    Google Scholar 

  70. Wendel A, Maurer M, Graber G, Pock T, Bischof H (2012) Dense reconstruction on-the-fly. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  71. Wendel A, Maurer M, Katusic M, Bischof H (2012) Fuzzy visual servoing for micro aerial vehicles. In: Austrian robotics workshop (ARW)

    Google Scholar 

  72. Wu C (2007) SiftGPU: a GPU implementation of scale invariant feature transform (SIFT). http://cs.unc.edu/~ccwu/siftgpu

  73. Zamir AR, Shah M (2010) Accurate image localization based on google maps street view. In: European conference on computer vision (ECCV)

    Google Scholar 

  74. Zhang Z (1994) Iterative point matching for registration of free-form curves and surfaces. Int J Comput Vis 13(2):119–152

    Article  Google Scholar 

  75. Zhao W, Nister D, Hsu S (2005) Alignment of continuous video onto 3D point clouds. In: IEEE transactions on pattern analysis and machine intelligence (PAMI), pp 1305–1318

    Google Scholar 

  76. Zhu ZW, Oskiper T, Samarasekera S, Kumar R, Sawhney HS (2008) Real-time global localization with a pre-built visual landmark database. In: IEEE conference on computer vision and pattern recognition (CVPR)

    Google Scholar 

  77. Zufferey J-C, Beyeler A, Floreano D (2010) Autonomous flight at low altitude with vision-based collision avoidance and GPS-based path following. In: IEEE International conference on robotics and vision (ICRA)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the Austrian Research Promotion Agency (FFG) projects FIT-IT Pegasus (825841), Construct (830035), and Holistic (830044). The authors would like to thank Arnold Irschara, Christof Hoppe, Michael Maurer, Markus Rumpler, and Christian Mostegel for contributions to earlier work on this topic.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Wendel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Wendel, A., Bischof, H. (2013). Visual Localization for Micro Aerial Vehicles in Urban Outdoor Environments. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5520-1_7

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5519-5

  • Online ISBN: 978-1-4471-5520-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics