Encyclopedia of Robotics

Living Edition
| Editors: Marcelo H. Ang, Oussama Khatib, Bruno Siciliano

Aerial Robots, Visual-Inertial Odometry of

Living reference work entry
DOI: https://doi.org/10.1007/978-3-642-41610-1_71-1

Synonyms

Definitions

Visual-inertial odometry (VIO) is the process of estimating the state (pose and velocity) of an agent (e.g., an aerial robot) by using only the input of one or more cameras plus one or more inertial measurement units (IMUs) attached to it. VIO is the only viable alternative to GPS and lidar-based odometry to achieve accurate state estimation. Since both cameras and IMUs are very cheap, these sensor types are ubiquitous in all today’s aerial robots.

Overview

Cameras and IMUs are complementary sensor types. A camera accumulates the photons during the exposure time to get a 2D image. Therefore they are precise during slow motion and provide rich information, which is useful for other perception tasks, such as place recognition. However, they have limited output rate (∼100...

This is a preview of subscription content, log in to check access.

References

  1. Agarwal A, Mierle K, Others (2010) Ceres solver. http://ceres-solver.org
  2. Bell BM, Cathey FW (1993) The iterated Kalman filter update as a Gauss-Newton method. IEEE Trans Autom Control 38(2):294–297. https://doi.org/10.1109/9.250476MathSciNetCrossRefGoogle Scholar
  3. Bloesch M, Weiss S, Scaramuzza D, Siegwart R (2010) Vision based MAV navigation in unknown and unstructured environments. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  4. Bloesch M, Omari S, Hutter M, Siegwart R (2015) Robust visual inertial odometry using a direct EKF-based approach. In: IEEE/RSJ international conference on intelligent robots and systems (IROS)Google Scholar
  5. Bryson M, Johnson-Roberson M, Sukkarieh S (2009) Airborne smoothing and mapping using vision and inertial sensors. In: IEEE international conference on robotics and automation (ICRA), pp 3143–3148Google Scholar
  6. Burri M, Nikolic J, Gohl P, Schneider T, Rehder J, Omari S, Achtelik MW, Siegwart R (2016) The EuRoC micro aerial vehicle datasets. Int J Robot Res 35:1157–1163. https://doi.org/10.1177/0278364915620033CrossRefGoogle Scholar
  7. Corke P, Lobo J, Dias J (2007) An introduction to inertial and visual sensing. Int J Robot Res 26(6):519–535. https://doi.org/10.1177/0278364907079279CrossRefGoogle Scholar
  8. Costante G, Mancini M, Valigi P, Ciarfuglia T (2016) Exploring representation learning with CNNs for frame-to-frame ego-motion estimation. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  9. Davison AJ, Reid ID, Molton ND, Stasse O (2007) MonoSLAM: real-time single camera SLAM. IEEE Trans Pattern Anal Machine Intell 29(6):1052–1067CrossRefGoogle Scholar
  10. Dellaert F (2012) Factor graphs and GTSAM: a hands-on introduction. Technical report GT-RIM-CP&R-2012-002, Georgia Institute of TechnologyGoogle Scholar
  11. Delmerico J, Scaramuzza D (2018) A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  12. Dong-Si TC, Mourikis A (2011) Motion tracking with fixed-lag smoothing: algorithm consistency and analysis. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  13. Faessler M, Fontana F, Forster C, Mueggler E, Pizzoli M, Scaramuzza D (2016) Autonomous, vision-based flight and live dense 3D mapping with a quadrotor MAV. J Field Robot 33(4):431–450.  https://doi.org/10.1002/rob.21581CrossRefGoogle Scholar
  14. Forster C, Lynen S, Kneip L, Scaramuzza D (2013) Collaborative monocular SLAM with multiple micro aerial vehicles. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 3962–3970.  https://doi.org/10.1109/IROS.2013.6696923
  15. Forster C, Pizzoli M, Scaramuzza D (2014) SVO: fast semi-direct monocular visual odometry. In: IEEE international conference on robotics and automation (ICRA), pp 15–22.  https://doi.org/10.1109/ICRA.2014.6906584
  16. Forster C, Carlone L, Dellaert F, Scaramuzza D (2017a) On-manifold preintegration for real-time visual-inertial odometry. IEEE Trans Robot 33(1):1–21.  https://doi.org/10.1109/TRO.2016.2597321CrossRefGoogle Scholar
  17. Forster C, Zhang Z, Gassner M, Werlberger M, Scaramuzza D (2017b) SVO: semidirect visual odometry for monocular and multicamera systems. IEEE Trans Robot 33(2):249–265.  https://doi.org/10.1109/TRO.2016.2623335CrossRefGoogle Scholar
  18. Furgale P, Rehder J, Siegwart R (2013) Unified temporal and spatial calibration for multi-sensor systems. In: IEEE/RSJ international conference on intelligent robots and systems (IROS)Google Scholar
  19. Gallego G, Lund JEA, Mueggler E, Rebecq H, Delbruck T, Scaramuzza D (2017) Event-based, 6-DOF camera tracking from photometric depth maps. IEEE Trans Pattern Anal Mach Intell,  https://doi.org/10.1109/TPAMI.2017.2658577Google Scholar
  20. Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of fourth Alvey vision conference, Manchester, vol 15, pp 147–151Google Scholar
  21. Hartley R, Zisserman A (2003) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, CambridgezbMATHGoogle Scholar
  22. Hesch JA, Kottas DG, Bowman SL, Roumeliotis SI (2014) Camera-IMU-based localization: observability analysis and consistency improvement. Int J Robot Res 33(1):182–201CrossRefGoogle Scholar
  23. Huang GP, Mourikis AI, Roumeliotis SI (2008) A first-estimates Jacobian EKF for improving SLAM consistency. In: International symposium on experimental robotics (ISER)Google Scholar
  24. Huang GP, Mourikis AI, Roumeliotis SI (2011) An observability-constrained sliding window filter for SLAM. In: IEEE/RSJ international conference on intelligent Robots and systems (IROS), pp 65–72Google Scholar
  25. Indelman V, Wiliams S, Kaess M, Dellaert F (2013) Information fusion in navigation systems via factor graph based incremental smoothing. J Robot Auton Syst 61(8):721–738CrossRefGoogle Scholar
  26. Jones ES, Soatto S (2011) Visual-inertial navigation, mapping and localization: a scalable real-time causal approach. Int J Robot Res 30(4):407–430CrossRefGoogle Scholar
  27. Jung SH, Taylor C (2001) Camera trajectory estimation using inertial sensor measurements and structure from motion results. In: IEEE international conference on computer vision and pattern recognition (CVPR)Google Scholar
  28. Kaess M, Ranganathan A, Dellaert F (2008) iSAM: Incremental smoothing and mapping. IEEE Trans Robot 24(6):1365–1378CrossRefGoogle Scholar
  29. Kaess M, Johannsson H, Roberts R, Ila V, Leonard J, Dellaert F (2012) iSAM2: Incremental smoothing and mapping using the Bayes tree. Int J Robot Research 31:217–236CrossRefGoogle Scholar
  30. Kelly J, Sukhatme GS (2011) Visual-inertial sensor fusion: localization, mapping and sensor-to-sensor self-calibration. Int J Robot Res 30(1):56–79. https://doi.org/10.1177/0278364910382802CrossRefGoogle Scholar
  31. Kelly J, Sukhatme GS (2014) A general framework for temporal calibration of multiple proprioceptive and exteroceptive sensors. Springer, Berlin/Heidelberg, pp 195–209. https://doi.org/10.1007/978-3-642-28572- 1_14Google Scholar
  32. Kim H, Handa A, Benosman R, Ieng SH, Davison AJ (2014) Simultaneous mosaicing and tracking with an event camera. In: British machine vision conference (BMVC). https://doi.org/10.5244/C.28.26
  33. Kim H, Leutenegger S, Davison A (2016) Real-time 3d reconstruction and 6-DOF tracking with an event camera. In: European conference on computer vision (ECCV)Google Scholar
  34. Klein G, Murray D (2009) Parallel tracking and mapping on a camera phone. In: IEEE ACM international symposium on mixed and augmented reality (ISMAR)Google Scholar
  35. Kottas DG, Hesch JA, Bowman SL, Roumeliotis SI (2012) On the consistency of vision-aided inertial navigation. In: International symposium on experimental robotics (ISER)Google Scholar
  36. Leutenegger S, Chli M, Siegwart R (2011) BRISK: binary robust invariant scalable keypoints. In: International conference on computer vision (ICCV), pp 2548–2555.  https://doi.org/10.1109/ICCV.2011.6126542
  37. Leutenegger S, Furgale P, Rabaud V, Chli M, Konolige K, Siegwart R (2013) Keyframe-based visual-inertial SLAM using nonlinear optimization. In: Robotics: science and systems (RSS)Google Scholar
  38. Leutenegger S, Lynen S, Bosse M, Siegwart R, Furgale P (2015) Keyframe-based visual-inertial SLAM using nonlinear optimization. Int J Robot Res 34:314–334CrossRefGoogle Scholar
  39. Li M, Mourikis AI (2013) 3-d motion estimation and on-line temporal calibration for camera-IMU systems. In: 2013 IEEE international conference on robotics and automation, pp 5709–5716.  https://doi.org/10.1109/ICRA.2013.6631398
  40. Lichtsteiner P, Posch C, Delbruck T (2008) A 128×128 120 dB 15 μs latency asynchronous temporal contrast vision sensor. IEEE J Solid-State Circuits 43(2):566–576.  https://doi.org/10.1109/JSSC.2007.914337CrossRefGoogle Scholar
  41. Lupton T, Sukkarieh S (2012) Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions. IEEE Trans Robot 28(1): 61–76CrossRefGoogle Scholar
  42. Lynen S, Achtelik M, Weiss S, Chli M, Siegwart R (2013) A robust and modular multi-sensor fusion approach applied to MAV navigation. In: IEEE/RSJ international conference on intelligent robots and systems (IROS)Google Scholar
  43. Martinelli A (2013) Observability properties and deterministic algorithms in visual-inertial structure from motion. Found Trends Robot 1–75. https://ieeexplore.ieee.org/document/8187490
  44. Maybeck P (1979) Stochastic models, estimation and control, vol 1. Academic, New YorkzbMATHGoogle Scholar
  45. Meier L, Tanskanen P, Heng L, Lee GH, Fraundorfer F, Pollefeys M (2012) PIXHAWK: a micro aerial vehicle design for autonomous flight using onboard computer vision. Auton Robots 33(1–2):21–39CrossRefGoogle Scholar
  46. Mourikis AI, Roumeliotis SI (2007) A multi-state constraint Kalman filter for vision-aided inertial navigation. In: IEEE international conference on robotics and automation (ICRA), pp 3565–3572Google Scholar
  47. Mourikis AI, Roumeliotis SI (2008) A dual-layer estimator architecture for long-term localization. In: Proceedings of the workshop on visual localization for mobile platforms at CVPR, AnchorageCrossRefGoogle Scholar
  48. Nerurkar E, Wu K, Roumeliotis S (2014) C-KLAM: constrained keyframe-based localization and mapping. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  49. Nikolic J, Rehder J, Burri M, Gohl P, Leutenegger S, Furgale P, Siegwart R (2014) A synchronized visual-inertial sensor system with FPGA pre-processing for accurate real-time SLAM. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  50. Patron-Perez A, Lovegrove S, Sibley G (2015) A spline-based trajectory representation for sensor fusion and rolling shutter cameras. Int J Comput Vis 113(3):208–219. https://doi.org/10.1007/s11263-015-0811-3CrossRefGoogle Scholar
  51. Posch C, Matolin D, Wohlgenannt R (2011) A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS. IEEE J Solid-State Circuits 46(1):259–275.  https://doi.org/10.1109/JSSC.2010.2085952CrossRefGoogle Scholar
  52. Qin T, Li P, Shen S (2017) VINS-Mono: a robust and versatile monocular visual-inertial state estimator. arXiv e-prints https://arxiv.org/abs/1708.03852
  53. Rebecq H, Gallego G, Mueggler E, Scaramuzza D (2017a) EMVS: event-based multi-view stereo—3D reconstruction with an event camera in real-time. Int J Comput Vis 1–21. https://doi.org/10.1007/s11263-017-1050-6
  54. Rebecq H, Horstschäfer T, Gallego G, Scaramuzza D (2017b) EVO: a geometric approach to event-based 6-DOF parallel tracking and mapping in real-time. IEEE Robot Autom Lett 2:593–600.  https://doi.org/10.1109/LRA.2016.2645143CrossRefGoogle Scholar
  55. Rosinol Vidal T, Rebecq H, Horstschaefer T, Scaramuzza D (2018) Ultimate slam? Combining events, images, and IMU for robust visual slam in HDR and high speed scenarios. IEEE Robot Autom Lett (RA-L).  https://doi.org/10.1109/lra.2018.2793357
  56. Rosten E, Porter R, Drummond T (2010) Faster and better: a machine learning approach to corner detection. IEEE Trans Pattern Anal Mach Intell 32(1):105–119.  https://doi.org/10.1109/TPAMI.2008.275CrossRefGoogle Scholar
  57. Scaramuzza D, Achtelik M, Doitsidis L, Fraundorfer F, Kosmatopoulos EB, Martinelli A, Achtelik MW, Chli M, Chatzichristofis S, Kneip L, Gurdan D, Heng L, Lee G, Lynen S, Meier L, Pollefeys M, Renzaglia A, Siegwart R, Stumpf JC, Tanskanen P, Troiani C, Weiss S (2014) Vision-controlled micro flying robots: from system design to autonomous navigation and mapping in GPS-denied environments. IEEE Robot Autom Mag 21:26–40CrossRefGoogle Scholar
  58. Shi J, Tomasi C (1994) Good features to track. In: IEEE international conference on computer vision and pattern recognition (CVPR), pp 593–600.  https://doi.org/10.1109/CVPR.1994.323794
  59. Sibley G, Matthies L, Sukhatme G (2010) Sliding window filter with application to planetary landing. J Field Robot 27(5):587–608CrossRefGoogle Scholar
  60. Sterlow D, Singh S (2004) Motion estimation from image and inertial measurements. Int J Robot Res 23:1157–1195CrossRefGoogle Scholar
  61. Strasdat H, Montiel J, Davison A (2010) Real-time monocular SLAM: why filter? In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  62. Tsotsos K, Chiuso A, Soatto S (2015) Robust inference for visual-inertial sensor fusion. In: IEEE international conference robotics and automation (ICRA)Google Scholar
  63. Wang S, Clark R, Wen H, Trigoni N (2017) Deepvo: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: IEEE international conference on robotics and automation (ICRA)Google Scholar
  64. Weiss S, Achtelik MW, Lynen S, Achtelik MC, Kneip L, Chli M, Siegwart R (2013) Monocular vision for long-term micro aerial vehicle state estimation: a compendium. J Field Robot 30(5):803–831.  https://doi.org/10.1002/rob.21466CrossRefGoogle Scholar
  65. Wu KJ, Ahmed AM, Georgiou GA, Roumeliotis SI (2015) A square root inverse filter for efficient vision-aided inertial navigation on mobile devices. In: Robotics: science and systems (RSS)Google Scholar
  66. Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video. In: IEEE international conference on computer vision and pattern recognition (CVPR), pp 6612–6619.  https://doi.org/10.1109/CVPR.2017.700
  67. Zhu AZ, Atanasov N, Daniilidis K (2017) Event-based visual inertial odometry. In: IEEE international conference on computer vision and pattern recognition (CVPR), pp 5816–5824Google Scholar

Authors and Affiliations

  1. 1.Department of InformaticsUniversity of ZurichZurichSwitzerland
  2. 2.Department of NeuroinformaticsETH Zurich and University of ZurichZurichSwitzerland

Section editors and affiliations

  • Aníbal Ollero
    • 1
  1. 1.GRVC Robotics Labs. SevillaEscuela Técnica Superior de Ingeniería, Universidad de SevillaSevillaSpain