Multimedia Tools and Applications

, Volume 77, Issue 7, pp 7955–7976 | Cite as

Improved visual SLAM: a novel approach to mapping and localization using visual landmarks in consecutive frames

  • Kajal Sharma


Pathfinding is becoming more and more common in autonomous vehicle navigation, robot localization, and other computer vision applications. In this paper, a novel approach to mapping and localization is presented that extracts visual landmarks from a robot dataset acquired by a Kinect sensor. The visual landmarks are detected and recognized using the improved scale-invariant feature transform (I-SIFT) method. The methodology is based on detecting stable and invariant landmarks in consecutive (red-green-blue depth) RGB-D frames of the robot dataset. These landmarks are then used to determine the robot path, and a map is constructed by using the visual landmarks. A number of experiments were performed on various datasets in an indoor environment. The proposed method performs efficient landmark detection in various environments, which includes changes in rotation and illumination. The experimental results show that the proposed method can solve the simultaneous localization and mapping (SLAM) problem using stable visual landmarks, but with less computation time.


SLAM Localization Landmarks Video processing 


  1. 1.
    Avidan S (2004) Support vector tracking. IEEE Trans Pattern Anal Mach Intell 26(8):1064–1072CrossRefGoogle Scholar
  2. 2.
    Chen S, Li Y (2005) Vision sensor planning for 3-D model acquisition. IEEE Trans Syst Man Cybern B Cybern 35(5):894–904CrossRefGoogle Scholar
  3. 3.
    Chiang J, Hsia C, Hsu H (2013) A stereo vision-based self-localization system. IEEE Sensors J 13(5):1677–1689CrossRefGoogle Scholar
  4. 4.
    Claraco JLB (2010) Development of scientific applications with the mobile robot programming toolkit (MRPT), machine perception and intelligent robotics. University of Malaga, LaboratoryGoogle Scholar
  5. 5.
    Comport A, Malis E, Rives P (2010) Real-time quadrifocal visual odometry. Int J Robot Res 29(2):245–266CrossRefGoogle Scholar
  6. 6.
    Endres F, Hess J, Engelhard N, Sturm J, Cremers D, Burgard W (2012) An evaluation of the RGB-D SLAM system, IEEE Intl. Conf. on Robotics and Automation (ICRA)Google Scholar
  7. 7.
    Engelhard N, Endres F, Hess J, Sturm J, Burgard W (2011) Realtime 3D visual SLAM with a hand-held camera, RGB-D workshop on 3D perception in robotics at the European Robotics Forum (ERF)Google Scholar
  8. 8.
    Gedik O, Alatan A (2013) 3-D rigid body tracking using vision and depth sensors. IEEE Transactions on Cybernetics 43(5):1395–1405CrossRefGoogle Scholar
  9. 9.
    Grisetti G, Stachniss C, Burgard W (2007) Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Trans Robot 23(1):34–46CrossRefGoogle Scholar
  10. 10.
    Grisetti G, Stachniss G, Burgard W (2009) Non-linear constraint network optimization for efficient map learning. IEEE Trans Intell Transp Syst 10(3):428–439CrossRefGoogle Scholar
  11. 11.
    Henry P, Krainin M, Herbst E et al (2010) RGB-D mapping: using depth cameras for dense 3D modeling of indoor environments. Intl. Symp. on Experimental Robotics (ISER)Google Scholar
  12. 12.
    Jiang C, Paudel D, Fougerolle Y et al (2016) Static-map and dynamic object reconstruction in outdoor scenes using 3-D motion segmentation. IEEE Robotics and Automation Letters 1(1):324–331CrossRefGoogle Scholar
  13. 13.
    Kaess M, Ranganathan A, Dellaert F (2008) iSAM: incremental smoothing and mapping. IEEE Trans Rob 24(6):1365–1378CrossRefGoogle Scholar
  14. 14.
    Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. Proc. IEEE computer society Conf. Computer vision and pattern recognition. 2: 506–513Washington, DCGoogle Scholar
  15. 15.
    Koeser K, Bartczak B, Koch R (2007) An analysis-by-synthesis camera tracking approach based on free-form surfaces. German Conf. on Pattern RecognitionGoogle Scholar
  16. 16.
    Konolige K, Bowman J (2009) Towards lifelong visual maps. Proc. IEEE/RSJ Intl. Conf. on Intelligent Robots and SystemsGoogle Scholar
  17. 17.
    Konolige K, Agrawal M, Bolles R et al (2007) Outdoor mapping and navigation using stereo vision. Proc. Intl. Symp. on Experimental Robotics (ISER)Google Scholar
  18. 18.
    Lee J, Roh K, Wagner D et al (2011) Robust local feature extraction algorithm with visual cortex for object recognition. Electron Lett 47(19):1075–1076CrossRefGoogle Scholar
  19. 19.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  20. 20.
    Magnusson M, Andreasson H, Nüchter A, Lilienthal A (2009) Automatic appearance-based loop detection from 3D laser data using the normal distributions transform. J Field Rob 26(11–12):892–914CrossRefzbMATHGoogle Scholar
  21. 21.
    Manduchi R, Castano A, Talukder A, Matthies L (2005) Obstacle detection and terrain classification for autonomous off-road navigation. Auton Robot 18(1):81–102CrossRefGoogle Scholar
  22. 22.
    Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86CrossRefGoogle Scholar
  23. 23.
    Montemerlo M, Thrun S, Koller D et al (2002) FastSLAM: a factored solution to the simultaneous localization and mapping problem. Proc. of the National Conf. On artificial intelligence (AAAI)Google Scholar
  24. 24.
    Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, and Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking, IEEE Intl. Symp. On mixed and augmented reality (ISMAR)Google Scholar
  25. 25.
    Nistér D (2005) Preemptive ransac for live structure and motion estimation. Mach Vis Appl 16(5):321–329CrossRefGoogle Scholar
  26. 26.
    Nüchter A, Lingemann K, Hertzberg J, Surmann H (2007) 6D SLAM–3D mapping outdoor environments: research articles. J Field Rob 24(8–9):699–722CrossRefzbMATHGoogle Scholar
  27. 27.
    Park J, Kim Y (2015) Collision avoidance for quadrotor using stereo vision depth maps. IEEE Trans Aerosp Electron Syst 51(4):3226–3241CrossRefGoogle Scholar
  28. 28.
    Pollefeys M, Gool L (2002) From images to 3D models Commun. ACM 45(7):50–55CrossRefzbMATHGoogle Scholar
  29. 29.
    Schmid C, Mohr R (1997) Local grayvalue invariants for image retrieval. IEEE Trans Pattern Anal Mach Intell 19(5):530–534CrossRefGoogle Scholar
  30. 30.
    Se S, Lowe D, Little J (2002) Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. Int J Robot Res 21(8):735–758CrossRefGoogle Scholar
  31. 31.
    Segal A, Haehnel D, Thrun S (2009) Generalized-icp. Robotics: Science and Systems (RSS)Google Scholar
  32. 32.
    Shao L, Han J, Xu D et al (2013) Computer vision for RGB-D sensors: Kinect and its applications. IEEE Transactions on Cybernetics 43(5):1314–1317CrossRefGoogle Scholar
  33. 33.
    Sharma K, Moon I (2013) Improved scale-invariant feature transform feature-matching technique-based object tracking in video sequences via a neural network and kinect sensor. J Electron Imaging 22(3):033017–033017CrossRefGoogle Scholar
  34. 34.
    Sharma K, Moon I, Kim S (2012) Extraction of visual landmarks using improved feature matching technique for stereo vision applications. IETE Tech Rev 29(6):473–481CrossRefGoogle Scholar
  35. 35.
    Sharma K, Moon I, Kim S (2012) Depth estimation of features in video frames with improved feature matching technique using kinect sensor. Opt Eng 51(10): 107002(1–11).Google Scholar
  36. 36.
    Strasdat H, Montiel J, Davison A (2010) Scale drift-aware large scale monocular SLAM. Proc. of Robotics: Science and Systems (RSS)Google Scholar
  37. 37.
    Stuckler J, Behnke S (2012) Integrating depth and color cues for dense multi-resolution scene mapping using rgb-d cameras, IEEE Intl. Conf. on Multisensor Fusion and Information Integration (MFI)Google Scholar
  38. 38.
    Stühmer J, Gumhold S, Cremers D (2010) Real-time dense geometry from a handheld camera. DAGM Symposium on Pattern RecognitionGoogle Scholar
  39. 39.
    Sturm J, Magnenat S, Engelhard N, Pomerleau F, Colas F, Burgard W, Cremers D, Siegwart R (2011) Towards a benchmark for RGB-D SLAM evaluation, Proc. of the RGB-D workshop on advanced reasoning with depth cameras at robotics: science and systems Conf. (RSS)Google Scholar
  40. 40.
    Vesanto J, Alhoniemi E (2000) Clustering of the self-organizing map. IEEE Trans Neural Netw 11(3):586–600CrossRefGoogle Scholar
  41. 41.
    Wang J, Zha H, Cipolla R (2006) Coarse-to-fine vision-based localization by indexing scale-invariant features. IEEE Trans Syst Man Cybern B Cybern 36(2):413–422CrossRefGoogle Scholar
  42. 42.
    Zamora E, Yu W (2013) Recent advances on simultaneous localization and mapping for mobile robots. IETE Tech Rev 30(6):490–496CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Independent ResearcherChangwonSouth Korea

Personalised recommendations