Comparison of Point and Line Features and Their Combination for Rigid Body Motion Estimation

  • Florian Pilz
  • Nicolas Pugeault
  • Norbert Krüger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5604)


This paper discusses the usage of different image features and their combination in the context of estimating the motion of rigid bodies (RBM estimation). From stereo image sequences, we extract line features at local edges (coded in so called multi-modal primitives) as well as point features (by means of SIFT descriptors). All features are then matched across stereo and time, and we use these correspondences to estimate the RBM by solving the 3D-2D pose estimation problem. We test different feature sets on various stereo image sequences, recorded in realistic outdoor and indoor scenes. We evaluate and compare the results using line and point features as 3D-2D constraints and we discuss the qualitative advantages and disadvantages of both feature types for RBM estimation. We also demonstrate an improvement in robustness through the combination of these features on large data sets in the driver assistance and robotics domain. In particular, we report total failures of motion estimation based on only one type of feature on relevant data sets.


Motion Estimation Line Feature Scale Invariant Feature Transform Stereo Match Point Correspondence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ball, R.: The theory of screws. Cambridge University Press, Cambridge (1900)Google Scholar
  2. 2.
    Zetzsche, C., Barth, E.: Fundamental limits of linear filters in the visual processing of two dimensional signals. Vision Research 30, 1111–1117 (1990)CrossRefGoogle Scholar
  3. 3.
    Krüger, N., Hulle, M.V., Wörgötter, F.: Ecovision: Challenges in early-cognitive vision. International Journal of Computer Vision 72, 5–7 (2007)CrossRefGoogle Scholar
  4. 4.
    Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: IEEE computer Society conference on Computer Vision and Pattern Recognition, pp. 8–15 (1998)Google Scholar
  5. 5.
    Christy, S., Horaud, R.: Iterative pose computation from line correspondences. Comput. Vis. Image Underst. 73, 137–144 (1999)zbMATHCrossRefGoogle Scholar
  6. 6.
    Ansar, A., Daniilidis, K.: Linear pose estimation from points or lines. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 282–296. Springer, Heidelberg (2002)Google Scholar
  7. 7.
    Lepetit, V., Fua, P.: Monocular model-based 3d tracking of rigid objects. Found. Trends. Comput. Graph. Vis. 1, 1–89 (2005)CrossRefGoogle Scholar
  8. 8.
    Roach, J., Aggarwall, J.: Determining the movement of objects from a sequence of images. IEEE Transactions on Patterm Analysis and Machine Intelligence 2, 554–562 (1980)Google Scholar
  9. 9.
    Lowe, D.G.: Three–dimensional object recognition from single two images. Artificial Intelligence 31, 355–395 (1987)CrossRefGoogle Scholar
  10. 10.
    Bruss, A., Horn, B.: Passive navigation. Computer Vision, Graphics, and Image Processing 21, 3–20 (1983)CrossRefGoogle Scholar
  11. 11.
    Horn, B.: Robot Vision. MIT Press, Cambridge (1994)Google Scholar
  12. 12.
    Waxman, A., Ullman, S.: Surface structure and 3-D motion from image flow: A kinematic analysis. International Fournal of Robot Research 4, 72–94 (1985)CrossRefGoogle Scholar
  13. 13.
    Negahdaripour, S., Horn, B.: Direct passive navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence 9, 168–176 (1987)zbMATHCrossRefGoogle Scholar
  14. 14.
    Steinbach., B.G.E.: An image-domain cost function for robust 3-d rigid body motion estimation. In: 15th International Conference on Pattern Recognition (ICPR 2000, vol. 3, pp. 823–826 (2000)Google Scholar
  15. 15.
    Steinbach, E.: Data driven 3-D Rigid Body Motion and Structure Estimation. Shaker Verlag (2000)Google Scholar
  16. 16.
    Torr, P.H.S., Zisserman, A.: Feature based methods for structure and motion estimation. In: ICCV 1999: Proceedings of the International Workshop on Vision Algorithms, London, UK, pp. 278–294. Springer, Heidelberg (2000)Google Scholar
  17. 17.
    Fischler, R., Bolles, M.: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Communications of the ACM 24, 619–638 (1981)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Schaffalitzky, F., Zisserman, A., Hartley, R.I., Torr, P.H.S.: A six point solution for structure and motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 632–648. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  19. 19.
    Horaud, R., Conio, B., Leboulleux, O., Lacolle, B.: An analytic solution for the perspective 4-point problem. Comput. Vision Graph. Image Process. 47, 33–44 (1989)CrossRefGoogle Scholar
  20. 20.
    Dhome, M., Richetin, M., Lapreste, J.T.: Determination of the attitude of 3d objects from a single perspective view. IEEE Trans. Pattern Anal. Mach. Intell. 11, 1265–1278 (1989)CrossRefGoogle Scholar
  21. 21.
    Haralick, R., Joo, H., Lee, C., Zhuang, X., Vaidya, V., Kim, M.: Pose estimation from corresponding point data. Systems, Man and Cybernetics, IEEE Transactions on 19, 1426–1446 (1989)CrossRefGoogle Scholar
  22. 22.
    Liu, Y., Huang, T., Faugeras, O.: Determination of camera location from 2-d to 3-d line and point correspondence. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 12, pp. 28–37 (1989)Google Scholar
  23. 23.
    Phong, T., Horaud, R., Yassine, A., Tao, P.: Object pose from 2-D to 3-D point and line correspondences. International Journal of Computer Vision 15, 225–243 (1995)CrossRefGoogle Scholar
  24. 24.
    Rosenhahn, B., Granert, O., Sommer, G.: Monocular pose estimation of kinematic chains. In: Dorst, L., Doran, C., Lasenby, J. (eds.) Applied Geometric Algebras for Computer Science and Engineering, pp. 373–383. Birkhäuser, Basel (2001)Google Scholar
  25. 25.
    Bretzner, L., Lindeberg, T.: Use your hand as a 3-D mouse, or, relative orientation from extended sequences of sparse point and line correspondences using the affine trifocal tensor. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 141–157. Springer, Heidelberg (1998)Google Scholar
  26. 26.
    Murray, R., Li, Z., Sastry, S.: A mathematical introduction to robotic manipulation. CRC Press, Boca Raton (1994)Google Scholar
  27. 27.
    Grest, D., Herzog, D., Koch, R.: Monocular body pose estimation by color histograms and point tracking. In: DAGM-Symposium, pp. 576–586 (2006)Google Scholar
  28. 28.
    Grest, D., Petersen, T., Krüger, V.: A Comparison of Iterative 2D-3D Pose Estimation Methods for Real-Time Applications, to appear. In: Salberg, A.-B., Hardeberg, J.Y., Jenssen, R. (eds.) SCIA 2009. LNCS, vol. 5575, pp. 706–715. Springer, Heidelberg (2009)Google Scholar
  29. 29.
    Dementhon, D.F., Davis, L.S.: Model-based object pose in 25 lines of code. International Journal of Computer Vision 15, 123–141 (1995)CrossRefGoogle Scholar
  30. 30.
    Araujo, H., Carceroni, R., Brown, C.: A fully projective formulation to improve the accuracy of lowe’s pose–estimation algorithm. Computer Vision and Image Understanding 70, 227–238 (1998)CrossRefGoogle Scholar
  31. 31.
    Wolf, L., Shashua, A.: Lior wolf and a. shashua. on projection matrices p k − > p 2, k = 3,.,6, and their applications in computer vision. In: Proceedings of the 8th International Conference on Computer Vision, pp. 412–419. IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  32. 32.
    Avidan, S., Shashua, A.: Trajectory triangulation: 3d reconstruction of moving points from a monocular image sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 348–357 (2000)CrossRefGoogle Scholar
  33. 33.
    Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 739–751. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  34. 34.
    Rosenhahn, B., Brox, T., Cremers, D., Seidel, H.P.: Modeling and tracking line-constrained mechanical systems. In: Sommer, G., Klette, R. (eds.) RobVis 2008. LNCS, vol. 4931, pp. 98–110. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  35. 35.
    Felsberg, M., Kalkan, S., Krüger, N.: Continuous dimensionality characterization of image structures. Image and Vision Computing (accepted for publication in a future issue)Google Scholar
  36. 36.
    Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 2, 91–110 (2004)CrossRefGoogle Scholar
  37. 37.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1615–1630 (2005)CrossRefGoogle Scholar
  38. 38.
    Moravec, H.: Obstacle avoidance and navigation in the real world by a seeing robot rover. Technical Report CMU-RI-TR-3, Carnegie-Mellon University, Robotics Institute (1980)Google Scholar
  39. 39.
    Harris, C.G., Stephens, M.: A combined corner and edge detector. In: 4th Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  40. 40.
    Harris, C.G.: Geometry from visual motion. MIT Press, Cambridge (1992)Google Scholar
  41. 41.
    Zhang, Z., Deriche, R., Faugeras, O., Luong, Q.T.: A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence 87, 87–119 (1995)CrossRefGoogle Scholar
  42. 42.
    Kalkan, S., Shi, Y., Pilz, F., Krüger, N.: Improving junction detection by semantic interpretation. In: VISAPP (1), pp. 264–271 (2007)Google Scholar
  43. 43.
    Pollefeys, M., Koch, R., van Gool, L.: Automated reconstruction of 3D scenes from sequences of images. ISPRS Journal of Photogrammetry and Remote Sensing 55, 251–267 (2000)CrossRefGoogle Scholar
  44. 44.
    Lowe, D.G.: Robust model-based motion tracking through the integration of search and estimation. Int. J. Comput. Vision 8, 113–122 (1992)CrossRefGoogle Scholar
  45. 45.
    Krüger, N., Jäger, T., Perwass, C.: Extraction of object representations from stereo imagesequences utilizing statistical and deterministic regularities in visual data. In: DAGM Workshop on Cognitive Vision, pp. 92–100 (2002)Google Scholar
  46. 46.
    Grimson, W. (ed.): Object Recognition by Computer. The MIT Press, Cambridge (1990)Google Scholar
  47. 47.
    Rosenhahn, B., Sommer, G.: Adaptive pose estimation for different corresponding entities. In: Van Gool, L. (ed.) DAGM 2002, vol. 2449, pp. 265–273. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  48. 48.
    Rosenhahn, B., Perwass, C., Sommer, G.: Cvonline: Foundations about 2d-3d pose estimation. In: Fisher, R. (ed.) CVonline: On-Line Compendium of Computer Vision (2004),
  49. 49.
    Selig, J.: Some remarks on the statistics of pose estimation. Technical Report SBU-CISM-00-25, South Bank University, London (2000)Google Scholar
  50. 50.
    Krüger, N., Wörgötter, F.: Statistical and deterministic regularities: Utilisation of motion and grouping in biological and artificial visual systems. Advances in Imaging and Electron Physics 131, 82–147 (2004)CrossRefGoogle Scholar
  51. 51.
    ECOVISION: Artificial visual systems based on early-cognitive cortical processing (EU–Project) (2001–2003),
  52. 52.
    Pugeault, N., Krüger, N.: Multi–modal matching applied to stereo. In: Proceedings of the BMVC 2003, pp. 271–280 (2003)Google Scholar
  53. 53.
    Krüger, N., Felsberg, M.: An explicit and compact coding of geometric and structural information applied to stereo matching. Pattern Recognition Letters 25(8), 849–863 (2004)CrossRefGoogle Scholar
  54. 54.
    Freidman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3, 209–226 (1977)CrossRefGoogle Scholar
  55. 55.
    Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proc. IEEE Conf. Comp. Vision Patt. Recog, pp. 1000–1006 (1997)Google Scholar
  56. 56.
    Pilz, F., Shi, Y., Grest, D., Pugeault, N., Kalkan, S., Krüger, N.: Utilizing semantic interpretation of junctions for 3d-2d pose estimation. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Paragios, N., Tanveer, S.-M., Ju, T., Liu, Z., Coquillart, S., Cruz-Neira, C., Müller, T., Malzbender, T. (eds.) ISVC 2007, Part I. LNCS, vol. 4841, pp. 271–280. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  57. 57.
    Hermann, S., Klette, R.: A study on parameterization and preprocessing for semi-global matching. Technical report, Computer Science Department, The University of Aukland, New Zealand (2008),

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Florian Pilz
    • 1
  • Nicolas Pugeault
    • 2
    • 3
  • Norbert Krüger
    • 3
  1. 1.Department of Medialogy and Engineering ScienceAalborg University CopenhagenDenmark
  2. 2.School of InformaticsUniversity of EdinburghUnited Kingdom
  3. 3.The Maersk Mc-Kinney Moller InstituteUniversity of Southern DenmarkDenmark

Personalised recommendations