Object Recognition and Modeling Using SIFT Features

  • Alessandro Bruno
  • Luca Greco
  • Marco La Cascia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8192)


In this paper we present a technique for object recognition and modelling based on local image features matching. Given a complete set of views of an object the goal of our technique is the recognition of the same object in an image of a cluttered environment containing the object and an estimate of its pose. The method is based on visual modeling of objects from a multi-view representation of the object to recognize. The first step consists of creating object model, selecting a subset of the available views using SIFT descriptors to evaluate image similarity and relevance. The selected views are then assumed as the model of the object and we show that they can effectively be used to visually represent the main aspects of the object.

Recognition is done making comparison between the image containing an object in generic position and the views selected as object models. Once an object has been recognized the pose can be estimated searching the complete set of views of the object. Experimental results are very encouraging using both a private dataset we acquired in our lab and a publicly available dataset.


Object Recognition Pose Estimation Object Model SIFT 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  2. 2.
    Mundy, J., Zisserman, A.: Geometric invariance in computer vision. MIT Press, Cambridge (1992)Google Scholar
  3. 3.
    Mundy, J.L.: Object recognition in the geometric era: A retrospective. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 3–28. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Turk, M., Pentland, A.: Eigenfaces for recognition. Journal of Cognitive Neuroscience 3, 71–86 (1991)CrossRefGoogle Scholar
  5. 5.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1470–1477 (2003)Google Scholar
  6. 6.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2161–2168 (2006)Google Scholar
  7. 7.
    Zhao, L.W., Luo, S.W., Liao, L.Z.: 3D object recognition and pose estimation using kernel pca. In: Proceedings of the Third International Conference on Machine Learning and Cybernetics, Shanghai, China, pp. 3258–3262 (2004)Google Scholar
  8. 8.
    Wang, X.Z., Zang, S.F., Li, J., et al.: View-based 3d object recognition using wavelet multi-scale singular value decomposition and support vector machine. In: Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition, Beijing, pp. 1428–1432 (2007)Google Scholar
  9. 9.
    Pontil, M., Verri, A.: Support vector machines for 3D object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(6), 637–646 (1998)CrossRefGoogle Scholar
  10. 10.
    Murase, H., Nayar, S.K.: Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision 14(1), 5–24 (1995)CrossRefGoogle Scholar
  11. 11.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. (2), pp. 1150–1157 (1999)Google Scholar
  12. 12.
    Wu, Y.J., Wang, X.M., Shang, F.H.: Study on 3D Object Recognition Based on KPCA-SVM. In: International Conference on Information and Intelligent Computing IPCSIT, vol. 18, pp. 55–60 (2011)Google Scholar
  13. 13.
    Chang, P., Krumm, J.: Object recognition with color cooccurrence histograms. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1999)Google Scholar
  14. 14.
    Kouskouridas, R., Gasteratos, A.: Establishing low dimensional manifolds for 3D object pose estimation. In: IEEE International Conference on Imaging Systems and Techniques (IST), pp. 425–430 (2012)Google Scholar
  15. 15.
    Viksten, F., Forssén, P.-E., Johansson, B., Moe, A.: Comparison of local image descriptors for full 6 degree-of-freedom pose estimation. In: IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 2779–2786 (2009)Google Scholar
  16. 16.
    Torki, M., Elgammal: A Regression from local features for viewpoint and pose estimation. In: IEEE International Conference on Computer Vision (ICCV), pp. 2603–2610 (2011)Google Scholar
  17. 17.
    El-Gaaly, T., Torki, M.: RGBD object pose recognition using local-global multi-kernel regression. In: IEEE 21st International Conference on Pattern Recognition, pp. 2468–2471 (2012)Google Scholar
  18. 18.
  19. 19.
    Ponce, J., Berg, T.L., Everingham, M., Forsyth, D.A., Hebert, M., Lazebnik, S., Marszalek, M., Schmid, C., Russell, B.C., Torralba, A., et al.: Dataset issues in object recognition. Toward Category-Level Object Recognition, 29–48 (2006)Google Scholar
  20. 20.
    Morel, J.-M., Yu, G.: ASIFT: A new framework for fully affine invariant image comparison. SIAM Journal on Imaging Sciences 2(2), 438–469 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). Computer Vision and Image Understanding 110(3), 346–359 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Alessandro Bruno
    • 1
  • Luca Greco
    • 1
  • Marco La Cascia
    • 1
  1. 1.DICGIMUniversità degli Studi di PalermoItaly

Personalised recommendations