International Journal of Computer Vision

, Volume 126, Issue 1, pp 1–20 | Cite as

Learning to Detect Good 3D Keypoints

  • Alessio Tonioni
  • Samuele Salti
  • Federico Tombari
  • Riccardo Spezialetti
  • Luigi Di Stefano
Article
  • 727 Downloads

Abstract

The established approach to 3D keypoint detection consists in defining effective handcrafted saliency functions based on geometric cues with the aim of maximizing keypoint repeatability. Differently, the idea behind our work is to learn a descriptor-specific keypoint detector so as to optimize the end-to-end performance of the feature matching pipeline. Accordingly, we cast 3D keypoint detection as a classification problem between surface patches that can or cannot be matched correctly by a given 3D descriptor, i.e. those either good or not in respect to that descriptor. We propose a machine learning framework that allows for defining examples of good surface patches from the training data and leverages Random Forest classifiers to realize both fixed-scale and adaptive-scale 3D keypoint detectors. Through extensive experiments on standard datasets, we show how feature matching performance improves significantly by deploying 3D descriptors together with companion detectors learned by our methodology with respect to the adoption of established state-of-the-art 3D detectors based on hand-crafted saliency functions.

Keywords

3D Keypoint Detection 3D Descriptors Machine Learning Surface Matching 

References

  1. Aldoma, A., Fäulhammer, T., & Vincze, M. (2014). Automation of “ground truth” annotation for multi-view RGB-D object instance recognition datasets. In Proceedings of international conference on intelligent robots and systems (IROS).Google Scholar
  2. Aldoma, A., Marton, Z., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., et al. (2012a). Point cloud library: Three-dimensional object recognition and 6 dof pose estimation. IEEE Robotics and Automation Magazine (RAM), 19(3), 80–91.CrossRefGoogle Scholar
  3. Aldoma, A., Tombari, F., Di Stefano, L., & Vincze, M. (2012b). A global hypotheses verification method for 3d object recognition. In European conference on computer vision (ECCV), Lecture Notes in Computer Science (Vol. 7574, pp. 511–524). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-33712-3_37.
  4. Alexandre, L. (2012). 3d descriptors for object and category recognition: A comparative evaluation. In IROS workshop on color-depth camera fusion in robotics.Google Scholar
  5. Bariya, P., & Nishino, K. (2010). Scale-hierarchical 3d object recognition in cluttered scenes. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 1657–1664. doi: 10.1109/CVPR.2010.5539774.
  6. Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3), 346–359.CrossRefGoogle Scholar
  7. Behley, J., Steinhage, V., & Cremers, A. (2012). Performance of histogram descriptors for the classification of 3d laser range data in urban environments. In International conference on robotics and automation.Google Scholar
  8. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi: 10.1023/A:1010933404324.CrossRefMATHGoogle Scholar
  9. Castellani, U., Cristani, M., & Fantoni, S. (2008). Sparse points matching by combining 3D mesh saliency with statistical descriptors. In Proceedings of computer graphics forum, pp. 643–652.Google Scholar
  10. Creusot, C., Pears, N., & Austin, J. (2013). A machine-learning approach to keypoint detection and landmarking on 3d meshes. International Journal of Computer Vision, 102(1–3), 146–179. doi: 10.1007/s11263-012-0605-9.CrossRefGoogle Scholar
  11. Criminisi, A., Shotton, J., & Konukoglu, E. (2012). Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Foundations and Trends in Computer Graphics and Vision, 7(2–3), 81–227. http://research.microsoft.com/apps/pubs/default.aspx?id=158806.
  12. Dutagaci, H., Cheung, C., & Godil, A. (2012). Evaluation of 3d interest point detection techniques via human-generated ground truth. The Visual Computer, 28(9), 901–917. doi: 10.1007/s00371-012-0746-4.CrossRefGoogle Scholar
  13. Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J., & Kwok, N. M. (2016). A comprehensive performance evaluation of 3d local feature descriptors. International Journal of Computer Vision, 116(1), 66–89.MathSciNetCrossRefGoogle Scholar
  14. Guo, Y., Sohel, F., Bennamoun, M., Lu, M., & Wan, J. (2013a). Rotational projection statistics for 3d local surface description and object recognition. International Journal of Computer Vision, 105(1), 63–86.MathSciNetCrossRefMATHGoogle Scholar
  15. Guo, Y., Sohel, F., Bennamoun, M., Lu, M., & Wan, J. (2013b). Trisi: A distinctive local surface descriptor for 3d modeling and object recognition. In 8th international conference on computer graphics theory and applications (GRAPP).Google Scholar
  16. Hartmann, W., Havlena, M., & Schindler, K. (2014). Predicting matchability. In 2014 IEEE conference on computer vision and pattern recognition (CVPR), pp. 9–16. doi: 10.1109/CVPR.2014.9.
  17. Holzer, S., Shotton, J., & Kohli, P. (2012). Learning to efficiently detect repeatable interest points in depth data. In 2012 IEEE European conference on computer vision (ECCV).Google Scholar
  18. Johnson, A., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5), 433–449.CrossRefGoogle Scholar
  19. Leutenegger, S., Chli, M., & Siegwart, R. (2011). BRISK: Binary robust invariant scalable keypoints. In 2011 IEEE international conference on computer vision (ICCV), pp. 2548–2555. doi: 10.1109/ICCV.2011.6126542.
  20. Li, Y., Wang, S., Tian, Q., & Ding, X. (2015). A survey of recent advances in visual feature detection. Neurocomputing, 149 Part B, 736–751. http://www.sciencedirect.com/science/article/pii/S0925231214010121
  21. Lin, X., Zhu, C., Zhang, Q., & Liu, Y. (2016). 3d keypoint detection based on deep neural network with sparse autoencoder. arXiv preprint arXiv:1605.00129.
  22. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  23. Mian, A. S., Bennamoun, M., & Owens, R. A. (2010). On the repeatability and quality of keypoints for local feature-based 3D object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2–3), 348–361.CrossRefGoogle Scholar
  24. Ovsjanikov, M., Huang, Q., & Guibas, L. (2011). A condition number for non-rigid shape matching. In Eurographics symposium on geometry processing, Vol. 30.Google Scholar
  25. Proença, P. F., Gaspar, F., & Dias, M. S. (2013). Good appearance and shape descriptors for object category recognition. In Advances in visual computing. Lecture notes in computer science (Vol. 8033, pp. 385–394). Springer: Berlin, Heidelberg.Google Scholar
  26. Rodolà, E., Albarelli, A., Bergamasco, F., & Torsello, A. (2013). A scale independent selection process for 3D object recognition in cluttered scenes. International Journal of Computer Vision, 102(1–3), 129–145. doi: 10.1007/s11263-012-0568-x.MathSciNetCrossRefGoogle Scholar
  27. Rosten, E., Porter, R., & Drummond, T. (2010). Faster and better: A machine learning approach to corner detection. The IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(1), 105–119.CrossRefGoogle Scholar
  28. Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In IEEE international conference on computer vision, pp. 2564–2571. http://doi.ieeecomputersociety.org/10.1109/ICCV.2011.6126544.
  29. Rusu, R. B., Blodow, N., & Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. In International conference on robotics and automation, pp. 3212–3217. doi: 10.1109/ROBOT.2009.5152473.
  30. Salti, S., Tombari, F., & Di Stefano, L. (2014). SHOT: Unique signatures of histograms for surface and texture description. Computer Vision and Image Understanding, 125, 251–264. doi: 10.1016/j.cviu.2014.04.011, http://www.sciencedirect.com/science/article/pii/S1077314214000988.
  31. Salti, S., Tombari, F., Spezialetti, R., & Di Stefano, L. (2015). Learning a descriptor-specific 3D keypoint detector. In The IEEE international conference on computer vision (ICCV), pp. 2318–2326.Google Scholar
  32. Shi, J., & Tomasi, C. (1994). Good features to track. In 1994 IEEE conference on computer vision and pattern recognition (CVPR’94), pp. 593–600.Google Scholar
  33. Steder, B., Rusu, R. B., Konolige, K., & Burgard, W. (2011). Point feature extraction on 3d range scans taking into account object boundaries. In 2011 IEEE international conference on robotics and automation (ICRA) (pp. 2601–2608). IEEE.Google Scholar
  34. Strecha, C., Lindner, A., Ali, K., & Fua, P. (2009). Training for task specific keypoint detection. In J. Denzler, G. Notni, & H. Se (Eds.), Pattern recognition: Lecture notes in computer science (Vol. 5748, pp. 151–160). Berlin, Heidelberg: Springer. doi: 10.1007/978-3-642-03798-6_16.CrossRefGoogle Scholar
  35. Sukno, F., Waddington, J., & Whelan, P. (2012). Comparing 3d descriptors for local search of craniofacial landmarks. In International symposium on visual computing (ISVC).Google Scholar
  36. Sun, J., Ovsjanikov, M., & Guibas, L. (2009). A concise and provably informative multi-scale signature based on heat diffusion. In Eurographics symposium on geometry processing, Vol. 28.Google Scholar
  37. Taati, B., Bondy, M., Jasiobedzki, P., & Greenspan, M. (October 2007). Variable dimensional local shape descriptors for object recognition in range data. In Proceedings of the 11th IEEE international conference on computer vision; Rio de Janeiro, Brazil, Vol. 1421, p. 18.Google Scholar
  38. Teran, L., & Mordohai, P. (2014). 3D interest point detection via discriminative learning. In ECCV 2014. Lecture notes in computer science (Vol. 8689, pp. 159–173). Springer. doi: 10.1007/978-3-319-10590-1_11.
  39. Tombari, F., Salti, S., & DiStefano, L. (2013). Performance evaluation of 3d keypoint detectors. International Journal of Computer Vision, 102(1–3), 198–220. doi: 10.1007/s11263-012-0545-4.CrossRefGoogle Scholar
  40. Tuytelaars, T., & Mikolajczyk, K. (2008). Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.CrossRefGoogle Scholar
  41. Verdie, Y., Yi, K. M., Fua, P., & Lepetit, V. (2015). TILDE: A temporally invariant learned DEtector. In Proceedings of the computer vision and pattern recognition.Google Scholar
  42. Wohlkinger, W., Aldoma, A., Rusu, R., & Vincze, M. (2012). 3dnet: Large-scale object class recognition from cad models. In International conference on robotics and automation (ICRA).Google Scholar
  43. Zaharescu, A., Boyer, E., Varanasi, K., & Horaud, R. (2009). Surface feature detection and description with applications to mesh matching. In Proceedings of international conference on computer vision and pattern recognition (CVPR), pp. 373–380.Google Scholar
  44. Zhong, Y. (2009). Intrinsic shape signatures: A shape descriptor for 3D object recognition. In Proceedings of international conference on computer vision workshops, pp. 1–8.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.DISIUniversity of BolognaBolognaItaly
  2. 2.Fleetmatics ResearchFlorenceItaly
  3. 3.CAMPTechnical University of MunichMunichGermany

Personalised recommendations