Abstract
Nowadays object recognition is a fundamental capability for an autonomous robot in interaction with the physical world. Taking advantage of new sensing technologies providing RGB-D data, the object recognition capabilities increase dramatically. Object recognition has been well studied, however, known object classifiers usually feature poor generality and, therefore, limited adaptivity to different application domains. Although some domain adaptation approaches have been presented for RGB data, little work has been done on understanding the effects of applying object classification algorithms using RGB-D for different domains. Addressing this problem, we propose and comprehensively investigate an approach for object recognition in RGB-D data that uses adaptive Support Vector Machines (aSVM) and, in this way, achieves an impressive robustness in cross-domain adaptivity. For evaluation, two datasets from different application domains were used. Moreover, a study of state-of-the-art RGB-D feature extraction techniques and object classification methods was performed to identify which combinations (object representation - classification algorithm) remain less affected in terms of performance while switching between different application domains.
Similar content being viewed by others
Notes
We controlled the expected proportion of false discoveries using the BH step-up procedure for False Discovery Rate control [3], as the tests are independent (between objects) or positively correlated (between 40, 50 and 60 % adaptations).
References
Aldoma A, Marton ZC, Tombari F, Wohlkinger W, Potthast C, Zeisl B, Rusu RB, Gedikli S, Vincze M (2012) Point cloud library: Three-dimensional object recognition and 6 dof pose estimation. IEEE Robot Autom Mag 19(3):80–91
Alexandre LA (2012) 3d descriptors for object and category recognition: a comparative evaluation. In: Workshop on Color-Depth Camera Fusion in Robotics at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal
Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc Ser B (Methodological) 57(1):289–300. doi:10.2307/2346101
Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: ACL, vol 7, pp 440–447
Chang CC, Lin CJ LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2:27:1–27:27. Software available at, http://www.csie.ntu.edu.tw/cjlin/libsvm
Daumé III H (2007) Frustratingly easy domain adaptation. In: ACL, vol 1785, pp 1787
Daumé III H, Marcu D (2006) Domain adaptation for statistical classifiers. J Artif Intell Res (JAIR) 26:101–126
Fischler MA, Bolles RC (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395. doi:10.1145/358669.358692
Grzegorzek M (2010) A System for 3D Texture-Based Probabilistic Object Recognition and Its Applications. International Journal on Pattern Analysis and Applications 13(3):333–348
Grzegorzek M, Deinzer F, Reinhold M, Denzler J, Niemann H (2003) How Fusion of Multiple Views Can Improve Object Recognition in Real-World Environments. In: Ertl T, Girod B, Greiner G, Niemann H, Seidel HP, Steinbach E, Westermann R (eds) Vision, Modeling, and Visualization 2003, pp 553–560. Aka/IOS Press, Berlin, Amsterdam, Munich, Germany
Grzegorzek M, Sav S, Izquierdo E, O’Connor NE (2010) Local Wavelet Features for Statistical Object Classification and Localisation. IEEE Multimedia 17 (1):56–66
Hoffman J, Kulis B, Darrell T, Saenko K (2012) Discovering latent domains for multisource domain adaptation. In: Computer Vision–ECCV 2012, pp 702–715. Springer
Jiang J, Zhai C (2007) Instance weighting for domain adaptation in nlp. In: ACL, vol 2007, pp 22
Kulis B, Saenko K, Darrell T (2011) What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1785–1792. IEEE
Lai K, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view RGBD-D object dataset. In: 2011 IEEE international conference on robotics and automation (ICRA), pp 1817–1824. IEEE
Lai K, Fox D (2009) 3d laser scan classification using web data and domain adaptation. In: Robotics: Science and Systems
Leung T, Malik J (2001) Representing and recognizing the visual appearance of materials using three-dimensional textons. Int J Comput Vis 43(1):29–44
Liu L, Shao L (2013) Learning discriminative representations from RGB-D video data. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, pp 1493–1500. AAAI Press
Madry M, Song D, Kragic D (2011) 2D/3D Object Categorization for Task Based Grasping. In: European Robotics Forum 2011: RGB-D Workshop on 3D Perception in Robotics. Extended abstract
Malisiewicz T, Efros AA (2008) Recognition by association via learning per-exemplar distances. In: IEEE conference on computer vision and pattern recognition, CVPR 2008, pp 1–8. IEEE
Marton ZC, Seidel F, Balint-Benczedi F, Beetz M (2012) Ensembles of Strong Learners for Multi-cue Classification. Pattern Recognition Letters (PRL), Special Issue on Scene Understandings and Behaviours Analysis
Richtsfeld A, Mörwald T, Prankl J, Zillich M, Vincze M. (2014) Learning of perceptual grouping for object segmentation on RGB-D data. J Vis Commun Image Represent 25(1):64–73
Roark B, Bacchiani M (2003) Supervised and unsupervised pcfg adaptation to novel domains. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pp 126–133. Association for Computational Linguistics
Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning Point Cloud Views using Persistent Feature Histograms. In: Proceedings of the 21st IEEE/RSJ international conference on intelligent robots and systems (IROS), Nice, France
Rusu RB, Bradski G, Thibaux R, Hsu J (2010) Fast 3D recognition and pose using the Viewpoint Feature Histogram. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 2155–2162, doi:10.1109/IROS.2010.5651280, (to appear in print)
Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: Computer Vision–ECCV 2010, pp 213–226. Springer
Shirahama K, Grzegorzek M (2014) Towards Large-Scale Multimedia Retrieval Enriched by Knowledge about Human Interpretation - Retrospective Survey. Multimedia Tools and Applications
Spinello L, Arras KO (2012) Leveraging RGB-D data: Adaptive fusion and domain adaptation for object detection. In: 2012 IEEE international conference on robotics and automation (ICRA), pp 4469–4474.IEEE
Wahl E, Hillenbrand U, Hirzinger G (2003) Surflet-Pair-Relation Histograms: A Statistical 3D-Shape Representation for Rapid Classification. In: 3D-Digital Imaging and Modeling (3DIM). Banff, Canada
Yang J, Yan R, Hauptmann AG (2007) Cross-domain video concept detection using adaptive svms. In: Proceedings of the 15th international conference on Multimedia, pp 188–197. ACM
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nuricumbo, J.R., Ali, H., Márton, ZC. et al. Improving object classification robustness in RGB-D using adaptive SVMs. Multimed Tools Appl 75, 6829–6847 (2016). https://doi.org/10.1007/s11042-015-2612-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2612-7