Abstract
Object categorization and manipulation are critical tasks for a robot to operate in the household environment. In this chapter, we propose new methods for visual recognition and categorization. We describe 2D object database and 3D point clouds with 2D/3D local descriptors which we quantify with the k-means clustering algorithm for obtaining the bag of words (BOW). Moreover, we develop a new global descriptor called VFH-Color that combines the original version of Viewpoint Feature Histogram (VFH) descriptor with the color quantization histogram, thus adding the appearance information that improves the recognition rate. The acquired 2D and 3D features are used for training Deep Belief Network (DBN) classifier. Results from our experiments for object recognition and categorization show an average of recognition rate between 91% and 99% which makes it very suitable for robot-assisted tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aldoma, A., Tombari, F., Rusu, R., Vincze, M.: OUR-CVFH–oriented, unique and repeatable clustered viewpoint feature histogram for object recognition and 6DOF pose estimation. Springer (2012)
Aldoma, A., Vincze, M., Blodow, N., Gossow, D., Gedikli, S., Rusu, R., Bradski, G.: Cad-model recognition and 6dof pose estimation using 3d cues. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops, pp. 585–592. IEEE (2011)
Alexandre, L.A.: 3d object recognition using convolutional neural networks with transfer learning between input channels. In: Intelligent Autonomous Systems 13, pp. 889–898. Springer (2016)
Antonelli, G., Fossen, T.I., Yoerger, D.R.: Underwater robotics. In: Springer Handbook of Robotics, pp. 987–1008. Springer (2008)
Avila, S., Thome, N., Cord, M., Valle, E., Araújo, A.D.A.: Bossa: Extended bow formalism for image classification. In: 2011 18th IEEE International Conference on Image Processing, pp. 2909–2912. IEEE (2011)
Bai, J., Nie, J.-Y., Paradis, F.: Using language models for text classification. In: Proceedings of the Asia Information Retrieval Symposium, Beijing, China (2004)
Basu, J.K., Bhattacharyya, D., Kim, T.-H.: Use of artificial neural network in pattern recognition. Int. J. Softw. Eng. Appl. 4, 2 (2010)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)
Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: Computer vision–ECCV 2006, pp. 404–417. Springer (2006)
Bengio, Y.: Learning deep architectures for ai. Foundations and trends®. Mach. Learn. 2(1), 1–127 (2009)
Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)
Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826. IEEE (2011)
Bolovinou, A., Pratikakis, I., Perantonis, S.: Bag of spatio-visual words for context inference in scene classification. Pattern Recogn. 46(3), 1039–1053 (2013)
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, Prague, pp. 1–2 (2004)
Dunbabin, M., Corke, P., Vasilescu, I., Rus, D.: Data muling over underwater wireless sensor networks using an autonomous underwater vehicle. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 2091–2098. IEEE (2006)
Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multimodal deep learning for robust rgb-d object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687. IEEE (2015)
Fei, B., Ng, W.S., Chauhan, S., Kwoh, C.K.: The safety issues of medical robotics. Reliab. Eng. Syst. Safety 73(2), 183–192 (2001)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings, vol. 2, IEEE, pp. II–264 (2003)
Filliat, D.: A visual bag of words method for interactive qualitative localization and mapping. In: 2007 IEEE International Conference on Robotics and Automation, pp. 3921–3926. IEEE (2007)
Forlizzi, J., DiSalvo, C.: Service robots in the domestic environment: a study of the roomba vacuum in the home. In: Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, pp. 258–265. ACM (2006)
Freund, E.: Fast nonlinear control with arbitrary pole-placement for industrial robots and manipulators. Int. J. Robot. Res. 1(1), 65–78 (1982)
Geusebroek, J.-M., Burghouts, G.J., Smeulders, A.W.: The amsterdam library of object images. Int. J. Comput. Vis. 61(1), 103–112 (2005)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hu, F., Xia, G.-S., Wang, Z., Huang, X., Zhang, L., Sun, H.: Unsupervised feature learning via spectral clustering of multidimensional patches for remotely sensed scene classification. IEEE J. Selected Topics Appl Earth Observ. Remote Sens. 8, 5 (2015)
Janoch, A., Karayev, S., Jia, Y., Barron, J.T., Fritz, M., Saenko, K., Darrell, T.: A category-level 3d object dataset: Putting the kinect to work. In: Consumer Depth Cameras for Computer Vision, pp. 141–165. Springer (2013)
Jaulin, L.: Robust set-membership state estimation; application to underwater robotics. Automatica 45(1), 202–206 (2009)
Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)
Khan, R., Barat, C., Muselet, D., Ducottet, C.: Spatial orientations of visual word pairs to improve bag-of-visual-words model. In: Proceedings of the British Machine Vision Conference, pp. 89–1. BMVA Press (2012)
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824. IEEE (2011)
Larlus, D., Verbeek, J., Jurie, F.: Category level object segmentation by combining bag-of-words models with dirichlet processes and random fields. Int. J. Comput. Vis. 88(2), 238–253 (2010)
LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2, pp. II–97. IEEE (2004)
Li, M., Ma, W.-Y., Li, Z., Wu, L.: Visual language modeling for image classification, Feb. 28 2012. US Patent 8,126,274
Li, T., Mei, T., Kweon, I.-S., Hua, X.-S.: Contextual bag-of-words for visual categorization. IEEE Trans. Circuits Syst. Video Technol. 21(4), 381–392 (2011)
Lowe, D.G.: Object recognition from local scale-invariant features. In: The proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 2, pp. 1150–1157. IEEE (1999)
Madai-Tahy, L., Otte, S., Hanten, R., Zell, A.: Revisiting deep convolutional neural networks for rgb-d based object recognition. In: International Conference on Artificial Neural Networks, pp. 29–37. Springer (2016)
Madry, M., Ek, C.H., Detry, R., Hang, K., Kragic, D.: Improving generalization for 3d object categorization with global structure histograms. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1379–1386. IEEE (2012)
Mc Donald, K.R.: Discrete language models for video retrieval. Ph.D. thesis, Dublin City University (2005)
McCann, S., Lowe, D.G.: Local naive bayes nearest neighbor for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3650–3656. IEEE (2012)
Mian, A., Bennamoun, M., Owens, R.: On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. Int. J. Comput. Vis. 89(2–3), 348–361 (2010)
Nair, V., Hinton, G.E.: 3d object recognition with deep belief nets. In: Advances in Neural Information Processing Systems, pp. 1339–1347 (2009)
Ouadiay, F.Z., Zrira, N., Bouyakhf, E.H., Himmi, M.M.: 3d object categorization and recognition based on deep belief networks and point clouds. In: Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics, pp. 311–318 (2016)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR 2007. IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8. IEEE (2007)
Potter, M.C.: Short-term conceptual memory for pictures. J. Exp. Psychol: Hum Learn. Mem. 2(5), 509 (1976)
Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (fpfh) for 3d registration. In: IEEE International Conference on Robotics and Automation, 2009. ICRA 2009, pp. 3212–3217. IEEE (2009)
Rusu, R., Blodow, N., Marton, Z., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2008, pp. 3384–3391 (2008)
Rusu, R., Bradski, G., Thibaux, R., Hsu, J.: Fast 3d recognition and pose using the viewpoint feature histogram. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2155–2162. IEEE (2010)
Rusu, R., Cousins, S.: 3D is here: point cloud library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA) (Shanghai, China, May 9-13 2011)
Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007, pp. 1–8. IEEE (2007)
Schwarz, M., Schulz, H., Behnke, S.: Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1329–1335. IEEE (2015)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia pp. 357–360. ACM (2007)
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering object categories in image collections
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, Proceedings, pp. 1470–1477. IEEE (2003)
Smolensky, P. Information processing in dynamical systems: Foundations of harmony theory
Socher, R., Huval, B., Bath, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3d object classification. In: Advances in Neural Information Processing Systems, pp. 665–673 (2012)
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Asian Conference on Computer Vision, pp. 525–538. Springer (2012)
Toldo, R., Castellani, U., Fusiello, A.: A bag of words approach for 3d object categorization. In: Computer Vision/Computer Graphics CollaborationTechniques, pp. 116–127. Springer (2009)
Tombari, F., Salti, S., Stefano, D.L.: Unique signatures of histograms for local surface description. In: Computer Vision–ECCV 2010, pp. 356–369. Springer (2010)
Tombari, F., Salti, S., Stefano, L.: A combined texture-shape descriptor for enhanced 3d feature matching. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 809–812. IEEE (2011)
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: Ninth IEEE International Conference on Computer Vision, 2003. Proceedings, pp. 273–280. IEEE (2003)
Vigo, D.A.R., Khan, F.S., Van de Weijer, J., Gevers, T.: The impact of color on bag-of-words based object recognition. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 1549–1553. IEEE (2010)
Visentin, G., Van Winnendael, M., Putz, P.: Advanced mechatronics in esa’s space robotics developments. In: 2001 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2001. Proceedings (2001), vol. 2, pp. 1261–1266. IEEE (2001)
Wohlkinger, W., Vincze, M.: Ensemble of shape functions for 3d object classification. In: 2011 IEEE International Conference on Robotics and Biomimetics (ROBIO) (2011), pp. 2987–2992. IEEE (2011)
Wu, L., Hoi, S.C., Yu, N.: Semantics-preserving bag-of-words models and applications. IEEE Trans. Image Process. 19(7), 1908–1920 (2010)
Yoshida, K.: Achievements in space robotics. IEEE Robot. Automat. Mag. 16(4), 20–28 (2009)
Zhang, H., Berg, A.C., Maire, M., Malik, J.: Svm-knn: discriminative nearest neighbor classification for visual category recognition. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2126–2136. IEEE (2006)
Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: Coupled multi-index for accurate image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1939–1946 (2014)
Zhong, Y.: Intrinsic shape signatures: a shape descriptor for 3d object recognition. In: 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), pp. 689–696. IEEE (2009)
Zhu, L., Rao, A.B., Zhang, A.: Theory of keyblock-based image retrieval. ACM Trans. Inf. Syst. (TOIS) 20(2), 224–257 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Zrira, N., Hannat, M., Bouyakhf, E.H., Ahmad Khan, H. (2018). 2D/3D Object Recognition and Categorization Approaches for Robotic Grasping. In: Hassanien, A., Oliva, D. (eds) Advances in Soft Computing and Machine Learning in Image Processing. Studies in Computational Intelligence, vol 730. Springer, Cham. https://doi.org/10.1007/978-3-319-63754-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-63754-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63753-2
Online ISBN: 978-3-319-63754-9
eBook Packages: EngineeringEngineering (R0)