Abstract
Local description of images is a common technique in many computer vision related research. Due to recent improvements in RGB-D cameras, local description of 3D data also becomes practical. The number of studies that make use of this extra information is increasing. However, their applicabilities are limited due to the need for generic combination methods. In this paper, we propose combining textural and geometrical descriptors for scene recognition of RGB-D data. The methods together with the normalization stages proposed in this paper can be applied to combine any descriptors obtained from 2D and 3D domains. This study represents and evaluates different ways of combining multi-modal descriptors within the BoW approach in the context of indoor scene localization. Query’s rough location is determined from the pre-recorded images and depth maps in an unsupervised image matching manner.
Chapter PDF
Similar content being viewed by others
References
Microsoft: Introducing kinect for xbox 360, http://www.xbox.com/en-US/Kinect/
Cummins, M., Newman, P.: Fab-map: Probabilistic localization and mapping in the space of appearance. Int. J. Rob. Res. 27, 647–665 (2008)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR, pp. 2169–2178 (2006)
Kang, H., Efros, A.A., Hebert, M., Kanade, T.: Image matching in large scale indoor environment. In: IEEE CVPR Workshop on Egocentric Vision (2009)
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE CVPR, pp. 413–420 (2009)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: IEEE ICCV, pp. 1470–1477 (2003)
Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: IEEE CVPR, pp. 627–634 (2005)
Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: Features and algorithms. In: IEEE CVPR (2012)
Janoch, A., Karayev, S., Jia, Y., Barron, J., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-D object dataset: Putting the kinect to work. In: IEEE ICCV Workshops, pp. 1168–1174 (2011)
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: IEEE ICCV Workshop on 3DRR (2011)
Browatzki, B., Fischer, J., Graf, B., Bulthoff, H., Wallraven, C.: Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset. In: IEEE ICCV Workshops, pp. 1189–1195 (2011)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Computer Vision 65, 43–72 (2005)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60, 91–110 (2004)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Computer Vision Image Underst. 110, 346–359 (2008)
Tangelder, J.W.H., Veltkamp, R.C.: A survey of content based 3D shape retrieval methods. Multimedia Tools Appl. 39, 441–471 (2008)
Bronstein, A.M., Bronstein, M.M., Guibas, L.J., Ovsjanikov, M.: Shape google: Geometric words and expressions for invariant shape retrieval. ACM Trans. Graph. 30, 1–20 (2011)
Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on PAMI 21, 433–449 (1999)
Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Matching 3D models with shape distributions. In: IEEE Int. Conf. on Shape Mod. & App. (2001)
Rusu, R.B., Blodow, N., Beetz, M.: Fast Point Feature Histograms (FPFH) for 3D Registration. In: IEEE ICRA, pp. 3212–3217 (2009)
Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: IEEE ICIP, pp. 809–812 (2011)
Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on PAMI 20, 226–239 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bayramog̃lu, N., Heikkilä, J., Pietikäinen, M. (2012). Combining Textural and Geometrical Descriptors for Scene Recognition. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33868-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-33868-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33867-0
Online ISBN: 978-3-642-33868-7
eBook Packages: Computer ScienceComputer Science (R0)