Analysis of Compact Features for RGB-D Visual Search

  • Alioscia PetrelliEmail author
  • Danilo Pau
  • Luigi Di Stefano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9280)


Anticipating the oncoming integration of depth sensing into mobile devices, we experimentally compare different compact features for representing RGB-D images in mobile visual search. Experiments on 3 state-of-the-art datasets, addressing both category and instance recognition, show how Deep Features provided by Convolutional Neural Networks better represent appearance information, whereas shape is more effectively encoded through Kernel Descriptors. Moreover, our evaluation suggests that learning to weight the relative contribution of depth and appearance is key to deploy effectively depth sensing in forthcoming mobile visual search scenarios.


RGB-D visual search Binary hash codes Deep learning 


  1. 1.
    Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: Advances in Neural Information Processing Systems, vol. 23, pp. 1–9 (2010)Google Scholar
  2. 2.
    Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: Intelligent Robots and Systems (2011)Google Scholar
  3. 3.
    Browatzki, B., Fischer, J.: Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset. In: International Conference on Computer Vision Workshops (2011)Google Scholar
  4. 4.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  5. 5.
    Chandrasekhar, V., Makar, M., Takacs, G., Chen, D., Tsai, S.S., Cheung, N.M., Grzeszczuk, R., Reznik, Y., Girod, B.: Survey of SIFT compression schemes. In: International Conference on Pattern Recognition (2010)Google Scholar
  6. 6.
    Chandrasekhar, V., Takacs, G., Chen, D.M., Tsai, S.S., Makar, M., Girod, B.: Feature matching performance of compact descriptors for visual search. In: Data Compression Conference (2014)Google Scholar
  7. 7.
    Chandrasekhar, V., Takacs, G., Chen, D.M., Tsai, S.S., Reznik, Y., Grzeszczuk, R., Girod, B.: Compressed Histogram of Gradients: A Low-Bitrate Descriptor. International Journal of Computer Vision (2011)Google Scholar
  8. 8.
    Dudani, S.A.: The Distance-Weighted k-Nearest-Neighbor Rule. Transactions on Systems, Man, and Cybernetics, 325–327 (1976)Google Scholar
  9. 9.
    Girod, B., Chandrasekhar, V., Chen, D.M., Cheung, N.M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S.S., Vedantham, R.: Mobile visual search. IEEE Signal Processing Magazine, 61–76, July 2011Google Scholar
  10. 10.
    Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 345–360. Springer, Heidelberg (2014) Google Scholar
  11. 11.
    Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing. In: Conference on Computer Vision and Pattern Recognition, pp. 2957–2964 (2012)Google Scholar
  12. 12.
    Ji, R., Duan, L.Y., Chen, J., Yao, H., Yuan, J., Rui, Y., Gao, W.: Location Discriminative Vocabulary Coding for Mobile Landmark Search. International Journal of Computer Vision, 290–314 (2011)Google Scholar
  13. 13.
    Johnson, M.: Generalized descriptor compression for storage and matching. In: British Machine Vision Conference, pp. 23.1-23.11 (2010)Google Scholar
  14. 14.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)Google Scholar
  15. 15.
    Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: International Conference on Robotics and Automation, pp. 1817–1824 (2011)Google Scholar
  16. 16.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  17. 17.
    Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: International Conference on Very Large Data Bases (2007)Google Scholar
  18. 18.
    Malaguti, F., Tombari, F., Salti, S., Pau, D., Di Stefano, L.: Toward compressed 3D descriptors. In: International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 176–183, October 2012Google Scholar
  19. 19.
    Nascimento, E.R., Oliveira, G.L., Campos, M.F.M., Vieira, A.W., Schwartz, W.R.: BRAND: a robust appearance and depth descriptor for RGB-D images. In: International Conference on Intelligent Robots and Systems, pp. 1720–1726, October 2012Google Scholar
  20. 20.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  21. 21.
    Singh, A., Sha, J., Narayan, K.S., Achim, T., Abbeel, P.: BigBIRD: a large-scale 3D database of object instances. In: International Conference on Robotics and Automation, pp. 509–516 (2014)Google Scholar
  22. 22.
    Venkataraman, K., Lelescu, D., Duparr, J., McMahon, A., Molina, G., Chatterjee, P., Mullis, R.: PiCam: an ultra-thin high performance monolithic camera array. In: Siggraph Asia (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Alioscia Petrelli
    • 1
    Email author
  • Danilo Pau
    • 2
  • Luigi Di Stefano
    • 1
  1. 1.University of BolognaBolognaItaly
  2. 2.ST MicroelectronicsAgrate BrianzaItaly

Personalised recommendations