Cluster Computing

, Volume 22, Supplement 4, pp 8181–8191 | Cite as

Multi-view CSPMPR-ELM feature learning and classifying for RGB-D object recognition

  • Yunhua YinEmail author
  • Huifang Li


In order to fully utilize potential feature information of RGB-D images, current popular algorithms mainly use convolutional neural network (CNN) to execute both feature extraction and classification. Such methods could achieve impressive results but usually on the basis of an extremely huge and complex network. What’s more, since the fully connected layers in CNN form a classical neural network classifier, which is trained by gradient descent-based implementations, the generalization ability is limited and sub-optimal. To address these problems, this paper introduce a multi-view CNN-SPMP-RNN-ELM (MCSPMPR-ELM) model for RGB-D object recognition, which combines the power of MCSPMPR and fast training of ELM. It uses the MCSPMPR algorithm to extract discriminative features from raw RGB images and depth images separately. Then the abstracted features are fed to a nonlinear ELM classifier, which leads to better generalization performance with faster learning speed. At last, co-training is employed to learn from the unlabeled data using the two distinct feature sets by semi-supervised learning method. Experimental results on widely used RGB-D object datasets show that our method achieves competitive performance compared with other state-of-the-art algorithms specifically designed for RGB-D data.


Object recognition Extreme learning machine ELM CNN SPMP Co-training RGB-D 



This work is funded by National Natural Science Foundation of China (Grant No. 61402368) and Science and Technology on Transient Impact Laboratory Foundation (Grant No. 61426060103162606007). The authors thank all the anonymous reviewers for their very helpful comments to improve the paper.


  1. 1.
    Lecun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. IEEE Int. Symp. Circuits Syst. 14(5), 253–256 (2010)Google Scholar
  2. 2.
    Socher, R., Huval, B., Bhat, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: International Conference on Neural Information Processing System (NIPS), pp. 665–673 (2012)Google Scholar
  3. 3.
    Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: architecture and fast algorithms. In: International Conference on Neural Information Processing System (NIPS), pp. 2115–2123 (2011)Google Scholar
  4. 4.
    Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: International Symposium on Experimental Robotics (ISER), pp. 1–15 (2012)Google Scholar
  5. 5.
    Blum, M., Springenberg, J.T., Wulfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: IEEE International Conference on Robotics & Automation, pp. 1298–1303 (2012)Google Scholar
  6. 6.
    Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. IEEE Conf. Comput. Vis. Pattern Recognit. 42(7), 1713–1720 (2011)Google Scholar
  7. 7.
    Wang, J., et al.: Locality-constrained linear coding for image classification. IEEE Conf. Comput. Vis. Pattern Recognit. 119(5), 3360–3367 (2010)Google Scholar
  8. 8.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Eleventh Conference on Computational Learning Theory, pp. 92–100 (1998)Google Scholar
  9. 9.
    Balcan, M.F., Blum, A., Yang, K.: Co-training and expansion: towards bridging theory and practice. Int. Conf. Neural Inf. Process. Syst. 8(1), 89–96 (2004)Google Scholar
  10. 10.
    Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning for RGB-D object recognition. Int. Conf. Pattern Recognit, 2377–2382 (2014)Google Scholar
  11. 11.
    Cheng, Y., Zhao, X., Huang, K., Tan, T.: Semi-supervised learning and feature evluation for RGB-D object recognition. Comput. Vis. Image Underst. 139(C), 149–160 (2015)CrossRefGoogle Scholar
  12. 12.
    Razavian, A.S., Azizpour, H., et al.: CNN Features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519 (2014)Google Scholar
  13. 13.
    Azizpour, H., Razavian, A.S., et al.: From generic to specific deep representations for visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 36–45 (2015)Google Scholar
  14. 14.
    Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine:a new learning scheme of feedforward neural networks. IEEE Int. Joint Conf. Neural Netw. 2(2), 985–990 (2004)Google Scholar
  15. 15.
    Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)CrossRefGoogle Scholar
  16. 16.
    Huang, G.-B., et al.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B 42(2), 513 (2012)CrossRefGoogle Scholar
  17. 17.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  18. 18.
    Bo, L., Ren, X., Fox, D.: Learning hierarchical sparse features for RGB-(D) object recognition. Int. J. Robot. Res. 33(4), 581–599 (2014)CrossRefGoogle Scholar
  19. 19.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 2(17), 2169–2178 (2006)Google Scholar
  20. 20.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(2), 2012 (2012)Google Scholar
  21. 21.
    Abdel-Hakim, A.E., Farag, A.A.: CSIFT: a sift descriptor with color invariant characteristics. In: IEEE Computer Society Conference on Computer Vision&Pattern Recognition (CVPR), vol. 2, pp. 1978–1983 (2006)Google Scholar
  22. 22.
    Socher, R., Lin, C.C., Ng, A., Manning, C.: Parsing natural scenes and natural language with recursive neural networks. In: International Conference on International Conference on Machine Learning( ICML), pp. 129–136 (2011)Google Scholar
  23. 23.
    Huang, G.B.: Learning capability and storage capacity of two hidden-layer feedforward networks. IEEE Trans. Neural Netw. 14(2), 274–281 (2003)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Feng, G., Huang, G.B., Lin, Q., Gay, R.: Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans. Neural Netw. 20(8), 1352–1357 (2009)CrossRefGoogle Scholar
  25. 25.
    Ding, S., Zhang, N., Xu, X., Guo, L., Zhang, J.: Deep extreme learning machine and its application in EEG classification. Math. Probl. Eng. 1(1), 1–11 (2015)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. IEEE Int. Conf. Robot. Automat. 47(10), 1817–1824 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Electronics and InformationNorthwestern Polytechnical UniversityXi’anChina
  2. 2.Science and Technology on Transient Impact LaboratoryBeijingChina

Personalised recommendations