The assessment of 3D model representation for retrieval with CNN-RNN networks

  • Weizhi NieEmail author
  • Kun Wang
  • Hongtao Wang
  • Yuting SuEmail author


In this paper, we propose a novel method for assessing 3D model representation via CNN and RNN networks. First, a visual tool developed with OpenGL is utilized to extract virtual views of each 3D model from different angles. These views are extracted by 10-degree wrap around the model. Second, a CNN model is used to extract the feature vectors of these virtual images. Then, these feature vectors as the input of an RNN are fused into a new feature to represent the 3D model. Finally, the Euclidean distance is used to obtain the similarity measure between two different models for the retrieval problem. In the experimental section, NTU, PSB and ShapeNet datasets are utilized to evaluate the performance of the proposed method. Several classic 3D model retrieval and classification methods are leveraged as comparison methods in this paper. The corresponding experiments also demonstrate the superiority of our approach.


3D model retrieval RNN Deep learning Information retrieval 



This work was supported in part by the National Natural Science Foundation of China (61502337, 61872267).


  1. 1.
    Ansary TF, Daoudi M, Vandeborre JP (2007) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimed 9(1):78–88CrossRefGoogle Scholar
  2. 2.
    Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ (2016) Gift: a real-time and scalable 3d shape search engine. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 5023–5032Google Scholar
  3. 3.
    Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M (2003) On visual similarity based 3d model retrieval. In: Computer graphics forum, vol 22. Wiley Online Library, pp 223–232Google Scholar
  4. 4.
    Chen DY, Tian XP, Shen YT, Ming O (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22(3):223–232CrossRefGoogle Scholar
  5. 5.
    Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078
  6. 6.
    Conrad M, De Doncker RW, Schniedenharn M, Diatlov A (2014) Packaging for power semiconductors based on the 3d printing technology selective laser melting. In: European conference on power electronics and applications, pp 1–7Google Scholar
  7. 7.
    Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27CrossRefGoogle Scholar
  8. 8.
    Darom T, Keller Y (2012) Scale-invariant features for 3-d mesh models. IEEE Trans Image Process 21(5):2758–2769MathSciNetCrossRefGoogle Scholar
  9. 9.
    Elad A, Kimmel R (2003) On bending invariant signatures for surfaces. IEEE Trans Pattern Anal Mach Intell 25(10):1285–1295CrossRefGoogle Scholar
  10. 10.
    Frome A, Huber D, Kolluri R, Bülow T, Malik J (2004) Recognizing objects in range data using regional point descriptors. In: European conference on computer vision. Springer, pp 224–237Google Scholar
  11. 11.
    Funkhouser T, Min P, Kazhdan M, Chen J, Halderman A, Dobkin D, Jacobs D (2003) A search engine for 3d models. Acm Trans Graph 22(1):83–105CrossRefGoogle Scholar
  12. 12.
    Gao Y, Dai Q, Zhang N (2010) 3d model comparison using spatial structure circular descriptor. Pattern Recogn 43(3):1142–1151CrossRefGoogle Scholar
  13. 13.
    Gao Y, Dai Q, Wang M, Zhang N (2011) 3d model retrieval using weighted bipartite graph matching. Signal Process Image Commun 26(1):39–47CrossRefGoogle Scholar
  14. 14.
    Gao Y, Tang J, Hong R, Yan S (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process Publ IEEE Signal Process Soc 21 (4):2269–2281MathSciNetCrossRefGoogle Scholar
  15. 15.
    Gao Z, Wang D, He X, Zhang H Group-pair convolutional neural networks for multi-view based 3d object retrievalGoogle Scholar
  16. 16.
    Gregor K, Danihelka I, Graves A, Rezende DJ, Wierstra D Draw: a recurrent neural network for image generation. arXiv:1502.04623
  17. 17.
    He X, Zhou Y, Zhou Z, Bai S, Bai X Triplet-center loss for multi-view 3d object retrieval. arXiv:1803.06189
  18. 18.
    Hilaga M, Shinagawa Y, Kohmura T, Kunii TL (2001) Topology matching for fully automatic similarity estimation of 3d shapes. In: Conference on computer graphics and interactive techniques, pp 203–212Google Scholar
  19. 19.
    Hu MC, Chen CW, Cheng WH, Chang CH, Lai JH, Wu JL (2015) Real-time human movement retrieval and assessment with kinect sensor. IEEE Trans Cybern 45(4):742–753CrossRefGoogle Scholar
  20. 20.
    Ip CY, Lapadat D, Sieger L, Regli WC (2002) Using shape distributions to compare solid models. In: ACM Symposium on solid modeling and applications, pp 273–280Google Scholar
  21. 21.
    Ji Y, Haffari G, Eisenstein J A latent variable recurrent neural network for discourse relation language models. arXiv:1603.01913
  22. 22.
    Kanezaki A, Matsushita Y, Nishida Y Rotationnet: joint learning of object classification and viewpoint estimation using unaligned 3d object dataset. arXiv:1603.06208
  23. 23.
    Kazhdan M, Funkhouser T, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3 d shape descriptors. In: Symposium on geometry processing, vol 6, pp 156–164Google Scholar
  24. 24.
    Leibe B, Schiele B Analyzing appearance and contour based methods for object categorization. In: 2003 IEEE Computer society conference on computer vision and pattern recognition, 2003. Proceedings. 2003, pp 409–415Google Scholar
  25. 25.
    Leng B, Guo S, Du C, Zeng J, Xiong Z (2017) 3d object retrieval based on viewpoint segmentation. Multimed Syst 23(1):19–28CrossRefGoogle Scholar
  26. 26.
    Liu A, Wang Z, Nie W, Su Y (2015) Graph-based characteristic view set extraction and matching for 3d model retrieval. Inform Sci 320:429–442CrossRefGoogle Scholar
  27. 27.
    Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116MathSciNetCrossRefGoogle Scholar
  28. 28.
    Maturana D, Scherer S (2015) Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International conference on intelligent robots and systems (IROS). IEEE, pp 922–928Google Scholar
  29. 29.
    Nie L, Wang M, Zha Z-J, Chua T-S (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst (TOIS) 30 (2):13CrossRefGoogle Scholar
  30. 30.
    Nie L, Zhang L, Yang Y, Wang M, Hong R, Chua T-S (2015) Beyond doctors: future health prediction from multimedia and multimodal observations. In: Proceedings of the 23rd ACM international conference on multimedia. ACM, pp 591–600Google Scholar
  31. 31.
    Nie W, Cao Q, Liu A, Su Y (2015) Convolutional deep learning for 3d object retrieval. Multimed Syst, 1–8Google Scholar
  32. 32.
    Nie W, Cao Q, Liu A, Su Y (2017) Convolutional deep learning for 3d object retrieval. Multimed Syst 23(3):325–332CrossRefGoogle Scholar
  33. 33.
    Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) Shape distributions. Acm Trans Graph 21(4):807–832MathSciNetCrossRefGoogle Scholar
  34. 34.
    Papadakis P, Pratikakis I, Perantonis S, Theoharis T (2007) Efficient 3d shape matching and retrieval using a concrete radialized spherical projection representation. Pattern Recogn 40(9):2437–2452CrossRefGoogle Scholar
  35. 35.
    Papadakis P, Pratikakis I, Theoharis T, Perantonis S (2010) Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int J Comput Vis 89(2-3):177–192CrossRefGoogle Scholar
  36. 36.
    Papoiu AD, Emerson NM, Patel TS, Kraft RA, Valdes-Rodriguez R, Nattkemper LA, Coghill RC, Yosipovitch G (2014) Voxel-based morphometry and arterial spin labeling fmri reveal neuropathic and neuroplastic features of brain processing of itch in end-stage renal disease. J Neurophysiol 112(7):1729–38CrossRefGoogle Scholar
  37. 37.
    Paquet E, Rioux M, Murching A, Naveen T, Tabatabai A (2000) Description of shape information for 2-d and 3-d objects. Signal Process Image Commun 16(s 1–2):103–122CrossRefGoogle Scholar
  38. 38.
    Pickup D, Sun X, Rosin PL, Martin RR, Cheng Z, Nie S, Jin L (2015) Canonical forms for non-rigid 3d shape retrieval. In: Eurographics workshop on 3d object retrieval, pp 99–106Google Scholar
  39. 39.
    Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. Proc Comput Vis Pattern Recogn (CVPR) IEEE 1(2):4Google Scholar
  40. 40.
    Rodol E, Rota BS, Windheuser T, Vestner M, Cremers D (2014) Dense non-rigid shape correspondence using random forests. In: Computer vision and pattern recognition, pp 4177–4184Google Scholar
  41. 41.
    Rodolà E, Albarelli A, Cremers D, Torsello A (2015) A simple and effective relevance-based point sampling for 3d shapes. Pattern Recogn Lett 59(C):41–47CrossRefGoogle Scholar
  42. 42.
    Roman-Rangel E, Jimenez-Badillo D, Marchand-Maillet S (2016) Classification and retrieval of archaeological potsherds using histograms of spherical orientations. J Comput Cultural Heritage (JOCCH) 9(3):17Google Scholar
  43. 43.
    Sfikas K, Theoharis T, Pratikakis I (2017) Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics workshop on 3D object retrievalGoogle Scholar
  44. 44.
    Sfikas K, Pratikakis I, Theoharis T (2018) Ensemble of panorama-based convolutional neural networks for 3d model classification and retrieval. Comput Graph 71:208–218CrossRefGoogle Scholar
  45. 45.
    Shi B, Bai S, Zhou Z, Bai X (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343CrossRefGoogle Scholar
  46. 46.
    Shih JL, Lee CH, Wang JT (2007) A new 3d model retrieval approach based on the elevation descriptor. Pattern Recogn 40(1):283–295CrossRefGoogle Scholar
  47. 47.
    Shinagawa Y, Kunii TL (1991) Constructing a Reeb graph automatically from cross sections. IEEE Comput Graph Appl 11(6):44–51CrossRefGoogle Scholar
  48. 48.
    Sinha A, Bai J, Ramani K (2016) Deep learning 3d shape surfaces using geometry images. In: European conference on computer vision. Springer, pp 223–240Google Scholar
  49. 49.
    Su H, Maji S, Kalogerakis E, Learnedmiller E (2015) Multi-view convolutional neural networks for 3d shape recognition, 945–953Google Scholar
  50. 50.
    Sundar H, Silver D, Gagvani N, Dickinson S (2003) Skeleton based shape matching and retrieval. In: Shape modeling international, p 130Google Scholar
  51. 51.
    Tombari F, Salti S, Di Stefano L (2010) Unique shape context for 3d data description. In: Proceedings of the ACM workshop on 3D object retrieval. ACM, pp 57–62Google Scholar
  52. 52.
    Wang D, Wang B, Zhao S, Yao H, Liu H (2017) View-based 3d object retrieval with discriminative views. Neurocomputing 252(C):58–66CrossRefGoogle Scholar
  53. 53.
    Wu Z, Song S, Khosla A, Yu F (2015) 3d shapenets: a deep representation for volumetric shapes. In: IEEE Conference on computer vision and pattern recognition, pp 1912–1920Google Scholar
  54. 54.
    Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J (2016) Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Advances in neural information processing systems, pp 82–90Google Scholar
  55. 55.
    Xu K, Shi Y, Zheng L, Zhang J, Liu M, Huang H, Su H, Cohen-Or D, Chen B (2016) 3d attention-driven depth acquisition for object identification. ACM Trans Graph (TOG) 35(6):238Google Scholar
  56. 56.
    Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3d object retrieval via multi-modal graph learning. Signal Process 112:110–118CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Electrical and Information EngineeringTianjin UniversityTianjinChina

Personalised recommendations