Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22035–22049 | Cite as

View-wised discriminative ranking for 3D object retrieval

  • Wenhui Li
  • Yang AnEmail author


In this paper, we propose a new framework which can capture the latent relative information within the multiple views of 3D model, named View-wised Discriminative Ranking(VDR). Different to existing view-based methods which treat the multiple views as the independent information, we want to model the relative information within multiple views. By placing the views of model in certain order, we learn the parameters of ranking function as a new robust model representation. We evaluate our proposal on several challenging datasets for 3D retrieval and the comparison experiments demonstrate the superiority of the proposed method in both retrieval accuracy and efficiency.


3D model retrieval View ranking Model representation 


  1. 1.
    Ankerst M, Kastenmu̇ller G (1999) Hans-Peter Kriegel, and Thomas Seidl. 3d shape histograms for similarity search and classification in spatial databases. In: Advances in Spatial Databases, 6th International Symposium, SSD’99, Hong Kong, China, July 20-23, Proceedings, pp 207–226Google Scholar
  2. 2.
    Ansary TF, Daoudi M, Vandeborre J-P (2007) A bayesian 3-d search engine using adaptive views clustering. IEEE Trans Multimedia 9(1):78–88CrossRefGoogle Scholar
  3. 3.
    Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M (2003) On visual similarity based 3d model retrieval. In: Computer graphics forum, vol 22, pp 223–232Google Scholar
  4. 4.
    Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1996) Support vector regression machines. In: Advances in Neural Information Processing Systems 9, NIPS, Denver, CO, USA, December 2-5, 1996, pp 155–161Google Scholar
  5. 5.
    Fang Y, Xie J, Dai G, Wang M, Zhu F, Xu T, Wong EK (2015) 3d deep shape descriptor. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 2319–2328Google Scholar
  6. 6.
    Gao Y, Dai Q (2014) View-based 3d object retrieval: Challenges and approaches. IEEE MultiMedia 21(3):52–57CrossRefGoogle Scholar
  7. 7.
    Gao Y, Dai Q, Wang M, Naiyao Z (2011) 3d model retrieval using weighted bipartite graph matching. Sig Proc Image Comm 26(1):39–47CrossRefGoogle Scholar
  8. 8.
    Gao Y, Tang J, Hong R, Yan S, Dai Q, Zhang N, Chua T-S (2012) Camera constraint-free view-based 3-d object retrieval. IEEE Trans Image Process 21 (4):2269–2281MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Gao Z, Zhang L, Chen M-Y, Hauptmann AG, Zhang H, Cai A-N (2014) Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset. Multimedia Tools Appl 68(3):641–657CrossRefGoogle Scholar
  11. 11.
    Gao Z, Zhang H, Xu GP, Xue YB, Hauptmann AG (2015) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112:83–97CrossRefGoogle Scholar
  12. 12.
    Gao Z, Zhang H, Xu GP, Xue YB (2015) Multi-perspective and multi-modality joint representation and recognition model for 3d action recognition. Neurocomputing 151:554–564CrossRefGoogle Scholar
  13. 13.
    Guo H, Wang J, Gao Y, Li J, Lu H (2016) Multi-view 3d object retrieval with deep embedding network, vol 25, pp 5526–5537Google Scholar
  14. 14.
    Gao Z, Li S, Zhang G, Zhu Y, Wang C, Zhang H (2017) Evaluation of regularized multi-task leaning algorithms for single/multi-view human action recognition. In: Multimedia Tools and Applications, pp 1–24Google Scholar
  15. 15.
    Gao Z, Li S, Zhu Y, Wang C, Zhang H (2017) Collaborative sparse representation leaning model for rgbd action recognition. Journal of Visual Communication and Image RepresentationGoogle Scholar
  16. 16.
    Gao Z, Zhang G-T, Zhang H, Xue Y-B, Xu G (2017) 3d human action recognition model based on image set and regularized multi-task leaning. Neurocomputing 252:67–76CrossRefGoogle Scholar
  17. 17.
    Hilaga M, Shinagawa Y, Komura T, Kunii TL (2001) Topology matching for fully automatic similarity estimation of 3d shapes. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH, Los Angeles, California, USA, August 12-17, pp 203–212Google Scholar
  18. 18.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093
  19. 19.
    Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Li F-F (2014) Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June, 23-28 2014, pp 1725–1732Google Scholar
  20. 20.
    Kim W-Y, Kim Y-S (2000) A region-based shape descriptor using zernike moments. Sig Proc Image Comm 16(1-2):95–102CrossRefGoogle Scholar
  21. 21.
    Leibe B, Schiele B (2003) Analyzing appearance and contour based methods for object categorization. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 16-22 June 2003, Madison, WI, USA, pp 409–415Google Scholar
  22. 22.
    Liu T-Y (2011) Learning to Rank for Information Retrieval. Springer, BerlinCrossRefzbMATHGoogle Scholar
  23. 23.
    Liu A, Wang Z, Nie W, Yuting S (2015) Graph-based characteristic view set extraction and matching for 3d model retrieval. Inf Sci 320:429–442CrossRefGoogle Scholar
  24. 24.
    Liu A, Nie W, Gao Y, Su Y (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116MathSciNetCrossRefGoogle Scholar
  25. 25.
    Liu A, Nie W, Gao Y, Su Y (2017) View-based 3-d model retrieval: A benchmark. IEEE Trans Cybern PP(99):1–13CrossRefGoogle Scholar
  26. 26.
    Liu A, Su Y, Nie W, Kankanhalli MS (2017) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114CrossRefGoogle Scholar
  27. 27.
    Lu K, Ji R, Tang J, Gao Y (2014) Learning-based bipartite graph matching for view-based 3d model retrieval. IEEE Trans Image Process 23(10):4553–4563MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Lu F, Sato I, Sato Y (2015) Uncalibrated photometric stereo based on elevation angle recovery from brdf symmetry of isotropic materials. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 168–176Google Scholar
  29. 29.
    Mu̇ller H, Mu̇ller W, Squire D, Marchand-Maillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposals. Pattern Recogn Lett 22(5):593–601CrossRefzbMATHGoogle Scholar
  30. 30.
    Nie W, Liu A, Gao Z, Su Y (2015) Clique-graph matching by preserving global & local structure. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12 2015, pp 4503–4510Google Scholar
  31. 31.
    Persoon E, Fu K-S (1977) Shape discrimination using fourier descriptors. IEEE Trans Syst Man Cybern 7(3):170–179MathSciNetCrossRefGoogle Scholar
  32. 32.
    Shilane P, Min P, Kazhdan MM, Funkhouser TA (2004) The princeton shape benchmark. In: 2004 International Conference on Shape Modeling and Applications (SMI 2004), 7-9 June 2004, Genova, Italy, pp 167–178Google Scholar
  33. 33.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognitionGoogle Scholar
  34. 34.
    Su H, Maji S, Kalogerakis E, Learned-Miller EG (2015) Multi-view convolutional neural networks for 3d shape recognition. In: IEEE International Conference on Computer Vision, ICCV, Santiago, Chile, December 7-13, pp 945–953Google Scholar
  35. 35.
    Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1–9Google Scholar
  36. 36.
    Wang M, Gao Y, Lu K, Rui Y (2013) View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans Image Process 22(4):1395–1407MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1912–1920Google Scholar
  38. 38.
    Xie J, Yi F, Zhu F, Wong EK (2015) Deepshape: Deep learned shape descriptor for 3d shape matching and retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Boston, MA, USA, June 7-12, pp 1275–1283Google Scholar
  39. 39.
    Yang L, Albregtsen F (1996) Fast and exact computation of cartesian geometric moments using discrete green’s theorem. Pattern Recogn 29(7):1061–1073CrossRefGoogle Scholar
  40. 40.
    Zhang D, Lu G (2002) Generic fourier descriptor for shape-based image retrieval. In: Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, ICME, Lausanne, Switzerland, vol I, pp 425–428Google Scholar
  41. 41.
    Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2015) View-based 3d object retrieval via multi-modal graph learning. Signal Process 112:110–118CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.School of Electronic Information EngineeringTianjin UniversityTianjinChina

Personalised recommendations