Robust joint learning network: improved deep representation learning for person re-identification

  • Yumin Tian
  • Qiang Li
  • Di WangEmail author
  • Bo Wan


Existing person re-identification methods, which based on deep representation learning, mostly only focus on either global feature or local feature. This obviously ignores the joint advantages and the correlation between global and local features. In this paper, we test and verify the benefits of jointly learning local and global features in a network based on the Convolutional Neural Network (CNN). Specifically, we give distinct weights to global loss and local loss when considering their different influence on our research, then we innovatively combine two losses into one loss. Besides, we propose a novel and strong network to learn part-level features with unified partition. Experimental results on three person ReID data sets, show that our method outperforms existing deep learning methods.


Person re-identification Deep learning Representation learning Joint learning 



This paper was supported in part by the National Natural Science Foundation of China under Grant 61702394, Grant 61572385 and Grant 61711530248, in part by the Postdoctoral Science Foundation of China under Grant 2018T111021 and Grant 2017M613082, in part by the Science and Technology Project of Shaanxi Province under Grant 2016GY-033, in part by the Shaanxi Key Research and Development Program under Grant 2017ZDXM-GY-002, in part by the Aeronautical Science Foundation of China under Grant 20171981008, and in part by the Fundamental Research Funds for the Central Universities under Grant JBX170313, Grant XJS17063 and Grant JBF180301.


  1. 1.
    Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: CVPRGoogle Scholar
  2. 2.
    Bai S, Bai X, Tian Q (2017) Scalable person re-identification on supervised smoothed manifold. In: CVPRGoogle Scholar
  3. 3.
    Barbosa IB, Cristani M, Caputo B, Rognhaugen A, Theoharis T (2017) Looking beyond appearances: synthetic training data for deep cnns in re-identification. arXiv preprint arXiv:1701.03153Google Scholar
  4. 4.
    Chang X, Yang Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks[J]. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chen D, Yuan Z, Chen B, Zheng N (2016) Similarity learning with spatial constraints for person re-identification. In: CVPR, pp 1268–1277Google Scholar
  6. 6.
    Chen Y, Zhu X, Gong S (2017) Person re-identification by deep learning multi scale representations. In: International conference on computer vision, workshop on cross-domain human identification (CHI)Google Scholar
  7. 7.
    Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. In: BMVCGoogle Scholar
  8. 8.
    Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1983–1991Google Scholar
  9. 9.
    Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: NIPSGoogle Scholar
  10. 10.
    Das A, Chakraborty A, Roy-Chowdhury AK (2014) Consistent re-identification in a camera network. Springer International PublishingGoogle Scholar
  11. 11.
    Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: CVPR. IEEE, pp 2360–2367Google Scholar
  12. 12.
    Geng M, Wang Y, Xiang T, Tian Y (2016) Deep transfer learning for person re-identification. arXiv preprint arXiv:1611.05244Google Scholar
  13. 13.
    Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: CVPRGoogle Scholar
  14. 14.
    Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: ECCVGoogle Scholar
  15. 15.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPRGoogle Scholar
  16. 16.
    Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737Google Scholar
  17. 17.
    Jing P, Su Y, Nie L et al (2017) Low-rank multi-view embedding learning for micro-video popularity prediction[J]. IEEE Trans Knowl Data Eng PP(99):1–1Google Scholar
  18. 18.
    Jose C, Fleuret F, Jose C, Fleuret F (2016) Scalable metric learning via weighted approximate rank component analysis. In: ECCVGoogle Scholar
  19. 19.
    Kalayeh MM, Basaran E, Gokmen M, Kamasak ME, Shah M (2018) Human semantic parsing for person re-identification. CVPRGoogle Scholar
  20. 20.
    Karanam S, Gou M, Wu Z, Rates-Borras A, Camps O, Radke RJ (2016) A comprehensive evaluation and benchmark for person re-identification: features, metrics, and data sets. arXiv preprint arXiv:1605.09653Google Scholar
  21. 21.
    Kviatkovsky I, Adam A, Rivlin E (2013) Color invariants for person re-identification. IEEE TPAMI 35(7):1622–1634CrossRefGoogle Scholar
  22. 22.
    Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: CVPRGoogle Scholar
  23. 23.
    Li W, Zhu X, Gong S (2017) Person re-identification by deep joint learning of multi-loss classification. In: IJCAIGoogle Scholar
  24. 24.
    Li Z, Nie F, Chang X et al (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis[J]. IEEE Trans Knowl Data Eng PP(99):1–1Google Scholar
  25. 25.
    Li J, Lu K, Huang Z et al (2018) Transfer independently together: a generalized framework for domain adaptation[J]. IEEE TRANS. CYBERN. PP(99):1–12Google Scholar
  26. 26.
    Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. CVPRGoogle Scholar
  27. 27.
    Lin M, Qiang C, Yan S (2014) Network in network. In: ICLRGoogle Scholar
  28. 28.
    Lin J, Ren L, Lu J, Feng J, Zhou J (2017) Consistent-aware deep learning for person re-identification in a camera network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5771–5780Google Scholar
  29. 29.
    Lisanti G, Masi I, Bagdanov AD, Del Bimbo A (2015) Person re-identification by iterative re-weighted sparse ranking. IEEE TPAMI 37(8):1629–1642CrossRefGoogle Scholar
  30. 30.
    Liu W, Anguelov D, Erhan D, Szegedy C, Reed SE, Fu C, Berg AC (2016) SSD: single shot multibox detector. In: ECCVGoogle Scholar
  31. 31.
    Liu X, Zhao H, Tian M, Sheng L, Shao J, Yi S, Yan J, Wang X (2017) Hydraplus-net: attentive deep features for pedestrian analysis. In: ICCVGoogle Scholar
  32. 32.
    Long M, Wang J, Jordan MI (2016) Unsupervised domain adaptation with residual transfer networks. CoRR, abs/1602.04433Google Scholar
  33. 33.
    Luo M, Chang X, Li Z et al (2017) Simple to complex cross-modal learning to rank[J]. Comput Vis Image Underst 163Google Scholar
  34. 34.
    Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE TIP 23(8):3656–3670MathSciNetzbMATHGoogle Scholar
  35. 35.
    Martinel N, Das A, Micheloni C, Roy Chowdhury AK (2016) Temporal model adaptation for person reidentification. In: ECCV. Springer, pp 858–877Google Scholar
  36. 36.
    Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: CVPR, pp 1363–1372Google Scholar
  37. 37.
    Nye L, Akbar M, Li T, Chua T-S (2014) A joint local-global approach for medical terminology assignment. In: Proc. Int. ACM SIGIR ConfGoogle Scholar
  38. 38.
    Qian X, Fu Y, Jiang Y-G, Xiang T, Xue X (2017) Multiscale deep learning architectures for person re-identification. ICCVGoogle Scholar
  39. 39.
    Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multicamera tracking. In: European conference on computer vision workshop on benchmarking multi-target trackingGoogle Scholar
  40. 40.
    Su C, Zhang S, Xing J, Gao W, Tian Q (2016) Deep attributes driven multi-camera person re-identification. In: ECCVGoogle Scholar
  41. 41.
    Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. ICCVGoogle Scholar
  42. 42.
    Sun Y, Zheng L, Deng W, Wang S (2017) SVDNet for pedestrian retrieval. In: ICCVGoogle Scholar
  43. 43.
    Sun Y, Liang Z, Yang Y, Tian Q, Wang S (2017) Beyond part models: person retrieval with refined part pooling. arXiv preprint arXiv:1711.09349Google Scholar
  44. 44.
    Ustinova E, Ganin Y, Lempitsky V (2015) Multiregion bilinear convolutional neural networks for person reidentification. arXiv preprint arXiv:1512.05300Google Scholar
  45. 45.
    Varior R, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human reidentification. In: ECCVGoogle Scholar
  46. 46.
    Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) GLAD: global-local-alignment descriptor for pedestrian retrieval. ACM MultimediaGoogle Scholar
  47. 47.
    Wu L, Shen C, Hengel A (2016) Personnet: person re-identification with deep convolutional neural networks. arXiv preprint arXiv:1601.07255Google Scholar
  48. 48.
    Wu S, Chen Y-C, Li X, Wu A-C, You J-J, Zheng W-S (2016) An enhanced deep feature representation for person re- identification. In: WACVGoogle Scholar
  49. 49.
    Wu Z, Huang Y, Wang L, Wang X, Tan T (2016) A comprehensive study oncross- view gait based human identification with deep cnns. TPAMIGoogle Scholar
  50. 50.
    Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: CVPRGoogle Scholar
  51. 51.
    Xie L, Shen J, Han J et al (2017) Dynamic multi-view hashing for online image retrieval[C]. Twenty-sixth international joint conference on artificial intelligence, pp 3133–3139Google Scholar
  52. 52.
    Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: ECCV. Springer, pp 1–16Google Scholar
  53. 53.
    Yang Y, Yang J, Yan J, Liao S, Yi D, Li SZ (2014) Salient color names for person re-identification. In: ECCV. Springer, pp 536–551Google Scholar
  54. 54.
    Yao H, Zhang S, Zhang Y, Li J, Tian Q (2017) Deep representation learning with part loss for person re-identification. arXiv preprint arXiv:1707.00798Google Scholar
  55. 55.
    Yu H-X, Wu A, Zheng W-S (2017) Cross-view asymmetric metric learning for unsupervised person re-identification. In: IEEE international conference on computer visionGoogle Scholar
  56. 56.
    Zeng Z, Li Z, Cheng D et al (2017) Two-stream multi-rate recurrent neural network for video-based pedestrian re-identification[J]. IEEE T IND INFORM:1–1Google Scholar
  57. 57.
    Zhang L, Xiang T, Gong S (2016) Learning a discriminative null space for person re-identification. In: CVPRGoogle Scholar
  58. 58.
    Zhang Y, Li B, Lu H, Irie A, Ruan X (2016) Sample-specific svm learning for person re-identification. In: CVPRGoogle Scholar
  59. 59.
    Zhang Y, Xiang T, Hospedales TM, Lu H (2017) Deep mutual learning. arXiv preprint arXiv:1705.00384Google Scholar
  60. 60.
    Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: CVPR, pp 144–151Google Scholar
  61. 61.
    Zhao H, Tian M, Shao J, Sun S, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature. In: CVPRGoogle Scholar
  62. 62.
    Zhao L, Li X, Wang J, Zhuang Y (2017) Deeply-learned part-aligned representations for person re-identification. In: ICCVGoogle Scholar
  63. 63.
    Zheng W, Gong S, Xiang T (2013) Reidentification by relative distance comparison. TPAMIGoogle Scholar
  64. 64.
    Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: ICCVGoogle Scholar
  65. 65.
    Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: past, present and future. arXiv preprint arXiv:1610.02984Google Scholar
  66. 66.
    Zheng Z, Zheng L, Yang Y (2016) A discriminatively learned cnn embedding for person re-identification. arXiv pre- print arXiv:1611.05666Google Scholar
  67. 67.
    Zheng L, Huang Y, Lu H, Yang Y (2017) Pose invariant embedding for deep person re-identification. arXiv preprint arXiv:1701.07732Google Scholar
  68. 68.
    Zheng L, Zhang H, Sun S, Chandraker M, Tian Q (2017) Person re-identification in the wild. CVPRGoogle Scholar
  69. 69.
    Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. arXiv preprint arXiv:1708.04896Google Scholar
  70. 70.
    Zhou J, Yu P, Tang W, Wu Y (2017) Efficient online local metric adaptation via negative samples for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2420–2428Google Scholar
  71. 71.
    Zhu L, Huang Z, Chang X et al (2017) Exploring consistent preferences: discrete hashing with pair-exemplar for scalable landmark search[C]. ACM, pp 726–734Google Scholar
  72. 72.
    Zhu L, Huang Z, Liu X et al (2017) Discrete multimodal hashing with canonical views for robust Mobile landmark search[J]. IEEE T MULTIMEDIA 19(9):2066–2079CrossRefGoogle Scholar
  73. 73.
    Zhu L, Huang Z, Li Z et al (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval[J]. IEEE T NEUR NET LEAR PP(99):1–13Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyXidian UniversityXi’anChina

Personalised recommendations