Multimedia Tools and Applications

, Volume 77, Issue 3, pp 3049–3069 | Cite as

A loss combination based deep model for person re-identification

  • Fuqing Zhu
  • Xiangwei Kong
  • Qun Wu
  • Haiyan Fu
  • Ming Li


The Convolutional Neural Network (CNN) has significantly improved the state-of-the-art in person re-identification (re-ID). In the existing available identification CNN model, the softmax loss function is employed as the supervision signal to train the CNN model. However, the softmax loss only encourages the separability of the learned deep features between different identities. The distinguishing intra-class variations have not been considered during the training process of CNN model. In order to minimize the intra-class variations and then improve the discriminative ability of CNN model, this paper combines a new supervision signal with original softmax loss for person re-ID. Specifically, during the training process, a center of deep features is learned for each pedestrian identity and the deep features are subtracted from the corresponding identity centers, simultaneously. So that, the deep features of the same identity to the center will be pulled efficiently. With the combination of loss functions, the inter-class dispersion and intra-class aggregation can be constrained as much as possible. In this way, a more discriminative CNN model, which has two key learning objectives, can be learned to extract deep features for person re-ID task. We evaluate our method in two identification CNN models (i.e., CaffeNet and ResNet-50). It is encouraging to see that our method has a stable improvement compared with the baseline and yields a competitive performance to the state-of-the-art person re-ID methods on three important person re-ID benchmarks (i.e., Market-1501, CUHK03 and MARS).


Convolutional neural network Loss combination Person re-identification 



This work is supported by the Foundation for Innovative Research Groups of the NSFC (Grant no.71421001), National Natural Science Foundation of China (Grant no.61502073), and the Open Projects Program of National Laboratory of Pattern Recognition (No.201407349).


  1. 1.
    Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings CVPR, pp 3908–3916Google Scholar
  2. 2.
    An L, Chen X, Liu S, Lei Y, Yang S (2016) Integrating appearance features and soft biometrics for person re-identification. Multimedia Tools and ApplicationsGoogle Scholar
  3. 3.
    Baltieri D, Vezzani R, Cucchiara R (2011) 3dpes: 3d people dataset for surveillance and forensics. In: Proceedings ACM workshop on human gesture and behavior understanding, pp 59–64Google Scholar
  4. 4.
    Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings COMPSTAT’2010, pp 177–186Google Scholar
  5. 5.
    Chang X, Yang Y (2016) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst. doi: 10.1109/TNNLS.2016.2582746
  6. 6.
    Chang X, Nie F, Wang S, Yang Y, Zhou X, Zhang C (2016) Compound rank- k projections for bilinear analysis. IEEE Trans Neural Netw Learn Syst 27(7):1502–1513MathSciNetCrossRefGoogle Scholar
  7. 7.
    Chang X, Nie F, Yang Y, Zhang C, Huang H (2016) Convex sparse pca for unsupervised feature learning. ACM Trans Knowl Discov Data 11(1):3:1–3:16CrossRefGoogle Scholar
  8. 8.
    Chang X, Yu YL, Yang Y, Xing EP (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell. doi: 10.1109/TPAMI.2016.2608901
  9. 9.
    Chen D, Yuan Z, Chen B, Zheng N (2016) Similarity learning with spatial constraints for person re-identification. In: Proceedings CVPR, pp 1268–1277Google Scholar
  10. 10.
    Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings CVPR, pp 1335–1344Google Scholar
  11. 11.
    Das A, Chakraborty A, Roy-Chowdhury AK (2014) Consistent re-identification in a camera network. In: Proceedings ECCV, pp 330–345Google Scholar
  12. 12.
    Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings ICML, pp 209–216Google Scholar
  13. 13.
    Dehghan A, Modiri Assari S, Shah M (2015) Gmmcp tracker: Globally optimal generalized maximum multi clique problem for multiple object tracking. In: Proceedings CVPR, pp 4091–4099Google Scholar
  14. 14.
    Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings CVPR, pp 248–255Google Scholar
  15. 15.
    Ess A, Leibe B, Van Gool L (2007) Depth and appearance for mobile scene analysis. In: Proceedings ICCV, pp 1–8Google Scholar
  16. 16.
    Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645CrossRefGoogle Scholar
  17. 17.
    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings CVPR, pp 580–587Google Scholar
  18. 18.
    Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proceedings ECCV, pp 262–275Google Scholar
  19. 19.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings CVPR, pp 770–778Google Scholar
  20. 20.
    Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Proceedings Scandinavian conference on Image analysis, pp 91–102Google Scholar
  21. 21.
    Hu HM, Fang W, Zeng G, Hu Z, Li B (2016) A person re-identification algorithm based on pyramid color topology feature. Multimedia Tools and ApplicationsGoogle Scholar
  22. 22.
    Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings ACM international conference on multimedia, pp 675–678Google Scholar
  23. 23.
    Jose C, Fleuret F (2016) Scalable metric learning via weighted approximate rank component analysis. In: Proceedings ECCV, pp 875–890Google Scholar
  24. 24.
    Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proceedings CVPR, pp 2288–2295Google Scholar
  25. 25.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings NIPS, pp 1097–1105Google Scholar
  26. 26.
    Leng Q, Hu R, Liang C, Wang Y, Chen J (2015) Person re-identification with content and context re-ranking. Multimedia Tools Appl 74(17):6989–7014CrossRefGoogle Scholar
  27. 27.
    Li W, Wang X (2013) Locally aligned feature transforms across views. In: Proceedings CVPR, pp 3594–3601Google Scholar
  28. 28.
    Li W, Zhao R, Wang X (2012) Human reidentification with transferred metric learning. In: Proceedings ACCV, pp 31–44Google Scholar
  29. 29.
    Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings CVPR, pp 152–159Google Scholar
  30. 30.
    Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings CVPR, pp 2197–2206Google Scholar
  31. 31.
    Lin Y, Zheng L, Zheng Z, Wu Y, Yang Y (2017) Improving person re-identification by attribute and identity learning. arXiv:170307220
  32. 32.
    Liu H, Feng J, Qi M, Jiang J, Yan S (2016) End-to-end comparative attention networks for person re-identification. arXiv:160604404
  33. 33.
    Liu J, Zha ZJ, Tian Q, Liu D, Yao T, Ling Q, Mei T (2016) Multi-scale triplet cnn for person re-identification. In: Proceedings ACM international conference on multimedia, pp 192–196Google Scholar
  34. 34.
    Martinel N, Das A, Micheloni C, Roy-Chowdhury AK (2016) Temporal model adaptation for person re-identification. arXiv:160707216
  35. 35.
    Prosser B, Zheng WS, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: Proceedings BMVC, pp 1–11Google Scholar
  36. 36.
    Radenović F, Tolias G, Chum O (2016) Cnn image retrieval learns from bow: unsupervised fine-tuning with hard examples. arXiv:160402426
  37. 37.
    Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:151106434
  38. 38.
    Roth PM, Hirzer M, Koestinger M, Beleznai C, Bischof H (2014) Mahalanobis distance learning for person re-identification. In: Person re-identification, Springer London, pp 247–267Google Scholar
  39. 39.
    Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings CVPR, pp 815–823Google Scholar
  40. 40.
    Su C, Zhang S, Xing J, Gao W, Tian Q (2016) Deep attributes driven multi-camera person re-identification. In: Proceedings ECCV, pp 475–491Google Scholar
  41. 41.
    Sun Y, Zheng L, Deng W, Wang S (2017) Svdnet for pedestrian retrieval. arXiv:170305693
  42. 42.
    Ustinova E, Ganin Y, Lempitsky V (2015) Multiregion bilinear convolutional neural networks for person re-identification. arXiv:151205300
  43. 43.
    Varior RR, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human re-identification. In: Proceedings ECCV, pp 791–808Google Scholar
  44. 44.
    Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: Proceedings ECCV, pp 135–153Google Scholar
  45. 45.
    Wang F, Zuo W, Lin L, Zhang D, Zhang L (2016) Joint learning of single-image and cross-image representations for person re-identification. In: Proceedings CVPR, pp 1288–1296Google Scholar
  46. 46.
    Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: Proceedings ECCV, pp 688–703Google Scholar
  47. 47.
    Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. In: Proceedings NIPS, pp 1473–1480Google Scholar
  48. 48.
    Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: Proceedings ECCV, pp 499–515Google Scholar
  49. 49.
    Wu L, Shen C, Hengel AvD (2016) Personnet: Person re-identification with deep convolutional neural networks. arXiv:160107255
  50. 50.
    Xiang ZJ, Chen Q, Liu Y (2014) Person re-identification by fuzzy space color histogram. Multimedia Tools Appl 73(1):91–107CrossRefGoogle Scholar
  51. 51.
    Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings CVPR, pp 1249–1258Google Scholar
  52. 52.
    Xiao T, Li S, Wang B, Lin L, Wang X (2016) End-to-end deep learning for person search. arXiv:160401850
  53. 53.
    Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: Proceedings ECCV, pp 701–716Google Scholar
  54. 54.
    Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimedia 18 (12):2494–2502CrossRefGoogle Scholar
  55. 55.
    Yang Y, Zhuang YT, Wu F, Pan YH (2008) Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans Multimedia 10(3):437– 446CrossRefGoogle Scholar
  56. 56.
    Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723– 742CrossRefGoogle Scholar
  57. 57.
    Yang Y, Ma Z, Hauptmann AG, Sebe N (2013) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans Multimedia 15 (3):661–669CrossRefGoogle Scholar
  58. 58.
    Yi D, Lei Z, Liao S, Li SZ (2014) Deep metric learning for person re-identification. In: Proceedings ICPR, pp 34–39Google Scholar
  59. 59.
    Zhang L, Xiang T, Gong S (2016) Learning a discriminative null space for person re-identification. In: Proceedings CVPR, pp 1239–1248Google Scholar
  60. 60.
    Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: Proceedings CVPR, pp 144–151Google Scholar
  61. 61.
    Zhao Y, Zhao X, Luo R, Liu Y (2016) Person re-identification by encoding free energy feature maps. Multimedia Tools Appl 75(8):4795–4813CrossRefGoogle Scholar
  62. 62.
    Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: coupled multi-index for accurate image retrieval. In: Proceedings CVPR, pp 1939–1946Google Scholar
  63. 63.
    Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings ICCV, pp 1116–1124Google Scholar
  64. 64.
    Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings CVPR, pp 1741–1750Google Scholar
  65. 65.
    Zheng L, Bie Z, Sun Y, Wang J, Wang S, Su C, Tian Q (2016) Mars: a video benchmark for large-scale person re-identification. In: Proceedings ECCV, pp 868–884Google Scholar
  66. 66.
    Zheng L, Wang S, Wang J, Tian Q (2016) Accurate image search with multi-scale contextual evidences. Int J Comput Vis 120(1):1–13MathSciNetCrossRefGoogle Scholar
  67. 67.
    Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: past, present and future. arXiv:161002984
  68. 68.
    Zheng L, Yang Y, Tian Q (2017) Sift meets cnn: a decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell. doi: 10.1109/TPAMI.2017.2709749
  69. 69.
    Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: Proceedings CVPRGoogle Scholar
  70. 70.
    Zheng WS, Gong S, Xiang T (2009) Associating groups of people. In: Proceedings BMVC, pp 23.1–23.11Google Scholar
  71. 71.
    Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv:170107717

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.School of Information and Communication EngineeringDalian University of TechnologyDalianChina
  2. 2.General Design InstituteZhejiang Sci-Tech UniversityHangzhouChina
  3. 3.Taizhou Research InstituteZhejiang UniversityTaizhouChina

Personalised recommendations