Advertisement

Independent metric learning with aligned multi-part features for video-based person re-identification

  • Jingjing WuEmail author
  • Jianguo Jiang
  • Meibin Qi
  • Hao Liu
Article

Abstract

Video-based person re-identification attracts wide attention because it plays a crucial role for many applications in the video surveillance. The task of video-based person re-identification is to match image sequences of the pedestrian recorded by non-overlapping cameras. Like many visual recognition problems, variations in pose, viewpoints, illumination, and occlusion make this task non-trivial. Aiming at increasing the robustness of features to variations and occlusion, this paper designs an aligned multi-part image model inspired by human visual attention mechanism. This model performs a pose estimation method to align the pedestrians. Then, it divides the images to extract multi-part appearance features. Besides, we present independent metric learning to combine the multi-part appearance and spatial-temporal features, which obtains several metric kernels by feeding these features into distance metric learning respectively. These kernels are fused with the weights learned by the attention measure. The novel way of features fusion can achieve better functional complementarity of these features. In experiments, we analyze the effectiveness of the major components. Extensive experiments on two public benchmark datasets, i.e., the iLIDS-VID and PRID-2011 datasets, demonstrate the effectiveness of the proposed method.

Keywords

Video-based person re-identification Aligned multi-part image model Independent metric learning Pedestrian alignment 

Notes

References

  1. 1.
    Chen J, Wang Y, Tang YY (2016) Person re-identification by exploiting spatio-temporal cues and multi-view metric learning. IEEE Signal Process Lett 23 (7):998–1002.  https://doi.org/10.1109/LSP.2016.2574323 CrossRefGoogle Scholar
  2. 2.
    Cho YJ, Yoon KJ (2016) Improving person re-identification via pose-aware multi-shot matching. In: Computer vision and pattern recognition, pp 1354–1362Google Scholar
  3. 3.
    Chu H, Qi M, Liu H, Jiang J (2017) Local region partition for person re-identification.Multimed Tools Appl (7):1–17Google Scholar
  4. 4.
    Ferrari V, Marinjimenez M, Zisserman A (2008) Progressive search space reduction for human pose estimation. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008, pp 1–8Google Scholar
  5. 5.
    Gao C, Wang J, Liu L, Yu JG, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288, DOI  https://doi.org/10.1109/ICIP.2016.7533168, (to appear in print)
  6. 6.
    Gordon CC, Churchill T, Clauser CE, Bradtmiller B, Mcconville JT (1989) Anthropometric survey of us army personnel: methods and summary statistics 1988. Tech. rep., Anthropology Research Project Inc., Yellow Springs, OHGoogle Scholar
  7. 7.
    He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: IEEE International conference on multimedia and expo, pp 1153–1158Google Scholar
  8. 8.
    Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis, pp 91–102Google Scholar
  9. 9.
    Itti L, Koch C (2000) A saliency-based search mechanism for overt and covert shifts of visual attention. Vis Res 40(12):1489–1506CrossRefGoogle Scholar
  10. 10.
    Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19Th British machine vision conference, pp 275–1. British machine vision associationGoogle Scholar
  11. 11.
    Li W, Wang X (2013) Locally aligned feature transforms across views. In: Computer vision and pattern recognition, pp 3594–3601Google Scholar
  12. 12.
    Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1454–1461.  https://doi.org/10.1109/CVPRW.2017.188
  13. 13.
    Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2197–2206.  https://doi.org/10.1109/CVPR.2015.7298832
  14. 14.
    Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol PP(99):1–1.  https://doi.org/10.1109/TCSVT.2017.2715499 CrossRefGoogle Scholar
  15. 15.
    Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE International conference on computer vision, pp 3810– 3818Google Scholar
  16. 16.
    Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 4294– 4298Google Scholar
  17. 17.
    Mclaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Computer vision and pattern recognition, pp 1325–1334Google Scholar
  18. 18.
    Ramanan D (2007) Learning to parse images of articulated bodies. In: Advances in neural information processing systems, pp 1129–1136Google Scholar
  19. 19.
    Song Z, Cai X, Chen Y, Zeng Y, Lv L, Shu H (2017) Deep convolutional neural networks with adaptive spatial feature for person re-identification. In: IEEE Advanced information technology, electronic and automation control conference, pp 2020–2023Google Scholar
  20. 20.
    Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, pp 135–153Google Scholar
  21. 21.
    Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision, pp 688–703Google Scholar
  22. 22.
    Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514.  https://doi.org/10.1109/TPAMI.2016.2522418 CrossRefGoogle Scholar
  23. 23.
    Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 420–428Google Scholar
  24. 24.
    Xiao Q, Cao K, Chen H, Peng F, Zhang C (2016) Cross domain knowledge transfer for person re-identification. arXiv:1611.06026
  25. 25.
    Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: 2017 IEEE international conference on computer vision (ICCV), pp 4743–4752.  https://doi.org/10.1109/ICCV.2017.507
  26. 26.
    Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide Web-internet & Web Information Systems, pp 1–16Google Scholar
  27. 27.
    Yang Y, Ramanan D (2013) Articulated human detection with flexible mixtures of parts. IEEE Trans Pattern Anal Mach Intell 35(12):2878–2890CrossRefGoogle Scholar
  28. 28.
    Yao H, Zhang S, Zhang Y, Li J, Tian Q (2017) Deep representation learning with part loss for person re-identification. arXiv:1707.00798
  29. 29.
    You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1345–1353.  https://doi.org/10.1109/CVPR.2016.150
  30. 30.
    Zhang W, Chen Q, Zhang W, He X (2018) Long-range terrain perception using convolutional neural networks. Neurocomputing 275:781–787CrossRefGoogle Scholar
  31. 31.
    Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. arXiv:1702.06294
  32. 32.
    Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process PP(99):1–1MathSciNetGoogle Scholar
  33. 33.
    Zhang W, Yu X, He X (2017) Learning bidirectional temporal cues for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP (99):1–1.  https://doi.org/10.1109/TCSVT.2017.2718188 CrossRefGoogle Scholar
  34. 34.
    Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Computer vision and pattern recognition, pp 907–915Google Scholar
  35. 35.
    Zheng L, Huang Y, Lu H, Yang Y (2017) Pose invariant embedding for deep person re-identification. arXiv:1701.07732
  36. 36.
    Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: Past, present and future. arXiv:1610.02984
  37. 37.
    Zheng S, Li X, Men A, Guo X, Yang B (2017) Integration of deep features and hand-crafted features for person re-identification. In: 2017 IEEE international conference on multimedia expo workshops (ICMEW), pp 674–679.  https://doi.org/10.1109/ICMEW.2017.8026267
  38. 38.
    Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE Conference on computer vision and pattern recognition, pp 6776–6785Google Scholar
  39. 39.
    Zhu J, Zeng H, Liao S, Lei Z, Cai C, Zheng LX (2017) Deep hybrid similarity learning for person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1Google Scholar
  40. 40.
    Zhu X, Jing XY, Wu F, Feng H (2016) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. In: International joint conference on artificial intelligence, pp 3552–3558Google Scholar
  41. 41.
    Zhu X, Jing XY, Yang L, You X, Chen D, Gao G, Wang Y (2017) Semi-supervised cross-view projection-based dictionary learning for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1.  https://doi.org/10.1109/TCSVT.2017.2718036 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and Information EngineeringHefei University of TechnologyHefeiChina

Personalised recommendations