Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 22, pp 31605–31616 | Cite as

FMT: fusing multi-task convolutional neural network for person search

  • Sulan ZhaiEmail author
  • Shunqiang Liu
  • Xiao Wang
  • Jin Tang
Article
  • 107 Downloads

Abstract

Person search is to detect all persons and identify the query persons from detected persons in the image without proposals and bounding boxes, which is different from person re-identification. In this paper, we propose a fusing multi-task convolutional neural network(FMT-CNN) to tackle the correlation and heterogeneity of detection and re-identification with a single convolutional neural network. We focus on how the interplay of person detection and person re-identification affects the overall performance. We employ person labels in region proposal network to produce features for person re-identification and person detection network, which can improve the accuracy of detection and re-identification simultaneously. We also use a multiple loss to train our re-identification network. Experiment results on CUHK-SYSU Person Search dataset show that the performance of our proposed method is superior to state-of-the-art approaches in both mAP and top-1.

Keywords

Person search Heterogeneous task Multiple loss Region proposal network Person labels 

Notes

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61872005, in part by the Natural Science Research Project of Anhui universities of China under Grant KJ2019A0005, KJ2019A0032, and in part supported by open project of Anhui University KF2019A03.

References

  1. 1.
    Cheng D, Gong Y, Zhou S, et al. (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. IEEE computer vision and pattern recognition, pp 1335-1344Google Scholar
  2. 2.
    Cheng D, Gong Y, Shi W, et al. (2018) Person re-identification by the asymmetric triplet and identification loss function. Multimed Tools Appl 77(3):3533–3550CrossRefGoogle Scholar
  3. 3.
    Ding S, Lin L, Wang G, et al. (2015) Deep feature learning with relative distance comparison for person re-identification. Pattern Recogn 48(10):2993–3003CrossRefGoogle Scholar
  4. 4.
    Dollar P, Belongie S, Belongie S, et al. (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–45CrossRefGoogle Scholar
  5. 5.
    Engel C, Baumgartner P, Holzmann M, et al. (2010) Person Re-Identification by support vector ranking. British Machine Vision Conference (BMVC) 42:1–11Google Scholar
  6. 6.
    Felzenszwalb PF, Girshick RB, Mcallester D, et al. (2010) Object detection with discriminatively trained Part-Based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645CrossRefGoogle Scholar
  7. 7.
    Gao H, Yu L, Huang Y, et al. (2017) Multi-task learning for person re-identification. In: international conference on intelligent science and big data engineering, pp 259–268Google Scholar
  8. 8.
    Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision, computer scienceGoogle Scholar
  9. 9.
    Girshick R, Donahue J, Darrell T, et al. (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, IEEE Computer Society, pp 580–587Google Scholar
  10. 10.
    Hamdoun O, Moutarde F, Stanciulescu B, et al. (2008) Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences. In: ACM/IEEE International Conference on Distributed Smart Cameras, pp 1–6Google Scholar
  11. 11.
    He K, Zhang X, Ren S, et al. (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  12. 12.
    Jia Y, Shelhamer E, Donahue J, et al. (2014) Caffe: Convolutional architecture for fast feature embedding. In: 22nd ACM international conference on multimedia, pp 675–678Google Scholar
  13. 13.
    Koestinger M, Hirzer M, Wohlhart P, et al. (2012) Large scale metric learning from equivalence constraints. In: IEEE conference on IEEE computer vision and pattern recognition, pp 2288–2295Google Scholar
  14. 14.
    Leng Q, Hu R, Liang C, et al. (2015) Person re-identification with content and context re-ranking. Multimed Tools Appl, pp 6989–7014CrossRefGoogle Scholar
  15. 15.
    Li S, Liu X, Liu W, et al. (2016) A discriminative null space based deep learning approach for person re-identification. In: 2016 4th international conference on cloud computing and intelligence systems (CCIS),. IEEE, pp 480–484Google Scholar
  16. 16.
    Liao S, Li SZ (2015) Efficient PSD constrained asymmetric metric learning for person re-identification. In: IEEE international conference on computer vision, pp 3685–3693Google Scholar
  17. 17.
    Liao S, Hu Y, Zhu X, et al. (2015) Person re-identification by local maximal occurrence representation and metric learning. In: IEEE conference on computer vision and pattern recognition, pp 2197–2206Google Scholar
  18. 18.
    Liu H, Feng J, Qi M, et al. (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506MathSciNetCrossRefGoogle Scholar
  19. 19.
    McLaughlin N, del Rincon JM, Miller PC (2017) Person reidentification using deep convnets with multitask learning. IEEE Trans Circuits Syst Video Techn, pp 525–539CrossRefGoogle Scholar
  20. 20.
    Nino-Castaneda J, Frías-Velázquez A, Bo NB, Slembrouck M, Guan J, Debard G, Vanrumste B, Tuytelaars T, Philips W (2016) Scalable semi-automatic annotation for multi-camera person tracking. IEEE Trans Image Process 25(5):2259–2274MathSciNetCrossRefGoogle Scholar
  21. 21.
    Ospici M, Cecchi A (2018) Person re-identification across different datasets with multi-task learning, arXiv preprint, pp 1807–09666Google Scholar
  22. 22.
    Paisitkriangkrai S, Shen C, Van Den Hengel A (2015) Learning to rank in person re-identification with metric ensembles. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1846–1855Google Scholar
  23. 23.
    Ren S, He K, Girshick R, et al. (2017) Faster r-CNN: Towards Real-Time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149CrossRefGoogle Scholar
  24. 24.
    Shen Y, Lin W, Yan J, et al. (2015) Person re-identification with correspondence structure learning. In: IEEE international conference on computer vision. IEEE computer society, pp 3200-3208Google Scholar
  25. 25.
    Sun Y, Zheng L, Deng W, et al. (2017) SVDNet for pedestrian retrieval. In: IEEE international conference on computer vision. IEEE Computer Society, pp 3820–3828Google Scholar
  26. 26.
    Xiao T, Li H, Ouyang W, et al. (2016) Learning deep feature representations with domain guided dropout for person re-identification, pp 1249-1258Google Scholar
  27. 27.
    Xiao T, Li S, Wang B, et al. (2017) Joint detection and identification feature learning for person search. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3376–3385Google Scholar
  28. 28.
    Yang B, Yan J, Lei Z, et al. (2015) Convolutional channel features. In: IEEE international conference on computer vision, pp 82–90Google Scholar
  29. 29.
    Yuan C, Xu C, Wang T, et al. (2017) Deep multi-instance learning for end-to-end person re-identification. Multimed Tools Appl, (4):1–31Google Scholar
  30. 30.
    Zhang S, Benenson R, Schiele B (2015) Filtered channel features for pedestrian detection. IEEE computer vision and pattern recognition, pp 1751–1760Google Scholar
  31. 31.
    Zhao R, Ouyang W, Wang X (2013) Unsupervised salience learning for person re-identification. IEEE Computer Vision and Pattern Recognition 9:3586–3593Google Scholar
  32. 32.
    Zheng WS, Gong S, Xiang T (2011) Person re-identification by probabilistic relative distance comparison. Computer Vision and Pattern Recognition 42:649–656Google Scholar
  33. 33.
    Zheng L, Shen L, Tian L, et al. (2015) Scalable person re-identification: a benchmark. In: IEEE international conference on computer vision, pp 1116–1124Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mathematical SciencesAnhui UniversityHefeiChina
  2. 2.School of Computer Science and TechnologyAnhui UniversityHefeiChina

Personalised recommendations