Skip to main content
Log in

Deep feature representation and multiple metric ensembles for person re-identification in security surveillance system

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the increasing concern about social public security and the development of large scale data storage technology, person re-identification in security surveillance system becomes a hot topic. Large variations in viewpoint and lighting across different camera views could change the appearance of the person a lot, which makes person re-identification still a challenging problem. Therefore, developing robust feature descriptors and designing discriminative distance metrics to measure the similarity between pedestrian images are two key aspects in person re-identification. In this paper, we propose a method using both deep learning and multiple metric ensembles to improve the performance of the re-identification. Firstly, we jointly use the various datasets to train a general Convolutional Neural Network (CNN) which is employed to extract the deep features of training and testing set afterwards. The deep architecture makes it possible to learn more abstract and internal features which are robust against the variations in viewpoint and lighting. Then we utilize the deep features of the training set to learn the specific distance metric of different datasets and combine it with Cosine distance metric together, multiple metric ensembles can measure the similarity between different images in a more comprehensive way. Finally, extensive experiments demonstrate that our method can improve the recognition performance effectively when compared with the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Computer vision and pattern recognition. IEEE, pp 3908–3916

  2. An L, Kafai M, Yang S, Bhanu B (2013) Reference-based person re-identification IEEE international conference on advanced video and signal based surveillance, pp 244–249

    Google Scholar 

  3. Baltieri D, Vezzani R, Cucchiara R (2011) 3dpes: 3d people dataset for surveillance and forensics. In: International ACM workshop on multimedia access to 3d human objects, pp 59–64

  4. Bohn J, Ying Y, Gentric S, Pontil M (2014) Large margin local metric learning, computer vision C ECCV 2014. Springer International Publishing, Zurich, Switzerland, pp 679–694

    Google Scholar 

  5. Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. British Mach Vision Conf 2:68.1–68.11

    Google Scholar 

  6. Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low- and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst Hum 43(4):996–1002

    Article  Google Scholar 

  7. Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information theoretic metric learning Proceedings of the 24th international conference on machine learning, ACM, vol 227, pp 209–216

  8. Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, vol 23, pp 2360–2367

  9. Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking Proceedings IEEE international workshop on performance evaluation for tracking and surveillance

    Google Scholar 

  10. Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. Marseille, France: Springer Berlin Heidelberg, Computer Vision CECCV, pp. 262–275

  11. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. Scandinavian Conf Image Anal 6688:91–102

    Article  Google Scholar 

  12. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. Computer Science

  13. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding Proceedings of the 22nd ACM international conference on multimedia, pp 675–678

    Google Scholar 

  14. Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: IEEE conference on computer vision & pattern recognition, pp 2288–2295

  15. Kawanishi Y, Wu Y, Mukunoki M, Minoh M (2014) Shinpuhkan2014: A multi-camera pedestrian dataset for tracking people across multiple cameras. The Korea-Japan Joint Workshop on Frontiers of Computer Vision, pp 322–329

  16. Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification IEEE conference on computer vision and pattern recognition IEEE computer society, vol 9, pp 3610–3617

  17. Li W, Zhao R, Wang X (2012) Human re-identification with transferred metric learning. Daejeon, Korea: Springer Berlin Heidelberg, 2013, Computer Vision C ACCV 2012, pp 31–34

  18. Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification 2014 IEEE conference on computer vision and pattern recognition (CVPR) IEEE computer society, pp 152–159

    Google Scholar 

  19. Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206

    Google Scholar 

  20. Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model Thirtieth AAAI conference on artificial intelligence

    Google Scholar 

  21. Liu Y, Cui J, Zhao H, Zha H (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking International conference on pattern recognition. IEEE, pp 898–901

    Chapter  Google Scholar 

  22. Liu Y, Liang Y, Liu S, Rosenblum DS, Zheng Y (2016) Predicting urban water quality with ubiquitous data, arXiv preprint 1610.09462

  23. Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for viceo-based pedestrian re-identification IEEE international conference on computer vision, pp 3810–3818

    Google Scholar 

  24. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2016) Action2Activity: recognizing complex activities from sensor data International conference on artificial intelligence, pp 1617–1623

    Google Scholar 

  25. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: Sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  26. Liu X, Song M, Tao D, Liu Z, Zhang L, Bu J, Chen C (2013) Semi-supervised node splitting for random forest construction Proceedings of CVPR, vol 9, pp 492–449

  27. Liu Y, Zhang X, Cui J, Wu C, Aghajan H, Zha H (2010) Visual analysis of child-adult interactive behaviors in video sequences International conference on virtual systems and multimedia. IEEE, pp 26–33

    Google Scholar 

  28. Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban water quality prediction based on multi-task multi-view learning Proceedings of the international joint conference on artificial intelligence

    Google Scholar 

  29. Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: predicting your career path. AAAI, pp 201–207

  30. Lu Y, Wei Y, Liu L, Zhong J, Sun L, Liu Y (2016) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimedia Tools & Applications, pp 1–19

  31. Mignon A, Jurie F (2012) Pcca: a new approach for distance learning from sparse pairwise constraints 2012 IEEE conference on computer vision and pattern recognition (CVPR), vol 157, pp 2666–2672

  32. Paisitkriangkrai S, Shen C, Hengel Avd (2015) Learning to rank in person re-identification with metric ensembles Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1846–1855

    Google Scholar 

  33. Roth PM, Wohlhart P, Hirzer M, Kostingerand M, Bischof H (2012) Large scale metric learning from equivalence constraints IEEE conference on computer vision & pattern recognition, pp 2288–2295

    Google Scholar 

  34. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

    Google Scholar 

  35. Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking European conference on computer vision, vol 8692, pp 688–703

  36. Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  37. Wang W, Yan Y, Zhang L, Hong R, Sebe N (2016) Collaborative sparse coding for multi-view action recognition. IEEE Multimedia Magazine 23(4):80–87

    Article  Google Scholar 

  38. Wang W, Yan Y, Zhang L, Hong R, Sebe N (2016) Collaborative sparse coding for multiview action recognition. IEEE multiMedia 23(4):80–87

    Article  Google Scholar 

  39. Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1249–1258

    Google Scholar 

  40. Yi D, Lei Z, Li SZ (2014) Deep metric learning for practical person re-identification. ICPR, pp 34–39

  41. Zhang L, Gao Y, Hong C, Feng Y, Zhu J, Cai D (2014) Feature correlation hypergraph: exploiting high-order potentials for multimodal recognition. IEEE Transactions on Cybernetics 44(8):1408–1419

    Article  Google Scholar 

  42. Zhang L, Gao Y, Ji R, Dai Q, Li X (2014) Actively learning human gaze shifting paths for photo cropping. IEEE 23(5):2235–45

    MathSciNet  MATH  Google Scholar 

  43. Zhang L, Gao Y, Ji R, Lu K, Shen J (2014) Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans Multimedia 16 (2):470–479

    Article  Google Scholar 

  44. Zhang L, Gao Y, Zimmermann R, Tian Q, Li X (2014) Fusion of multi-channel local and global structural cues for photo aesthetics evaluation. IEEE Trans Image Process A Pub IEEE Sign Process Soc 23(3):1419–29

    Article  MATH  Google Scholar 

  45. Zhang L, Han Y, Yang Y, Song M, Yan S, Tian Q (2013) Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans Image Process 22(12):5071–5084

    Article  MathSciNet  MATH  Google Scholar 

  46. Zhang L, Hong R, Gao Y, Ji R, Dai Q, Li X (2016) Image categorization by learning a propagated graphlet path. IEEE Trans Neural Netw Learn Syst 27(3):674–685

    Article  MathSciNet  Google Scholar 

  47. Zhang L, Li X, Nie L, Yang Y, Xia Y (2016) Weakly supervised human fixations prediction. IEEE Trans Cybern 46(1):258–269

    Article  Google Scholar 

  48. Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM 12(3):37

    Article  Google Scholar 

  49. Zhang L, Song M, Li N, Bu J, Chen C (2009) Feature selection for fast speech emotion recognition International conference on multimedia 2009, Vancouver, British Columbia, Canada, pp 753–756

    Google Scholar 

  50. Zhang L, Song M, Liu Z, Liu X, Bu J, Chen C (2013) Probabilistic graphlet cut: exploring spatial structure cuefor weakly supervised image segmentation Proceedings of CVPR, vol 9, pp 1908–1915

  51. Zhang L, Song M, Zhao Q, Liu X, Bu J, Chen C (2013) Probabilistic graphlet transfer for photo cropping. IEEE Trans Cybern 21(5):2887–2897

    MATH  Google Scholar 

  52. Zhang L, Wang M, Hong R, Yin B, Li X (2016) Large-scale aerial image categorization using a multitask topological codebook. IEEE Trans Cybern 46 (2):535–545

    Article  Google Scholar 

  53. Zhang L, Yang Y, Gao Y, Wang C, Yu Y, Li X (2014) A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans Image Process 23(9):4150–4159

    Article  MathSciNet  MATH  Google Scholar 

  54. Zhang L, Yang Y, Wang M, Hong R, Nie L, Li X (2016) Detecting densely distributed graph patterns for fine-grained image categorization. IEEE Trans Image Process 25(2):553–565

    Article  MathSciNet  MATH  Google Scholar 

  55. Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: 2014 IEEE conference on computer vision and pattern recognition, pp 144–151

  56. Zheng W-S, Gong S, Xiang T (2009) Associating groups of people. BMVC

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingxian Han.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qi, M., Han, J., Jiang, J. et al. Deep feature representation and multiple metric ensembles for person re-identification in security surveillance system. Multimed Tools Appl 78, 27029–27043 (2019). https://doi.org/10.1007/s11042-017-4649-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4649-2

Keywords

Navigation