Two Stream Deep CNN-RNN Attentive Pooling Architecture for Video-Based Person Re-identification
Person re-identification (re-ID), is the task of associating the relationship among the images of a person captured from different cameras with non-overlapping field of view. Fundamental and yet an open issue in re-ID is extraction of powerful features in low resolution surveillance videos. In order to solve this, a novel Two Stream Convolutional Recurrent model with Attentive pooling mechanism is presented for person re-ID in videos. Each stream of the model is a Siamese network which is aimed at extracting and matching most differentiated feature maps. Attentive pooling is used to select most informative video frames. The output of two streams is fused to formulate one combined feature map, which helps to deal with major challenges of re-ID e.g. pose and illumination variation, clutter background and occlusion. The proposed technique is evaluated on three challenging datasets: MARS, PRID-2011 and iLIDS-VID. Experimental evaluation shows that the proposed technique performs better than existing state-of-the-art supervised video based person re-ID models. The implementation is available at https://github.com/re-identification/Person_RE-ID.git.
KeywordsPerson re-identification Spatial stream Temporal stream
This research was supported by development project of leading technology for future vehicle of the business of Daegu metropolitan city (No. 20180910). We are also thankful to NVIDIA Corporation for donating the TitanX GPU which is used in this research.
- 3.Boin, J.-B., Araujo, A., Girod, B.: Recurrent neural networks for person re-identification revisited. arXiv preprint arXiv:1804.03281 (2018)
- 4.Yu, Z., et al.: Three-stream convolutional networks for video-based person re-identification. arXiv preprint arXiv:1712.01652 (2017)
- 5.McLaughlin, N., Martinez del Rincon, J., Miller, P.: Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
- 9.Yi, D., et al.: Deep metric learning for person re-identification. In: 22nd International Conference on Pattern Recognition (ICPR). IEEE (2014)Google Scholar
- 10.Mumtaz, S., et al.: Weighted hybrid features for person re-identification. In: 7th International Conference on Image Processing Theory Tools and Applications, Montreal (2017)Google Scholar
- 11.Mubariz, N., et al.: Optimization of person re-identification through visual descriptors. In: 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Funchal, Madeira, Portugal, pp. 348–355 (2018)Google Scholar
- 14.Xu, S., et al.: Jointly attentive spatial-temporal pooling networks for video-based person re-identification. arXiv preprint arXiv:1708.02286 (2017)
- 15.Chung, D., Tahboub, K., Delp, E.J.: A two stream siamese convolutional neural network for person re-identification. In: The IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
- 16.Liu, K., et al.: A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)Google Scholar
- 17.Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision (1981)Google Scholar
- 18.Karanam, S., Li, Y., Radke, R.J.: Sparse re-id: block sparsity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2015)Google Scholar