Abstract
In many person re-identification applications, typically only a small number of labeled image pairs are available for training. To address this serious practical issue, we propose a novel semi-supervised ranking method which makes use of unlabeled data to improve the re-identification performance. It is shown that low density separation or graph propagation assumption is not valid under some conditions in person re-identification. Thus, we propose to iteratively select the most confident matched (positive) image pairs from the unlabeled data. Since the number of positive matches is greatly smaller than that of negative ones, we increase the positive prior by selecting positive data from the top-ranked matching subset among all unlabeled data. The optimal model is learnt by solving a regression based ranking problem. Experimental results show that our method significantly outperforms state-of-the-art distance learning algorithms on three publicly available datasets using only few labeled matched image pairs for training.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
Note that the feature used in our experiments is different from those in existing methods. It is very discriminative for VIPeR dataset, so it can achieve 70Â % rank one accuracy using 316 matched image pairs for training. Such good performance may be due to the combination of foreground detection and global feature extraction (on a large region of an image) which is very effective for VIPeR dataset. It is interesting to conduct further investigation on this issue, but it is not the focus of this paper.
References
Ba̧k, S., Corvée, E., Brémond, F., Thonnat, M.: Boosted human re-identification using riemannian manifolds. Image Vis. Comput. 30, 443–452 (2010)
Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: CVPR (2010)
Bauml, M., Stiefelhagen, R.: Evaluation of local features for person re-identification in image sequences. In: AVSS (2011)
Cheng, D.S., Cristani, M., Stoppa, M., Bazzani, L., Murino, V.: Custom pictorial structures for re-identification. In: BMVC (2011)
Doretto, G., Sebastian, T., Tu, P., Rittscher, J.: Appearance-based person reidentification in camera networks: problem overview and current approaches. JAIHC 2, 127–151 (2011)
Jungling, K., Arens, M.: View-invariant person re-identification with an implicit shape model. In: AVSS (2011)
Bazzani, L., Cristani, M., Perina, A., Murino, V.: Multiple-shot person re-identification by chromatic and epitomic analyses. Pattern Recogn. Lett. 33, 898–903 (2012)
Bąk, S., Charpiat, G., Corvée, E., Brémond, F., Thonnat, M.: Learning to match appearances by correlations in a covariance metric space. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 806–820. Springer, Heidelberg (2012)
Ma, B., Su, Y., Jurie, F.: BiCov: a novel image representation for person re-identification and face verification. In: BMVC (2012)
Kviatkovsky, I., Adam, A., Rivlin, E.: Color invariants for person reidentification. TPAMI 35, 1622–1634 (2013)
Zhao, R., Ouyang, W., Wang, X.: Unsupervised salience learning for person re-identification. In: CVPR (2013)
Xu, Y., Lin, L., Zheng, W.S., Liu, X.: Human re-identification by matching compositional template with cluster sampling. In: ICCV (2013)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 262–275. Springer, Heidelberg (2008)
Prosser, B., Zheng, W.S., Gong, S., Xiang, T.: Person re-identification by support vector ranking. In: BMVC (2010)
Avraham, T., Gurvich, I., Lindenbaum, M., Markovitch, S.: Learning implicit transfer for person re-identification. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part I. LNCS, vol. 7583, pp. 381–390. Springer, Heidelberg (2012)
Hirzer, M., Roth, P.M., Köstinger, M., Bischof, H.: Relaxed pairwise learned metric for person re-identification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 780–793. Springer, Heidelberg (2012)
Zheng, W.S., Gong, S., Xiang, T.: Reidentification by relative distance comparison. TPAMI 35, 653–668 (2013)
Li, W., Wang, X.: Locally aligned feature transforms across views. In: CVPR (2013)
Liu, C., Loy, C.C., Gong, S., Wang, G.: POP: Person re-identification post-rank optimisation. In: ICCV (2013)
Zhao, R., Ouyang, W., Wang, X.: Person re-identification by salience matching. In: ICCV (2013)
Ma, A.J., Yuen, P.C., Li, J.: Domain transfer support vector ranking for person re-identification without target camera label information. In: ICCV (2013)
Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised Learning, vol. 2. MIT Press, Cambridge (2006)
Zhu, X.: Semi-supervised learning literature survey. Computer Science, University of Wisconsin - Madison (2008)
Figueira, D., Bazzani, L., Minh, H.Q., Cristani, M., Bernardino, A., Murino, V.: Semi-supervised multi-feature learning for person re-identification. In: AVSS (2013)
Bäuml, M., Tapaswi, M., Stiefelhagen, R.: Semi-supervised learning with constraints for person identification in multimedia data. In: CVPR (2013)
Iqbal, U., Curcio, I.D.D., Gabbouj, M.: Who is the hero? - semi-supervised person re-identification in videos. In: VISAPP (2014)
Amini, M.R., Truong, T.V., Goutte, C.: A boosting algorithm for learning bipartite ranking functions with partially labeled data. In: SIGIR (2008)
Hoi, S.C., Jin, R.: Semi-supervised ensemble ranking. In: AAAI (2008)
Gray, D., Brennan, S., Tao, H.: Evaluating appearance models for recognition, reacquisition, and tracking. In: IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (2007)
Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 91–102. Springer, Heidelberg (2011)
Tian, Y., Zitnick, C.L., Narasimhan, S.G.: Exploring the spatial hierarchy of mixture models for human pose estimation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 256–269. Springer, Heidelberg (2012)
Acknowledgement
The work is supported in part by ONR-N00014-13-1-0764, NSF-III-1360971, AFOSR-FA9550-13-1-0137, and NSF-Bigdata-1419210.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Proof of Equation (12)
Appendix: Proof of Equation (12)
Suppose there are \(N_i^a\) images for person \(i\) under camera \(a\) and \(N_j^b\) images for person \(j\) under \(b\). The number of positive matches for person \(i\) in both camera views is \(N_i^a N_i^b\). Since the total numbers of images are \(\sum _{i=1}^{J^a} N_i^a\) under camera view \(a\) and \(\sum _{j=1}^{J^b} N_j^b\) under camera view \(b\), the positive prior \(\tau \) is calculated by
The total number of image pairs in \(E_1\) is equal to the number of groups \(G_{m\cdot }\) and \(G_{\cdot n}\), i.e., \(\sum _{i=1}^{J^a} N_i^a + \sum _{j=1}^{J^b} N_j^b\). There are \(\sum _{i=1}^{J} N_i^a\) groups \(G_{m\cdot }\) and \(\sum _{j=1}^{J} N_j^b\) groups \(G_{\cdot n}\) containing at least one positive ADV. However, the classification function \(f\) may wrongly select a negative ADV from \(G_{m\cdot }\) or \(G_{\cdot n}\) that contains positive ADV(s). Thus, the number of ADVs in \(E_1\) is \((\sum _{i=1}^{J} N_i^a + \sum _{j=1}^{J} N_j^b) c_1\), where \(c_1\) is the rank one accuracy measuring the performance of \(f\). Then, the positive prior \(\tau _1\) in \(E_1\) is given by the following equation,
Since it is difficult to compare \(\tau \) and \(\tau _1\) by (17) and (18) directly, we approximate them by assuming \(N_i^a \approx \sum _{i'=1}^{J^a} N_{i'}^a / J^a\) and \(N_j^b \approx \sum _{j'=1}^{J^b} N_{j'}^b / J^b\). Substituting the approximations of \(N_i^a\) and \(N_j^b\) into (17) and (18), respectively, \(\tau \) and \(\tau _1\) become
If \(\max (1 / J^a, 1 / J^b) \ll c_1\), multiplying \(J^a J^b\) on both sides, we obtain \(\max (J^a, J^b) \ll J^a J^b c_1\). Thus, \(\tau \ll \tau _1\), which leads to (12).
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ma, A.J., Li, P. (2015). Semi-Supervised Ranking for Re-identification with Few Labeled Image Pairs. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-16817-3_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16816-6
Online ISBN: 978-3-319-16817-3
eBook Packages: Computer ScienceComputer Science (R0)