Abstract
As a fundamental technique that concerns several vision tasks such as image parsing, action recognition and clothing retrieval, human pose estimation (HPE) has been extensively investigated in recent years. To achieve accurate and reliable estimation of the human pose, it is well-recognized that the clothing attributes are useful and should be utilized properly. Most previous approaches, however, require to manually annotate the clothing attributes and are therefore very costly. In this paper, we shall propose and explore a latent clothing attribute approach for HPE. Unlike previous approaches, our approach models the clothing attributes as latent variables and thus requires no explicit labeling for the clothing attributes. The inference of the latent variables are accomplished by utilizing the framework of latent structured support vector machines (LSSVM). We employ the strategy of alternating direction to train the LSSVM model: In each iteration, one kind of variables (e.g., human pose or clothing attribute) are fixed and the others are optimized. Our extensive experiments on two real-world benchmarks show the state-of-the-art performance of our proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, L., Zhang, L., Liu, H., Yan, S.: Towards large-population face identification in unconstrained videos. In: IEEE Transactions on Circuits and Systems for Video Technology, p. 1 (2014)
Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., Yan, S.: Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3330–3337 (2012)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3337–3344 (2011)
Ladicky, L., Torr, P.H.S., Zisserman, A.: Human pose estimation using a joint pixel-wise and part-wise formulation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3578–3585 (2013)
Rothrock, B., Park, S., Zhu, S.C.: Integrating grammar and segmentation for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3214–3221 (2013)
Shen, J., Liu, G., Chen, J., Fang, Y., Xie, J., Yu, Y., Yan, S.: Unified structured learning for simultaneous human pose estimation and garment attribute classification. arXiv preprint arXiv:1404.4923 (2014)
Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Trans. Comput. 22, 67–92 (1973)
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. Int. J. Comput. Vision 61, 55–79 (2005)
Burenius, M., Sullivan, J., Carlsson, S.: 3D pictorial structures for multiple view articulated pose estimation. In: IEEE Conference on Computer Vision Pattern Recognition, pp. 3618–3625 (2013)
Ionescu, C., Carreira, J., Sminchisescu, C.: Iterated second-order label sensitive pooling for 3D human pose estimation. In: IEEE Conference on Computer Vision Pattern Recognition (2014)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 116–124 (2013)
Ramanan, D.: Learning to parse images of articulated bodies. In: Neural Information Processing Systems, pp. 1129–1136 (2006)
Sapp, B., Jordan, C., Taskar, B.: Adaptive pose priors for pictorial structures. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 422–429 (2010)
Morris, D.D., Rehg, J.M.: Singularity analysis for articulated object tracking. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 289–296 (1998)
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixture-of-parts. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1385–1392 (2011)
Sapp, B., Toshev, A., Taskar, B.: Cascaded models for articulated pose estimation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 406–420. Springer, Heidelberg (2010)
Cherian, A., Mairal, J., Alahari, K., Schmid, C.: Mixing body-part sequences for human pose estimation. In: IEEE Conference on Computer Vision Pattern Recognition (2014)
Eichner, M., Ferrari, V.: Better appearance models for pictorial structures. In: British Machine Vision Conference (2009)
Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012)
Bourdev, L., Maji, S., Malik, J.: Describing people: poselet-based attribute classification. In: International Conference on Computer Vision (ICCV) (2011)
Li, Y., Zhou, Y., Yan, J., Niu, Z., Yang, J.: Visual saliency based on conditional entropy. In: Zha, H., Taniguchi, R., Maybank, S. (eds.) ACCV 2009, Part I. LNCS, vol. 5994, pp. 246–257. Springer, Heidelberg (2010)
Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: You are what you wear: parsing clothing in fashion photos. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3570–3577 (2012)
Liu, S., Feng, J., Song, Z., Zhang, T., Lu, H., Xu, C., Yan, S.: Hi, magic closet, tell me what to wear! In: ACM Multimedia Conference, pp. 619–628 (2012)
Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on kullback discrimination of distributions. In: International Conference on Pattern Recognition (1994)
Felzenszwalb, P., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: primal estimated sub-Gradient SOlver for SVM. In: International Conference on Machine Learning, pp. 807–814 (2007)
Ferrari, V., Marin, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)
Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1014–1021 (2009)
Ferrari, V., Marn-Jimnez, M.J., Zisserman, A.: Pose search: retrieving people using their pose. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, W., Shen, J., Liu, G., Yu, Y. (2015). A Latent Clothing Attribute Approach for Human Pose Estimation. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-16865-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)