Abstract
The effectiveness of appearance based person models strongly relies on a sufficiently large number of high quality training samples. Generating training data in terms of bounding boxes is already a time consuming task. If more complex person models are used, like part-based models or models suitable for human pose estimation, the labeling process becomes infeasible. In the context of pose estimation, motion capturing is often used to generate ground truth data. A major problem with this approach is that motion capturing is usually done in artificial environments with only few persons. It is therefore difficult to generate classifiers which are able to localize anatomical landmarks on a moving person. In order to solve this problem we propose a solution to generate annotations of anatomical landmarks using a semi-automatic work flow, based on tracking and automatic scale selection.
The contribution of the paper is twofold. First, different tracking methods are evaluated in terms of their properties to follow anatomical structures on a moving person. Second, in order to determine the spatial extents of anatomical landmarks some simple but effective scale selection methods are proposed. The resulting person models are intended to generate a suitable basis for learning regression models for monocular pose estimation, as well as for training part-based models directly. Results of a comprehensive quantitative evaluation on the UMPM dataset are presented, while we also show examples of qualitative results on two challenging YouTube sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
van der Aa, N., Luo, X., Giezeman, G., Tan, R., Veltkamp, R.: Utrecht Multi-Person Motion (UMPM) benchmark: A multi-person dataset with synchronized video and motion capture data for evaluation of articulated human motion and interaction. In: Proc. of Human Interaction in Computer Vision (HICV) Workshop (2011)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object Detection with Discriminatively Trained Part Based Models. IEEE Trans. on PAMI 32(9), 1627–1645 (2010)
Mihalcik, D., Doermann, D.: The Design and Implementation of ViPER. Tech. rep., University of Maryland (2003)
Mori, G., Malik, J.: Recovering 3D Human Body Configurations Using Shape Contexts. IEEE Trans. on PAMI 28(7), 1052–1062 (2006)
Müller, J., Arens, M.: Human Pose Estimation with Implicit Shape Models. In: Proc. of ACM ARTEMIS 2010, pp. 9–14. ACM, New York (2010)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A Database and Web-Based Tool for Image Annotation. Int. J. Comput. Vision 77(1-3), 157–173 (2008)
Salmane, H., Ruichek, Y., Khoudour, L.: Object Tracking Using Harris Corner Points Based Optical Flow Propagation and Kalman Filter. In: Proc. of 14th IEEE Intelligent Transportation Systems Conference (ITSC 2011), Washington D.C., USA, pp. 67–73 (2011)
Schikora, M., Koch, W., Cremers, D.: Multi-Object Tracking via High Accuracy Optical Flow and Finite Set Statistics. In: Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP (2011)
Sigal, L., Balan, A., Black, M.: HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion. Int. Journal of Computer Vision 87(1), 4–27 (2010)
Sigal, L., Black, M.J.: Predicting 3D People from 2D Pictures. In: Proc. of Int. Conf. on Articulated Motion and Deformable Objects (AMDO). pp. 185–195 (2006)
Vondrick, C., Patterson, D., Ramanan, D.: Efficiently Scaling up Crowdsourced Video Annotation. Int. Journal of Computer Vision, 1–21 (2012), doi:10.1007/s11263-012-0564-1
Wu, Y., Lim, J., Yang, M.H.: Online Object Tracking: A Benchmark. In: Proc. of CVPR 2013 (2013)
Yang, H., Shao, L., Zheng, F., Wang, L., Song, Z.: Recent advances and trends in visual tracking: A review. Neurocomputing 74(18), 3823–3831 (2011)
Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys 38(4) (2006)
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L1 optical flow. In: Pattern Recognition, pp. 214–223. Springer (2007)
Zhang, K., Zhang, L., Yang, M.-H.: Real-time compressive tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 864–877. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Krah, S.B., Brauer, J., Hübner, W., Arens, M. (2014). Supporting Annotation of Anatomical Landmarks Using Automatic Scale Selection. In: Perales, F.J., Santos-Victor, J. (eds) Articulated Motion and Deformable Objects. AMDO 2014. Lecture Notes in Computer Science, vol 8563. Springer, Cham. https://doi.org/10.1007/978-3-319-08849-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-08849-5_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08848-8
Online ISBN: 978-3-319-08849-5
eBook Packages: Computer ScienceComputer Science (R0)