Abstract
Viewpoint informed keypoint prediction from 2D images is an essential task in computer vision, which captures the fine details of rigid objects, however, the cases of ambiguous viewpoint predicted by the convolutional neural network, especially for two peaks of high confidence viewpoint proposals, may specify a set of erroneous keypoints. To address the above issue, we present multiscale convolutional neural networks and propose a filter to ensure high confidence viewpoint informed, which provides a global perspective for keypoint prediction. Leveraging the global precedence, we combine multiscale local appearance based keypoint likelihood with filtered viewpoint conditioned likelihood to induce a considerable performance gain. Experimentally, we show that our framework outperforms state-of-the-art methods on PASCAL 3D benchmark.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Long, J., Zhang, N., Darrell, T.: Do convnets learn correspondence? In: Advances in Neural Information Processing Systems, vol. 2, pp. 1601–1609 (2014)
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: IEEE International Conference on Computer Vision, pp. 2686–2694 (2014)
Yang, Y., Ramanan, D.: Articulated pose estimation using flexible mixtures of parts. In: Computer Vision & Pattern Recognition, vol. 32, no. 14, pp. 1385–1392 (2011)
Gkioxari, G., Hariharan, B., Girshick, R., Malik, J.: Using k-poselets for detecting people and localizing their keypoints. In: Computer Vision & Pattern Recognition, pp. 3582–3589 (2014)
Tulsiani, S., Malik, J.: Viewpoints and keypoints. In: Computer Vision & Pattern Recognition, pp. 1510–1519 (2015)
Zhang, N., Shelhamer, E., Gao, Y., Darrell, T.: Fine-grained pose prediction, normalization, and recognition. Comput. Sci. 69(2), 207–221 (2016)
Gkioxari, G., Arbelaez, P., Bourdev, L., Malik, J.: Articulated pose estimation using discriminative armlet classifiers. In: IEEE International Conference on Computer Vision, vol. 9, no. 4, pp. 3342–3349 (2013)
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: European Conference on Computer Vision, pp. 168–181 (2010)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012)
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Chabot, F., Chaouch, M., Rabarisoa, J., Teulire, C., Chateau, T.: Deep MANTA: a coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image. In: Computer Vision & Pattern Recognition (2017)
Mousavian, A., Anguelov, D., Flynn, J., Kosecka, J.: 3D bounding box estimation using deep learning and geometry. In: Computer Vision & Pattern Recognition (2017)
Xiang, Y., Choi, W., Lin, Y., Savarese, S.: Subcategory-aware convolutional neural networks for object proposals and detection. In: IEEE Winter Conference on Applications of Computer Vision (2017)
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: Computer Vision & Pattern Recognition, pp. 1653–1660 (2014)
Tompson, J., Jain, A., Lecun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Eprint Arxiv, pp. 1799–1807 (2014)
Acknowledgments
This work was partly supported by the National High Technology Research and Development Program of China (863 Program) No. 2015AA016306, National Nature Science Foundation of China (No. 61231015), EU FP7 QUICK project under Grant Agreement No. PIRSES-GA-2013-612652*, National Nature Science Foundation of China (61502348), Hubei Province Technological Innovation Major Project (No. 2016AAA015), science and technology program of Shenzhen (JCYJ20150422150029092).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Li, Q., Hu, R., Chen, Y., Yan, J., Xiao, J. (2018). A Fine-Grained Filtered Viewpoint Informed Keypoint Prediction from 2D Images. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10736. Springer, Cham. https://doi.org/10.1007/978-3-319-77383-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-77383-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77382-7
Online ISBN: 978-3-319-77383-4
eBook Packages: Computer ScienceComputer Science (R0)