Deep Semantic Matching with Foreground Detection and Cycle-Consistency

Chen, Yun-Chun; Huang, Po-Hsiang; Yu, Li-Yu; Huang, Jia-Bin; Yang, Ming-Hsuan; Lin, Yen-Yu

doi:10.1007/978-3-030-20893-6_22

Yun-Chun Chen^18,19,
Po-Hsiang Huang¹⁹,
Li-Yu Yu¹⁹,
Jia-Bin Huang²⁰,
Ming-Hsuan Yang^21,22 &
…
Yen-Yu Lin¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Included in the following conference series:

Asian Conference on Computer Vision

3361 Accesses
9 Citations

Abstract

Establishing dense semantic correspondences between object instances remains a challenging problem due to background clutter, significant scale and pose differences, and large intra-class variations. In this paper, we present an end-to-end trainable network for learning semantic correspondences using only matching image pairs without manual keypoint correspondence annotations. To facilitate network training with this weaker form of supervision, we (1) explicitly estimate the foreground regions to suppress the effect of background clutter and (2) develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent. We train the proposed model using the PF-PASCAL dataset and evaluate the performance on the PF-PASCAL, PF-WILLOW, and TSS datasets. Extensive experimental results show that the proposed approach achieves favorably performance compared to the state-of-the-art. The code and model will be available at https://yunchunchen.github.io/WeakMatchNet/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. 17(1–3), 185–203 (1981)
Article Google Scholar
Lucas, B.D., Kanade, T., et al.: An iterative image registration technique with an application to stereo vision. In: IJCAI (1981)
Google Scholar
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47, 7–42 (2002)
Article Google Scholar
Wang, Z.-F., Zheng, Z.-G.: A region based stereo matching algorithm using cooperative optimization. In: CVPR (2008)
Google Scholar
Liu, C., Yuen, J., Torralba, A.: SIFT Flow: dense correspondence across scenes and its applications. TPAMI 33, 978–994 (2011)
Article Google Scholar
Chen, H.-Y., Lin, Y.-Y., Chen, B.-Y.: Co-segmentation guided hough transform for robust feature matching. TPAMI 37, 2388–2401 (2015)
Article Google Scholar
Taniai, T., Sinha, S.N., Sato, Y.: Joint recovery of dense correspondence and cosegmentation in two images. In: CVPR (2016)
Google Scholar
Hsu, K.-J., Lin, Y.-Y., Chuang, Y.-Y.: Co-attention CNNs for unsupervised object co-segmentation. In: IJCAI (2018)
Google Scholar
Mustafa, A., Hilton, A.: Semantically coherent co-segmentation and reconstruction of dynamic scenes. In: CVPR (2017)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Choy, C.B., Gwak, J.Y., Savarese, S., Chandraker, M.: Universal correspondence network. In: NIPS (2016)
Google Scholar
Han, K., et al.: SCNet: learning semantic correspondence. In: ICCV (2017)
Google Scholar
Kim, S., Min, D., Ham, B., Jeon, S., Lin, S., Sohn, K.: FCSS: fully convolutional self-similarity for dense semantic correspondence. In: CVPR (2017)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: Convolutional neural network architecture for geometric matching. In: CVPR (2017)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: End-to-end weakly-supervised semantic alignment. In: CVPR (2018)
Google Scholar
Ham, B., Cho, M., Schmid, C., Ponce, J.: Proposal flow: semantic correspondences from object proposals. TPAMI 40, 1711–1725 (2017)
Article Google Scholar
Hu, Y.-T., Lin, Y.-Y.: Progressive feature matching with alternate descriptor selection and correspondence enrichment. In: CVPR (2016)
Google Scholar
Hu, Y.-T., Lin, Y.-Y., Chen, H.-Y., Hsu, K.-J., Chen, B.-Y.: Matching images with multiple descriptors: an unsupervised approach for locally adaptive descriptor selection. TIP 24, 5995–6010 (2015)
MathSciNet MATH Google Scholar
Hsu, K.-J., Lin, Y.-Y., Chuang, Y.-Y., et al.: Robust image alignment with multiple feature descriptors and matching-guided neighborhoods. In: CVPR (2015)
Google Scholar
Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: CVPR (2013)
Google Scholar
Novotny, D., Larlus, D., Vedaldi, A.: AnchorNet: a weakly supervised network to learn geometry-sensitive features for semantic matching. In: CVPR (2017)
Google Scholar
Kanazawa, A., Jacobs, D.W., Chandraker, M.: WarpNet: Weakly supervised matching for single-view reconstruction. In: CVPR (2016)
Google Scholar
Meister, S., Hur, J., Roth, S.: UnFlow: unsupervised learning of optical flow with a bidirectional census loss. In: AAAI (2018)
Google Scholar
Zou, Y., Luo, Z., Huang, J.-B.: DF-Net: unsupervised joint learning of depth and flow using cross-task consistency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 38–55. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_3
Chapter Google Scholar
Huang, J.-B., Kang, S.B., Ahuja, N., Kopf, J.: Temporally coherent completion of dynamic video. ACM Trans. Graph. (TOG) 35(6), 196 (2016)
Google Scholar
Lai, W.-S., Huang, J.-B., Wang, O., Shechtman, E., Yumer, E., Yang, M.-H.: Learning blind video temporal consistency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 179–195. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_11
Chapter Google Scholar
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: CVPR (2017)
Google Scholar
Lee, H.-Y., Tseng, H.-Y., Huang, J.-B., Singh, M., Yang, M.-H.: Diverse image-to-image translation via disentangled representations. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 36–52. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_3
Chapter Google Scholar
Zhou, X., Zhu, M., Daniilidis, K.: Multi-image matching via fast alternating minimization. In: ICCV (2015)
Google Scholar
Zhou, T., Jae Lee, Y., Yu, S.X., Efros, A.A.: FlowWeb: joint image set alignment by weaving consistent, pixel-wise correspondences. In: CVPR (2015)
Google Scholar
Zhou, T., Krahenbuhl, P., Aubry, M., Huang, Q., Efros, A.A.: Learning dense correspondence via 3D-guided cycle consistency. In: CVPR (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. TPAMI 35, 2878–2890 (2013)
Article Google Scholar
Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3D human pose annotations. In: ICCV (2009)
Google Scholar
Tola, E., Lepetit, V., Fua, P.: DAISY: An efficient dense descriptor applied to wide-baseline stereo. TPAMI 32, 815–830 (2010)
Article Google Scholar
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: CVPR (2015)
Google Scholar
Yi, K.M., Trulls, E., Lepetit, V., Fua, P.: LIFT: Learned invariant feature transform. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 467–483. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_28
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Ufer, N., Ommer, B.: Deep semantic feature matching. In: CVPR (2017)
Google Scholar
Yang, F., Li, X., Cheng, H., Li, J., Chen, L.: Object-aware dense semantic correspondence. In: CVPR (2017)
Google Scholar
Kim, S., Min, D., Lin, S., Sohn, K.: DCTM: Discrete-continuous transformation matching for semantic flow. In: CVPR (2017)
Google Scholar

Download references

Acknowledgement

This work is supported in part by Ministry of Science and Technology under grants MOST 105-2221-E-001-030-MY2 and MOST 107-2628-E-001-005-MY3.

Author information

Authors and Affiliations

Academia Sinica, Taipei, Taiwan
Yun-Chun Chen & Yen-Yu Lin
National Taiwan University, Taipei, Taiwan
Yun-Chun Chen, Po-Hsiang Huang & Li-Yu Yu
Virginia Tech, Blacksburg, USA
Jia-Bin Huang
University of California, Merced, USA
Ming-Hsuan Yang
Google Cloud, Sunnyvale, USA
Ming-Hsuan Yang

Authors

Yun-Chun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Po-Hsiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Li-Yu Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Bin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Hsuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yen-Yu Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun-Chun Chen .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, YC., Huang, PH., Yu, LY., Huang, JB., Yang, MH., Lin, YY. (2019). Deep Semantic Matching with Foreground Detection and Cycle-Consistency. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-20893-6_22
Published: 29 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics