Skip to main content

Deep Homography Estimation with Pairwise Invertibility Constraint

  • Conference paper
  • First Online:
Structural, Syntactic, and Statistical Pattern Recognition (S+SSPR 2018)

Abstract

Recent works have shown that deep learning methods can improve the performance of the homography estimation due to the better features extracted by convolutional networks. Nevertheless, these works are supervised and rely too much on the labeled training dataset as they aim to make the homography be estimated as close to the ground truth as possible, which may cause overfitting. In this paper, we propose a Siamese network with pairwise invertibility constraint for supervised homography estimation. We utilize spatial pyramid pooling modules to improve the quality of extracted features in each image by exploiting context information. Discovering the fact that there is a pair of homographies from a given image pair which are inverse matrices, we propose the invertibility constraint to avoid overfitting. To employ the constraint, we adopt the matrix representation of the homography rather than the commonly used 4-point parameterization in other methods. Experiments on the synthetic dataset generated from MSCOCO dataset show that our proposed method outperforms several state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Song, Y.Z., Xiao, B., Hall, P., et al.: In search of perceptually salient groupings. IEEE Trans. Image Process. 20(4), 935–947 (2011)

    Article  MathSciNet  Google Scholar 

  2. Liu, S., Bai, X.: Discriminative features for image classification and retrieval. Pattern Recognit. Lett. 33(6), 744–751 (2012)

    Article  Google Scholar 

  3. Bai, X., Ren, P., Zhang, H., et al.: An incremental structured part model for object recognition. Neurocomputing 154, 189–199 (2015)

    Article  Google Scholar 

  4. Liang, J., Zhou, J., Tong, L., et al.: Material based salient object detection from hyperspectral images. Pattern Recognit. 76, 476–490 (2018)

    Article  Google Scholar 

  5. Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)

    Article  Google Scholar 

  6. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  7. Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE International Conference on Computer Vision, ICCV, pp. 2564–2571. IEEE (2011)

    Google Scholar 

  8. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Readings in Computer Vision, pp. 726–740 (1987)

    Google Scholar 

  9. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence, vol. 2, pp. 674–679. Morgan Kaufmann Publishers Inc. (1981)

    Google Scholar 

  10. Baker, S., Matthews, I.: Lucas-Kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)

    Article  Google Scholar 

  11. Dosovitskiy, A., Fischer, P., Ilg, E., et al.: FlowNet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)

    Google Scholar 

  12. Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1–32), 2 (2016)

    MATH  Google Scholar 

  13. Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: 2015 IEEE International Conference on Computer Vision, ICCV, pp. 2938–2946. IEEE (2015)

    Google Scholar 

  14. Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: CVPR, vol. 2, no. 6, p. 7 (2017)

    Google Scholar 

  15. Wang, S., Clark, R., Wen, H., et al.: DeepVO: towards end-to-end visual odometry with deep recurrent convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation, ICRA, pp. 2043–2050. IEEE (2017)

    Google Scholar 

  16. DeTone, D., Malisiewicz, T., Rabinovich, A.: Deep image homography estimation. arXiv preprint arXiv:1606.03798 (2016)

  17. Japkowicz, N., Nowruzi, F.E., Laganiere, R.: Homography estimation from image pairs with hierarchical convolutional networks. In: 2017 IEEE International Conference on Computer Vision Workshop, ICCVW, pp. 904–911. IEEE (2017)

    Google Scholar 

  18. Nguyen, T., Chen, S.W., Skandan, S., et al.: Unsupervised deep homography: a fast and robust homography estimation model. IEEE Robot. Autom. Lett. 3, 2346–2353 (2018)

    Article  Google Scholar 

  19. Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54

    Chapter  Google Scholar 

  20. Zhou, T., Brown, M., Snavely, N., et al.: Unsupervised learning of depth and ego-motion from video. In: CVPR, vol. 2, no. 6, p. 7 (2017)

    Google Scholar 

  21. Chang, J.R., Chen, Y.S.: Pyramid stereo matching network. arXiv preprint arXiv:1803.08669 (2018)

  22. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  23. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  24. Baker, S., Datta, A., Kanade, T.: Parameterizing homographies. Technical report CMU-RI-TR-06-11 (2006)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the National Natural Science Foundation of China project no. 61772057, in part by Beijing Natural Science Foundation project no. 4162037, and the support funding from State Key Lab. of Software Development Environment.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Bai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Wang, C., Bai, X., Liu, Y., Zhou, J. (2018). Deep Homography Estimation with Pairwise Invertibility Constraint. In: Bai, X., Hancock, E., Ho, T., Wilson, R., Biggio, B., Robles-Kelly, A. (eds) Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2018. Lecture Notes in Computer Science(), vol 11004. Springer, Cham. https://doi.org/10.1007/978-3-319-97785-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-97785-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-97784-3

  • Online ISBN: 978-3-319-97785-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics