Abstract
Planar homography estimation refers to the problem of computing a bijective linear mapping of pixels between two images. While this problem has been studied with convolutional neural networks (CNNs), existing methods simply regress the location of the four corners using a dense layer preceded by a fully-connected layer. This vector representation damages the spatial structure of the corners since they have a clear spatial order. Moreover, four points are the minimum required to compute the homography, and so such an approach is susceptible to perturbation. In this paper, we propose a conceptually simple, reliable, and general framework for homography estimation. In contrast to previous works, we formulate this problem as a perspective field (PF), which models the essence of the homography - pixel-to-pixel bijection. The PF is naturally learned by the proposed fully convolutional residual network, PFNet, to keep the spatial order of each pixel. Moreover, since every pixels’ displacement can be obtained from the PF, it enables robust homography estimation by utilizing dense correspondences. Our experiments demonstrate the proposed method outperforms traditional correspondence-based approaches and state-of-the-art CNN approaches in terms of accuracy while also having a smaller network size. In addition, the new parameterization of this task is general and can be implemented by any fully convolutional network (FCN) architecture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
ImgAug toolbox: https://github.com/aleju/imgaug.
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chollet, F., et al.: Keras (2015)
Chum, O., Matas, J.: Planar affine rectification from change of scale. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6495, pp. 347–360. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19282-1_28
DeTone, D., Malisiewicz, T., Rabinovich, A.: Deep image homography estimation. arXiv preprint arXiv:1606.03798 (2016)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Gong, M., Zhao, S., Jiao, L., Tian, D., Wang, S.: A novel coarse-to-fine scheme for automatic image registration based on sift and mutual information. IEEE Trans. Geosci. Remote Sens. 52(7), 4328–4338 (2014). https://doi.org/10.1109/TGRS.2013.2281391
Ha, H., Perdoch, M., Alismail, H., Kweon, I.S., Sheikh, Y.: Deltille grids for geometric camera calibration. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 5344–5352 (2017)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2017)
Japkowicz, N., Nowruzi, F.E., Laganiere, R.: Homography estimation from image pairs with hierarchical convolutional networks. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 904–911, October 2017. https://doi.org/10.1109/ICCVW.2017.111
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of IEEE International Conference on 3D Vision, pp. 239–248. IEEE (2016)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Ma, J., Zhou, H., Zhao, J., Gao, Y., Jiang, J., Tian, J.: Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans. Geosci. Remote Sens. 53(12), 6469–6481 (2015). https://doi.org/10.1109/TGRS.2015.2441954
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Monasse, P., Morel, J.M., Tang, Z.: Three-step image rectification. In: Proceedings of British Machine Vision Conference, p. 89-1. BMVA Press (2010)
Nguyen, T., Chen, S.W., Shivakumar, S.S., Taylor, C.J., Kumar, V.: Unsupervised deep homography: a fast and robust homography estimation model. IEEE Rob. Autom. Lett. 3(3), 2346–2353 (2018). https://doi.org/10.1109/LRA.2018.2809549
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to sift or surf. In: Proceedings of International Conference on Computer Vision, pp. 2564–2571. IEEE (2011)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Staranowicz, A.N., Brown, G.R., Morbidi, F., Mariottini, G.L.: Practical and accurate calibration of RGB-D cameras using spheres. Comput. Vis. Image Underst. 137, 102–114 (2015). https://doi.org/10.1016/j.cviu.2015.03.013. http://www.sciencedirect.com/science/article/pii/S1077314215000703
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: Proceedings of International conference on Machine Learning, pp. 1139–1147 (2013)
Zhang, H., Patel, V.M.: Densely connected pyramid dehazing network. arXiv preprint arXiv:1803.08396 (2018)
Zhu, Y., Newsam, S.: DenseNet for dense flow. arXiv preprint arXiv:1707.06316 (2017)
Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21(11), 977–1000 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zeng, R., Denman, S., Sridharan, S., Fookes, C. (2019). Rethinking Planar Homography Estimation Using Perspective Fields. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11366. Springer, Cham. https://doi.org/10.1007/978-3-030-20876-9_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-20876-9_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20875-2
Online ISBN: 978-3-030-20876-9
eBook Packages: Computer ScienceComputer Science (R0)