Abstract
Homography estimation is one of the important ways to calculate the transformation between images. For most embedded terminal devices, an efficient and robust homography estimation algorithm is extremely necessary. In this paper, we design an innovative compressed convolutional neural network to estimate homographies which work very well. The model size of the network is less than 10 MB, which is small enough to be used on mobile devices. In addition, to improve the estimated accuracy in challenging environment, we present a novel loss function to train our network. Finally, we compare our algorithm with traditional methods and other learning-based methods. Experiments on our compressed network demonstrate that the innovative network achieves better accuracy compared to other learning-based algorithms, and is more robust to illumination changes compared to traditional algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mur-Artal, R., Tardos, J.D.: ORB-SLAM: tracking and mapping recognizable features. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Wang, G., Zhai, Z., Xu, B., et al.: A parallel method for aerial image stitching using ORB feature points. IEEE/ACIS. In: International Conference on Computer and Information Science, pp. 769–773. IEEE (2017)
Hsu, Y.F., Chou, C.C., Shih, M.Y.: Moving camera video stabilization using homography consistency, pp. 2761–2764 (2012)
Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 430–443. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_34
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision, Barcelona, pp. 2564–2571 (2011)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: IEEE International Conference on Robotics and Automation, pp. 15–22. IEEE (2014)
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54
Evangelidis, G.D., Psarakis, E.Z.: Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans. Pattern Anal. Mach. Intell. 30(10), 1858–1865 (2008)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc. (2012)
Chatfield, K., Simonyan, K., Vedaldi, A., et al.: Return of the devil in the details: delving deep into convolutional nets. Comput. Sci. 50(1), 815–830 (2014)
Wan, J., Wang, D., Hoi, S.C.H., et al.: Deep learning for content-based image retrieval: a comprehensive study. In: The ACM International Conference, pp. 157–166. ACM (2014)
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38
Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. Educ. Inform. 31, 2938–2946 (2015)
Tateno, K., Tombari, F., Laina, I., et al.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction, pp. 6565–6574 (2017)
Wang, S., Clark, R., Wen, H., et al.: DeepVO: towards end-to-end visual odometry with deep recurrent convolutional neural networks, pp. 2043–2050 (2017)
Detone, D.: Deep image homography estimation. In: RSS Workshop on Limits and Potentials of Deep Learning in Robotics (2016)
Nguyen, T., Chen, S.W., Skandan, S., et al.: Unsupervised deep homography: a fast and robust homography estimation model. IEEE Robot. Autom. Lett. PP(99), 1 (2018)
Howard, A.G., Zhu, M., Chen, B., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017)
Xie, S., Girshick, R., Dollar, P., et al.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5987–5995. IEEE Computer Society (2017)
Zhang, X., Zhou, X., Lin, M., et al.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices (2017)
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under Grants 61571285, and Shanghai Science and Technology Commission under Grant 17DZ2292400 and 18XD1423900.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, G., You, Z., An, P., Yu, J., Chen, Y. (2019). Efficient and Robust Homography Estimation Using Compressed Convolutional Neural Network. In: Zhai, G., Zhou, J., An, P., Yang, X. (eds) Digital TV and Multimedia Communication. IFTC 2018. Communications in Computer and Information Science, vol 1009. Springer, Singapore. https://doi.org/10.1007/978-981-13-8138-6_13
Download citation
DOI: https://doi.org/10.1007/978-981-13-8138-6_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8137-9
Online ISBN: 978-981-13-8138-6
eBook Packages: Computer ScienceComputer Science (R0)