UAV First View Landmark Localization via Deep Reinforcement Learning

Wang, Xinran; Ren, Peng; Yu, Leijian; Han, Lirong; Deng, Xiaogang

doi:10.1007/978-3-319-97785-0_8

Xinran Wang¹⁹,
Peng Ren¹⁹,
Leijian Yu¹⁹,
Lirong Han¹⁹ &
…
Xiaogang Deng¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11004))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

1360 Accesses
1 Citations

Abstract

In recent years, the study of Unmanned Aerial Vehicle (UAV) autonomous landing has been a hot research topic. Aiming at UAV’s landmark localization, the computer vision algorithms have excellent performance. In the computer vision research field, the deep learning methods are widely employed in object detection and localization. However, these methods rely heavily on the size and quality of the training datasets. In this paper, we propose to exploit the Landmark-Localization Network (LLNet) to solve the UAV landmark localization problem in terms of a deep reinforcement learning strategy with small-sized training datasets. The LLNet learns how to transform the bounding box into the correct position through a sequence of actions. To train a robust landmark localization model, we combine the policy gradient method in deep reinforcement learning algorithm and the supervised learning algorithm together in the training stage. The experimental results show that the LLNet is able to locate the landmark precisely.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Luo, C., Yu, L., Ren, P.: A vision-aided approach to perching a bio-inspired unmanned aerial vehicle. IEEE Trans. Ind. Electron. 65(5), 3976–3984 (2018)
Article Google Scholar
Yu, L., et al.: Deep learning for vision-based micro aerial vehicle autonomous landing. Int. J. Micro Air Veh. (2018)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning, pp. 597–606 (2015)
Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)
Google Scholar
Wang, N., Li, S., Gupta, A., Yeung, D.-Y.: Transferring rich feature hierarchies for robust visual tracking (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks (2013)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
Google Scholar
Li, H., Li, Y., Porikli, F.: Robust online visual tracking with a single convolutional neural network. In: Cremers, D., Reid, I., Saito, H., Yang, M.H. (eds.) ACCV 2014. LNCS, vol. 9007, pp. 194–209. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16814-2_13
Chapter Google Scholar
Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: International Conference on Computer Vision, pp. 3119–3127 (2015)
Google Scholar
Yun, S., Choi, J., Yoo, Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning (2017)
Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning And Data Mining. Springer, Boston (2017)
Book Google Scholar
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets (2014)
Google Scholar
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. In: Sutton, R.S. (ed.) Reinforcement Learning, pp. 5–32. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3618-5_2
Chapter Google Scholar
Kristan, M., et al.: The visual object tracking VOT2015 challenge results. In: International Conference on Computer Vision Workshops, pp. 1–23 (2015)
Google Scholar
Zhang, K., Zhang, L., Liu, Q., Zhang, D., Yang, M.-H.: Fast visual tracking via dense spatio-temporal context learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 127–141. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_9
Chapter Google Scholar
Choi, J., Chang, H.J., Jeong, J., et al.: Visual tracking using attention-modulated disintegration and integration. In: Computer Vision and Pattern Recognition, pp. 4321–4330 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information and Control Engineering, China University of Petroleum (East China), Qingdao, 266580, China
Xinran Wang, Peng Ren, Leijian Yu, Lirong Han & Xiaogang Deng

Authors

Xinran Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Ren
View author publications
You can also search for this author in PubMed Google Scholar
Leijian Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lirong Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiaogang Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Ren .

Editor information

Editors and Affiliations

Beihang University, Beijing, China
Xiao Bai
University of York, York, United Kingdom
Edwin R. Hancock
IBM Research – Thomas J. Watson Research, Yorktown Heights, New York, USA
Tin Kam Ho
University of York, Heslington, York, United Kingdom
Richard C. Wilson
University of Cagliari, Cagliari, Italy
Battista Biggio
Data 61 - CSIRO, Canberra, Aust Capital Terr, Australia
Antonio Robles-Kelly

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Ren, P., Yu, L., Han, L., Deng, X. (2018). UAV First View Landmark Localization via Deep Reinforcement Learning. In: Bai, X., Hancock, E., Ho, T., Wilson, R., Biggio, B., Robles-Kelly, A. (eds) Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2018. Lecture Notes in Computer Science(), vol 11004. Springer, Cham. https://doi.org/10.1007/978-3-319-97785-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-97785-0_8
Published: 02 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97784-3
Online ISBN: 978-3-319-97785-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics