Skip to main content

UAV First View Landmark Localization via Deep Reinforcement Learning

  • Conference paper
  • First Online:
Structural, Syntactic, and Statistical Pattern Recognition (S+SSPR 2018)

Abstract

In recent years, the study of Unmanned Aerial Vehicle (UAV) autonomous landing has been a hot research topic. Aiming at UAV’s landmark localization, the computer vision algorithms have excellent performance. In the computer vision research field, the deep learning methods are widely employed in object detection and localization. However, these methods rely heavily on the size and quality of the training datasets. In this paper, we propose to exploit the Landmark-Localization Network (LLNet) to solve the UAV landmark localization problem in terms of a deep reinforcement learning strategy with small-sized training datasets. The LLNet learns how to transform the bounding box into the correct position through a sequence of actions. To train a robust landmark localization model, we combine the policy gradient method in deep reinforcement learning algorithm and the supervised learning algorithm together in the training stage. The experimental results show that the LLNet is able to locate the landmark precisely.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Luo, C., Yu, L., Ren, P.: A vision-aided approach to perching a bio-inspired unmanned aerial vehicle. IEEE Trans. Ind. Electron. 65(5), 3976–3984 (2018)

    Article  Google Scholar 

  2. Yu, L., et al.: Deep learning for vision-based micro aerial vehicle autonomous landing. Int. J. Micro Air Veh. (2018)

    Google Scholar 

  3. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  4. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  5. Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: International Conference on Machine Learning, pp. 597–606 (2015)

    Google Scholar 

  6. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)

    Google Scholar 

  7. Wang, N., Li, S., Gupta, A., Yeung, D.-Y.: Transferring rich feature hierarchies for robust visual tracking (2015)

    Google Scholar 

  8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  9. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  10. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks (2013)

    Google Scholar 

  11. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)

    Google Scholar 

  12. Li, H., Li, Y., Porikli, F.: Robust online visual tracking with a single convolutional neural network. In: Cremers, D., Reid, I., Saito, H., Yang, M.H. (eds.) ACCV 2014. LNCS, vol. 9007, pp. 194–209. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16814-2_13

    Chapter  Google Scholar 

  13. Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: International Conference on Computer Vision, pp. 3119–3127 (2015)

    Google Scholar 

  14. Yun, S., Choi, J., Yoo, Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning (2017)

    Google Scholar 

  15. Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  16. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)

    Article  Google Scholar 

  17. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)

    Article  Google Scholar 

  18. Sammut, C., Webb, G.I.: Encyclopedia of Machine Learning And Data Mining. Springer, Boston (2017)

    Book  Google Scholar 

  19. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets (2014)

    Google Scholar 

  20. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. In: Sutton, R.S. (ed.) Reinforcement Learning, pp. 5–32. Springer, Boston (1992). https://doi.org/10.1007/978-1-4615-3618-5_2

    Chapter  Google Scholar 

  21. Kristan, M., et al.: The visual object tracking VOT2015 challenge results. In: International Conference on Computer Vision Workshops, pp. 1–23 (2015)

    Google Scholar 

  22. Zhang, K., Zhang, L., Liu, Q., Zhang, D., Yang, M.-H.: Fast visual tracking via dense spatio-temporal context learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 127–141. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_9

    Chapter  Google Scholar 

  23. Choi, J., Chang, H.J., Jeong, J., et al.: Visual tracking using attention-modulated disintegration and integration. In: Computer Vision and Pattern Recognition, pp. 4321–4330 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Ren .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, X., Ren, P., Yu, L., Han, L., Deng, X. (2018). UAV First View Landmark Localization via Deep Reinforcement Learning. In: Bai, X., Hancock, E., Ho, T., Wilson, R., Biggio, B., Robles-Kelly, A. (eds) Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2018. Lecture Notes in Computer Science(), vol 11004. Springer, Cham. https://doi.org/10.1007/978-3-319-97785-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-97785-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-97784-3

  • Online ISBN: 978-3-319-97785-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics