Onboard CNN-Based Processing for Target Detection and Autonomous Landing for MAVs
- 83 Downloads
In this work, we address the problem of target detection involved in an autonomous landing task for a Micro Aerial Vehicle (MAV). The challenge is to detect a flag located somewhere in the environment. The flag is posed on a pole, and to its right, a landing platform is located. Thus, the MAV has to detect the flag, fly towards it and once it is close enough, locate the landing platform nearby, aiming at centring over it to perform landing; all of this has to be carried out autonomously. In this context, the main problem is the detection of both the flag and the landing platform, whose shapes are known in advanced. Traditional computer vision algorithms could be used; however, the main challenges in this task are the changes in illumination, rotation and scale, and the fact that the flight controller uses the detection to perform the autonomous flight; hence the detection has to be stable and continuous on every camera frame. Motivated by this, we propose to use a Convolutional Neural Network optimised to be run on a small computer with limited computer processing budget. The MAV carries this computer, and it is used to process everything on board. To validate our system, we tested with rotated images, changes in scale and the presence of low illumination. Our method is compared against two conventional computer vision methods, namely, template and feature matching. In addition, we tested our system performance in a wide corridor, executing everything on board the MAV. We achieved a successful detection of the flag with a confidence metric of 0.9386 and 0.9826 for the Landing platform. In total, all the onboard computations ran at an average of 13.01 fps.
KeywordsCNN SSD Target detection Autonomous landing
Micro Aerial Vehicles (MAVs) have become popular in the research community for easy control and manipulation using the GPS devices and RGB cameras for solving multiple problems like inspection, detection, surveillance, rescue and localisation in indoors and outdoors environments. These tasks have been carried out with vision methods such as optical flow, segmentation, edge detector, morphological operations, feature extractor, feature matching and template matching. Besides, some methods have been combined with two or more techniques for suitable detection, while a MAV performs an autonomous flight in an unknown environment. Also, the combination of different types of cameras such as depth cameras, thermal cameras and stereo cameras enable to capture other types of information useful for detection. Nevertheless, the use of this information can be computationally expensive to perform detection onboard of the MAV in real-time, affect the speed performance. Likewise, it can be affected much for changes of illumination and environments, including oblique views, scale and rotations even that the object is partially occluded.
From the above, several events around the world have proposed competitions of robotics focused on the use of MAVs to solve tasks in real-time. The International Micro Aerial Vehicles and competition (IMAV) is an event focused on aerial robotics, including conference and competition in outdoors and indoors environments. The event consists of the development of new systems and methods to solve problems such as detection, control, pose estimation and autonomous navigation.
Deep learning has become a useful tool for classification, segmentation and detection without having to explicitly design a detector, descriptor and matcher components, typical of traditional computer vision techniques. Convolutional Neural Networks (CNNs) have been used to obtain results by training a dataset, allowing the learning of features to recognise multiple objects in one single pass without importance the views, occlusion and changes of illumination. YOLO, FRCNs and Single Shot Detector (SSD) are CNNs to detect classes of objects in an image, learning their features without using much computationally cost.
Therefore, motivated by the effectiveness of deep learning for the detecting task, in this work, we present a detection system to solve one of the missions included in the indoors competition of the IMAV2019. This mission consisted in detect a given flag, which is used to indicate the position of a landing platform. The goal is to a MAV navigate autonomously detecting the flag to fly towards its location, and then identify the landing platform. Once the landing platform is detected, the MAV has to maintain the detection, while performing autonomous flight to centre its position w.r.t. the platform, seeking to secure the landing on the platform in an autonomous manner, see Fig. 1.
Our detection system is based on Single Shot Detector architecture with seven convolutional layers (SSD7). We have manually generated a training dataset of the flag and platform in several views, environments and changes of illumination to obtain an improved result before realising the autonomous landing. The SSD network was chosen due to its fast performance on micro computer boards with low budget processing powers and without GPU. In average, we have tested and observed that detection tasks can be performed with an average processing speed of 15 fps; this includes the controller responsible for the autonomous flight and landing routines.
In order to present our work, this paper is organised as follows. Section 2 provides related works about object detection and autonomous landing using deep learning and vision methods. Section 3 describes the dataset generation, the hardware used for the training and experiments, and our approach for detection. Section 4 shows the experimental design and the comparison of our approach with other methods for the flag and platform detection. In Sect. 5, we present the results running on board the MAV. Finally, conclusions and future work are outlined in Sect. 6.
2 Related Work
Object detection is a problem that has addressed for a long time in image processing, pattern recognition, and robotics using multiples techniques of recognition. In aerial robotics, recent works have sought out new techniques for target detection using sensors or vision during autonomous flight. However, due to onboard cameras of the MAVs, vision methods have used to perform tasks of detection, search and tracking with visual descriptors being the most widely used due to its fast application. For instance, in  detect regions of interest to the runway of wind-fixes UAVs applying sparse coding spatial pyramid matching (ScSPM), others create a keypoints database for feature matching  or the improvement of a descriptor using CamShift based on colour information . Others prefer the use patterns or marks to detect a landing platform [2, 3] and template-based matching in an image pyramid scheme for the target detection in multiple scales . Likewise, methods based on RANSAC allow the search and detection of landing sites with multi-scale features using 3D maps for pose estimation of landing sites [21, 22].
For one hand, machine learning and Artificial Neural Networks have leveraged the learning to detect and recognise landing targets using several methods in combination. For instance, the use of nearest neighbour with CNN layers to have effective in recognition  and category maps using counter propagation networks (CPNs) to identify multiple objects from aerial images . Also, they are suitable for learning the skill of pilots through generated models from datasets , even to cooperative detections and tracking onboard . Likewise, deep reinforcement learning can identify the position of the land the UAV on uniform textures using a Deep Q-Networks (DQNs) for vertical descent on a variety of simulated and real-world environments  or in several simulated environments with relevant noise . Some works employ different deep reinforcement learning methods for the autonomous landing. Thus,  they show an improved deep reinforcement learning (DRL) trained on Gazebo simulation for the autonomous landing. In , use Deep Q-Networks (DQNs) to perform autonomous landing on the deck of a USV subject to perturbations induced by sea, and  use a Gazebo-based reinforcement learning framework for UAV landing on a moving platform.
On the other hand, the target detection onboard of the MAVs using deep learning has promising results, such as YOLO, FRCN and SSD. The training of CNN models is an alternative for target detection, estimating heading angles to guide the aircraft to runway landing  or to obtain high-level commands directly to MAV respect to target . Furthermore, some CNN allows detecting broad zones for autonomous landing using depth estimation networks in real environments from a simulate dataset . However, it is necessary to take into account that some sites are not wides and a precise landing is required, providing a bounding box of the landing target . Hence, the detection of the targets is one of the main tasks in aerial robotics before to do an autonomous landing, in  uses YOLO and SqueezeNet to detect marks on the landing zone in synthesised and real-world scenarios. Finally, another work performs deep learning-based reconstruction and marker detection for MAV landing with YOLOv2 , and  uses lightDenseYOLO in combination with Kalman Filter for detecting markers and estimating the direction to perform the autonomous landing.
Despite detect targets and landing zones with deep learning, these works perform an onboard detection using computers with GPU architecture like Nvidia TX1, Nvidia TX2 and Snapdragon. Therefore, in this paper, we present a detection system using an SSD network for target detection and autonomous landing onboard of a MAV without a computer with GPU architecture.
3.1 Single Shot Detector (SSD)
To cover more forms of bounding boxes, the SSD uses Multi-scale features maps and data augmentation to improve the accuracy, flipping, cropping and distorting the colour of the image to handle variants in various object sizes and shapes. Our SSD architecture makes 6340 predictions for better coverage of location, scale and aspect ratios, more than many other detection methods. Besides, the predictions are classified as the intersection over the union and are a measure of the ratio between the intersected area over the joined area for two regions. This strategy makes that each prediction have shapes closer to the corresponding ground truth (Fig. 4), where its value is of 0.0 to 1.0, being the value 1.0 the proper detection.
3.2 System Overview
3.3 Dataset Generation
4.1 Mexican Flag Detection
Mexican flag detection with different methods.
The results obtained with Feature Matching achieves a 43.04% due to the lack of features in the template, causing the search for the flag to be missed in some cases. Instead, Template Matching obtains a suitable result 74.47% by using the cross-correlation and pyramidal scale, detecting the flag more times than Feature Matching. However, that method has problems of detection with rotated images in different angles. Nonetheless, our system implement with the SSD network finds the majority of images no matter the illumination, scales and rotations.
4.2 Landing Platform Detection
Landing platform detection with different methods.
The second result shows that the feature matching is not suitable for this test due to not finding enough features. The template matching method achieves 48.94% by realises a sweep in all the input image to localise the search template, obtaining a better result that feature matching method. Notwithstanding the result, the speed performance is slow by performing the sweep in the whole image; therefore, it is not suitable for real-time tasks. For another hand, our system achievement 96.09% finding the landing platform in the different conditions of the image and faster than the other methods.
5 Autonomous Landing Results
Autonomous landing results offboard and onboard of MAV.
Average flag confidence
Average platform confidence
We have presented a target detection system using a deep learning implementation based on the SSD network to detect a flag and a Landing platform. This work is motivated by the challenge of having to perform autonomous landing as part of a mission included in the indoors competition of the IMAV 2019. The mission represents an existing problem in aerial robotics which consists of target detection while a MAV performs autonomous navigation, where the place to land has to be located by detecting a flag and then, the landing has to be performed by centring on a landing platform performing a landing routine autonomously. Thus, we have presented a detection system using the SSD7 network running on the Intel Computer Stick without GPU and architecture carried by the MAV, thus enabling it to perform onboard processing. This enabled the MAV to detect a flag and later on the landing platform while performing an autonomous flight. We validated our detection system with image datasets under multiple conditions of illumination even when the object is scaled or rotated, obtaining success of 96.59% for the flag detection and 96.09% for the landing platform detection. We compared our approach against other methods based on traditional computer vision techniques such as template and feature matching. Also, we test our system in real-time with offboard and onboard flights, obtaining metric confidence output of 0.9386 for the flag, and 0.9826 for the Landing platform, everything running on the Intel Stick at an average of 13.01 fps.
Future work involves the use of this framework for more sophisticated tasks such as object tracking during autonomous flight, involving much more targets and in outdoor environments.
- 1.Baomar, H., Bentley, P.J.: Autonomous navigation and landing of airliners using artificial neural networks and learning by imitation. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI) (2017)Google Scholar
- 2.Bartak, R., Hraško, A., Obdržálek, D.: A controller for autonomous landing of AR. Drone. In: The 26th Chinese Control and Decision Conference (2014 CCDC), pp. 329–334. IEEE (2014)Google Scholar
- 3.Barták, R., Hrasko, A., Obdrzalek, D.: On autonomous landing of AR. Drone: hands-on experience. In: The Twenty-Seventh International Flairs Conference (2014)Google Scholar
- 4.Bicer, Y., Moghadam, M., Sahin, C., Eroglu, B., Üre, N.K.: Vision-based UAV guidance for autonomous landing with deep neural networks. In: AIAA SciTech 2019 Forum, p. 0140 (2019)Google Scholar
- 5.Cabrera-Ponce, A.A., Martinez-Carranza, J.: A vision-based approach for autonomous landing. In: 2017 Workshop on Research, Education and Development of Unmanned Aerial Systems (RED-UAS), pp. 126–131. IEEE (2017)Google Scholar
- 10.Polvara, R., et al.: Autonomous quadrotor landing using deep reinforcement learning. arXiv preprint arXiv:1709.03339 (2017)
- 11.Polvara, R., et al.: Toward end-to-end control for UAV autonomous landing via deep reinforcement learning. In: 2018 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 115–123. IEEE (2018)Google Scholar
- 14.Recker, S., Gribble, C., Butkiewicz, M.: Autonomous precision landing for the joint tactical aerial resupply vehicle. In: 2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pp. 1–8. IEEE (2018)Google Scholar
- 16.Rojas-Perez, L.O., Munguia-Silva, R., Martinez-Carranza, J.: Real-time landing zone detection for UAVs using single aerial images. In: Watkins, S. (ed.) 10th International Micro Air Vehicle Competition and Conference, Melbourne, Australia, pp. 243–248, November 2018Google Scholar
- 19.Xu, Y., Liu, Z., Wang, X.: Monocular vision based autonomous landing of quadrotor through deep reinforcement learning. In: 2018 37th Chinese Control Conference (CCC), pp. 10014–10019. IEEE (2018)Google Scholar
- 20.Xu, Y., Zhang, Y., Liu, H., Wang, X.: Deep learning for UAV autonomous landing based on self-built image dataset. In: Eleventh International Conference on Machine Vision (ICMV 2018), vol. 11041, p. 110412I. International Society for Optics and Photonics (2019)Google Scholar
- 21.Yang, S., Scherer, S.A., Schauwecker, K., Zell, A.: Onboard monocular vision for landing of an MAV on a landing site specified by a single reference image. In: 2013 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 318–325. IEEE (2013)Google Scholar