Data Augmentation in Deep Learning-Based Obstacle Detection System for Autonomous Navigation on Aquatic Surfaces
Deep learning-based frameworks have been widely used in object recognition, perception and autonomous navigation tasks, showing outstanding feature extraction capabilities. Nevertheless, the effectiveness of such detectors usually depends on large amounts of training data. For specific object-recognition tasks, it is often difficult and time-consuming to gather enough valuable data . Data Augmentation has been broadly adopted to overcome these difficulties, as it allows to increase the training data and introduce variation in qualitative elements like color, illumination, distortion and orientation. In this paper, we leverage on the object detection framework YOLOv2  to evaluate the behavior of an obstacle detection system for an autonomous boat designed for the International RoboBoat Competition. We are focused on how the overall performance of a model changes with different augmentation techniques. Thus, we analyze the features that the network learns by using geometric and pixel-wise transformations to augment our data. Our instances of interest are buoys and sea markers, thus to generate training data comprising these classes, we simulated the aquatic surface of the boat and collected data from the COCO dataset . Finally, we discuss that significant generalization is achieved in the learning process of our experiments using different augmentation techniques.
KeywordsData augmentation Synthesized images Deep learning Object detection Computer vision
We would like to thank Tecnológico de Monterrey, WritingLabs and TecLabs for providing the equipment used in our experiments and financial support in the production of this work. Additionally, we extend our gratitude to VANTEC, the student group from ITESM that invited us to participate in the International RoboBoat Competition.
- 1.Fawzi, A., Samulowitz, H., Turaga, D., Frossard, P.: Adaptive data augmentation for image classification. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3688–3692, September 2016. https://doi.org/10.1109/ICIP.2016.7533048
- 3.Girshick, R.B., Donahue, J., Darrell, T., Malik, J., Berkeley, U.C.: Rich feature hierarchies for accurate object detection and semantic segmentation. Technical report (2013)Google Scholar
- 4.Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
- 5.Jo, H., Na, Y., Song, J.: Data augmentation using synthesized images for object detection. In: 2017 17th International Conference on Control, Automation and Systems (ICCAS), pp. 1035–1038, October 2017. https://doi.org/10.23919/ICCAS.2017.8204369
- 6.Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, vol. 1, January 2009Google Scholar
- 9.Muoz-Bulnes, J., Fernandez, C., Parra, I., Fernndez-Llorca, D., Sotelo, M.A.: Deep fully convolutional networks with random data augmentation for enhanced generalization in road detection. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 366–371, October 2017. https://doi.org/10.1109/ITSC.2017.8317901
- 11.Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning, December 2017Google Scholar
- 12.Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)
- 14.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, September 2014Google Scholar
- 15.Xiang, Y., Mottaghi, R., Savarese, S.: Beyond PASCAL: a benchmark for 3D object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision, pp. 75–82, March 2014. https://doi.org/10.1109/WACV.2014.6836101
- 16.Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. arXiv preprint arXiv:1708.04896 (2017)
- 17.Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar