Advertisement

VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results

  • Pengfei ZhuEmail author
  • Longyin Wen
  • Dawei Du
  • Xiao Bian
  • Haibin Ling
  • Qinghua Hu
  • Qinqin Nie
  • Hao Cheng
  • Chenfeng Liu
  • Xiaoyu Liu
  • Wenya Ma
  • Haotian Wu
  • Lianjie Wang
  • Arne Schumann
  • Chase Brown
  • Chen Qian
  • Chengzheng Li
  • Dongdong Li
  • Emmanouil Michail
  • Fan Zhang
  • Feng Ni
  • Feng Zhu
  • Guanghui Wang
  • Haipeng Zhang
  • Han Deng
  • Hao Liu
  • Haoran Wang
  • Heqian Qiu
  • Honggang Qi
  • Honghui Shi
  • Hongliang Li
  • Hongyu Xu
  • Hu Lin
  • Ioannis Kompatsiaris
  • Jian Cheng
  • Jianqiang Wang
  • Jianxiu Yang
  • Jingkai Zhou
  • Juanping Zhao
  • K. J. Joseph
  • Kaiwen Duan
  • Karthik Suresh
  • Bo Ke
  • Ke Wang
  • Konstantinos Avgerinakis
  • Lars Sommer
  • Lei Zhang
  • Li Yang
  • Lin Cheng
  • Lin Ma
  • Liyu Lu
  • Lu Ding
  • Minyu Huang
  • Naveen Kumar Vedurupaka
  • Nehal Mamgain
  • Nitin Bansal
  • Oliver Acatay
  • Panagiotis Giannakeris
  • Qian Wang
  • Qijie Zhao
  • Qingming Huang
  • Qiong Liu
  • Qishang Cheng
  • Qiuchen Sun
  • Robert Laganière
  • Sheng Jiang
  • Shengjin Wang
  • Shubo Wei
  • Siwei Wang
  • Stefanos Vrochidis
  • Sujuan Wang
  • Tiaojio Lee
  • Usman Sajid
  • Vineeth N. Balasubramanian
  • Wei Li
  • Wei Zhang
  • Weikun Wu
  • Wenchi Ma
  • Wenrui He
  • Wenzhe Yang
  • Xiaoyu Chen
  • Xin Sun
  • Xinbin Luo
  • Xintao Lian
  • Xiufang Li
  • Yangliu Kuai
  • Yali Li
  • Yi Luo
  • Yifan Zhang
  • Yiling Liu
  • Ying Li
  • Yong Wang
  • Yongtao Wang
  • Yuanwei Wu
  • Yue Fan
  • Yunchao Wei
  • Yuqin Zhang
  • Zexin Wang
  • Zhangyang Wang
  • Zhaoyue Xia
  • Zhen Cui
  • Zhenwei He
  • Zhipeng Deng
  • Zhiyao Guo
  • Zichen Song
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11133)

Abstract

Object detection is a hot topic with various applications in computer vision, e.g., image understanding, autonomous driving, and video surveillance. Much of the progresses have been driven by the availability of object detection benchmark datasets, including PASCAL VOC, ImageNet, and MS COCO. However, object detection on the drone platform is still a challenging task, due to various factors such as view point change, occlusion, and scales. To narrow the gap between current object detection performance and the real-world requirements, we organized the Vision Meets Drone (VisDrone2018) Object Detection in Image challenge in conjunction with the 15th European Conference on Computer Vision (ECCV 2018). Specifically, we release a large-scale drone-based dataset, including 8, 599 images (6, 471 for training, 548 for validation, and 1, 580 for testing) with rich annotations, including object bounding boxes, object categories, occlusion, truncation ratios, etc. Featuring a diverse real-world scenarios, the dataset was collected using various drone models, in different scenarios (across 14 different cities spanned over thousands of kilometres), and under various weather and lighting conditions. We mainly focus on ten object categories in object detection, i.e., pedestrian, person, car, van, bus, truck, motor, bicycle, awning-tricycle, and tricycle. Some rarely occurring special vehicles (e.g., machineshop truck, forklift truck, and tanker) are ignored in evaluation. The dataset is extremely challenging due to various factors, including large scale and pose variations, occlusion, and clutter background. We present the evaluation protocol of the VisDrone-DET2018 challenge and the comparison results of 38 detectors on the released dataset, which are publicly available on the challenge website: http://www.aiskyeye.com/. We expect the challenge to largely boost the research and development in object detection in images on drone platforms.

Keywords

Performance evaluation Drone Object detection in images 

Notes

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61502332 and Grant 61732011, in part by Natural Science Foundation of Tianjin under Grant 17JCZDJC30800, in part by US National Science Foundation under Grant IIS-1407156 and Grant IIS-1350521, and in part by Beijing Seetatech Technology Co., Ltd and GE Global Research.

References

  1. 1.
    Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. TPAMI 26(11), 1475–1490 (2004)CrossRefGoogle Scholar
  2. 2.
    Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: CVPR. IEEE Computer Society (2008)Google Scholar
  3. 3.
    Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS - improving object detection with one line of code. In: ICCV, pp. 5562–5570 (2017)Google Scholar
  4. 4.
    Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_22CrossRefGoogle Scholar
  5. 5.
    Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. CoRR abs/1712.00726 (2017)Google Scholar
  6. 6.
    Chen, X., Wu, Z., Yu, J.: Dual refinement network for single-shot object detection. CoRR abs/1807.08638 (2018). http://arxiv.org/abs/1807.08638
  7. 7.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016)Google Scholar
  8. 8.
    Dai, J., et al.: Deformable convolutional networks. In: ICCV, pp. 764–773 (2017)Google Scholar
  9. 9.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)Google Scholar
  10. 10.
    Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 248–255 (2009)Google Scholar
  11. 11.
    Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. TPAMI 34(4), 743–761 (2012)CrossRefGoogle Scholar
  12. 12.
    Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: survey and experiments. TPAMI 31(12), 2179–2195 (2009)CrossRefGoogle Scholar
  13. 13.
    Ess, A., Leibe, B., Gool, L.J.V.: Depth and appearance for mobile scene analysis. In: ICCV, pp. 1–8 (2007)Google Scholar
  14. 14.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC 2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
  15. 15.
    Everingham, M., Eslami, S.M.A., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes challenge: a retrospective. IJCV 111(1), 98–136 (2015)CrossRefGoogle Scholar
  16. 16.
    Everingham, M., Gool, L.J.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  17. 17.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  18. 18.
    Fu, C., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD : Deconvolutional single shot detector. CoRR abs/1701.06659 (2017). http://arxiv.org/abs/1701.06659
  19. 19.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: CVPR, pp. 3354–3361 (2012)Google Scholar
  20. 20.
    Girshick, R.B.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)Google Scholar
  21. 21.
    Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)Google Scholar
  22. 22.
    Haris, M., Shakhnarovich, G., Ukita, N.: Deep back-projection networks for super-resolution. CoRR abs/1803.02735 (2018)Google Scholar
  23. 23.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV, pp. 2980–2988 (2017)Google Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. TPAMI 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  26. 26.
    Hsieh, M., Lin, Y., Hsu, W.H.: Drone-based object counting by spatially regularized regional proposal network. In: ICCV (2017)Google Scholar
  27. 27.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR abs/1709.01507 (2017)Google Scholar
  28. 28.
    Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: in defense of two-stage object detector. CoRR abs/1711.07264 (2017)Google Scholar
  29. 29.
    Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)Google Scholar
  30. 30.
    Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2999–3007 (2017)Google Scholar
  31. 31.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  32. 32.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  33. 33.
    Lyu, S., et al.: UA-DETRAC 2017: report of AVSS2017 & IWT4S challenge on advanced traffic monitoring. In: AVSS, pp. 1–7 (2017)Google Scholar
  34. 34.
    Mundhenk, T.N., Konjevod, G., Sakla, W.A., Boakye, K.: A large contextual dataset for classification, detection and counting of cars with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 785–800. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_48CrossRefGoogle Scholar
  35. 35.
    Özuysal, M., Lepetit, V., Fua, P.: Pose estimation for category specific multiview object localization. In: CVPR, pp. 778–785 (2009)Google Scholar
  36. 36.
    Razakarivony, S., Jurie, F.: Vehicle detection in aerial imagery : a small target detection benchmark. J. Vis. Commun. Image Represent. 34, 187–203 (2016)CrossRefGoogle Scholar
  37. 37.
    Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)Google Scholar
  38. 38.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017)Google Scholar
  39. 39.
    Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. CoRR abs/1804.02767 (2018)Google Scholar
  40. 40.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. TPAMI 39(6), 1137–1149 (2017)CrossRefGoogle Scholar
  41. 41.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  42. 42.
    Shrivastava, A., Gupta, A., Girshick, R.B.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)Google Scholar
  43. 43.
    Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: AAAI, pp. 4278–4284 (2017)Google Scholar
  44. 44.
    Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)CrossRefGoogle Scholar
  45. 45.
    Viola, P.A., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: CVPR, pp. 511–518 (2001)Google Scholar
  46. 46.
    Wang, F., et al.: Residual attention network for image classification. In: CVPR, pp. 6450–6458 (2017)Google Scholar
  47. 47.
    Wen, L., et al.: UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. CoRR abs/1511.04136 (2015)Google Scholar
  48. 48.
    Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR, pp. 5987–5995 (2017)Google Scholar
  49. 49.
    Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 657–674. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01219-9_39CrossRefGoogle Scholar
  50. 50.
    Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)Google Scholar
  51. 51.
    Zhang, Z., Qiao, S., Xie, C., Shen, W., Wang, B., Yuille, A.L.: Single-shot object detection with enriched semantics. In: CVPR (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Pengfei Zhu
    • 1
    Email author
  • Longyin Wen
    • 2
  • Dawei Du
    • 3
  • Xiao Bian
    • 4
  • Haibin Ling
    • 5
  • Qinghua Hu
    • 1
  • Qinqin Nie
    • 1
  • Hao Cheng
    • 1
  • Chenfeng Liu
    • 1
  • Xiaoyu Liu
    • 1
  • Wenya Ma
    • 1
  • Haotian Wu
    • 1
  • Lianjie Wang
    • 1
  • Arne Schumann
    • 31
  • Chase Brown
    • 6
  • Chen Qian
    • 28
  • Chengzheng Li
    • 29
  • Dongdong Li
    • 27
  • Emmanouil Michail
    • 20
  • Fan Zhang
    • 14
  • Feng Ni
    • 22
  • Feng Zhu
    • 21
  • Guanghui Wang
    • 10
  • Haipeng Zhang
    • 13
  • Han Deng
    • 25
  • Hao Liu
    • 27
  • Haoran Wang
    • 14
  • Heqian Qiu
    • 36
  • Honggang Qi
    • 18
  • Honghui Shi
    • 9
  • Hongliang Li
    • 36
  • Hongyu Xu
    • 7
  • Hu Lin
    • 11
  • Ioannis Kompatsiaris
    • 20
  • Jian Cheng
    • 34
  • Jianqiang Wang
    • 33
  • Jianxiu Yang
    • 14
  • Jingkai Zhou
    • 11
  • Juanping Zhao
    • 28
  • K. J. Joseph
    • 23
  • Kaiwen Duan
    • 18
  • Karthik Suresh
    • 6
  • Bo Ke
    • 12
  • Ke Wang
    • 14
  • Konstantinos Avgerinakis
    • 20
  • Lars Sommer
    • 31
    • 32
  • Lei Zhang
    • 19
  • Li Yang
    • 14
  • Lin Cheng
    • 14
  • Lin Ma
    • 26
  • Liyu Lu
    • 1
  • Lu Ding
    • 28
  • Minyu Huang
    • 16
  • Naveen Kumar Vedurupaka
    • 24
  • Nehal Mamgain
    • 23
  • Nitin Bansal
    • 6
  • Oliver Acatay
    • 31
  • Panagiotis Giannakeris
    • 20
  • Qian Wang
    • 14
  • Qijie Zhao
    • 22
  • Qingming Huang
    • 18
  • Qiong Liu
    • 11
  • Qishang Cheng
    • 36
  • Qiuchen Sun
    • 14
  • Robert Laganière
    • 30
  • Sheng Jiang
    • 14
  • Shengjin Wang
    • 33
  • Shubo Wei
    • 14
  • Siwei Wang
    • 14
  • Stefanos Vrochidis
    • 20
  • Sujuan Wang
    • 34
  • Tiaojio Lee
    • 25
  • Usman Sajid
    • 10
  • Vineeth N. Balasubramanian
    • 23
  • Wei Li
    • 36
  • Wei Zhang
    • 25
  • Weikun Wu
    • 16
  • Wenchi Ma
    • 10
  • Wenrui He
    • 21
  • Wenzhe Yang
    • 14
  • Xiaoyu Chen
    • 36
  • Xin Sun
    • 17
  • Xinbin Luo
    • 28
  • Xintao Lian
    • 14
  • Xiufang Li
    • 14
  • Yangliu Kuai
    • 27
  • Yali Li
    • 33
  • Yi Luo
    • 11
  • Yifan Zhang
    • 34
    • 35
  • Yiling Liu
    • 15
  • Ying Li
    • 15
  • Yong Wang
    • 30
  • Yongtao Wang
    • 22
  • Yuanwei Wu
    • 10
  • Yue Fan
    • 25
  • Yunchao Wei
    • 8
  • Yuqin Zhang
    • 16
  • Zexin Wang
    • 14
  • Zhangyang Wang
    • 6
  • Zhaoyue Xia
    • 33
  • Zhen Cui
    • 29
  • Zhenwei He
    • 19
  • Zhipeng Deng
    • 27
  • Zhiyao Guo
    • 16
  • Zichen Song
    • 36
  1. 1.Tianjin UniversityTianjinChina
  2. 2.JD FinanceMountain ViewUSA
  3. 3.University at Albany, SUNYAlbanyUSA
  4. 4.GE Global ResearchNiskayunaUSA
  5. 5.Temple UniversityPhiladelphiaUSA
  6. 6.Texas A&M UniversityCollege StationUSA
  7. 7.University of MarylandCollege ParkUSA
  8. 8.University of Illinois at Urbana-ChampaignUrbana-ChampaignUSA
  9. 9.Thomas J. Watson Research CenterYorktown HeightsUSA
  10. 10.University of KansasLawrenceUSA
  11. 11.South China University of TechnologyGuangzhouChina
  12. 12.Sun Yat-sen UniversityGuangzhouChina
  13. 13.Jiangnan UniversityWuxiChina
  14. 14.Xidian UniversityXi’anChina
  15. 15.Northwestern Polytechnical UniversityXi’anChina
  16. 16.Xiamen UniversityXiamenChina
  17. 17.Ocean University of ChinaQingdaoChina
  18. 18.University of Chinese Academy of SciencesBeijingChina
  19. 19.Chongqing UniversityChongqingChina
  20. 20.Centre for Research and Technology HellasThessalonikiGreece
  21. 21.Beijing University of Telecommunication and PostBeijingChina
  22. 22.Peking UniversityBeijingChina
  23. 23.Indian Institute of TechnologyHyderabadIndia
  24. 24.NIT TrichyTiruchirappalliIndia
  25. 25.Shandong UniversityJinanChina
  26. 26.Tencent AI LabBellevueChina
  27. 27.National University of Defense TechnologyChangshaChina
  28. 28.Shanghai Jiao Tong UniversityShanghaiChina
  29. 29.Nanjing University of Science and TechnologyNanjingChina
  30. 30.University of OttawaOttawaCanada
  31. 31.Fraunhofer IOSBKarlsruheGermany
  32. 32.Karlsruhe Institute of TechnologyKarlsruheGermany
  33. 33.Tsinghua UniversityBeijingChina
  34. 34.Nanjing Artificial Intelligence Chip Research, Institute of AutomationChinese Academy of SciencesBeijingChina
  35. 35.Institute of AutomationChinese Academy of SciencesBeijingChina
  36. 36.University of Electronic Science and Technology of ChinaChengduChina

Personalised recommendations