Advertisement

Proposal-Refined Weakly Supervised Object Detection in Underwater Images

  • Xiaoqian Lv
  • An Wang
  • Qinglin Liu
  • Jiamin Sun
  • Shengping ZhangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11901)

Abstract

Recently, Convolutional Neural Networks (CNNs) have achieved great success in object detection due to their outstanding abilities of learning powerful features on large-scale training datasets. One of the critical factors of their success is the accurate and complete annotation of the training dataset. However, accurately annotating the training dataset is difficult and time-consuming in some applications such as object detection in underwater images due to severe foreground clustering and occlusion. In this paper, we study the problem of object detection in underwater images with incomplete annotation. To solve this problem, we propose a proposal-refined weakly supervised object detection method, which consists of two stages. The first stage is a weakly-fitted segmentation network for foreground-background segmentation. The second stage is a proposal-refined detection network, which uses the segmentation results of the first stage to refine the proposals and therefore can improve the performance of object detection. Experiments are conducted on the Underwater Robot Picking Contest 2017 dataset (URPC2017) which has 19967 underwater images containing three kinds of objects: sea cucumber, sea urchin and scallop. The annotation of the training set is incomplete. Experimental results show that the proposed method greatly improves the detection performance compared to several baseline methods.

Keywords

Underwater image Weakly supervised Object detection Semantic segmentation 

References

  1. 1.
    Chen, G., Liu, L., Hu, W., Pan, Z.: Semi-supervised object detection in remote sensing images using generative adversarial networks. In: 2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 2503–2506 (2018)Google Scholar
  2. 2.
    Choi, M.K., et al.: Co-occurrence matrix analysis-based semi-supervised training for object detection. In: 2018 25th IEEE International Conference on Image Processing, pp. 1333–1337 (2018)Google Scholar
  3. 3.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)Google Scholar
  4. 4.
    Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRefGoogle Scholar
  5. 5.
    Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  6. 6.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  7. 7.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)Google Scholar
  8. 8.
    Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01264-9_45CrossRefGoogle Scholar
  9. 9.
    Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017)Google Scholar
  10. 10.
    Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2018).  https://doi.org/10.1109/TPAMI.2018.2858826
  11. 11.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  12. 12.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  13. 13.
    Naylor, R., Burke, M.: Aquaculture and ocean resources: raising tigers of the sea. Annu. Rev. Environ. Resour. 30, 185–218 (2005)CrossRefGoogle Scholar
  14. 14.
    Redmon, J., Divvala, S., Girshick, R., Farhadil, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)Google Scholar
  15. 15.
    Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525 (2017)Google Scholar
  16. 16.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  17. 17.
    Rhee, P.K., Erdenee, E., Kyun, S.D., Ahmed, M.U., Jin, S.: Active and semi-supervised learning for object detection with imperfect data. Cogn. Syst. Res. 45, 109–123 (2017)CrossRefGoogle Scholar
  18. 18.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  19. 19.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Tang, P., Wang, X., Bai, X., Liu, W.: Multiple instance detection network with online instance classifier refinement. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2843–2851 (2017)Google Scholar
  21. 21.
    Tang, P., et al.: Weakly supervised region proposal network and object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 370–386. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_22CrossRefGoogle Scholar
  22. 22.
    Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)CrossRefGoogle Scholar
  23. 23.
    Zhang, X., Feng, J., Xiong, H., Tian, Q.: Zigzag learning for weakly supervised object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4262–4270 (2018)Google Scholar
  24. 24.
    Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_26CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xiaoqian Lv
    • 1
  • An Wang
    • 1
  • Qinglin Liu
    • 1
  • Jiamin Sun
    • 1
  • Shengping Zhang
    • 1
    Email author
  1. 1.School of Computer Science and TechnologyHarbin Institute of TechnologyWeihaiChina

Personalised recommendations