Advertisement

A Modified PSRoI Pooling with Spatial Information

  • Yiqing Zheng
  • Xiaolu Hu
  • Ning Bi
  • Jun Tan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11257)

Abstract

Position Sensitive RoI pooling in RFCN [5] is a RoI pooling that contains the position information. Each RoI rectangle will be devided into K \(\times \) K bins by a regular grid. In this paper, we present a modified PSRoI pooling that contains spatial information. Every bin in PSRoI pooling is rescaled to 2\(\times \). With this proposed network, the spatial information around the bin will be added to predict the classifications. The weight outside the bin is simply set to 0.5 manually. We use ResNet101 backbone network to test our model on PASCAL VOC and MS COCO. We gain some improvement compared with RFCN [5] on the VOC and MS COCO Dataset.

Keywords

Modified PSRoI pooling Spatial information 

References

  1. 1.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
  2. 2.
    Girshick, R.: Fast R-CNN. In: ICCV (2015)Google Scholar
  3. 3.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)Google Scholar
  4. 4.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)Google Scholar
  5. 5.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)Google Scholar
  6. 6.
    Zhu, Y., Zhao, C., Wang, J., et al.: CoupleNet: coupling global structure with local parts for object detection. In: ICCV (2017)Google Scholar
  7. 7.
    Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_22CrossRefGoogle Scholar
  8. 8.
    Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)Google Scholar
  9. 9.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-timeobjectdetection. In: CVPR (2016)Google Scholar
  10. 10.
    Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10578-9_23CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Sun Yat-sen UniversityGuangzhouChina

Personalised recommendations