Revisiting Faster R-CNN: A Deeper Look at Region Proposal Network

Han, Guangxing; Zhang, Xuan; Li, Chongrong

doi:10.1007/978-3-319-70090-8_2

Guangxing Han¹⁸,
Xuan Zhang¹⁸ &
Chongrong Li¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10636))

Included in the following conference series:

International Conference on Neural Information Processing

4604 Accesses
11 Citations
3 Altmetric

Abstract

Currently, state-of-the-art object detectors are based on Faster R-CNN. We firstly revisit Faster R-CNN and explore problems in it, e.g., coarseness of feature maps for accurate localization, fixed-window feature extraction in RPN and insensitivity for small scale objects. Then a novel object detection network is proposed to address these problems. Specifically, we utilize a two-stage cascade multi-scale proposal generation network to get high accurate proposals: an original RPN is adopted to initially generate coarse proposals, then another network with multi-layer features and RoI pooling layer are introduced to refine these proposals. We also generate small scale proposals in the second stage simultaneously. After that, a detection network with multi-layer features further classifies and refines proposals. A novel 3-step joint training algorithm is introduced to optimize our model. Experiments on PASCAL VOC 2007 and 2012 demonstrate the effectiveness of our network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Everingham, M., Eslami, S.A., Van Gool, L., et al.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111, 98–136 (2015). LNCS. Springer
Article Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: Common Objects in Context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). doi:10.1007/978-3-319-10602-1_48
Google Scholar
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28, pp. 91–99. Curran Associates, Montréal (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., et al.: Region-based convolutional networks for accurate object detection and segmentation. In: IEEE Computer Vision and Pattern Recognition, pp. 580–587. IEEE Press, Columbus (2014)
Google Scholar
Uijlings, J.R., Van De Sande, K.E., Gevers, T., et al.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, South Lake Tahoe (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE Press, Santiago (2015)
Google Scholar
Yu, W., Yang, K., Bai, Y., et al.: Visualizing and comparing convolutional neural networks. arXiv preprint arXiv:1412.6631 (2014)
Kong, T., Yao, A., Chen, Y., et al.: HyperNet: towards accurate region proposal generation and joint object detection. In: IEEE Computer Vision and Pattern Recognition, pp. 845–853. IEEE Press, Las Vegas (2016)
Google Scholar
Zhang, L., Lin, L., Liang, X., He, K.: Is Faster R-CNN doing well for pedestrian detection? In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 443–457. Springer, Cham (2016). doi:10.1007/978-3-319-46475-6_28
Chapter Google Scholar
Yang, B., Yan, J., Lei, Z., et al.: Craft objects from images. In: IEEE Computer Vision and Pattern Recognition, pp. 6043–6051. IEEE Press, Las Vegas (2016)
Google Scholar
Gidaris, S., Komodakis, N.: Attend refine repeat: active box proposal generation via in-out localization. arXiv preprint arXiv:1606.04446 (2016)
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). doi:10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). doi:10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. arXiv preprint arXiv:1612.03144 (2016)
Bell, S., Lawrence Zitnick, C., Bala, K., et al.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: IEEE Computer Vision and Pattern Recognition, pp. 2874–2883. IEEE Press, Las Vegas (2016)
Google Scholar
Ghodrati, A., Diba, A., Pedersoli, M., et al.: DeepProposal: hunting objects by cascading deep convolutional layers. In: IEEE International Conference on Computer Vision, pp. 2578–2586. IEEE Press, Santiago (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Tsinghua National Laboratory for Information Science and Technology (TNList), Institute for Network Sciences and Cyberspace (INSC), Tsinghua University, Beijing, 100084, China
Guangxing Han, Xuan Zhang & Chongrong Li

Authors

Guangxing Han
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chongrong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuan Zhang .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, G., Zhang, X., Li, C. (2017). Revisiting Faster R-CNN: A Deeper Look at Region Proposal Network. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10636. Springer, Cham. https://doi.org/10.1007/978-3-319-70090-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-70090-8_2
Published: 28 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70089-2
Online ISBN: 978-3-319-70090-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics