Regular and Small Target Detection

Wang, Wenzhe; Wu, Bin; Lv, Jinna; Dai, Pilin

doi:10.1007/978-3-030-05716-9_37

Wenzhe Wang¹⁹,
Bin Wu¹⁹,
Jinna Lv¹⁹ &
…
Pilin Dai¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11296))

Included in the following conference series:

International Conference on Multimedia Modeling

2250 Accesses
2 Citations

Abstract

Although remarkable results have been achieved in the areas of object detection, the detection of small objects is still a challenging task now. The low resolution and noisy representation make small objects difficult to detect, and further recognition will be much harder. Aiming at the small objects that have regular positions, shapes, colors or other features, this paper proposes an approach of Regular and Small Target Detection based on Faster R-CNN (RSTD) for the detection and recognition of regular and small targets such as traffic signs. In this approach, a regular and small target feature extraction layer is designed to automatically extract the surrounding background and internal key information of the proposal objects, which benefits the detection and recognition. Extensive evaluations on Tsinghua-Tencent 100K and GTSDB datasets demonstrate the superiority of our approach in detecting traffic signs over well-established state-of-the-arts. The source code and model introduced in this paper are publicly available at: https://github.com/zhezheey/RSTD/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bell, S., Zitnick, C.L., Bala, K., Girshick, R.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2874–2883 (2016)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6154–6162 (2018)
Google Scholar
Chen, X., et al.: 3D object proposals for accurate object class detection. In: Annual Conference on Neural Information Processing Systems (NIPS), pp. 424–432 (2015)
Google Scholar
Cheng, P., Liu, W., Zhang, Y., Ma, H.: LOCO: local context based faster R-CNN for small traffic sign detection. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 329–341. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_27
Chapter Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Annual Conference on Neural Information Processing Systems (NIPS), pp. 379–387 (2016)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Fang, W., Chen, J., Liang, C., Wang, X., Nan, Y., Hu, R.: Object detection in low-resolution image via sparse representation. In: International Conference on Multimedia Modeling (MMM), pp. 234–245 (2015)
Google Scholar
Girshick, R.: Fast R-CNN. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 346–361. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_23
Chapter Google Scholar
Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M. Igel, C.: Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2013)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM International Conference on Multimedia (MM), pp. 675–678 (2014)
Google Scholar
Joly, A., Buisson, O.: Logo retrieval with a contrario visual query expansion. In: ACM International Conference on Multimedia (MM), pp. 581–584 (2009)
Google Scholar
Kong, T., Sun, F., Yao, A., Liu, H., Lu, M., Chen, Y.: RON: reverse connection with objectness prior networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5244–5252 (2017)
Google Scholar
Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5325–5334 (2015)
Google Scholar
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1951–1959 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Meng, Z., Fan, X., Chen, X., Chen, M., Tong Y.: Detecting small signs from large images. In: International Conference on Information Reuse & Integration for Data Science (IRI), pp. 217–224 (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Annual Conference on Neural Information Processing Systems (NIPS), pp. 91–99 (2015)
Google Scholar
Shrivastava, A., Gupta, A.: Contextual priming and feedback for faster R-CNN. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 330–348. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_20
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556 (2014)
Yang, F., Choi, W., Lin, Y.: Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2129–2137 (2016)
Google Scholar
Zhou, P., Ni, B., Geng, C., Hu, J., Xu, Y.: Scale-transferrable object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 528–537 (2018)
Google Scholar
Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., Hu, S.: Traffic-sign detection and classification in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2110–2118 (2016)
Google Scholar

Download references

Acknowledgement

This work is partially supported by the National Key R&D Program of China (No. 2018YFC0831500), the National Social Science Foundation of China (No. 16ZDA055), the National Natural Science Foundation of China (No. 61772082), and the Special Found for Beijing Common Construction Project.

Author information

Authors and Affiliations

Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Wenzhe Wang, Bin Wu, Jinna Lv & Pilin Dai

Authors

Wenzhe Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jinna Lv
View author publications
You can also search for this author in PubMed Google Scholar
Pilin Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wenzhe Wang or Bin Wu .

Editor information

Editors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Ioannis Kompatsiaris
EURECOM, Sophia Antipolis, France
Benoit Huet
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Vasileios Mezaris
Dublin City University, Dublin, Ireland
Cathal Gurrin
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, W., Wu, B., Lv, J., Dai, P. (2019). Regular and Small Target Detection. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-05716-9_37
Published: 11 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05715-2
Online ISBN: 978-3-030-05716-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics