Graph Convolution and Self Attention Based Non-maximum Suppression

  • Zhe Qiu
  • Xiaodong GuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11554)


Non-maximum suppression is an integral and last part of object detection. Traditional NMS algorithm sorts the detection boxes according to their class scores. The detection boxes with maximum score are always selected while all other boxes with a sufficient overlap with the preserved boxes are discarded. This strategy is simple and effective. However, there still need some improvements in this process because the algorithm makes a ‘hard’ decision (accept or reject) for each box. In this paper, we formulate the non-maximum suppression as a rescoring process and construct a network called NmsNet which utilizes graph convolution and self attention mechanism to predict each box as an object or redundant one. We evaluate our method on the VOC2007 dataset. The experimental results show that our method achieves a higher MAP compared with the traditional greedy NMS and the Soft NMS.


Graph convolution Self attention Non-maximum suppression 



This work was supported in part by National Natural Science Foundation of China under grants 61771145 and 61371148.


  1. 1.
    Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)Google Scholar
  2. 2.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)Google Scholar
  3. 3.
    Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). Scholar
  4. 4.
    Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS—improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5562–5570. IEEE Press, Venice (2017)Google Scholar
  5. 5.
    Hosang, J., Benenson, R., Schiele, B.: learning non-maximum suppression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6469–6477. IEEE Press, Honolulu (2017)Google Scholar
  6. 6.
    Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
  7. 7.
    Niepert, M., Ahmed, M., Kutzkov, K.: Learning convolutional neural networks for graphs. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 2014–2023. ACM, New York (2016)Google Scholar
  8. 8.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)Google Scholar
  9. 9.
    Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. In: International Conference on Learning Representations (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Electronic EngineeringFudan UniversityShanghaiChina

Personalised recommendations