Advertisement

Iterative Maximum Clique Clustering Based Detection Filter

  • Xinyu Zhang
  • Hao ShengEmail author
  • Yang Zhang
  • Jiahui Chen
  • Yubin Wu
  • Guangtao Xue
  • Quanrui Wei
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11304)

Abstract

Object detection is an important research field of computer vision, but getting accurate object detection from a large number of detection candidates has always been a challenge. The most current algorithms use an insufficient Greedy Non-Maximum Suppression (NMS) strategy which heavily relies on the confidence of the detection candidates. This paper proposes the Iterative Detection Filter (IDF) approach, which considers more information of the detection candidates, including overlapping, the confidence generated by the detector, and the ground position perception information of the scene. Through this approach, the detection candidates are mapped to more accurate detections. Our method achieves a significant improvement on the MOT16 and MOT17 datasets, which are widely used in video tracking and detection.

Keywords

Detection filter Maximum clique Iterative clustering Detection candidate Non-Maximum Suppression 

Notes

Acknowledgement

This study is partially supported by the National Key R & D Program of China (No. 2016QY01W0200), the National Natural Science Foundation of China (No. 61472019), the Macao Science and Technology Development Fund (No. 138/2 016/A3), the Open Fund of the State Key Laboratory of Software Development Environment under grant SKLSDE-2017ZX-09, the Project of Experimental Verification of the Basic Commonness and Key Technical Standards of the Industrial Internet network architecture, and the Technology Innovation Fund of China Electronic Technology Group Corporation. Thank you for the support from HAWKEYE Group.

References

  1. 1.
    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008(1), 1–10 (2008)CrossRefGoogle Scholar
  2. 2.
    Cheng, M., Zhang, Z., Lin, W., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300 fps. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3286–3293 (2014)Google Scholar
  3. 3.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)Google Scholar
  4. 4.
    Ellis, A., Ferryman, J.: Pets2010 and pets2009 evaluation of results using individual ground truthed single views. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 135–142 (2010)Google Scholar
  5. 5.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A.: Cascade object detection with deformable part models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2241–2248 (2010)Google Scholar
  6. 6.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  7. 7.
    Felzenszwalb, P.F., McAllester, D.A., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  8. 8.
    Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  9. 9.
    Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  11. 11.
    Henderson, P., Ferrari, V.: End-to-end training of object class detectors for mean average precision. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016, Part V. LNCS, vol. 10115, pp. 198–213. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-54193-8_13CrossRefGoogle Scholar
  12. 12.
    Hosang, J.H., Benenson, R., Schiele, B.: Learning non-maximum suppression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6469–6477 (2017)Google Scholar
  13. 13.
    Kim, C., Li, F., Ciptadi, A., Rehg, J.M.: Multiple hypothesis tracking revisited. In: IEEE International Conference on Computer Vision, pp. 4696–4704 (2015)Google Scholar
  14. 14.
    Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017)Google Scholar
  15. 15.
    Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: A benchmark for multi-object tracking. CoRR abs/1603.00831 (2016)Google Scholar
  16. 16.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Annual Conference on Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  17. 17.
    van de Sande, K.E.A., Uijlings, J.R.R., Gevers, T., Smeulders, A.W.M.: Segmentation as selective search for object recognition. In: IEEE International Conference on Computer Vision, pp. 1879–1886 (2011)Google Scholar
  18. 18.
    Stewart, R., Andriluka, M., Ng, A.Y.: End-to-end people detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2325–2333 (2016)Google Scholar
  19. 19.
    Viola, P.A., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518 (2001)Google Scholar
  20. 20.
    Wan, L., Eigen, D., Fergus, R.: End-to-end integration of a convolutional network, deformable parts model and non-maximum suppression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 851–859 (2015)Google Scholar
  21. 21.
    Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_26CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Xinyu Zhang
    • 1
  • Hao Sheng
    • 1
    Email author
  • Yang Zhang
    • 1
  • Jiahui Chen
    • 1
  • Yubin Wu
    • 1
  • Guangtao Xue
    • 2
  • Quanrui Wei
    • 3
  1. 1.State Key Laboratory of Software Development Environment, School of Computer Science and EngineeringBeihang UniversityBeijingPeople’s Republic of China
  2. 2.Department of Computer Science and EngineeringShanghai Jiao Tong UniversityShanghaiPeople’s Republic of China
  3. 3.The 15th Research Institute of China Electronics Technology group CorporationBeijingPeople’s Republic of China

Personalised recommendations