Robust Pedestrian Detection: Faster Deployments with Fusion of Models

  • Chan Tong LamEmail author
  • Jose Gaspar
  • Wei Ke
  • Xu Yang
  • Sio Kei Im
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12046)


Pedestrian detection has a wide range of real-world critical applications including security and management of emergency scenarios. In critical applications, detection recall and precision are both essential to ensure the correct detection of all pedestrians. The development and deployment of object detection vision-based models is a time-consuming task, depending on long training and fine-tuning processes to achieve top performance. We propose an alternative approach, based on a fusion of pre-trained off-the-shelf state-of-the-art object detection models, and exploit base model divergences to quickly deploy robust ensembles with improved performance. Our approach promotes model reuse and does not require additional learning algorithms, making it suitable for rapid deployments of critical systems. Experimental results, conducted on PASCAL VOC07 test dataset, reveal mean average precision (mAP) improvements over base detection models, regardless of the set of models selected. Improvements in mAP were observed starting from just two detection models and reached 3.53% for a fusion of four detection models, resulting in an absolute fusion mAP of 83.65%. Moreover, the hyperparameters of our ensemble model may be adjusted to set an appropriate tradeoff between precision and recall to fit different recall and precision application requirements.


Deep learning Pedestrian detection Fusion Ensemble learning 



This work was funded by the Science and Technology Development Fund of Macau SAR (File no. 138/2016/A3).


  1. 1.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. I-511–I-518 (2001).
  2. 2.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999).
  3. 3.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, pp. 886–893 (2005).
  4. 4.
    Dollar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1532–1545 (2014). Scholar
  5. 5.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017). Scholar
  6. 6.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). Scholar
  7. 7.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger (2016).
  8. 8.
    Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018).
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst., 1–9 (2012). Scholar
  10. 10.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2015). Scholar
  11. 11.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 7–12 June 2015, pp. 1–9 (2015).
  12. 12.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
  13. 13.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision (2015).
  14. 14.
    Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-ResNet and the impact of residual connections on learning. arXiv:1602.07261 (2016). Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016).
  16. 16.
    Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). Scholar
  17. 17.
    Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. 11, 169–198 (2016). Scholar
  18. 18.
    Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996). Scholar
  19. 19.
    Freund, Y., Schapire, R.E.: A short introduction to boosting. J. Japanese Soc. Artif. Intell. 14, 771–780 (1999)Google Scholar
  20. 20.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5, 241–259 (1992). Scholar
  21. 21.
    Breiman, L.: Stacked regressions. Mach. Learn. 24, 49–64 (1996). Scholar
  22. 22.
    Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3, 79–87 (2008). Scholar
  23. 23.
    Masoudnia, S., Ebrahimpour, R.: Mixture of experts: a literature survey. Artif. Intell. Rev. 42, 275–293 (2014). Scholar
  24. 24.
    Yuksel, S.E., Wilson, J.N., Gader, P.D.: Twenty years of mixture of experts. IEEE Trans. Neural Networks Learn. Syst. 23, 1177–1193 (2012). Scholar
  25. 25.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010). Scholar
  26. 26.
    Niitani, Y., Ogawa, T., Saito, S., Saito, M.: ChainerCV: a library for deep learning in computer vision, pp. 2–5 (2017).
  27. 27.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014).
  28. 28.
    Zhao, X., Li, W., Zhang, Y., Gulliver, T.A., Chang, S., Feng, Z.: A faster RCNN-based pedestrian detection system. In: IEEE Vehicular Technology Conference (2017).
  29. 29.
    Wu, Q., Liao, S.: Single shot multibox detector for vehicles and pedestrians detection and classification. DEStech Trans. Eng. Technol. Res., 22–28 (2018).
  30. 30.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). Scholar
  31. 31.
    Liu, Z., Chen, Z., Li, Z., Hu, W.: An Efficient pedestrian detection method based on YOLOv2. Math. Probl. Eng. 2018 (2018). Scholar
  32. 32.
    Qiu, S., Wen, G., Deng, Z., Liu, J., Fan, Y.: Accurate non-maximum suppression for object detection in high-resolution remote sensing images. Remote Sens. Lett. 9, 237–246 (2018). Scholar
  33. 33.
    Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2009). Scholar
  34. 34.
    Devernay, F.: A non-maxima suppression method for edge detection with sub-pixel accuracy. INRIA Res. Rep. 2724 (1995)Google Scholar
  35. 35.
    Tang, E.K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Mach. Learn. 65, 247–271 (2006). Scholar
  36. 36.
    Skalak, D., et al.: The sources of increased accuracy for two proposed boosting algorithms. In: Proceedings of the American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop, vol. 1129, p. 1133. Citeseer (1996)Google Scholar
  37. 37.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844 (1998). Scholar
  38. 38.
    Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960). Scholar
  39. 39.
    Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 211–218. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Chan Tong Lam
    • 1
    Email author
  • Jose Gaspar
    • 1
  • Wei Ke
    • 1
  • Xu Yang
    • 1
  • Sio Kei Im
    • 1
  1. 1.School of Applied SciencesMacao Polytechnic InstituteMacao S.A.R.China

Personalised recommendations