Adaptive Structural Model for Video Based Pedestrian Detection

Yan, Junjie; Yang, Bin; Lei, Zhen; Li, Stan Z.

doi:10.1007/978-3-319-16865-4_14

Junjie Yan⁵,
Bin Yang⁵,
Zhen Lei⁵ &
…
Stan Z. Li⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

Asian Conference on Computer Vision

2049 Accesses
1 Citations

Abstract

The performance of generic pedestrian detector usually declines seriously for videos in novel scenes, which is one of the major bottlenecks for current pedestrian detection techniques. The conventional works improve pedestrian detection in video by mining new instances from detections and adapting the detector according to the collected instances. However, when treating the two tasks separately, the detector adaptation suffers from the defective output of instance mining. In this paper, we propose to jointly handle the instance mining and detector adaption using an adaptive structural model. The regularization function of the model is applied on detector to prevent overfitting in adaption, and the loss function is designed to evaluate the combination of mined instances set and detector. Particularly, we extend the Deformable Part Model (DPM) to adaptive DPM, where an adaptive feature transformation defined on low-level HOG cell is learned to reduce the domain shift, and the regularization function for the detector is conducted on the transformation. The loss of the instance set and detector is measured by a cost-flow network structure which incorporates both the appearance of frame-wise detections and their spatio-temporal continuity. We demonstrate an alternating minimization procedure to optimize the model. The proposed method is evaluated on ETHZ, PETS2009 and Caltech datasets, and outperforms baseline DPM by 7 % in terms of mean miss rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.cvg.rdg.ac.uk/PETS2009/.
2.
The mean miss rate defined in P. Dollár’s toolbox is used here, which is the average miss rate at 0.0100, 0.0178, 0.0316, 0.0562, 0.1000, 0.1778, 0.3162, 0.5623 and 1.0000 false-positive-per-image.
3.
The two videos are selected as they contain more people than other videos.

References

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR. IEEE (2005)
Google Scholar
Yan, J., Lei, Z., Yi, D., Li, S.Z.: Multi-pedestrian detection in crowded scenes: A global view. In: CVPR. IEEE (2012)
Google Scholar
Maji, S., Berg, A., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: CVPR. IEEE (2008)
Google Scholar
Wang, X., Han, T., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: ICCV. IEEE (2009)
Google Scholar
Walk, S., Majer, N., Schindler, K., Schiele, B.: New features and insights for pedestrian detection. In: CVPR. IEEE (2010)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. TPAMI (2010)
Google Scholar
Park, D., Ramanan, D., Fowlkes, C.: Multiresolution models for object detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 241–254. Springer, Heidelberg (2010)
Chapter Google Scholar
Yan, J., Zhang, X., Lei, Z., Liao, S., Li, S.Z.: Robust multi-resolution pedestrian detection in traffic scenes. In: CVPR. IEEE (2013)
Google Scholar
Huang, C., Nevatia, R.: High performance object detection by collaborative learning of joint ranking of granules features. In: CVPR. IEEE (2010)
Google Scholar
Dollár, P., Belongie, S., Perona, P.: The fastest pedestrian detector in the west. In: BMVC 2010 (2010)
Google Scholar
Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: CVPR. IEEE (2012)
Google Scholar
Dollár, P., Appel, R., Kienzle, W.: Crosstalk cascades for frame-rate pedestrian detection. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 645–659. Springer, Heidelberg (2012)
Chapter Google Scholar
Yan, J., Lei, Z., Wen, L., Li, S.Z.: The fastest deformable part model for object detection. In: CVPR (2014)
Google Scholar
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: An evaluation of the state of the art. TPAMI (2012)
Google Scholar
Yang, M., Zhu, S., Lv, F., Yu, K.: Correspondence driven adaptation for human profile recognition. In: CVPR. IEEE (2011)
Google Scholar
Sharma, P., Huang, C., Nevatia, R.: Unsupervised incremental learning for improved object detection in a video. In: CVPR. IEEE (2012)
Google Scholar
Wang, X., Hua, G., Han, T.X.: Detection by detections: Non-parametric detector adaptation for a video. In: CVPR. IEEE (2012)
Google Scholar
Tang, K., Ramanathan, V., Fei-Fei, L., Koller, D.: Shifting weights: Adapting object detectors from image to video. In: NIPS (2012)
Google Scholar
Wang, M., Wang, X.: Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In: CVPR. IEEE (2011)
Google Scholar
Wang, M., Li, W., Wang, X.: Transferring a generic pedestrian detector towards specific scenes. In: CVPR. IEEE (2012)
Google Scholar
Sharma, P., Nevatia, R.: Efficient detector adaptation for improved object detection in a video. In: CVPR. IEEE (2013)
Google Scholar
Yang, Y., Shu, G., Shah, M.: Semi-supervised learning of feature hierarchies for object detection in a video. In: CVPR. IEEE (2013)
Google Scholar
Enzweiler, M., Gavrila, D.: Monocular pedestrian detection: Survey and experiments. TPAMI (2009)
Google Scholar
Geronimo, D., Lopez, A., Sappa, A., Graf, T.: Survey of pedestrian detection for advanced driver assistance systems. PAMI (2010)
Google Scholar
Roth, P.M., Sternig, S., Grabner, H., Bischof, H.: Classifier grids for robust adaptive object detection. In: CVPR. IEEE (2009)
Google Scholar
Pang, J., Huang, Q., Yan, S., Jiang, S., Qin, L.: Transferring boosted detectors towards viewpoint and scene adaptiveness. TIP (2011)
Google Scholar
Saenko, K., Kulis, B., Fritz, M., Darrell, T.: Adapting visual category models to new domains. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 213–226. Springer, Heidelberg (2010)
Chapter Google Scholar
Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: CVPR. IEEE (2011)
Google Scholar
Gao, T., Stark, M., Koller, D.: What makes a good detector? – Structured priors for learning from few examples. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 354–367. Springer, Heidelberg (2012)
Chapter Google Scholar
Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: ICCV. IEEE (2011)
Google Scholar
Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR. IEEE (2012)
Google Scholar
Pirsiavash, H., Ramanan, D.: Steerable part models. In: CVPR. IEEE (2012)
Google Scholar
Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: CVPR. IEEE (2008)
Google Scholar
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR. IEEE (2011)
Google Scholar
Berclaz, J., Fleuret, F., Fua, P.: Multiple object tracking using flow linear programming. In: PETS-Winter. IEEE (2009)
Google Scholar
Jiang, H., Fels, S., Little, J.J.: A linear programming approach for multiple object tracking. In: CVPR. IEEE (2007)
Google Scholar
Yang, B., Huang, C., Nevatia, R.: Learning affinities and dependencies for multi-target tracking using a crf model. In: CVPR. IEEE (2011)
Google Scholar
Andriyenko, A., Schindler, K.: Globally optimal multi-target tracking on a hexagonal lattice. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 466–479. Springer, Heidelberg (2010)
Chapter Google Scholar
Wen, L., Li, W., Yan, J., Lei, Z., Yi, D., Li, S.Z.: Multiple target tracking based on undirected hierarchical relation hypergraph (2014)
Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2009)
MATH Google Scholar
Ess, A., Leibe, B., Schindler, K., van Gool, L.: A mobile vision system for robust multi-person tracking. In: CVPR. IEEE (2008)
Google Scholar
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. IJCV 63(2), 153–161 (2005)
Article Google Scholar
Wojek, C., Schiele, B.: A performance evaluation of single and multi-feature people detection. In: Rigoll, G. (ed.) DAGM 2008. LNCS, vol. 5096, pp. 82–91. Springer, Heidelberg (2008)
Chapter Google Scholar
Lin, Z., Davis, L.S.: A pose-invariant descriptor for human detection and segmentation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 423–436. Springer, Heidelberg (2008)
Chapter Google Scholar
Dollár, P., Tu, Z., Tao, H., Belongie, S.: Feature mining for image classification. In: CVPR. IEEE (2007)
Google Scholar
Schwartz, W., Kembhavi, A., Harwood, D., Davis, L.: Human detection using partial least squares analysis. In: ICCV. IEEE (2009)
Google Scholar
Bar-Hillel, A., Levi, D., Krupka, E., Goldberg, C.: Part-based feature synthesis for human detection. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 127–142. Springer, Heidelberg (2010)
Chapter Google Scholar
Dollár, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: BMVC (2009)
Google Scholar
Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 301–311. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Acknowledgement

This work was supported by the Chinese National Natural Science Foundation Projects #61105023, #61103156, #61105037, #61203267, #61375037, #61473291, National Science and Technology Support Program Project #2013BAK02B01, Chinese Academy of Sciences Project No. KGZD-EW-102-2, and AuthenMetric R&D Funds.

Author information

Authors and Affiliations

Center for Biometrics and Security Research and National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Junjie Yan, Bin Yang, Zhen Lei & Stan Z. Li

Authors

Junjie Yan
View author publications
You can also search for this author in PubMed Google Scholar
Bin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Lei
View author publications
You can also search for this author in PubMed Google Scholar
Stan Z. Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhen Lei .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, J., Yang, B., Lei, Z., Li, S.Z. (2015). Adaptive Structural Model for Video Based Pedestrian Detection. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-16865-4_14
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics