Predicting Driver Attention in Critical Situations

Xia, Ye; Zhang, Danqing; Kim, Jinkyu; Nakayama, Ken; Zipser, Karl; Whitney, David

doi:10.1007/978-3-030-20873-8_42

Ye Xia¹⁸,
Danqing Zhang¹⁸,
Jinkyu Kim¹⁸,
Ken Nakayama¹⁸,
Karl Zipser¹⁸ &
…
David Whitney¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11365))

Included in the following conference series:

Asian Conference on Computer Vision

3166 Accesses
43 Citations

Abstract

Robust driver attention prediction for critical situations is a challenging computer vision problem, yet essential for autonomous driving. Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol—tracking eye movements during driving. Here, we first propose a new in-lab driver attention collection protocol and introduce a new driver attention dataset, Berkeley DeepDrive Attention (BDD-A) dataset, which is built upon braking event videos selected from a large-scale, crowd-sourced driving video dataset. We further propose Human Weighted Sampling (HWS) method, which uses human gaze behavior to identify crucial frames of a driving dataset and weights them heavily during model training. With our dataset and HWS, we built a driver attention prediction model that outperforms the state-of-the-art and demonstrates sophisticated behaviors, like attending to crossing pedestrians but not giving false alarms to pedestrians safely walking on the sidewalk. Its prediction results are nearly indistinguishable from ground-truth to humans. Although only being trained with our in-lab attention data, the model also predicts in-car driver attention data of routine driving with state-of-the-art accuracy. This result not only demonstrates the performance of our model but also proves the validity and usefulness of our dataset and data collection protocol.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alletto, S., Palazzi, A., Solera, F., Calderara, S., Cucchiara, R.: DR(eye)VE: a dataset for attention-based tasks with applications to autonomous and assisted driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 54–60 (2016)
Google Scholar
Bazzani, L., Larochelle, H., Torresani, L.: Recurrent mixture density network for spatiotemporal visual attention. arXiv preprint arXiv:1603.08199 (2016)
Bruce, N., Tsotsos, J.: Saliency based on information maximization. In: Advances in Neural Information Processing Systems, pp. 155–162 (2006)
Google Scholar
Bruce, N.D., Tsotsos, J.K.: Saliency, attention, and visual search: an information theoretic approach. J. Vis. 9(3), 5–5 (2009)
Article Google Scholar
Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41, 740–757 (2018)
Article Google Scholar
Cavanagh, P., Alvarez, G.A.: Tracking multiple targets with multifocal attention. Trends Cogn. Sci. 9(7), 349–354 (2005)
Article Google Scholar
Cornelissen, F.W., Peters, E.M., Palmer, J.: The eyelink toolbox: eye tracking with matlab and the psychophysics toolbox. Behav. Res. Methods Instrum. Comput. 34(4), 613–617 (2002)
Article Google Scholar
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. arXiv preprint arXiv:1611.09571 (2016)
Erdem, E., Erdem, A.: Visual saliency estimation by nonlinearly integrating features using region covariances. J. Vis. 13(4), 11–11 (2013)
Article Google Scholar
Fridman, L., Langhans, P., Lee, J., Reimer, B.: Driver gaze region estimation without use of eye movement. IEEE Intell. Syst. 31(3), 49–56 (2016)
Article Google Scholar
Groner, R., Walder, F., Groner, M.: Looking at faces: local and global aspects of scanpaths. In: Advances in Psychology, vol. 22, pp. 523–533. Elsevier (1984)
Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Advances in Neural Information Processing Systems, pp. 545–552 (2007)
Google Scholar
Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 262–270 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Kümmerer, M., Theis, L., Bethge, M.: Deep Gaze I: boosting saliency prediction with feature maps trained on ImageNet. In: International Conference on Learning Representations (ICLR 2015) (2015)
Google Scholar
Kümmerer, M., Wallis, T.S., Bethge, M.: DeepGaze II: reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563 (2016)
Liu, N., Han, J., Zhang, D., Wen, S., Liu, T.: Predicting eye fixations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 362–370 (2015)
Google Scholar
Liu, Y., Zhang, S., Xu, M., He, X.: Predicting salient face in multiple-face videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4420–4428 (2017)
Google Scholar
Mannan, S., Ruddock, K., Wooding, D.: Fixation sequences made during visual examination of briefly presented 2D images. Spat. Vis. 11(2), 157–178 (1997)
Article Google Scholar
Murray, N., Vanrell, M., Otazu, X., Parraga, C.A.: Saliency estimation using a non-parametric low-level vision model. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 433–440. IEEE (2011)
Google Scholar
Palazzi, A., Solera, F., Calderara, S., Alletto, S., Cucchiara, R.: Learning where to attend like a human driver. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 920–925. IEEE (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525. IEEE (2017)
Google Scholar
Rizzolatti, G., Riggio, L., Dascola, I., Umiltá, C.: Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention. Neuropsychologia 25(1), 31–40 (1987)
Article Google Scholar
Simon, L., Tarel, J.P., Brémond, R.: Alerting the drivers about road signs with poor visual saliency. In: 2009 IEEE Intelligent Vehicles Symposium, pp. 48–53. IEEE (2009)
Google Scholar
Tawari, A., Kang, B.: A computational framework for driver’s visual attention using a fully convolutional architecture. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 887–894. IEEE (2017)
Google Scholar
Thomas, C.L.: OpenSalicon: an open source implementation of the salicon saliency model. Technical report. TR-2016-02, University of Pittsburgh (2016)
Google Scholar
Underwood, G., Humphrey, K., Van Loon, E.: Decisions about objects in real-world scenes are influenced by visual saliency before and during their inspection. Vis. Res. 51(18), 2031–2038 (2011)
Article Google Scholar
Valenti, R., Sebe, N., Gevers, T.: Image saliency by isocentric curvedness and color. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2185–2192. IEEE (2009)
Google Scholar
Wei, Y., Wen, F., Zhu, W., Sun, J.: Geodesic Saliency Using Background Priors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 29–42. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_3
Chapter Google Scholar
Xu, H., Gao, Y., Yu, F., Darrell, T.: End-to-end learning of driving models from large-scale video datasets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Yu, F., et al.: BDD100K: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 (2018)
Zhang, J., Sclaroff, S.: Saliency detection: a boolean map approach. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 153–160. IEEE (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Berkeley, CA, 94720, USA
Ye Xia, Danqing Zhang, Jinkyu Kim, Ken Nakayama, Karl Zipser & David Whitney

Authors

Ye Xia
View author publications
You can also search for this author in PubMed Google Scholar
Danqing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jinkyu Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ken Nakayama
View author publications
You can also search for this author in PubMed Google Scholar
Karl Zipser
View author publications
You can also search for this author in PubMed Google Scholar
David Whitney
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ye Xia .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C.V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

1 Electronic supplementary material

Supplementary material 1 (pdf 82 KB)

Supplementary material 1 (mp4 8373 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, Y., Zhang, D., Kim, J., Nakayama, K., Zipser, K., Whitney, D. (2019). Predicting Driver Attention in Critical Situations. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11365. Springer, Cham. https://doi.org/10.1007/978-3-030-20873-8_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-20873-8_42
Published: 26 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20872-1
Online ISBN: 978-3-030-20873-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics