Skip to main content

MSTN: Multistage Spatial-Temporal Network for Driver Drowsiness Detection

  • Conference paper
  • First Online:
Book cover Computer Vision – ACCV 2016 Workshops (ACCV 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10118))

Included in the following conference series:

Abstract

Recent survey has shown that drowsy driving is one of the main factors in fatal motor vehicle crashes. In this paper, given only the visual information of the driver, we propose a Multistage Spatial-Temporal Network (MSTN) to efficiently and accurately detect driver drowsiness. The proposed MSTN consists of a spatial CNN, a temporal LSTM, and then followed by a temporal smoothing. Firstly, we use the spatial CNN to effectively extract drowsiness-related features from the face region detected from each video frame. Then, we model the temporal variation of the drowsiness status by feeding a sequence of frame-level features into the Long Short Term Memory (LSTM). Finally, we conduct the temporal smoothing to smooth the predicted drowsiness scores in order to avoid noisy predictions. We evaluate the proposed MSTN using NTHU Drowsy Driver Detection Video Dataset and achieve 82.61% overall accuracy on the testing set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Donahue, J., Hendricks, L.A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: CVPR (2015)

    Google Scholar 

  2. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing coadaptation of feature detectors. arXiv:1207.0580 (2012)

  3. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)

  4. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093 (2014)

  5. Kingma, D., Ba, J.: ADAM: a method for stochastic optimization. arXiv:1412.6980 (2014)

  6. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  7. Zaremba, W., Sutskever, I.: Learning to execute. arXiv:1410.4615 (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chiou-Ting Hsu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Shih, TH., Hsu, CT. (2017). MSTN: Multistage Spatial-Temporal Network for Driver Drowsiness Detection. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10118. Springer, Cham. https://doi.org/10.1007/978-3-319-54526-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54526-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54525-7

  • Online ISBN: 978-3-319-54526-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics