Driver Fatigue Detection Using Multitask Cascaded Convolutional Networks

Liu, Xiaoshuang; Fang, Zhijun; Liu, Xiang; Zhang, Xiangxiang; Gu, Jianrong; Xu, Qi

doi:10.1007/978-3-319-68121-4_15

Xiaoshuang Liu¹⁸,
Zhijun Fang¹⁸,
Xiang Liu¹⁸,
Xiangxiang Zhang¹⁸,
Jianrong Gu¹⁹ &
…
Qi Xu²⁰

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 510))

Included in the following conference series:

International Conference on Intelligence Science

1675 Accesses
2 Citations

Abstract

Driving fatigue is one of the main reasons of traffic accidents. In this paper, we apply the multitask cascaded convolutional networks to face detection and alignment in order to ensure the accuracy and real-time of the algorithm. Afterwards another convolution neural network (CNN) is used for eye state recognition. Finally, we calculate the percentage of eyelid closure (PERCLOS) to detect the fatigue. The experimental results show that the proposed method has high recognition accuracy of eye state and can detect the fatigue effectively in real- time.

You have full access to this open access chapter, Download conference paper PDF

Deep Learning Based Driver’s Fatigue Detection Framework

Driver Fatigue Detection via Eye State Analyses Based on Deep Learning Approach

On Fatigue Driving Detection System Based on Deep Learning

Keywords

1 Introduction

Along with the development of the auto industry and the transportation industry, traffic accidents have caused great loss in the property and damage to the society. Amongst these traffic accidents more than 20% of these traffic accidents are caused by fatigue driving. Safe driving has become a hot issue in today’s society, Therefore, it is of great significance to develop a real-time and accurate fatigue detection system to send fatigue warning information when the driver is tired, which can effectively reduce the occurrence of traffic accidents.

At present, fatigue testing contains three main directions. First, fatigue detection based on the vehicle state detection method, mainly through the turning angle, vehicle driving speed to detect whether the driver fatigue, this method is subject to external interference, the detection accuracy has a greater impact. Second, based on driver’s physiological information [7], mainly by detecting the driver’s heart rate, pulse and other physiological signals to determine whether the driver is in a state of fatigue, This method requires the driver to carry a lot of testing equipment, very cumbersome, and the driver has a great interference. Third, fatigue detection methods based on computer vision [6, 8,9,10], this method is a non-intrusive way, the facial features can be calculated by analyzing the changes of facial expression, such as eye closure duration, yawning and so on.

In the fatigue detection, driver face detection and alignment are important. The multitask cascaded convolutional networks to face detection and alignment [1] has proven to be an effective method. Another very important step is the detection of human eye state. Compared to the traditional active infrared radiation method [2], normal camera image employs a safer passive way. To detect the state of eyes, There are many methods, such as AdaBoost classifier [3], SVM classifier [4] and so on. However, their ability of expressing features is limited. Recently, convolutional neural network (CNN) achieve remarkable progresses in a variety of computer vision tasks. In our paper, we design a driver fatigue detection system using multitask cascaded convolutional networks. As shown in Fig. 1, the method mainly includes five parts: Joint face detection and alignment using multitask cascaded convolutional networks, normalize the current image and ground truth shape according to the scaled mean shape, extract the area of eye, state of eye recognition, fatigue detection.

2 Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks

Fatigue detection system should have high recognition accuracy and can detect the fatigue effectively in real-time. How to quickly and accurately detect the face of the driver and the eye alignment and overcome the impact of a certain light are the difficulties of fatigue detection system. Kaipeng et al. [1] propose a new cascaded CNNs-based framework for joint face detection and alignment, and carefully design lightweight CNN architecture for real-time performance. The overall pipeline is shown in Fig. 2, which is the input of the following three-stage cascaded framework.

Stage 1: Exploit a fully convolutional network, called proposal network (P-Net), to obtain the candidate facial windows and their bounding box regression vectors. Then candidates are calibrated based on the estimated bounding box regression vectors. After that, employ nonmaximum suppression (NMS) to merge highly overlapped candidates.

Stage 2: All candidates are fed to another CNN, called refine network (R-Net), which further rejects a large number of false candidates, performs calibration with bounding box regression, and conducts NMS.

Stage 3: This stage is similar to the second stage, but in this stage we aim to identify face regions with more supervision. In particular, the network will output five facial landmarks’ positions.

3 Extraction Area Eye

3.1 Face Normalization

In order to accurately extract the eye areas, we need to calculate the average face. Then normalize the current image and ground truth shape according to the scaled mean shape, this process is 2D affine transformation. The 2D affine transformation is a method used to change the rotation angle, the scale, and the location of a shape. The transformation can be represented as Eq. (1).

$$ \left\{ {\begin{array}{*{20}l} {x = ax^{{\prime }} + by^{{\prime }} + c} \hfill \\ {y = dx^{{\prime }} + ey^{{\prime }} + f} \hfill \\ \end{array} } \right. $$

(1)

Where $ (x_{i} ,y_{i} )^{T} $ is the coordinate of the ith feature point on the average face, $ (x_{i}^{{\prime }} ,y_{i}^{{\prime }} )^{T} $ is the coordinate of the ith feature point on the detected face. It has a matrix representation shown as Eq. (2).

$$ \left[ {\begin{array}{*{20}c} {x_{i} } \\ {y_{i} } \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} a & b & c \\ d & e & f \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{i}^{{\prime }} } \\ {y_{i}^{{\prime }} } \\ 1 \\ \end{array} } \right] = M\left[ {\begin{array}{*{20}c} {x_{i}^{{\prime }} } \\ {y_{i}^{{\prime }} } \\ 1 \\ \end{array} } \right] $$

(2)

For convenience, Eq. (2) can be rewritten as Eq. (3).

$$ U = Kh $$

(3)

Where U is the feature point matrix of the average face, K is the feature point matrix of the detected face. h is affine transformation matrix. It can be calculated with least squares solution. Then, the solution of h can be obtained as Eq. (4).

$$ h = \left( {K^{T} K} \right)^{ - 1} K^{T} U $$

(4)

Normalize the current image and ground truth shape.

According to the scaled mean shape aimed at change the detected faces’ rotation angle, the scale, and the location of a shape. As shown in Fig. 3.

3.2 Eye Area Extraction

In this paper, we extract the area of eyes based on the facial landmarks after normalization as shown in Fig. 4. The eye area has a size of 32 × 32.

4 Eye State Recognition

CNN expresses features more better, avoiding the manual feature selection. So we used convolutional neural network to detect the state of eyes.

4.1 Convolutional Neural Network

To have high recognition accuracy of state of eyes and can detect the fatigue effectively real-time, three convolutional layers are used in our proposed network as shown in Fig. 5. Each convolution layer connects a pooling layer, the first convolution layer is connected with a max pooling, the last two convolutions are connected with average pooling. The ReLU layers add non-linear constraints and the Dropout layers prevents overfitting in the networks.

4.2 Activation Functions

Sigmoid function and tanh function are commonly used non-linear activation functions, but these functions exist the gradient vanishing, So we use the ReLU function (Rectified linear unit) which is defined as Eq. (5).

$$ f(x) = \left\{ {\begin{array}{*{20}l} x \hfill & {if\,x \ge 0} \hfill \\ 0 \hfill & {if\,x < 0} \hfill \\ \end{array} } \right. $$

(5)

ReLU can effectively alleviate the problem of gradient vanishing, So as to train the deep neural network directly in a supervised manner. The network can get sparse expression after the ReLU function, with the advantage of unilateral suppression.

5 Fatigue Detection Based on PERCLOS

After eye area extraction, the next step is to detect driver fatigue based on PERCLOS (percentage of eyelid closure over the pupil over time). PERCLOS is an established parameter to detect the level of drowsiness. Level of drowsiness can be judged based on the PERCLOS threshold value, PERCLOS is a parameter that is used to detect driver fatigue [5]. It is calculated as (6).

$$ f_{PERCLOS} = \frac{{n_{close} }}{{N_{total} }} \times 100\% $$

(6)

Let $ n_{close} $ be the number of eye-close frames over a period time. $ N_{total} $ is the total number of frames over a period time. When the driver is in a state of fatigue, the driver’s PERCLOS value will be higher than normal. We set the PERCLOS threshold, when the driver’s PERCLOS value is higher than this threshold, then the current driver is considered fatigue.

6 Experiment and Results

VS2013, running on a Win7 system with Intel (R) Core(TM) i7-6700HQ, CPUs (3.40 GHz), 32 GB memory, GPU NVNID GeForce GTX 1070.

6.1 Train

In order to overcome the influence of light on image, the training data must contain data for different light intensities to enhance the robustness of the network, as shown in Fig. 6.

Since we perform eye state recognition, here we use the following two different kinds of data annotation in our training process:

(1)
negatives: 36 × 36 sample area was randomly intercepted near the eye area, regions whose the intersection-over-union (IoU) ratio is less than 0.4 to any ground-truth eyes as shown in Fig. 7.
Fig. 7.
Negatives training samples.
Full size image
(2)
positives: Positive samples are divided into two types, open eyes samples and closed eyes samples, their IoU above 0.6 to a ground truth face, as shown in Fig. 8.
Fig. 8.
Positives training samples.
Full size image

6.2 Training Results

We select images including eye images of open and closed as positives samples, and randomly crop several patches to collect negatives samples. We select 120000 images as training samples. The eye state recognition rate of the network has an increase in the number of iterations when training the samples, the result is shown in Fig. 9.

With the increase of the iteration number, the accuracy rate gradually increased, the final accuracy rate between 0.995 to 0.996 fluctuations. In order to test the performance of the network, we collected three sections of video data, respectively, the accuracy rate shown in the Table 1.

Table 1. The test result of eye state.

Full size table

Through statistical 5 tests videos includes 1239 frames of 320 * 240 images, computing the average time-consuming of the method include each module and overall time. Table 2 is the time-consuming result. The method complies with the requirement of real-time.

Table 2. The test of time consuming.

Full size table

6.3 Fatigue Detection Based on PERCLOS

When the driver is in a state of fatigue, the driver’s PERCLOS value will be higher than normal, by setting the PERCLOS threshold, when the driver’s PERCLOS value is higher than this threshold, then the current driver is considered fatigue. In this paper, the PERCLOS threshold is set to 0.30, when the driver is fatigue, the PERCLOS value is bigger than 0.30, Fig. 10 shows PERCLOS result.

Figure 11 shows the Sample images of detection results.

7 Conclusion

In this paper we propose a driver fatigue detection system. This system uses the multitask cascaded convolutional networks to face detection and alignment. And then use another convolution neural network (CNN) for eye state recognition. Finally we calculate the percentage of eyelid closure (PERCLOS) to detect the fatigue. The method of eye state recognition provides high accuracy and can detect the fatigue effectively in real-time. Tests show that the system implementation is successful and the system does indeed infer fatigue reliably.

References

Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sig. Process. Lett. 10, 1499–1503 (2016)
Article Google Scholar
Zhang, F., Su, J., Geng, L., Xiao, Z.: Driver fatigue detection based on eye state recognition. In: 2017 International Conference on Machine Vision and Information Technology (CMVIT), Singapore, pp. 105–110 (2017)
Google Scholar
Shi-Feng, X.U., Zeng, Y.: Eyes state detection based on adaboost algorithm. Comput. Simul. 7, 214–217 (2007)
Google Scholar
Yao, S., Li, X., Zhang, W., Zhou, J.: Eyes state detection method based on LBP. Appl. Res. Comput. 6, 1897–1901 (2015)
Google Scholar
Wierwille, W.W., Ellsworth, L.A., Wreggit, S.S., Fairbanks, R.J., Kim, C.L.: Research on vehicle-based driver status performance monitoring: development, validation, and refinement of algorithms for detection of driver drowsiness. National Highway Traffic Safety Administration (1994)
Google Scholar
Liu, A., Li, Z., Wang, L., Zhao, Y.: A practical driver fatigue detection algorithm based on eye state. In: 2010 Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics (PrimeAsia), Shanghai, pp. 235–238 (2010)
Google Scholar
Wang. Y., Liu, X., Zhang, Y., Zhu, Z., Liu, D., Sun, J.: Driving fatigue detection based on EEG signal. In: 2015 Fifth International Conference on Instrumentation and Measurement, Computer, Communication and Control (IMCCC), Qinhuangdao, pp. 715–718 (2015)
Google Scholar
Zhao, G.F., Han, A.X.: Method of detecting logistics driver’s fatigue state based on computer vision. In: 2015 International Conference on Computer Science and Applications (CSA), Wuhan, pp. 60–63 (2015)
Google Scholar
Li, X., Wu, Q., Kou, Y., Hou, L., Xie, H.: Driver’s eyes state detection based on adaboost algorithm and image complexity. In: 2015 Sixth International Conference on Intelligent Systems Design and Engineering Applications (ISDEA), Guiyang, pp. 349–352 (2015)
Google Scholar
Guohui, H., Wanying, W.: An algorithm for fatigue driving face detection and location. In: 2015 8th International Conference on Intelligent Computation Technology and Automation (ICICTA), Nanchang, pp. 130–132 (2015)
Google Scholar

Download references

Acknowledgments

This research was partially supported by the National Nature Science Foundation of China (No. 61461021) and Local Colleges Faculty Construction of Shanghai MSTC (No. 15590501300).

Author information

Authors and Affiliations

School of Electronic and Electric Engineering, Shanghai University of Engineering Science, Shanghai, China
Xiaoshuang Liu, Zhijun Fang, Xiang Liu & Xiangxiang Zhang
Information Center, Shanghai University of Engineering Science, Shanghai, China
Jianrong Gu
College of Information Engineering, Shanghai Maritime University, Shanghai, China
Qi Xu

Authors

Xiaoshuang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhijun Fang
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiangxiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianrong Gu
View author publications
You can also search for this author in PubMed Google Scholar
Qi Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhijun Fang .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Zhongzhi Shi
Machine Intelligence Research Institute, Rockville, Maryland, USA
Ben Goertzel
Shanghai Maritime University, Shanghai, China
Jiali Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Fang, Z., Liu, X., Zhang, X., Gu, J., Xu, Q. (2017). Driver Fatigue Detection Using Multitask Cascaded Convolutional Networks. In: Shi, Z., Goertzel, B., Feng, J. (eds) Intelligence Science I. ICIS 2017. IFIP Advances in Information and Communication Technology, vol 510. Springer, Cham. https://doi.org/10.1007/978-3-319-68121-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-68121-4_15
Published: 27 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68120-7
Online ISBN: 978-3-319-68121-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)