Computer vision based working environment monitoring to analyze Generalized Anxiety Disorder (GAD)

Abstract

Ever advancing development in Computer Vision and Deep Learning has increased the efficacy of smart monitoring by analyzing and predicting the physical abnormalities and generating time-sensitive results. Based on the improved principles of smart monitoring and data processing, a novel computer vision assisted deep learning based posture monitoring system is proposed to predict Generalized Anxiety Disorder (GAD) oriented physical abnormalities of an individual from their working environment. We used deep learning-assisted 3D Convolutional Neural Network (CNN) technology for spatio-temporal feature extraction and Gated Recurrent Unit (GRU) model to exploit the extracted temporal dynamics for adversity scale determination. The alert-based decisions with the deliverance of the physical state helps to increase the utility of the proposed system in the healthcare or assistive-care domain. The proficiency of the system is also enhanced by storing the predicted anomaly scores in the local database of the system which can be further used for therapeutic purposes. To validate the prediction performance of the proposed system, extensive experiments are conducted on three challenging datasets, NTU RGB+D, UTD-MHAD and HMDB51. The proposed methodology achieved comparable performance by obtaining the mean accuracy of 91.88%, 94.28%, and 70.33%, respectively. Furthermore, the average prediction time taken by the proposed methodology is approximately 1.13 seconds which demonstrates the real-time monitoring efficiency of the system. The calculated outcomes show that the proposed methodology performs better contrasted with other contemporary studies for activity prediction, data processing cost, error rate, and time complexity.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. 1.

    Akula A, Shah AK, Ghosh R (2018) Deep learning approach for human action recognition in infrared images. Cogn Syst Res 50:146–154. https://doi.org/10.1016/j.cogsys.2018.04.002

    Article  Google Scholar 

  2. 2.

    Ben Mabrouk A, Zagrouba E (2018) Abnormal behavior recognition for intelligent video surveillance systems: a review. Expert Syst Appl 91:480–491

    Article  Google Scholar 

  3. 3.

    Buch N, Velastin SA, Orwell J (2011) A review of computer vision techniques for the analysis of urban traffic. IEEE Trans Intell Transp Syst 12:920–939

    Article  Google Scholar 

  4. 4.

    Cai L, Liu X, Ding H, Chen F (2018) Human action recognition using improved sparse gaussian process latent variable model and hidden conditional random filed. IEEE Access 6:20047–20057. https://doi.org/10.1109/ACCESS.2018.2822713

    Article  Google Scholar 

  5. 5.

    Cai L, Liu X, Chen F, Xiang M (2018) Robust human action recognition based on depth motion maps and improved convolutional neural network. J Electron Imaging, https://doi.org/10.1117/1.jei.27.5.051218

    Article  Google Scholar 

  6. 6.

    Carreira J, Zisserman A (2017) Quo Vadis, action recognition? a new model and the kinetics dataset. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR

  7. 7.

    Chen C, Jafari R, Kehtarnavaz N (2015) UTD-MHAD: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: Proceedings - international conference on image processing, ICIP

  8. 8.

    Chen X, Hwang JN, Meng D et al (2017) A quality-of-content-based joint source and channel coding for human detections in a mobile surveillance cloud. IEEE Trans Circuits Syst Video Technol, https://doi.org/10.1109/TCSVT.2016.2539758

    Article  Google Scholar 

  9. 9.

    Cho K, van Merrienboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. J Biol Chem 281:37275–37281. https://doi.org/10.1074/jbc.M608066200

    Article  Google Scholar 

  10. 10.

    Das Dawn D, Shaikh SH (2016) A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis Comput. https://doi.org/10.1007/s00371-015-1066-2

    Article  Google Scholar 

  11. 11.

    Das S, Koperski M, Bremond F, Francesca G (2018) A fusion of appearance based CNNs and temporal evolution of Skeleton with LSTM for daily living action recognition. arXiv Prepr

  12. 12.

    Donahue J, Hendricks L A, Guadarrama S et al (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  13. 13.

    Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  14. 14.

    Feichtenhofer C, Pinz A, Wildes RP (2017) Spatiotemporal multiplier networks for video action recognition. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR

  15. 15.

    Gao J, Yang Y, Lin P, Park DS (2018) Editorial: computer vision in healthcare applications. J Healthc Eng 2018: https://doi.org/10.1155/2018/5157020

    Article  Google Scholar 

  16. 16.

    Gupta A, Martinez J, Little JJ, Woodham R J (2014) 3D pose from motion for cross-view action recognition via non-linear circulant temporal encoding. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  17. 17.

    Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science (80-). https://doi.org/10.1126/science.1127647

    MathSciNet  Article  Google Scholar 

  18. 18.

    International Labour Organisation (2015) World employment and social outlook: trends 2015

  19. 19.

    Krishna R, Hata K, Ren F et al (2017) Dense-captioning events in videos. In: Proceedings of the IEEE international conference on computer vision

  20. 20.

    Kuehne H, Jhuang H, Garrote E et al (2011) HMDB: a large video database for human motion recognition. In: Proceedings of the IEEE international conference on computer vision

  21. 21.

    Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  22. 22.

    Li R, Zickler T (2012) Discriminative virtual views for cross-view action recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  23. 23.

    Li B, Camps O I, Sznaier M (2012) Cross-view activity recognition using Hankelets. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  24. 24.

    Li Y, Lan C, Xing J et al (2016) Online human action detection using joint classification-regression recurrent neural networks. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    Google Scholar 

  25. 25.

    Liu J, Shahroudy A, Xu D, Wang G (2016) Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    Google Scholar 

  26. 26.

    Moon T, Choi H, Lee H, Song I (2016) RNNDROP: a novel dropout for RNNS in ASR. In: 2015 IEEE workshop on automatic speech recognition and understanding, ASRU 2015 - Proceedings

  27. 27.

    Neverova N, Wolf C, Lacey G et al (2016) Learning human identity from motion patterns. IEEE Access. https://doi.org/10.1109/ACCESS.2016.2557846

    Article  Google Scholar 

  28. 28.

    Neverova N, Wolf C, Taylor G, Nebout F (2016) ModDrop: adaptive multi-modal gesture recognition. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/TPAMI.2015.2461544

    Article  Google Scholar 

  29. 29.

    Ng JYH, Hausknecht M, Vijayanarasimhan S et al (2015) Beyond short snippets: deep networks for video classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  30. 30.

    Sacchi L, Larizza C, Combi C, Bellazzi R (2007) Data mining with temporal abstractions: learning rules from time series. Data Min Knowl Discov. https://doi.org/10.1007/s10618-007-0077-7

    MathSciNet  Article  Google Scholar 

  31. 31.

    Shahroudy A, Liu J, Ng T-T, Wang G (2016) NTU RGB+D: a large scale dataset for 3D human activity analysis. https://doi.org/10.1109/CVPR.2016.115

  32. 32.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Am J Heal Pharm 75:398–406. https://doi.org/10.2146/ajhp170251

    Article  Google Scholar 

  33. 33.

    Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems

  34. 34.

    Veeriah V, Zhuang N, Qi GJ (2015) Differential recurrent neural networks for action recognition. In: Proceedings of the IEEE international conference on computer vision

  35. 35.

    Wang J, Nie X, Xia Y et al (2014) Cross-view action modeling, learning, and recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  36. 36.

    Wang L, Qiao Y, Tang X (2015) Action recognition with trajectory-pooled deep-convolutional descriptors. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

  37. 37.

    Wang X, Farhadi A, Gupta A (2015) Actions transformations, https://doi.org/10.1109/CVPR.2016.291

  38. 38.

    Wang Y, Long M, Wang J, Yu PS (2017) Spatiotemporal pyramid network for video action recognition. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR

  39. 39.

    Wang P, Li W, Li C, Hou Y (2018) Action recognition based on joint trajectory maps with convolutional neural networks. Knowledge-Based Syst, https://doi.org/10.1016/j.knosys.2018.05.029

    Article  Google Scholar 

  40. 40.

    Yang X, Luo X, Huang T et al (2018) Towards efficient and objective work sampling: recognizing workers’ activities in site surveillance videos with two-stream convolutional networks. Autom Constr 94:360–370. https://doi.org/10.1016/j.autcon.2018.07.011

    Article  Google Scholar 

  41. 41.

    Yu W, Lao XQ, Pang S et al (2013) A survey of occupational health hazards among 7,610 female workers in china’s electronics industry. Arch Environ Occup Heal. https://doi.org/10.1080/19338244.2012.701244 https://doi.org/10.1080/19338244.2012.701244

  42. 42.

    Zhang Z, Wang C, Xiao B et al (2013) Cross-view action recognition via a continuous virtual path. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition

  43. 43.

    Zheng J, Jiang Z (2013) Learning view-invariant sparse representations for cross-view action recognition. In: Proceedings of the IEEE international conference on computer vision

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ankush Manocha.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Manocha, A., Singh, R. Computer vision based working environment monitoring to analyze Generalized Anxiety Disorder (GAD). Multimed Tools Appl 78, 30457–30484 (2019). https://doi.org/10.1007/s11042-019-7700-7

Download citation

Keywords

  • Anxiety monitoring
  • Computer vision
  • Deep learning
  • 3D CNN
  • GRU
  • Medical assistance