Bag of Deep Features for Instructor Activity Recognition in Lecture Room

Nida, Nudrat; Yousaf, Muhammad Haroon; Irtaza, Aun; Velastin, Sergio A.

doi:10.1007/978-3-030-05716-9_39

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11296))

Included in the following conference series:

International Conference on Multimedia Modeling

2190 Accesses

Abstract

This research aims to explore contextual visual information in the lecture room, to assist an instructor to articulate the effectiveness of the delivered lecture. The objective is to enable a self-evaluation mechanism for the instructor to improve lecture productivity by understanding their activities. Teacher’s effectiveness has a remarkable impact on uplifting students performance to make them succeed academically and professionally. Therefore, the process of lecture evaluation can significantly contribute to improve academic quality and governance. In this paper, we propose a vision-based framework to recognize the activities of the instructor for self-evaluation of the delivered lectures. The proposed approach uses motion templates of instructor activities and describes them through a Bag-of-Deep features (BoDF) representation. Deep spatio-temporal features extracted from motion templates are utilized to compile a visual vocabulary. The visual vocabulary for instructor activity recognition is quantized to optimize the learning model. A Support Vector Machine classifier is used to generate the model and predict the instructor activities. We evaluated the proposed scheme on a self-captured lecture room dataset, IAVID-1. Eight instructor activities: pointing towards the student, pointing towards board or screen, idle, interacting, sitting, walking, using a mobile phone and using a laptop, are recognized with an 85.41% accuracy. As a result, the proposed framework enables instructor activity recognition without human intervention.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Ijjina, E.P., Chalavadi, K.M.: Human action recognition using genetic algorithms and convolutional neural networks. Pattern Recognit. 59, 199–212 (2016)
Article Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Kim, H.-J., Lee, J.S., Yang, H.-S.: Human action recognition using a modified convolutional neural network. In: Liu, D., Fei, S., Hou, Z., Zhang, H., Sun, C. (eds.) ISNN 2007. LNCS, vol. 4492, pp. 715–723. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72393-6_85
Chapter Google Scholar
Knol, M.H., Dolan, C.V., Mellenbergh, G.J., van der Maas, H.L.: Measuring the quality of university lectures: development and validation of the instructional skills questionnaire (ISQ). PloS One 11(2), e0149163 (2016)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, W., Wen, L., Chang, M.C., Lim, S.N., Lyu, S.: Adaptive RNN tree for large-scale human action recognition. In: ICCV, pp. 1453–1461 (2017)
Google Scholar
Murtaza, F., Yousaf, M.H., Velastin, S.A.: Multi-view human action recognition using 2D motion templates based on MHIS and their hog description. IET Comput. Vis. 10(7), 758–767 (2016)
Article Google Scholar
Murtaza, F., Yousaf, M.H., Velastin, S.A.: PMHI: proposals from motion history images for temporal segmentation of long uncut videos. IEEE Signal Process. Lett. 25(2), 179–183 (2018)
Article Google Scholar
Nazir, S., Yousaf, M.H., Nebel, J.C., Velastin, S.A.: A bag of expression framework for improved human action recognition. Pattern Recognit. Lett. 103, 39–45 (2018)
Article Google Scholar
Nazir, S., Yousaf, M.H., Velastin, S.A.: Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition. Computers & Electrical Engineering (2018)
Google Scholar
Ning, F., Delhomme, D., LeCun, Y., Piano, F., Bottou, L., Barbano, P.E.: Toward automatic phenotyping of developing embryos from videos. IEEE Trans. Image Process. 14(9), 1360–1371 (2005)
Article Google Scholar
O’Hara, S., Draper, B.A.: Introduction to the bag of features paradigm for image classification and retrieval. arXiv preprint arXiv:1101.3354 (2011)
Orrite, C., Rodriguez, M., Herrero, E., Rogez, G., Velastin, S.A.: Automatic segmentation and recognition of human actions in monocular sequences. In: 2014 22nd International Conference on Pattern Recognition (ICPR), pp. 4218–4223. IEEE (2014)
Google Scholar
Raza, A., Yousaf, M.H., Sial, H.A., Raja, G.: HMM-based scheme for smart instructor activity recognition in a lecture room environment. SmartCR 5(6), 578–590 (2015)
Article Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Wang, Y., Mori, G.: Human action recognition by semilatent topic models. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1762–1774 (2009)
Article Google Scholar
Yousaf, M.H., Azhar, K., Sial, H.A.: A novel vision based approach for instructor’s performance and behavior analysis. In: 2015 International Conference on Communications, Signal Processing, and Their Applications (ICCSPA), pp. 1–6. IEEE (2015)
Google Scholar
Yousaf, M.H., Habib, H.A., Azhar, K.: Fuzzy classification of instructor morphological features for autonomous lecture recording system. Inf. J. 16(8), 6367 (2013)
Google Scholar
Zhu, F., Shao, L., Xie, J., Fang, Y.: From handcrafted to learned representations for human action recognition: a survey. Image Vis. Comput. 55, 42–52 (2016)
Article Google Scholar

Download references

Acknowledgements

Sergio A Velastin has received funding from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 600371, el Ministerio de Economía, Industria y Competitividad (COFUND2014-51509) el Ministerio de Educación, cultura y Deporte (CEI-15-17) and Banco Santander.

Author information

Authors and Affiliations

Department of Computer Engineering, University of Engineering and Technology, Taxila, Pakistan
Nudrat Nida & Muhammad Haroon Yousaf
Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan
Aun Irtaza
Department of Computer Science, Applied Artificial Intelligence Research Group, University Carlos III de Madrid, 28270, Madrid, Spain
Sergio A. Velastin
Cortexica Vision Systems Ltd., London, SE1 9LQ, UK
Sergio A. Velastin
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, E1 4NS, UK
Sergio A. Velastin

Authors

Nudrat Nida
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Haroon Yousaf
View author publications
You can also search for this author in PubMed Google Scholar
Aun Irtaza
View author publications
You can also search for this author in PubMed Google Scholar
Sergio A. Velastin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Haroon Yousaf .

Editor information

Editors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Ioannis Kompatsiaris
EURECOM, Sophia Antipolis, France
Benoit Huet
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Vasileios Mezaris
Dublin City University, Dublin, Ireland
Cathal Gurrin
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nida, N., Yousaf, M.H., Irtaza, A., Velastin, S.A. (2019). Bag of Deep Features for Instructor Activity Recognition in Lecture Room. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-05716-9_39
Published: 11 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05715-2
Online ISBN: 978-3-030-05716-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics