A Study on Convolutional Neural Networks with Active Video Tubelets for Object Detection and Classification

Rajkumar, R.; Arunnehru, J.

doi:10.1007/978-981-13-3393-4_12

R. Rajkumar¹⁸ &
J. Arunnehru¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 898))

746 Accesses
1 Citations

Abstract

Convolutional neural networks are a powerful learning model inspired from biological concept of neurons. This deep learning model allows us to replicate the complex neural structure seen in living beings to be applied on data sets and to structure convulsions consisting of several layers. A study on convolutional neural networks have been proven to be an effective class of models for object recognition, taking those results into consideration we intend to apply convolutional neural networks for video classification in two different ways. Generalization of the results obtained by the application of convolutional neural networks on existing data sets for videos, namely Sports 1-M and YouTube object data set (YTO) and their implementation of two distinct CNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer, Cham (2014, September)
Google Scholar
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC 2009-British Machine Vision Conference, p. 124-1. BMVA Press (2009, September)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 1996–2003. IEEE (2009, June)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Null, p. 1470. IEEE (2003, October)
Google Scholar
Niebles, J.C., Chen, C.W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: European Conference on Computer Vision, pp. 392–405. Springer, Berlin, Heidelberg (2010, September)
Chapter Google Scholar
Wang, H., KlÃd’ser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE (2011, June)
Google Scholar
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Article Google Scholar
DollÃąr, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005, pp. 65–72. IEEE (2005, October)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8. IEEE (2008, June)
Google Scholar
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: International Workshop on Human Behavior Understanding, pp. 29–39. Springer, Berlin, Heidelberg (2011, November)
Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Taylor, G.W., Fergus, R., LeCun, Y., Bregler, C.: Convolutional learning of spatio-temporal features. In: European Conference on Computer Vision, pp. 140–153. Springer, Berlin, Heidelberg (2010, September)
Chapter Google Scholar
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368. IEEE (2011, June)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Kang, K., Ouyang, W., Li, H., Wang, X.: Object detection from video tubelets with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 817–825 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, SRM Institute of Science and Technology, Vadapalani Campus, Chennai, Tamil Nadu, India
R. Rajkumar & J. Arunnehru

Authors

R. Rajkumar
View author publications
You can also search for this author in PubMed Google Scholar
J. Arunnehru
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Arunnehru .

Editor information

Editors and Affiliations

Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ, USA
Jiacun Wang
Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangaluru, Karnataka, India
G. Ram Mohana Reddy
Department of Computer Science and Engineering, JNTUH College of Engineering Hyderabad, Hyderabad, Telangana, India
V. Kamakshi Prasad
Department of Electronics and Communication Engineering, Malla Reddy College of Engineering & Technology, Secunderabad, Telangana, India
V. Sivakumar Reddy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rajkumar, R., Arunnehru, J. (2019). A Study on Convolutional Neural Networks with Active Video Tubelets for Object Detection and Classification. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing . Advances in Intelligent Systems and Computing, vol 898. Springer, Singapore. https://doi.org/10.1007/978-981-13-3393-4_12

Download citation

DOI: https://doi.org/10.1007/978-981-13-3393-4_12
Published: 14 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3392-7
Online ISBN: 978-981-13-3393-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics