Large Scale Deep Network Architecture of CNN for Unconstraint Visual Activity Analytics

Kumar, Naresh

doi:10.1007/978-3-319-76348-4_25

Naresh Kumar^18,19

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 736))

Included in the following conference series:

International Conference on Intelligent Systems Design and Applications

1837 Accesses
1 Citations

Abstract

Handling the issues of massive datasets for information retrieval, feature learning, is expected one of the most challenging problems in machine learning and computer vision research. The issues in this work, have been focused to maintain the data scalability problems for machine learning classifiers in social media activity analysis. The research highlights the machine learning performance techniques which can provide promising results against the large and unstructured complex data of social media activities. This work has been focused on the biologically inspired processing techniques by neural network and introduces the extension of this network to resolve the problems of complex data pertaining to human activity analysis. It is presented various architectures of CNN and several phases of visual data processing for detection and recognition problems. Some selected techniques are highlighted that create the interest for deep network learning in various domains of research under the consideration of complex data handlings. It has been introduced activation functions and sequence pooling methodology for fast training of convolutional network with massive data of unstructured human activity recognition. Overall, it is highlighted that fast training aspects of the network against large scale and complex data, can be improved by choosing activation function and pooling methodology at fully connected layers of the neural network. Moreover, the sounding techniques of deep learning and data analytics are highly applicable for human health, medicine, robotics, education and industrial applications.

N. Kumar—IEEE Member.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gantz, J., Reinsel, D.: Extracting Value from Chaos. EMC, Hopkinton (2011)
Google Scholar
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ng, A.Y.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Google Scholar
Ma, C.Y., Chen, M.H., Kira, Z., AlRegib, G.: TS-LSTM and temporal-inception: exploiting spatiotemporal dynamics for activity recognition. arXiv preprint arXiv:1703.10667 (2017)
Fernando, B., Gavves, E., Oramas, J.M., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5378–5387 (2015)
Google Scholar
Song, Y., Morency, L.P., Davis, R.: Action recognition by hierarchical sequence summarization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3562–3569 (2013)
Google Scholar
Can, E.F., Manmatha, R.: Formulating action recognition as a ranking problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 251–256 (2013)
Google Scholar
Huang, F., Ash, J., Langford, J., Schapire, R.: Learning deep ResNet blocks sequentially using boosting theory. arXiv preprint arXiv:1706.04964 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 807–814 (2010)
Google Scholar
Rahmani, H., Mian, A., Shah, M.: Learning a deep model for human action recognition from novel viewpoints. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 667–681 (2017)
Article Google Scholar
Liu, A.A., Xu, N., Nie, W.Z., Su, Y.T., Wong, Y., Kankanhalli, M.: Benchmarking a multimodal and multiview and interactive dataset for human action recognition. IEEE Trans. Cybern. 47(7), 1781–1794 (2017)
Article Google Scholar
Yu, S., Cheng, Y., Su, S., Cai, G., Li, S.: Stratified pooling based deep convolutional neural networks for human action recognition. Multimed. Tools Appl. 76(11), 13367–13382 (2017)
Article Google Scholar
Liu, A.A., Su, Y.T., Nie, W.Z., Kankanhalli, M.: Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 102–114 (2017)
Article Google Scholar
Idrees, H., Zamir, A.R., Jiang, Y.G., Gorban, A., Laptev, I., Sukthankar, R., Shah, M.: The THUMOS challenge on action recognition for videos “in the wild”. Comput. Vis. Image Underst. 155, 1–23 (2017)
Article Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Xia, L., Chen, C.C., Aggarwal, J.K.: View invariant human action recognition using histograms of 3D joints. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20–27. IEEE, June 2012
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-CNN. arXiv preprint arXiv:1703.06870 (2017)
Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 14, 1–20 (2017)
Google Scholar
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)
Google Scholar
Chua, L.O., Roska, T.: The CNN paradigm. IEEE Trans. Circuits Syst. I Fundam. Theor. Appl. 40(3), 147–156 (1993)
Article MATH Google Scholar
Girshick, R.: Fast r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Naresh Babu, K.V., Edla, D.R.: New algebraic activation function for multi-layered feed forward neural networks. IETE J. Res. 63(1), 71–79 (2017)
Article Google Scholar
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)
Google Scholar
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep learning with s-shaped rectified linear activation units. In: AAAI, pp. 1737–1743 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Society of India (CSI), Delhi, India
Naresh Kumar
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, 247667, India
Naresh Kumar

Authors

Naresh Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naresh Kumar .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs , Auburn, Washington, USA
Ajith Abraham
Department of Computer Science, South Asian University, Chanakyapuri, Delhi, India
Pranab Kr. Muhuri
Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka , Durian Tunggal, Melaka, Malaysia
Azah Kamilah Muda
Machine Intelligence Research Labs , Auburn, Washington, USA
Niketa Gandhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, N. (2018). Large Scale Deep Network Architecture of CNN for Unconstraint Visual Activity Analytics. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-76348-4_25
Published: 22 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics