Abstract
Handling the issues of massive datasets for information retrieval, feature learning, is expected one of the most challenging problems in machine learning and computer vision research. The issues in this work, have been focused to maintain the data scalability problems for machine learning classifiers in social media activity analysis. The research highlights the machine learning performance techniques which can provide promising results against the large and unstructured complex data of social media activities. This work has been focused on the biologically inspired processing techniques by neural network and introduces the extension of this network to resolve the problems of complex data pertaining to human activity analysis. It is presented various architectures of CNN and several phases of visual data processing for detection and recognition problems. Some selected techniques are highlighted that create the interest for deep network learning in various domains of research under the consideration of complex data handlings. It has been introduced activation functions and sequence pooling methodology for fast training of convolutional network with massive data of unstructured human activity recognition. Overall, it is highlighted that fast training aspects of the network against large scale and complex data, can be improved by choosing activation function and pooling methodology at fully connected layers of the neural network. Moreover, the sounding techniques of deep learning and data analytics are highly applicable for human health, medicine, robotics, education and industrial applications.
N. Kumar—IEEE Member.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gantz, J., Reinsel, D.: Extracting Value from Chaos. EMC, Hopkinton (2011)
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ng, A.Y.: Large scale distributed deep networks. In: Advances in Neural Information Processing Systems, pp. 1223–1231 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Ma, C.Y., Chen, M.H., Kira, Z., AlRegib, G.: TS-LSTM and temporal-inception: exploiting spatiotemporal dynamics for activity recognition. arXiv preprint arXiv:1703.10667 (2017)
Fernando, B., Gavves, E., Oramas, J.M., Ghodrati, A., Tuytelaars, T.: Modeling video evolution for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5378–5387 (2015)
Song, Y., Morency, L.P., Davis, R.: Action recognition by hierarchical sequence summarization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3562–3569 (2013)
Can, E.F., Manmatha, R.: Formulating action recognition as a ranking problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 251–256 (2013)
Huang, F., Ash, J., Langford, J., Schapire, R.: Learning deep ResNet blocks sequentially using boosting theory. arXiv preprint arXiv:1706.04964 (2017)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-2010), pp. 807–814 (2010)
Rahmani, H., Mian, A., Shah, M.: Learning a deep model for human action recognition from novel viewpoints. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 667–681 (2017)
Liu, A.A., Xu, N., Nie, W.Z., Su, Y.T., Wong, Y., Kankanhalli, M.: Benchmarking a multimodal and multiview and interactive dataset for human action recognition. IEEE Trans. Cybern. 47(7), 1781–1794 (2017)
Yu, S., Cheng, Y., Su, S., Cai, G., Li, S.: Stratified pooling based deep convolutional neural networks for human action recognition. Multimed. Tools Appl. 76(11), 13367–13382 (2017)
Liu, A.A., Su, Y.T., Nie, W.Z., Kankanhalli, M.: Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 102–114 (2017)
Idrees, H., Zamir, A.R., Jiang, Y.G., Gorban, A., Laptev, I., Sukthankar, R., Shah, M.: The THUMOS challenge on action recognition for videos “in the wild”. Comput. Vis. Image Underst. 155, 1–23 (2017)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Xia, L., Chen, C.C., Aggarwal, J.K.: View invariant human action recognition using histograms of 3D joints. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20–27. IEEE, June 2012
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-CNN. arXiv preprint arXiv:1703.06870 (2017)
Zheng, L., Yang, Y., Tian, Q.: SIFT meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 14, 1–20 (2017)
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)
Chua, L.O., Roska, T.: The CNN paradigm. IEEE Trans. Circuits Syst. I Fundam. Theor. Appl. 40(3), 147–156 (1993)
Girshick, R.: Fast r-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Naresh Babu, K.V., Edla, D.R.: New algebraic activation function for multi-layered feed forward neural networks. IETE J. Res. 63(1), 71–79 (2017)
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853 (2015)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep learning with s-shaped rectified linear activation units. In: AAAI, pp. 1737–1743 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kumar, N. (2018). Large Scale Deep Network Architecture of CNN for Unconstraint Visual Activity Analytics. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-76348-4_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)