Content Based Video Retrieval Using Convolutional Neural Network

Iqbal, Saeed; Qureshi, Adnan N; Lodhi, Awais M.

doi:10.1007/978-3-030-01054-6_12

Saeed Iqbal¹⁷,
Adnan N Qureshi¹⁷ &
Awais M. Lodhi¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 868))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

1651 Accesses
6 Citations

Abstract

We present a content-based retrieval method for unconstrained video. To achieve the goal content based video retrieval we segmented the videos, detecting objects that match the human-defined interest. Object detection and classification is basic task in video analysis. For object detection we separate the foreground object from background and perform localization on each extracted frames from videos, and measure intensity histogram of 8-oriented frames and then perform haar-cascade, Gabor filter, active appearance model (AAM) and Convolutional Neural Network (CNN) algorithms get the authentic result. We used two datasets: Youtube and SegTek containing more than 2000 videos and using cluster computing to get the state-of-the-arts result of object detection and segmentations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chute, C., Manfrediz, A., Minton, S., Reinsel, D., Schlichting, W., Toncheva, A.: The diverse and exploding digital universe. In: IDC White Paper (2008)
Google Scholar
Tsai, D.: Georgia Tech Segmentation and Tracking Dataset (GT-SegTrack) (2017). http://cpl.cc.gatech.edu/projects/SegTrack/
Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3282–3289. IEEE (2012)
Google Scholar
Mehmood, Z., Mahmood, T., Javid, M.A.: Content-based image retrieval and semantic automatic image annotation based on the weighted average of triangular histograms using support vector machine. Appl. Intell. 48(1), 166–181 (2018)
Article Google Scholar
Johnson, J., Karpathy, A., Fei-Fei, L.: DenseCap: fully convolutional localization networks for dense captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4565–4574 (2016)
Google Scholar
Kevin O’Regan, J., Deubel, H., Clark, J.J., Rensink, R.A.: Picture changes during blinks: Looking without seeing and seeing without looking. Vis. Cogn. 7(1–3), 191–211 (2000)
Article Google Scholar
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, , pp. 2048–2057 (2015)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)
Google Scholar
Ba, J., Mnih, V., Kavukcuoglu, K.: Multiple object recognition with visual attention. arXiv preprint arXiv:1412.7755 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., et al.: Large scale distributed deep networks. In: Advances in neural information processing systems, pp. 1223–1231 (2012)
Google Scholar
Le, Q.V., MarcAurelio Ranzato, R.M., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. arxiv. org (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Vinyals, O., Kaiser, Ł., Koo, T., Petrov, S., Sutskever, I., Hinton, G.: Grammar as a foreign language. In: Advances in Neural Information Processing Systems, pp. 2773–2781 (2015)
Google Scholar
Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Mikolov, T., et al.: DeVISE: a deep visual-semantic embedding model. In: Advances in Neural Information Processing Systems, pp. 2121–2129 (2013)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. corr abs/1409.4842 (2014)
Google Scholar
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732 (2014)
Google Scholar
Chen, G., Parada, C., Heigold, G.: Small-footprint keyword spotting using deep neural networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4087–4091. IEEE (2014)
Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Heigold, G., Vanhoucke, V., Senior, A., Nguyen, P., Ranzato, M., Devin, M., Dean, J.: Multilingual acoustic models using distributed deep neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8619–8623. IEEE (2013)
Google Scholar
Al-Rfou, R., Alain, G., Almahairi, A., Angermueller, C., Bahdanau, D., Ballas, N., Bastien, F., Bayer, J., Belikov, A., Belopolsky, A., et al.: Theano: a python framework for fast computation of mathematical expressions. arXiv preprint (2016)
Google Scholar
Goodfellow, I.J., Bulatov, Y., Ibarz, J., Arnoud, S., Shet, V.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 (2013)
Recht, B., Re, C., Wright, S., Niu, F.: HOGWILD: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701 (2011)
Google Scholar
Zagoruyko, S., Lerer, A., Lin, T.-Y., Pinheiro, P.O., Gross, S., Chintala, S., Dollár, P.: A multipath network for object detection. arXiv preprint arXiv:1604.02135 (2016)
Sadrnia, H., Rajabipour, A., Jafary, A., Javadi, A., Mostofi, Y.: Classification and analysis of fruit shapes in long type watermelon using image processing. Int. J. Agric. Biol. 1, 68–70 (2007)
Google Scholar
Arivazhagan, S., Shebiah, R.N., Nidhyanandhan, S.S., Ganesan, L.: Fruit recognition using color and texture features. J. Emerg. Trends Comput. Inf. Sci. 1(2), 90–94 (2010)
Google Scholar
Insuasti-Ceballos, D., Bouwmans, T., Castellanos-Dominguez, G.: GMM background modeling using divergence-based weight updating. In: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 21st Iberoamerican Congress, CIARP 2016, Lima, Peru, 8–11 November 2016, Proceedings, vol. 10125, p. 282. Springer (2017)
Google Scholar
Bruhn, A., Weickert, J., Schnörr, C.: Lucas/kanade meets horn/schunck: combining local and global optic flow methods. Int. J. Comput. Vis. 61(3), 211–231 (2005)
Article Google Scholar
Jang, H., Won, I.-S., Jeong, D.-S.: Automatic vehicle detection and counting algorithm. Int. J. Comput. Sci. Netw. Secur. (IJCSNS) 14(9), 99 (2014)
Google Scholar
Nield, D.: Denmark just installed environmentally friendly traffic lights that give priority to bikes and buses (2017). https://www.sciencealert.com/copenhagen-just-installed-environmentally-friendly-traffic-lights-that-give-priority-to-buses-and-bikes
Tsai, D., Flagg, M., Nakazawa, A., Rehg, J.M.: Motion coherent tracking using multi-label mrf optimization. Int. J. Comput. Vis. 100(2), 190–202 (2012)
Article MathSciNet Google Scholar
Iqbal, S., Shaheen, M., et al.: A machine learning based method for optimal journal classification. In: 8th International Conference for Internet Technology and Secured Transactions (ICITST), pp. 259–264. IEEE (2013)
Google Scholar
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. R. Stat. Soc. Series C (Appl. Stat.) 28(1), 100–108 (1979)
MATH Google Scholar
Jain, S.: A machine learning approach: SVM for image classification in CBIR. Int. J. Appl. Annovation Eng. Manag. (IJAIEM) 2(4) (2013)
Google Scholar
Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques (2007)
Google Scholar
Bergstra, J., Breuleux, O., Lamblin, P., Pascanu, R., Delalleau, O., Desjardins, G., Goodfellow, I., Bergeron, A., Bengio, Y., Kaelbling, P.: Theano: deep learning on GPUs with python (2011)
Google Scholar
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)
Google Scholar
Williams, J.M.: Deep learning and transfer learning in the classification of EEG signals (2017)
Google Scholar
Voulodimos, A., Doulamis, N., Doulamis, A., Protopapadakis, E.: Deep learning for computer vision: a brief review. Comput. Intell. Neurosci. 2018 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, University of Central Punjab, Lahore, Pakistan
Saeed Iqbal, Adnan N Qureshi & Awais M. Lodhi

Authors

Saeed Iqbal
View author publications
You can also search for this author in PubMed Google Scholar
Adnan N Qureshi
View author publications
You can also search for this author in PubMed Google Scholar
Awais M. Lodhi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Iqbal .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iqbal, S., Qureshi, A.N., Lodhi, A.M. (2019). Content Based Video Retrieval Using Convolutional Neural Network. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 868. Springer, Cham. https://doi.org/10.1007/978-3-030-01054-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-01054-6_12
Published: 09 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01053-9
Online ISBN: 978-3-030-01054-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics