Advertisement

A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding

  • Isma Hadji
  • Richard P. Wildes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)

Abstract

We introduce a new large scale dynamic texture dataset. With over 10,000 videos, our Dynamic Texture DataBase (DTDB) is two orders of magnitude larger than any previously available dynamic texture dataset. DTDB comes with two complementary organizations, one based on dynamics independent of spatial appearance and one based on spatial appearance independent of dynamics. The complementary organizations allow for uniquely insightful experiments regarding the abilities of major classes of spatiotemporal ConvNet architectures to exploit appearance vs. dynamic information. We also present a new two-stream ConvNet that provides an alternative to the standard optical-flow-based motion stream to broaden the range of dynamic patterns that can be encompassed. The resulting motion stream is shown to outperform the traditional optical flow stream by considerable margins. Finally, the utility of DTDB as a pretraining substrate is demonstrated via transfer learning on a different dynamic texture dataset as well as the companion task of dynamic scene recognition resulting in a new state-of-the-art.

Supplementary material

474202_1_En_20_MOESM1_ESM.pdf (214 kb)
Supplementary material 1 (pdf 213 KB)

References

  1. 1.
    Amazon Mechanical Turk. www.mturk.com
  2. 2.
    Beautiful word clouds. www.wordle.net
  3. 3.
    Carreira, J., Zisserman, A.: Quo vadis, action recognition? A new model and the kinetics dataset. In: CVPR (2017)Google Scholar
  4. 4.
    Chetverikov, D., Peteri, R.: A brief survey of dynamic texture description and recognition. In: CORES (2005)Google Scholar
  5. 5.
    Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: CVPR (2014)Google Scholar
  6. 6.
    Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: CVPR (2015)Google Scholar
  7. 7.
    Dai, D., Riemenschneider, H., Gool, L.: The synthesizability of texture examples. In: CVPR (2014)Google Scholar
  8. 8.
    Derpanis, K., Wildes, R.P.: Spacetime texture representation and recognition based on spatiotemporal orientation analysis. PAMI 34, 1193–1205 (2012)CrossRefGoogle Scholar
  9. 9.
    Derpanis, K.G., Wildes, R.P.: Dynamic texture recognition based on distributions of spacetime oriented structure. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 191–198, June 2010Google Scholar
  10. 10.
    Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: Dynamic textures. IJCV 51, 91–109 (2003)CrossRefGoogle Scholar
  11. 11.
    Dubois, S., Peteri, R., Michel, M.: Characterization and recognition of dynamic textures based on the 2D+T curvelet. Sig. Im. Vid. Proc. 9, 819–830 (2013)Google Scholar
  12. 12.
    Feichtenhofer, C., Pinz, A., Wildes., R.P.: Spatiotemporal residual networks for video action recognition. In: NIPS (2016)Google Scholar
  13. 13.
    Feichtenhofer, C., Pinz, A., Wildes., R.P.: Temporal residual networks for dynamic scene recognition. In: CVPR (2017)Google Scholar
  14. 14.
    Ghanem, B., Ahuja, N.: Maximum margin distance learning for dynamic texture recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 223–236. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15552-9_17CrossRefGoogle Scholar
  15. 15.
    Hadji, I., Wildes, R.P.: A spatiotemporal oriented energy network for dynamic texture recognition. In: ICCV (2017)Google Scholar
  16. 16.
    He, K., Zhang, X., Ren, S., Sun., J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  17. 17.
    Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. PAMI 35, 1915–1929 (2013)CrossRefGoogle Scholar
  18. 18.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR (2014)Google Scholar
  19. 19.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  20. 20.
    Langer, M., Mann, R.: Optical snow. IJCV 55, 55–71 (2003)CrossRefGoogle Scholar
  21. 21.
    Lin, T.Y., Maji, S.: Visualizing and understanding deep texture representations. In: CVPR (2016)Google Scholar
  22. 22.
    Ng, J., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., Toderici., G.: Beyond short snippets: deep networks for video classification. In: CVPR (2015)Google Scholar
  23. 23.
    Oxholm, G., Bariya, P., Nishino, K.: The scale of geometric texture. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 58–71. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33718-5_5CrossRefGoogle Scholar
  24. 24.
    Peteri, R., Sandor, F., Huiskes, M.: DynTex: a comprehensive database of dynamic textures. PRL 31, 1627–1632 (2010)CrossRefGoogle Scholar
  25. 25.
  26. 26.
    Quan, Y., Bao, C., Ji, H.: Equiangular kernel dicitionary learning with applications to dynamic textures analysis. In: CVPR (2016)Google Scholar
  27. 27.
    Quan, Y., Huang, Y., Ji, H.: Dynamic texture recognition via orthogonal tensor dictionary learning. In: ICCV (2015)Google Scholar
  28. 28.
    Ravichandran, A., Chaudhry, R., R. Vidal, R.: View-invariant dynamic texture recognition using a bag of dynamical systems. In: CVPR (2009)Google Scholar
  29. 29.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Saisan, P., Doretto, G., Wu, Y., Soatto, S.: Dynamic texture recognition. In: CVPR (2001)Google Scholar
  31. 31.
    Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)Google Scholar
  32. 32.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)Google Scholar
  33. 33.
    Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. Technical report. CRCV-TR-12-01, University of Central Florida (2012)Google Scholar
  34. 34.
    Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV (2015)Google Scholar
  35. 35.
    Varma, M., Zisserman, A.: Texture classification: are filter banks necessary? In: CVPR (2003)Google Scholar
  36. 36.
    Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. IJCV 62, 61–81 (2005)CrossRefGoogle Scholar
  37. 37.
  38. 38.
    Yang, F., Xia, G., Liu, G., Zhang, L., Huang, X.: Dynamic texture recognition by aggregating spatial and temporal features via SVMs. Neurocomp. 173, 1310–1321 (2016)CrossRefGoogle Scholar
  39. 39.
  40. 40.
    Zhao, G., Pietikäinen, M.: Dynamic texture recognition using volume local binary patterns. In: Vidal, R., Heyden, A., Ma, Y. (eds.) WDV 2005-2006. LNCS, vol. 4358, pp. 165–177. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-70932-9_13CrossRefGoogle Scholar
  41. 41.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.York UniversityTorontoCanada

Personalised recommendations