Advertisement

Convolutional Neural Networks for Detecting and Mapping Crowds in First Person Vision Applications

  • Juan Sebastian OlierEmail author
  • Carlo Regazzoni
  • Lucio Marcenaro
  • Matthias Rauterberg
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9094)

Abstract

There has been an increasing interest on the analysis of First Person Videos in the last few years due to the spread of low-cost wearable devices. Nevertheless, the understanding of the environment surrounding the wearer is a difficult task with many elements involved. In this work, a method for detecting and mapping the presence of people and crowds around the wearer is presented. Features extracted at the crowd level are used for building a robust representation that can handle the variations and occlusion of people’s visual characteristics inside a crowd. To this aim, convolutional neural networks have been exploited. Results demonstrate that this approach achieves a high accuracy on the recognition of crowds, as well as the possibility of a general interpretation of the context trough the classification of characteristics of the segmented background.

Keywords

Convolutional neural networks Crowds detection First-person vision Egocentric videos 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Betancourt, A., Morerio, P., Regazzoni, C., Rauterberg, M.: The evolution of first person vision methods: A survey. IEEE Transactions on Circuits and Systems for Video Technology PP(99), 1–1 (2015)Google Scholar
  2. 2.
    Bourdev, L., Yang, F., Fergus, R.: Deep poselets for human detection (2014). arXiv preprint arXiv:1407.0717
  3. 3.
    Fathi, A., Farhadi, A., Rehg, J.M.: Understanding egocentric activities. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 407–414. IEEE (2011)Google Scholar
  4. 4.
    Fathi, A., Hodgins, J.K., Rehg, J.M.: Social interactions: a first-person perspective. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1226–1233. IEEE (2012)Google Scholar
  5. 5.
    Kanade, T., Hebert, M.: First-person vision. Proceedings of the IEEE 100(8), 2442–2453 (2012)CrossRefGoogle Scholar
  6. 6.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
  7. 7.
    LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 253–256, May 2010Google Scholar
  8. 8.
    Lee, Y.J., Grauman, K.: Predicting important objects for egocentric video summarization. International Journal of Computer Vision, 1–18 (2014)Google Scholar
  9. 9.
    Narayan, S., Kankanhalli, M.S., Ramakrishnan, K.R.: Action and interaction recognition in first-person videos. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 526–532. IEEE (2014)Google Scholar
  10. 10.
    Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2056–2063. IEEE (2013)Google Scholar
  11. 11.
    Pirsiavash, H., Ramanan, D.: Detecting activities of daily living in first-person camera views. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2847–2854. IEEE (2012)Google Scholar
  12. 12.
    Poleg, Y., Arora, C., Peleg, S.: Temporal segmentation of egocentric videos. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2537–2544. IEEE (2014)Google Scholar
  13. 13.
    Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519, June 2014Google Scholar
  14. 14.
    Ryoo, M.S., Matthies, L.: First-person activity recognition: what are they doing to me? In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2730–2737. IEEE (2013)Google Scholar
  15. 15.
    Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3626–3633. IEEE (2013)Google Scholar
  16. 16.
    Smirnov, E.A., Timoshenko, D.M., Andrianov, S.N.: Comparison of regularization methods for imagenet classification with deep convolutional neural networks. AASRI Procedia 6, 89–94 (2014)CrossRefGoogle Scholar
  17. 17.
    Vedaldi, A., Lenc, K.: Matconvnet-convolutional neural networks for matlab (2014). arXiv preprint arXiv:1412.4564

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Juan Sebastian Olier
    • 1
    • 2
    Email author
  • Carlo Regazzoni
    • 1
  • Lucio Marcenaro
    • 1
  • Matthias Rauterberg
    • 2
  1. 1.Department of Electrical, Electronic, Telecommunications Engineering and Naval Architecture (DITEN)University of GenoaGenoaItaly
  2. 2.Department of Industrial DesignEindhoven University of TechnologyEindhovenNetherlands

Personalised recommendations