Ego2Top: Matching Viewers in Egocentric and Top-View Videos

  • Shervin ArdeshirEmail author
  • Ali Borji
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9909)


Egocentric cameras are becoming increasingly popular and provide us with large amounts of videos, captured from the first person perspective. At the same time, surveillance cameras and drones offer an abundance of visual information, often captured from top-view. Although these two sources of information have been separately studied in the past, they have not been collectively studied and related. Having a set of egocentric cameras and a top-view camera capturing the same area, we propose a framework to identify the egocentric viewers in the top-view video. We utilize two types of features for our assignment procedure. Unary features encode what a viewer (seen from top-view or recording an egocentric video) visually experiences over time. Pairwise features encode the relationship between the visual content of a pair of viewers. Modeling each view (egocentric or top) by a graph, the assignment process is formulated as spectral graph matching. Evaluating our method over a dataset of 50 top-view and 188 egocentric videos taken in different scenarios demonstrates the efficiency of the proposed approach in assigning egocentric viewers to identities present in top-view camera. We also study the effect of different parameters such as the number of egocentric viewers and visual features.


Egocentric vision Surveillance Spectral graph matching Gist Cross-domain image understanding 


  1. 1.
    Fathi, A., Farhadi, A., Rehg, J.: Understanding egocentric activities. In: 2011 IEEE International Conference on Computer Vision (ICCV). IEEE (2011)Google Scholar
  2. 2.
    Fathi, A., Li, Y., Rehg, J.M.: Learning to recognize daily actions using gaze. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 314–327. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33718-5_23 Google Scholar
  3. 3.
    Bettadapura, V., Essa, I., Pantofaru, C.: Egocentric field-of-view localization using first-person point-of-view devices. In: IEEE Winter Conference on Applications of Computer Vision (WACV) (2015)Google Scholar
  4. 4.
    Egozi, A., Keller, Y., Guterman, H.: A probabilistic approach to spectral graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 18–27 (2013)CrossRefGoogle Scholar
  5. 5.
    Dicle, C., Campsm, O., Sznaier., M.: The way they move: tracking multiple targets with similar appearance. In: Proceedings of the IEEE International Conference on Computer Vision (2013)Google Scholar
  6. 6.
    Kanade, T., Hebert, M.: First-person vision. Proc. IEEE 100(8), 2442–2453 (2012)CrossRefGoogle Scholar
  7. 7.
    Betancourt, A., Morerio, P., Regazzoni, C.S., Rauterberg, M.: The evolution of first person vision methods: a survey. IEEE Trans. Circ. Syst. Video Technol. 25(5), 744–760 (2015)CrossRefGoogle Scholar
  8. 8.
    Fathi, A., Ren, X., Rehg, J.M.: Learning to recognize objects in egocentric activities. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  9. 9.
    Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  10. 10.
    Li, Y., Fathi, A., Rehg, J.: Learning to predict gaze in egocentric video. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3216–3223 (2013)Google Scholar
  11. 11.
    Polatsek, P., Benesova, W., Paletta, L., Perko, R.: Novelty-based spatiotemporal saliency detection for prediction of gaze in egocentric video. IEEE Sig. Process. Lett. 23(3), 394–398 (2016)CrossRefGoogle Scholar
  12. 12.
    Borji, A., Sihite, D.N., Itti, L.: What/where to look next? modeling top-down visual attention in complex interactive environments. IEEE Trans. Syst., Man Cybern.: Syst. 44(5), 523–538 (2014)CrossRefGoogle Scholar
  13. 13.
    Alahi, A., Bierlaire, M., Kunt, M.: Object detection and matching with mobile cameras collaborating with fixed cameras. In: Workshop on Multi-Camera and Multi-Modal Sensor Fusion Algorithms and Applications-M2SFA2 (2008)Google Scholar
  14. 14.
    Alahi, A., Marimon, D., Bierlaire, M., Kunt, M.: A master-slave approach for object detection and matching with fixed and mobile cameras. In: 15th IEEE International Conference on Image Processing, ICIP 2008 (2008)Google Scholar
  15. 15.
    Ferland, F., Pomerleau, F., Le Dinh, C., Michaud, F.: Egocentric and exocentric teleoperation interface using real-time, 3d video projection. In: 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (2009)Google Scholar
  16. 16.
    Park, H., Jain, E., Sheikh, Y.: Predicting primary gaze behavior using social saliency fields. In: Proceedings of the IEEE International Conference on Computer Vision (2013)Google Scholar
  17. 17.
    Hoshen, Y., Ben-Artzi, G., Peleg, S.: Wisdom of the crowd in egocentric video curation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2014)Google Scholar
  18. 18.
    Fathi, A., Hodgins, J.K., Rehg, J.M.: Social interactions: a first-person perspective. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  19. 19.
    Yan, Y., et al.: Egocentric daily activity recognition via multitask clustering. IEEE Trans. Image Process. 24(10), 2984–2995 (2015)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Damen, D., Leelasawassuk, T., Haines, O., Calway, A., Mayol-Cuevas, W.: You-do, i-learn: discovering task relevant objects and their modes of interaction from multi-user egocentric video. In: BMVC (2014)Google Scholar
  21. 21.
    Cheng, D.S., Cristani, M., Stoppa, M., Bazzani, L., Murino, V.: Custom pictorial structures for re-identification. In: BMVC (2011)Google Scholar
  22. 22.
    Bak, S., Corvee, E., Brémond, F., Thonnat, M.: Multiple-shot human re-identification by mean riemannian covariance grid. In: 8th IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS) (2011)Google Scholar
  23. 23.
    Bazzani, L., Cristani, M., Murino, V.: Symmetry-driven accumulation of local features for human characterization and re-identification. In: Computer Vision and Image Understanding (2013)Google Scholar
  24. 24.
    Poleg, Y., Arora, C., Peleg, S.: Head motion signatures from egocentric videos. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9005, pp. 315–329. Springer, Heidelberg (2015)Google Scholar
  25. 25.
    Yonetani, R., Kitani, K.M., Sato, Y.: Ego-surfing first person videos. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015)Google Scholar
  26. 26.
    Zamir, A.R., Ardeshir, S., Shah, M.: Robust refinement of GPS-tags using random walks with an adaptive damping factor. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  27. 27.
    Kiefer, P., Giannopoulos, I., Raubal, M.: Where am i? investigating map matching during selflocalization with mobile eye tracking in an urban environment. Trans. GIS 18(5), 660–686 (2014)CrossRefGoogle Scholar
  28. 28.
    Ardeshir, S., Zamir, A.R., Torroella, A., Shah, M.: GIS-assisted object detection and geospatial localization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 602–617. Springer, Heidelberg (2014)Google Scholar
  29. 29.
    Ardeshir, S., Collins-Sibley, K.M., Shah, M.: Geo-semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2015)Google Scholar
  30. 30.
    Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53(2), 169–191 (2003)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Center for Research in Computer VisionUniversity of Central FloridaOrlandoUSA

Personalised recommendations