Abstract
We address the problem of temporally aligning semantically similar videos, for example two videos of cars on different tracks. We present an alignment method that establishes frame-to-frame correspondences such that the two cars are seen from a similar viewpoint (e.g. facing right), while also being temporally smooth and visually pleasing. Unlike previous works, we do not assume that the videos show the same scripted sequence of events. We compare against three alternative methods, including the popular DTW algorithm, on a new dataset of realistic videos collected from the internet. We perform a comprehensive evaluation using a novel protocol that includes both quantitative measures and a user study on visual pleasingness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V.: Semi-automated video morphing. In: Eurographics Symposium on Rendering (2014)
Agarwala, A., Zheng, K.C., Pal, C., Agrawala, M., Cohen, M., Curless, B., Szeliski, R.: Panoramic video textures. In: SIGGRAPH (2005)
Ruegg, J., Wang, O., Smolic, A., Gross, M.: Ducttake: spatiotemporal video compositing. Comput. Graph. Forum (Proc. Eurograph.) 32, 51–61 (2013)
Ngo, C., Ma, Y., Zhang, H.: Video summarization and scene detection by graph modeling. IEEE Trans. Circ. Syst. Video Technol. 15, 296–305 (2005)
Jiang, Y., Ngo, C., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: International Conference on Image and Video Retrieval (2007)
Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High dynamic range video. ACM Trans. Graph. 26(3), 760–768 (2007)
Caspi, Y., Irani, M.: A step towards sequence-to-sequence alignment. In: CVPR (2000)
Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. IEEE Trans. PAMI 24, 1409–1424 (2002)
Caspi, Y., Irani, M.: Alignment of non-overlapping sequences. In: ECCV (2001)
Caspi, Y., Simakov, D., Irani, M.: Feature-based sequence-to-sequence matching. IJCV 68, 53–64 (2006)
Wolf., L., Zomet, A.: Wide baseline matching between unsynchronized video sequences. IJCV (2006)
Tuytelaars, T., van Gool, L.: Synchronizing video sequences. In: CVPR (2004)
Evangelidis, G.D., Bauckhage, C.: Efficient subframe video alignment using short descriptors. IEEE Trans. PAMI (2013)
Wang, O., Schroers, C., Zimmer, H., Gross, M., Sorkine-Hornung, A.: Videosnapping: Interactive synchronization of multiple videos. ACM Trans. Graph. (2014)
Rao, C., Gritai, A., Shah, M.: View-invariant alignment and matching of video sequences. In: ICCV (2003)
Ukrainitz, Y., Irani, M.: Aligning sequences and actions by maximizing space-time correlations. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 538–550. Springer, Heidelberg (2006). doi:10.1007/11744078_42
Dexter, E., Perez, P., Laptev, I.: Multi-view synchronization of human actions and dynamic scenes. In: BMVC (2009)
Sakoe, H., Chiba, S.: Object segmentation by alignment of poselet activations to image contours. IEEE Trans. Acoust. Speech Signal Proc. (1978)
Neal, R.M.: Probabilistic inference using markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, University of Toronto (1993)
Kim, W.H., Kim, J.N.: An adaptive shot change detection algorithm using an average of absolute difference histogram within extension sliding window. In: ISCE (2009)
Padua, F.L.C., Carceroni, R.L.: Linear sequence-to-sequence alignment. IEEE Trans. PAMI (2009)
Douze, M., Revaud, J., Verbeek, J., Jegou, H., Schmid, C.: Circulant temporal encoding for video retrieval and temporal alignment. IJCV (2016)
Diego, F., Serrat, J., Lpez, A.M.: Joint spatio-temporal alignment of sequences. IEEE Trans. Multimedia (2013)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv abs/1409.1556 (2014)
Ramanathan, V., Tang, K., Mori, G., Fei-Fei, L.: Learning temporal embeddings for complex video analysis. In: ICCV (2015)
Zha, S., Luisier, F., Andrews, W., Srivastava, N., Salakhutdinov, R.: Exploiting image-trained cnn architectures for unconstrained video classification. In: BMVC (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Crow, F.C.: Summed-area tables for texture mapping. In: SIGGRAPH (1984)
Oh, S., Russell, S.J., Sastry, S.: Markov chain monte carlo data association for multi-target tracking. IEEE Trans. Autom. Control 54, 481–497 (2009)
Brau, E., J., G., Simek, K., Del Pero, L., Dawson, C.R., Barnard, K.: Bayesian 3D tracking from monocular video. In: ICCV (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV (2015)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results (2012). http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html
Papazoglou, A., Ferrari, V.: Fast object segmentation in unconstrained video. In: ICCV (2013)
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV (2013)
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Papazoglou, A., Del Pero, L., Ferrari, V. (2017). Video Temporal Alignment for Object Viewpoint. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10114. Springer, Cham. https://doi.org/10.1007/978-3-319-54190-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-54190-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54189-1
Online ISBN: 978-3-319-54190-7
eBook Packages: Computer ScienceComputer Science (R0)