Event Detection by Velocity Pyramid

Liang, Zhuolin; Inoue, Nakamasa; Shinoda, Koichi

doi:10.1007/978-3-319-04114-8_30

Event Detection by Velocity Pyramid

Zhuolin Liang²²,
Nakamasa Inoue²² &
Koichi Shinoda²²

Conference paper

3413 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8325))

Abstract

In this paper, we propose velocity pyramid for multimedia event detection. Recently, spatial pyramid matching is proposed to introduce coarse geometric information into Bag of Features framework, and is effective for static image recognition and detection. In video, not only spatial information but also temporal information, which represents its dynamic nature, is important. In order to fully utilize it, we propose velocity pyramid where video frames are divided into motional sub-regions. Our method is effective for detecting events characterized by their temporal patterns. Experiment on the dataset of MED (Multimedia Event Detection) has shown 10% improvement of performance by velocity pyramid than without this method. Further, when combined with spatial pyramid, velocity pyramid provides an extra 3% gains to the detection result.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

2013 TRECVID Multimedia Event Detection Track, http://www.nist.gov/itl/iad/mig/med13.cfm
Jiang, Y.G., Zeng, X., Ye, G., et al.: Columbia-UCF TRECVID2010 Multimedia Event Detection: Combining Multiple Modalities, Contextual Concepts, and Temporal Matching. In: Proc. of TRECVID Workshop (2010)
Google Scholar
Aly, R., McGuinness, K., et al.: AXES at TRECVid 2012. In: Proc. of TRECVID Workshop (2012)
Google Scholar
Jiang, L.: Alexander G. Hauptmann, G. Xiang: Leveraging High-level and Low-level Features for Multimedia Event Detection. ACM Multimedia 12, 449–458 (2012)
Google Scholar
Torralba, A., Oliva, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3), 145–175 (2001)
Article MATH Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human Detection Using Oriented Histograms of Flow and Appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning Realistic Human Actions from Movies. In: Proc. CVPR, pp. 1–8 (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: Proc. CVPR, pp. 2169–2178 (2006)
Google Scholar
Sun, C., Nevatia, R.: Large-scale Web Video Event Classification by use of Fisher Vectors. In: 2013 IEEE Workshop on Application of Computer Vision, pp. 15–22 (2013)
Google Scholar
Viitaniemi, V., Laaksonen, J.: Spatial extensions to bag of visual words. In: Proc. CIVR. ACM (2009)
Google Scholar
Inoue, N., Shinoda, K.: A Fast and Accurate Video Semantic-Indexing System Using Fast MAP Adaptation and GMM Supervectors. IEEE Transactions on Multimedia 14(4-2), 1196–1205 (2012)
Article Google Scholar
Kamishima, Y., Inoue, N., Shinoda, K., Sato, S.: Multimedia Event Detection Using GMM Supervectors and SVMs. In: Proc. ICIP, Florida, pp. 3089–3092 (2012)
Google Scholar
Yu, S., Xu, Z., Ding, D., Sze, W.: Informedia E-Lamp@TRECVID 2012. In: Proc. of TRECVID Workshop (2012)
Google Scholar
Cheng, H., Liu, J., Ali, S., Javed, O.: SRI-Sarnoff AURORA System at TRECVID 2012 Multimedia Event Detection and Recounting. In: Proc. of TRECVID Workshop (2012)
Google Scholar
Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)
Article Google Scholar
Wang, H., Klser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Proc. CVPR, pp. 3169–3176 (2011)
Google Scholar
Wang, F., Jiang, Y.G., Ngo, C.W.: Video Event Detection Using Motion Relativity and Visual Relatedness. In: Proc. ACM Multimedia, pp. 239–248 (2008)
Google Scholar
Chen, M., Hauptmann, A.: MoSIFT: Recognizing Human Actions in Surveillance Videos. CMU-CS-09-161, Carnegie Mellon University (2009)
Google Scholar
Bermejo Nievas, E., Deniz Suarez, O., Bueno García, G., Sukthankar, R.: Violence Detection in Video using Computer Vision Techniques. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011, Part II. LNCS, vol. 6855, pp. 332–339. Springer, Heidelberg (2011)
Chapter Google Scholar
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tokyo Institute of Technology, Japan
Zhuolin Liang, Nakamasa Inoue & Koichi Shinoda

Authors

Zhuolin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Nakamasa Inoue
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Shinoda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, Dublin City University, Dublin 9, Ireland
Cathal Gurrin
Fakultät IV für Elektrotechnik und Informatik, Technische Universität Berlin / DAI-Labor, 10587, Berlin, Germany
Frank Hopfgartner
Department of Information and Computing Sciences, Universiteit Utrecht, 3584 CC, Utrecht, The Netherlands
Wolfgang Hurst
UiT The Arctic University of Norway, 9019, Tromsø, Norway
Håvard Johansen
Singapore University of Technology and Design, Singapore
Hyowon Lee
School of Electrical Engineering, Dublin City University, Ireland
Noel O’Connor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, Z., Inoue, N., Shinoda, K. (2014). Event Detection by Velocity Pyramid. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds) MultiMedia Modeling. MMM 2014. Lecture Notes in Computer Science, vol 8325. Springer, Cham. https://doi.org/10.1007/978-3-319-04114-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-04114-8_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04113-1
Online ISBN: 978-3-319-04114-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics