Local Expert Forest of Score Fusion for Video Event Classification

Liu, Jingchen; McCloskey, Scott; Liu, Yanxi

doi:10.1007/978-3-642-33715-4_29

Local Expert Forest of Score Fusion for Video Event Classification

Jingchen Liu²¹,
Scott McCloskey²² &
Yanxi Liu²¹

Conference paper

9453 Accesses
12 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7576))

Abstract

We address the problem of complicated event categorization from a large dataset of videos “in the wild”, where multiple classifiers are applied independently to evaluate each video with a ‘likelihood’ score. The core contribution of this paper is a local expert forest model for meta-level score fusion for event detection under heavily imbalanced class distributions. Our motivation is to adapt to performance variations of the classifiers in different regions of the score space, using a divide-and-conquer technique. We propose a novel method to partition the likelihood-space, being sensitive to local label distributions in imbalanced data, and train a pair of locally optimized experts each time. Multiple pairs of experts based on different partitions (‘trees’) form a ‘forest’, balancing local adaptivity and over-fitting of the model. As a result, our model disregards classifiers in regions of the score space where their performance is bad, achieving both local source selection and fusion. We experiment with the TRECVID Multimedia Event Detection (MED) dataset, detecting 15 complicated events from around 34k video clips comprising more than 1000 hours, and demonstrate superior performance compared to other score-level fusion methods.

Download to read the full chapter text

Chapter PDF

References

Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: ICPR (2004)
Google Scholar
Wong, S., Kim, T., Cipolla, R.: Learning motion categories using both semantics and structural information. In: CVPR (2007)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: CVPR (2009)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: CVPR (2009)
Google Scholar
Over, P., Awad, G., Fiscus, J., Antonishek, B., Smeaton, A., Kraaij, W., Quenot, G.: Trecvid 2010 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2010, NIST, USA (2011)
Google Scholar
Gong, S., Xiang, T.: Recognition of group activities using dynamic probabilistic networks. In: ICCV (2003)
Google Scholar
Yu, G., Goussies, N.A., Yuan, J., Liu, Z.: Fast action detection via discriminative random forest voting and top-k subvolume search. Multimedia 13, 507–517 (2011)
Article Google Scholar
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. In: European Conf. on Speech Communication and Technology (1997)
Google Scholar
Dass, S., Nandakumar, K., Jain, A.: A Principled Approach to Score Level Fusion in Multimodal Biometric Systems. In: Kanade, T., Jain, A., Ratha, N.K. (eds.) AVBPA 2005. LNCS, vol. 3546, pp. 1049–1058. Springer, Heidelberg (2005)
Chapter Google Scholar
Yin, Z., Porikli, F., Collins, R.: Likelihood map fusion for visual object tracking. In: BMVC (2008)
Google Scholar
Mittal, A., Zisserman, A., Torr, P.: Hand detection using multiple proposals. In: BMVC (2011)
Google Scholar
Ma, C., Lee, C.: An efficient gradient computation approach to discriminative fusion optimization in semantic concept detection. In: ICPR (2008)
Google Scholar
Gao, S., Wu, W., Lee, C., Chua, T.S.: A maximal figure-of-merit (mfom)-learning approach to robust classifier design for text categorization. ACM Trans. on Information Systems 42, 145–175 (2006)
Google Scholar
Tseng, B., Lin, C., Naphade, M., Natsev, A., Smith, J.: Normalized classifier fusion for semantic visual concept detection. In: ICIP (2003)
Google Scholar
Bach, F., Heckerman, D., Horvitz, E.: On the path to an ideal roc curve: considering cost asymmetry in learning classifiers. In: Artificial Intelligence and Statistics (2005)
Google Scholar
Gao, S., Lee, C., Lim, J.: An ensemble classifier learning approach to roc optimization. In: ICPR (2006)
Google Scholar
Jordan, M.I.: Hierarchical mixtures of experts and the em algorithm. Neural Computation 6, 181–214 (1994)
Article Google Scholar
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 1.21 (2011), http://cvxr.com/cvx
Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC (2008)
Google Scholar
Li, L., Su, H., Xing, E., Li, F.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: Neural Information Processing Systems (NIPS), Vancouver, Canada (2010)
Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)
Article MATH Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Article Google Scholar
Herschtal, A., Raskutti, B.: Optimizing area under the roc curve using gradient descent. In: ICML (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Penn State University, State College, PA, USA
Jingchen Liu & Yanxi Liu
Honeywell Labs, Golden Valley, MN, USA
Scott McCloskey

Authors

Jingchen Liu
View author publications
You can also search for this author in PubMed Google Scholar
Scott McCloskey
View author publications
You can also search for this author in PubMed Google Scholar
Yanxi Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, J., McCloskey, S., Liu, Y. (2012). Local Expert Forest of Score Fusion for Video Event Classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7576. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33715-4_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-33715-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33714-7
Online ISBN: 978-3-642-33715-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics