Abstract
Since 2017 the Video Browser Showdown (VBS) collaborates with TRECVID and interactively evaluates Ad-Hoc Video Search (AVS) tasks, in addition to Known-Item Search (KIS) tasks. In this video search competition the participants have to find relevant target scenes to a given textual query within a specific time limit, in a large dataset consisting of 600 h of video content. Since usually the number of relevant scenes for such an AVS query is rather high, the teams at the VBS 2017 could find only a small portion of them. One way to support them at the interactive search would be to automatically retrieve other similar instances of an already found target scene. However, it is unclear which content descriptors should be used for such an automatic video content search, using a query-by-example approach. Therefore, in this paper we investigate several different visual content descriptors (CNN Features, CEDD, COMO, HOG, Feature Signatures and HOF) for the purpose of similarity search in the TRECVID IACC.3 dataset, used for the VBS. Our evaluation shows that there is no single descriptor that works best for every AVS query, however, when considering the total performance over all 30 AVS tasks of TRECVID 2016, CNN features provide the best performance.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
YouTube Company Statistics 2016, www.statisticbrain.com/youtube-statistics (accessed September 1, 2017).
- 2.
TRECVID video data, http://www-nlpir.nist.gov/projects/tv2016/tv2016.html#data.
- 3.
TRECVID extra Ad-Hoc video search judgments, www-nlpir.nist.gov/projects/ tv2016/pastdata/extra.avs.qrels.tv16.xlsx.
- 4.
Internet Archive, www.archive.org.
References
Awad, G., Fiscus, J., Michel, M., Joy, D., Kraaij, W., Smeaton, A.F., Quénot, G., Eskevich, M., Aly, R., Ordelman, R.: TRECVID 2016: evaluating video search, video event detection, localization, and hyperlinking. In: Proceedings of TRECVID (2016)
Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38
Beecks, C., Kirchhoff, S., Seidl, T.: Signature matching distance for content-based image retrieval. In: Proceedings of 3rd International ACM Conference on Multimedia Retrieval (2013)
Blaz̆ek, A., Lokoc̆, J., Kubon̆, D.: Video hunter at VBS 2017. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 493–498. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_47
Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 312–322. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79547-6_30
Cisco: The Zettabyte Era: Trends and Analysis. Technical report, Cisco (2017). http://tinyurl.com/cisco-trends-2017
Cobârzan, C., Schoeffmann, K., Bailer, W., Hürst, W., Blažek, A., Lokoč, J., Vrochidis, S., Barthel, K.U., Rossetto, L.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76(4), 5539–5571 (2017)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. I, pp. 886–893. IEEE (2005)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_33
Hürst, W., Ching, A.I.V., Schoeffmann, K., Primus, M.J.: Storyboard-based video browsing using color and concept indices. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 480–485. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_45
Joe Yue-Hei, N., Fan, Y., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: Proceedings of IEEE Conference Workshop on Computer Vision and Pattern Recognition, pp. 53–61 (2015)
Kletz, S., Schoeffmann, K., Münzer, B., Primus, J.M., Husslein, H.: Surgical action retrieval for assisting video review of laparoscopic skills. In: Proceedings of ACMMM Conference Workshop on Educational and Knowledge Technologies (2017)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proceedings of 26th IEEE Conference on Computer Vision and Pattern Recognition (2008)
Lu, Y.-J., Nguyen, P.A., Zhang, H., Ngo, C.-W.: Concept-based interactive search system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 463–468. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_42
Moumtzidou, A., et al.: VERGE in VBS 2017. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 486–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_46
Nguyen, V.-T., Ngo, T.D., Le, D.-D., Tran, M.-T., Duong, D.A., Satoh, S.: Semantic extraction and object proposal for video search. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 475–479. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_44
Rossetto, L., Giangreco, I., Tănase, C., Schuldt, H., Dupont, S., Seddati, O.: Enhanced retrieval and browsing in the IMOTION system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 469–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_43
Sanou, B.: World in 2016: ICT Facts and Figures. Technical report, International Telecommunication Union (ITU) (2017). http://tinyurl.com/itu-facts-2016
Schoeffmann, K., Hudelist, M.A., Huber, J.: Video interaction tools: a survey of recent work. ACM Comput. Surv. 48(1), 14:1–14:34 (2015)
Schoeffmann, K., Primus, M.J., Muenzer, B., Petscharnig, S., Karisch, C., Xu, Q., Huerst, W.: Collaborative feature maps for interactive video search. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 457–462. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_41
Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE MultiMedia 21(4), 8–13 (2014)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of 8th ACM International Workshop on Multimedia Information Retrieval, p. 321. ACM Press (2006)
Vassou, S.A., Amanatiadis, A., Christodoulou, K., Chatzichristoos, S.A.: CoMo: a compact composite moment-based descriptor for image retrieval. In: Proceedings of 15th International Workshop on Content-Based Multimedia Indexing (2017)
Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, p. 603. ACM Press (2008)
Acknowledgement
This work is supported by the Alpen-Adria University Klagenfurt and Lakeside Labs GmbH, Klagenfurt, Austria and funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF 20214 u. 3520/26336/38165.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Kletz, S., Leibetseder, A., Schoeffmann, K. (2018). Evaluation of Visual Content Descriptors for Supporting Ad-Hoc Video Search Tasks at the Video Browser Showdown. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-73603-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73602-0
Online ISBN: 978-3-319-73603-7
eBook Packages: Computer ScienceComputer Science (R0)