Skip to main content

Evaluation of Visual Content Descriptors for Supporting Ad-Hoc Video Search Tasks at the Video Browser Showdown

  • Conference paper
  • First Online:
  • 3124 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Abstract

Since 2017 the Video Browser Showdown (VBS) collaborates with TRECVID and interactively evaluates Ad-Hoc Video Search (AVS) tasks, in addition to Known-Item Search (KIS) tasks. In this video search competition the participants have to find relevant target scenes to a given textual query within a specific time limit, in a large dataset consisting of 600 h of video content. Since usually the number of relevant scenes for such an AVS query is rather high, the teams at the VBS 2017 could find only a small portion of them. One way to support them at the interactive search would be to automatically retrieve other similar instances of an already found target scene. However, it is unclear which content descriptors should be used for such an automatic video content search, using a query-by-example approach. Therefore, in this paper we investigate several different visual content descriptors (CNN Features, CEDD, COMO, HOG, Feature Signatures and HOF) for the purpose of similarity search in the TRECVID IACC.3 dataset, used for the VBS. Our evaluation shows that there is no single descriptor that works best for every AVS query, however, when considering the total performance over all 30 AVS tasks of TRECVID 2016, CNN features provide the best performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    YouTube Company Statistics 2016, www.statisticbrain.com/youtube-statistics (accessed September 1, 2017).

  2. 2.

    TRECVID video data, http://www-nlpir.nist.gov/projects/tv2016/tv2016.html#data.

  3. 3.

    TRECVID extra Ad-Hoc video search judgments, www-nlpir.nist.gov/projects/ tv2016/pastdata/extra.avs.qrels.tv16.xlsx.

  4. 4.

    Internet Archive, www.archive.org.

References

  1. Awad, G., Fiscus, J., Michel, M., Joy, D., Kraaij, W., Smeaton, A.F., Quénot, G., Eskevich, M., Aly, R., Ordelman, R.: TRECVID 2016: evaluating video search, video event detection, localization, and hyperlinking. In: Proceedings of TRECVID (2016)

    Google Scholar 

  2. Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_38

    Google Scholar 

  3. Beecks, C., Kirchhoff, S., Seidl, T.: Signature matching distance for content-based image retrieval. In: Proceedings of 3rd International ACM Conference on Multimedia Retrieval (2013)

    Google Scholar 

  4. Blaz̆ek, A., Lokoc̆, J., Kubon̆, D.: Video hunter at VBS 2017. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 493–498. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_47

    Chapter  Google Scholar 

  5. Chatzichristofis, S.A., Boutalis, Y.S.: CEDD: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval. In: Gasteratos, A., Vincze, M., Tsotsos, J.K. (eds.) ICVS 2008. LNCS, vol. 5008, pp. 312–322. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-79547-6_30

    Chapter  Google Scholar 

  6. Cisco: The Zettabyte Era: Trends and Analysis. Technical report, Cisco (2017). http://tinyurl.com/cisco-trends-2017

  7. Cobârzan, C., Schoeffmann, K., Bailer, W., Hürst, W., Blažek, A., Lokoč, J., Vrochidis, S., Barthel, K.U., Rossetto, L.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76(4), 5539–5571 (2017)

    Article  Google Scholar 

  8. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. I, pp. 886–893. IEEE (2005)

    Google Scholar 

  9. Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006). https://doi.org/10.1007/11744047_33

    Chapter  Google Scholar 

  10. Hürst, W., Ching, A.I.V., Schoeffmann, K., Primus, M.J.: Storyboard-based video browsing using color and concept indices. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 480–485. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_45

    Chapter  Google Scholar 

  11. Joe Yue-Hei, N., Fan, Y., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: Proceedings of IEEE Conference Workshop on Computer Vision and Pattern Recognition, pp. 53–61 (2015)

    Google Scholar 

  12. Kletz, S., Schoeffmann, K., Münzer, B., Primus, J.M., Husslein, H.: Surgical action retrieval for assisting video review of laparoscopic skills. In: Proceedings of ACMMM Conference Workshop on Educational and Knowledge Technologies (2017)

    Google Scholar 

  13. Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proceedings of 26th IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  14. Lu, Y.-J., Nguyen, P.A., Zhang, H., Ngo, C.-W.: Concept-based interactive search system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 463–468. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_42

    Chapter  Google Scholar 

  15. Moumtzidou, A., et al.: VERGE in VBS 2017. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 486–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_46

    Chapter  Google Scholar 

  16. Nguyen, V.-T., Ngo, T.D., Le, D.-D., Tran, M.-T., Duong, D.A., Satoh, S.: Semantic extraction and object proposal for video search. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 475–479. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_44

    Chapter  Google Scholar 

  17. Rossetto, L., Giangreco, I., Tănase, C., Schuldt, H., Dupont, S., Seddati, O.: Enhanced retrieval and browsing in the IMOTION system. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 469–474. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_43

    Chapter  Google Scholar 

  18. Sanou, B.: World in 2016: ICT Facts and Figures. Technical report, International Telecommunication Union (ITU) (2017). http://tinyurl.com/itu-facts-2016

  19. Schoeffmann, K., Hudelist, M.A., Huber, J.: Video interaction tools: a survey of recent work. ACM Comput. Surv. 48(1), 14:1–14:34 (2015)

    Google Scholar 

  20. Schoeffmann, K., Primus, M.J., Muenzer, B., Petscharnig, S., Karisch, C., Xu, Q., Huerst, W.: Collaborative feature maps for interactive video search. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10133, pp. 457–462. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51814-5_41

    Chapter  Google Scholar 

  21. Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE MultiMedia 21(4), 8–13 (2014)

    Article  Google Scholar 

  22. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of 8th ACM International Workshop on Multimedia Information Retrieval, p. 321. ACM Press (2006)

    Google Scholar 

  23. Vassou, S.A., Amanatiadis, A., Christodoulou, K., Chatzichristoos, S.A.: CoMo: a compact composite moment-based descriptor for image retrieval. In: Proceedings of 15th International Workshop on Content-Based Multimedia Indexing (2017)

    Google Scholar 

  24. Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: Proceedings of 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, p. 603. ACM Press (2008)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the Alpen-Adria University Klagenfurt and Lakeside Labs GmbH, Klagenfurt, Austria and funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF 20214 u. 3520/26336/38165.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sabrina Kletz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kletz, S., Leibetseder, A., Schoeffmann, K. (2018). Evaluation of Visual Content Descriptors for Supporting Ad-Hoc Video Search Tasks at the Video Browser Showdown. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73603-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73602-0

  • Online ISBN: 978-3-319-73603-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics