Compressed-Domain Video Object Tracking Using Markov Random Fields with Graph Cuts Optimization

  • Fernando BombardelliEmail author
  • Serhan Gül
  • Cornelius Hellge
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)


We propose a method for tracking objects in H.264/AVC compressed videos using a Markov Random Field model. Given an initial segmentation of the target object in the first frame, our algorithm applies a graph-cuts-based optimization to output a binary segmentation map for the next frame. Our model uses only the motion vectors and block coding modes from the compressed bitstream. Thus, complexity and storage requirements are significantly reduced compared to pixel-domain algorithms. We evaluate our method over two datasets and compare its performance to a state-of-the-art compressed-domain algorithm. Results show that we achieve better results in more challenging sequences.



The research leading to these results has received funding from the German Federal Ministry for Economic Affairs and Energy under the VIRTUOSE-DE project.


  1. 1.
    Aimar, L., et al.: x264, the best H.264/AVC encoder - VideoLAN, August 2017.
  2. 2.
    Arvanitidou, M.G., Glantz, A., Krutz, A., Sikora, T., Mrak, M., Kondoz, A.: Global motion estimation using variable block sizes and its application to object segmentation. In: 10th Workshop on Image Analysis for Multimedia Interactive Services, pp. 173–176. IEEE (2009)Google Scholar
  3. 3.
    Babu, R.V., Tom, M., Wadekar, P.: A survey on compressed domain video analysis techniques. Multimedia Tools Appl. 75(2), 1043–1078 (2016)CrossRefGoogle Scholar
  4. 4.
    Becker, D., et al.: Visual object tracking in a parking garage using compressed domain analysis. In: Proceedings of the 9th ACM Multimedia System Conference (2018)Google Scholar
  5. 5.
    Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. Roy. Stat. Soc. Ser. B (Methodol.) 36, 192–236 (1974)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  7. 7.
    Galoogahi, H.K., Fagg, A., Huang, C., Ramanan, D., Lucey, S.: Need for speed: a benchmark for higher frame rate object tracking. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1134–1143. IEEE (2017)Google Scholar
  8. 8.
    Goldberg, A.V., Hed, S., Kaplan, H., Tarjan, R.E., Werneck, R.F.: Maximum flows by incremental breadth-first search. In: Demetrescu, C., Halldórsson, M.M. (eds.) ESA 2011. LNCS, vol. 6942, pp. 457–468. Springer, Heidelberg (2011). Scholar
  9. 9.
    Goldberg, A.V., Hed, S., Kaplan, H., Kohli, P., Tarjan, R.E., Werneck, R.F.: Faster and more dynamic maximum flow by incremental breadth-first search. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 619–630. Springer, Heidelberg (2015). Scholar
  10. 10.
    Gül, S., Meyer, J.T., Hellge, C., Schierl, T., Samek, W.: Hybrid video object tracking in H.265/HEVC video streams. In: 2016 IEEE 18th Int. Workshop on Multimedia Signal Process, pp. 1–5 (2016)Google Scholar
  11. 11.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). Scholar
  12. 12.
    Kantorov, V., Laptev, I.: Efficient feature extraction, encoding and classification for action recognition. In: CVPR, pp. 2593–2600 (2014)Google Scholar
  13. 13.
    Käs, C., Nicolas, H.: An Approach to trajectory estimation of moving objects in the H.264 compressed domain. In: Wada, T., Huang, F., Lin, S. (eds.) PSIVT 2009. LNCS, vol. 5414, pp. 318–329. Springer, Heidelberg (2009). Scholar
  14. 14.
    Khatoonabadi, S.H., Bajić, I.V.: Video object tracking in the compressed domain using spatio-temporal Markov random fields. IEEE Trans. Image Process. 22(1), 300–313 (2013)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Kristan, M., et al.: The visual object tracking VOT2016 challenge results. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 777–823. Springer, Cham (2016). Scholar
  16. 16.
    Laumer, M., Amon, P., Hutter, A., Kaup, A.: Compressed domain moving object detection based on H.264/AVC macroblock types. In: VISAPP, vol. 1, pp. 219–228 (2013)Google Scholar
  17. 17.
    Laumer, M., Amon, P., Hutter, A., Kaup, A.: Moving object detection in the H.264/AVC compressed domain. APSIPA Trans. Signal Inf. Process. 5, e18 (2016)CrossRefGoogle Scholar
  18. 18.
    Poppe, C., De Bruyne, S., Paridaens, T., Lambert, P., Van de Walle, R.: Moving object detection in the H.264/AVC compressed domain for video surveillance applications. J. Vis. Commun. Image Represent. 20(6), 428–437 (2009)CrossRefGoogle Scholar
  19. 19.
    Prince, S.J.: Computer Vision: Models, Learning, and Inference. Cambridge University Press, Cambridge (2012)CrossRefGoogle Scholar
  20. 20.
    Richardson, I.E.: The H.264 Advanced Video Compression Standard. Wiley, Hoboken (2011)Google Scholar
  21. 21.
    Smolić, A., Hoeynck, M., Ohm, J.R.: Low-complexity global motion estimation from P-frame motion vectors for MPEG-7 applications. In: International Conference on Image Processing, vol. 2, pp. 271–274. IEEE (2000)Google Scholar
  22. 22.
    Sühring, K.: H.264/AVC JM reference software, August 2017.
  23. 23.
    Wiegand, T., Sullivan, G.J., Bjøntegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 13(7), 560–576 (2003)CrossRefGoogle Scholar
  24. 24.
    Wojaczek, P., Laumer, M., Amon, P., Hutter, A., Kaup, A.: Hybrid person detection and tracking in H.264/AVC video streams. In: VISAPP, vol. 1, pp. 478–485 (2015)Google Scholar
  25. 25.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. (CSUR) 38(4), 13 (2006)CrossRefGoogle Scholar
  26. 26.
    You, W., Sabirin, M., Kim, M.: Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain. In: Proceedings of SPIE, the International Society for Optical Engineering (2009)Google Scholar
  27. 27.
    Zeng, W., Du, J., Gao, W., Huang, Q.: Robust moving object segmentation on H.264/AVC compressed video using the block-based MRF model. Real-Time Imaging 11(4), 290–299 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Fernando Bombardelli
    • 1
    Email author
  • Serhan Gül
    • 1
  • Cornelius Hellge
    • 1
  1. 1.Department of Video Coding and AnalyticsFraunhofer Heinrich Hertz InstituteBerlinGermany

Personalised recommendations