Multimedia Tools and Applications

, Volume 55, Issue 3, pp 677–723 | Cite as

A software for performance evaluation and comparison of people detection and tracking methods in video processing

  • Bahadir KarasuluEmail author
  • Serdar Korukoglu


Digital video content analysis is an important item for multimedia content-based indexing (MCBI), content-based video retrieval (CBVR) and visual surveillance systems. There are some frequently-used generic object detection and/or tracking (D&T) algorithms in the literature, such as Background Subtraction (BS), Continuously Adaptive Mean Shift (CMS), Optical Flow (OF) and etc. An important problem for performance evaluation is the absence of stable and flexible software for comparison of different algorithms. This software is able to compare them with the same metrics in real-time and at the same platform. In this paper, we have designed and implemented the software for the performance comparison and the evaluation of well-known video object D&T algorithms (for people D&T) at the same platform. The software works as an automatic and/or semi-automatic test environment in real-time, which uses the image and video processing essentials, e.g. morphological operations and filters, and ground-truth (GT) XML data files, charting/plotting capabilities and etc.


Surveillance systems Multimedia performance evaluation People detection People tracking 


  1. 1.
    Aguilera J, Wildenauer H, Kampel M, Borg M, Thirde D, Ferryman J (2005) Evaluation of motion segmentation quality for aircraft activity surveillance. In Proc. of the 2nd Joint IEEE Int. Workshop on Visual Surveillance and Perform (VS-PETS ’05), Beijing, China, pp 293–300, October 2005. doi: 10.1109/VSPETS.2005.1570928
  2. 2.
    AVITrack (2009) Aircraft surroundings, categorised vehicles & individuals tracking for apron’s activity model interpretation & check. Accessed 20 Jan 2010
  3. 3.
    Bashir F, Porikli F (2006) Performance evaluation of object detection and tracking systems. In: Proc. 9th IEEE International Workshop on PETS. New York, USA, pp 7–14, June 18Google Scholar
  4. 4.
    Baumann A, Boltz M, Ebling J et al (2008) A review and comparison of measures for automatic video surveillance systems. EURASIP J Image Video Process, vol 2008, Article ID 824726, 30 pp. doi: 10.1155/2008/824726
  5. 5.
    Benezeth Y, Jodoin PM, Emile B, Laurent H, Rosenberger C (2008) Review and evaluation of commonly-implemented background subtraction algorithms. In: Pattern Recognition, (ICPR 2008) 19th Int. Conf. on Publication Date: 8–11 Dec. 2008, pp 1–4. doi: 10.1109/ICPR.2008.4760998
  6. 6.
    Bradski GR (1998) Computer vision face tracking for use in a perceptual user interface. In: Intel Technol J. (Q2 1998)
  7. 7.
    Bradski G, Kaehler A (2008) Learning OpenCV: computer vision with the OpenCV library. O’Reilly Media, Inc. Publication, 1005 Gravenstein Highway North, Sebastopol, CA 95472. ISBN: 978-0-596-51613-0Google Scholar
  8. 8.
    Brdiczka O, Yuen P, Zaidenberg S, Reignier P, Crowley JL (2006) Automatic acquisition of context models and its application to video surveillance. In 18th Int. Conf. on Pattern Recognit. (ICPR’06). Hong Kong, pp 1175–1178, August 2006Google Scholar
  9. 9.
    Carmona EJ, Martínez-Cantos J, Mira J (2008) A new video segmentation method of moving objects based on blob-level knowledge. Pattern Recogn Lett 29(3):272–285. doi: 10.1016/j.patrec.2007.10.007 CrossRefGoogle Scholar
  10. 10.
    CAVIAR (2009) Context aware vision using image-based active recognition. Accessed 20 Jan 2010
  11. 11.
    Cheung SC, Kamath C (2004) Robust techniques for background subtraction in urban traffic video. Video Communications and Image Processing, SPIE Electronic Imaging, San Jose, January. UCRL-JC-153846-ABS, UCRL-CONF-200706Google Scholar
  12. 12.
    CLEAR (2009) Classification of events, activities and relationships—evaluation campaign and workshop. Accessed 20 Jan 2010
  13. 13.
    Comaniciu D, Meer P (1999) Mean shift analysis and applications. IEEE Int Conf Computer Vision (ICCV’99). Kerkyra, Greece, pp 1197–1203Google Scholar
  14. 14.
    Comaniciu D, Meer P (2002) Mean shift: A robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619. doi: 10.1109/34.1000236 CrossRefGoogle Scholar
  15. 15.
    Comaniciu D, Ramesh V (2000) Mean shift and optimal prediction for efficient object tracking. In Proc of IEEE Conf on Image Processing (ICIP 2000), Vancouver, Canada, Vol. 3:70–73. doi:10.1109/ICIP.2000.899297
  16. 16.
    CREDS (2009) Call for real-time event detection solutions (creds) for enhanced security and safety in public transportation. Accessed 20 Jan 2010
  17. 17.
    Erdem CE, Ernst F, Redert A, Hendriks E (2005) Temporal stabilization of video object tracking for 3D-TV applications. Signal Process Image Commun 20:151–167. doi: 10.1016/j.image.2004.10.005 CrossRefGoogle Scholar
  18. 18.
    Fleet DJ, Wiess Y (2005) Optical flow estimation. In: Paragios N, Chen Y, Faugeras O (eds) Mathematical models in computer vision: the handbook, Ch. 15. Springer, pp 239–258Google Scholar
  19. 19.
    Foresti GL, Regazzoni CS, Varshney PK (2003) Multisensor surveillance systems: the fusion perspective. Kluwer Academic Publishers, Dordrecht. ISBN 1-4020-7492-1Google Scholar
  20. 20.
    François RJA (2004) CAMSHIFT tracker design experiments with Intel OpenCV and SAI. IRIS Technical Report IRIS-04-423. University of Southern California, Los Angeles, USAGoogle Scholar
  21. 21.
    Haritaoglu I, Harwood D, Davis LS (2000) W4: Real-time surveillance of people and their activities. IEEE Trans Pattern Anal Mach Intell 22(8):809–830. doi: 10.1109/34.868683 CrossRefGoogle Scholar
  22. 22.
    Horn BKP, Schunk BG (1981) Determining optical flow. Artif Intell 17(1–3):185–203. doi: 10.1016/0004-3702(81)90024-2 CrossRefGoogle Scholar
  23. 23.
    Howlett M (2009) Nplot. Net charting-plotting scientific library. Accessed 20 Jan 2010
  24. 24.
    Jaynes C, Webb S, Steele RM, Xiong Q (2002) An open development environment for evaluation of video surveillance systems (ODViS). In: Proc. 3rd IEEE Int. Workshop on PETS (PETS’2002), June 2002. Copenhagen, Denmark, pp 32–29Google Scholar
  25. 25.
    Jodoin PM, Mignotte M (2009) Optical-flow based on an edge-avoidance procedure. Comput Vis Image Underst 113(4):511–531. doi: 10.1016/j.cviu.2008.12.005 CrossRefGoogle Scholar
  26. 26.
    Karasulu B (2009) The ViCamPEv website. Accessed 20 Jan 2010
  27. 27.
    Kasturi R, Goldgof D, Soundararajan P, Manohar V, Garofolo J, Bowers R, Boonstra M, Korzhova V, Zhang J (2009) Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol. IEEE Trans Pattern Anal Mach Intell 31(2):319–336. doi: 10.1109/TPAMI.2008.57 CrossRefGoogle Scholar
  28. 28.
    Lazarevic-McManus N, Renno JR, Makris D, Jones GA (2008) An object-based comparative methodology for motion detection based on the F-Measure. Comput Vis Image Underst 111(1):74–85. doi: 10.1016/j.cviu.2007.07.007, Special Issue on Intelligent Visual SurveillanceCrossRefGoogle Scholar
  29. 29.
    List T, Fisher RB (2004) CVML—an XML-based computer vision markup language. In Proc. of the 17th Int. Conf. on Pattern Recognit. (ICPR 04), vol. 1. Cambridge, UK, pp 789–792, August 2004. doi: 10.1109/ICPR.2004.1334335
  30. 30.
    Lucas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In Proc. Seventh Int. Joint Conf. on Artificial Intelligence, Vancouver, Canada, pp 674–679Google Scholar
  31. 31.
    Manohar V, Boonstra M, Korzhova V (2006) PETS vs. VACE evaluation programs: a comparative study. Proc. 9th IEEE Int. Workshop on PETS, New York, USA, pp 1–6, June 18Google Scholar
  32. 32.
    Mitsubishi MERL (2010) PEP: Performance Evaluation Platform for object tracking methods. Accessed 20 Jan 2010
  33. 33.
    Nummiaro K, Koller-Meier E, Van Gool LJ (2003) An adaptive color-based particle filter. Image Vis Comput 21(1):99–110. doi: 10.1016/S0262-8856(02)00129-4 CrossRefGoogle Scholar
  34. 34.
    OpenCV (2009) The open computer vision library. Accessed 20 Jan 2010
  35. 35.
    Pauwels K, Van Hulle MM (2009) Optic flow from unstable sequences through local velocity constancy maximization. Image Vis Comput 27(5):579–587, In: the 17th British Machine Vision Conf. (BMVC 2006). doi:10.1016/j.imavis.2008.04.010 Google Scholar
  36. 36.
    PETS (2007) IEEE international workshop on performance evaluation of tracking and surveillance. Accessed 20 Jan 2010
  37. 37.
    Porikli F (2002) Automatic video object segmentation. Ph.D. Dissertation, Electrical and Computer Engineering, Polytechnic University, Brooklyn, Newyork, USAGoogle Scholar
  38. 38.
    Remagnino P, Jones GA, Paragios N, Regazzoni CS (Eds) (2002) Video-based surveillance systems: computer vision and distributed processing. Kluwer Academic Publishers, Dordrecht, ISBN/ISSN 0-7923-7632-3Google Scholar
  39. 39.
    Sacan A, Ferhatosmanoglu H, Coskun H (2008) CellTrack: an open-source software for cell tracking and motility analysis. Bioinformatics Advance Access published on May 29, 2008, Bioinformatics 2008 (24):1647–1649. doi: 10.1093/bioinformatics/btn247
  40. 40.
    Schunk B (1986) The image flow constraint equation. Comput Vis Graph Image Process 35(1):20–46. doi: 10.1016/0734-189X(86)90124-6 CrossRefGoogle Scholar
  41. 41.
    Shan C, Tan T, Wei Y (2007) Real-time hand tracking using a mean shift embedded particle filter. Pattern Recognit 40(7):1958–1970. doi: 10.1016/j.patcog.2006.12.012 zbMATHCrossRefGoogle Scholar
  42. 42.
    Shi J, Tomasi C (1994) Good features to track. In: IEEE Conf. on Computer Vision and Pattern Recognit. (CVPR), pp 593–600. doi: 10.1109/CVPR.1994.323794
  43. 43.
    Thirde D, Borg M, Aguilera J, Wildenauer H, Ferryman J, Kampel M (2007) Robust real-time tracking for visual surveillance. EURASIP Journal on Advances in Signal Processing, vol. 2007, Article ID 96568, 23 pp, 2007. doi: 10.1155/2007/96568
  44. 44.
    Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. In Proceedings of IEEE Intl. Conf. on Computer Vision (ICCV). Nice, FranceGoogle Scholar
  45. 45.
    VACE (2009) Video analysis and content extraction. Accessed 20 Jan 2010
  46. 46.
    Viitaniemi V, Laaksonen J (2007) Evaluating the performance in automatic image annotation: Example case by adaptive fusion of global image features. Signal Process Image Commun 22(6):557–568. doi: 10.1016/j.image.2007.05.003 CrossRefGoogle Scholar
  47. 47.
    VIPeR (2009) Viewpoint invariant pedestrian recognition. Accessed 20 Jan 2010
  48. 48.
    Wren C, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785. doi: 10.1109/34.598236. CrossRefGoogle Scholar
  49. 49.
    Yang C, Duraiswami R, Davis L (2005) Efficient mean-shift tracking via a new similarity measure. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognit. (CVPR’05). IEEE Press, Washington, USA, pp 1176–1834Google Scholar
  50. 50.
    Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38(4):45. doi: 10.1145/1177352.1177355, Article 13CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.Department of Computer EngineeringEge UniversityIzmirTurkey

Personalised recommendations