Experiential Sampling for Object Detection in Video

  • Paresh Anandathirtha
  • K.R. Ramakrishnan
  • S. Kumar Raja
  • Mohan S. Kankanhalli
Part of the Signals and Communication Technology book series (SCT)


There are robust, supervised learning-based algorithms available for object detection in an image. Object detection in videos can be performed by using such a detector on each frame of the video sequence. This approach checks for the presence of an object around each point, at different scales, and ignoring the temporal continuity and availability of various visual cues such as motion and color. Hence such methods lack efficiency and adaptability. We propose a generic framework, based on experiential sampling, that considers temporal continuity and various visual cues to focus on relevant subset of each frame. We determine some key points, called attention samples, and object detection is performed only at scales with these points as centers. These key points are statistical samples from a density function that is estimated based on various visual cues, past experience, and modeling temporal continuity. This density estimation is modeled as a Bayesian filtering problem and sequential Monte Carlo methods are used to solve it. This framework effectively combines both bottom-up and top-down visual attention phenomena and results in significant reduction in overall computation required, with negligible loss in accuracy. This enables the use of robust learning-based object detectors in real-time applications that are otherwise computationally expensive. We demonstrate the usefulness of this framework for frontal-face detection in video using color and motion cues.


Experiential Sampling Object Detection Face Detection Sensor Sample Attention Sample 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arulampalam, M. S., Maskell, S., Gordon, N. and Clapp, T. (2002). A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking, IEEE Transactions on Signal Processing, Vol. 50, No. 2, pp. 174–188.CrossRefGoogle Scholar
  2. 2.
    Czyz, J. (2006). Object Detection in Video via Particle Filters, International Conference on Pattern Recognition, Vol. 1, pp. 820–823.Google Scholar
  3. 3.
    Doucet, A., Godsill, S. and Andrieu, C. (2000). On sequential Monte Carlo sampling methods for bayesian filtering, Statistics and Computing, Vol. 10, No. 3, pp. 197–208.CrossRefGoogle Scholar
  4. 4.
    Hsu, R. L., Mottaleb, M. A. and Jain, A. K. (2002). Face Detection in Color Images, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, Issue 5, pp. 696–706.CrossRefGoogle Scholar
  5. 5.
    Isard, M. and Blake, A. (1998). Condensation: conditional density propagation for visual tracking. Int. Journal Computer Vision, 29(1):5–28.CrossRefGoogle Scholar
  6. 6.
    Itti, L., Koch, C. and Niebur E. (1998). A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 1254–1259.CrossRefGoogle Scholar
  7. 7.
    Itti, L. and Koch, C. (2001). Computational Modeling of Visual Attention, Nature Reviews Neuroscience, Vol. 2, No. 3, pp. 194–203.CrossRefGoogle Scholar
  8. 8.
    Jain, R. (2003). Experiential computing, Commun. ACM, Vol. 46, No. 7, pp. 48–55.CrossRefGoogle Scholar
  9. 9.
    Kankanhalli, M. S., Wang J. and Jain R. (2006). Experiential sampling in multimedia systems, IEEE Trans. Multimedia, Vol. 8, pp. 937–946.CrossRefGoogle Scholar
  10. 10.
    Kankanhalli, M. S., Wang J. and Jain R. (2006). Experiential sampling on multiple data streams, IEEE Trans. Multimedia, Vol. 8, pp. 947–955.CrossRefGoogle Scholar
  11. 11.
    Li, S. Z., Zhu, L., Zhang, Z. Q., Blake, A., Zhang, H. J. and Shum, H. (2002). Statistical Learning of Multi-View Face Detection, Proc. 7th European Conference on Computer Vision, Copenhagen, Denmark, Vol. 2353, pp. 67–81.Google Scholar
  12. 12.
    Ma, Y. F., Hua, X. S., Lu, L. and Zhang, H. J. (2005). A Generic Framework of User Attention Model and Its Application in Video Summarization, IEEE Transaction on Multimedia, Vol. 7, pp. 907–919.CrossRefGoogle Scholar
  13. 13.
    Navalpakkam, V. and Itti, L. (2002). A Goal Oriented Attention Guidance Model, Proceedings of the Second international Workshop on Biologically Motivated Computer Vision, pp. 453 – 461Google Scholar
  14. 14.
    Navalpakkam, V. and Itti, L. (2006). An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed, Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 2049–2056.Google Scholar
  15. 15.
    Neisser, U. (1976). Cognition and Reality, W.H. Freeman, San Francisco.Google Scholar
  16. 16.
    Oliva, A., Torralba, A., Castelhano, M.S. and Henderson, J.M. (2003). Top-down control of visual attention in object detection, International Conference on Image Processing, Vol. 1, pp. 253–256.Google Scholar
  17. 17.
    Osuna, E. , Freund, R. and Girosi, F. (1997). Training support vector machines: an application to face detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.130–136.Google Scholar
  18. 18.
    Papageorgiou, C. and Poggio, T. (2000). A trainable system for object detection, International. Journal of Computer Vision, Vol. 38(1), pp. 15–33.MATHCrossRefGoogle Scholar
  19. 19.
    Paul, B., et al. (2006). Sequential Monte Carlo tracking by fusing multiple cues in video sequences, Image and Vision Computing, doi:10.1016/j.imavis.2006.07.017.Google Scholar
  20. 20.
    Prez, P., Hue, C., Vermaak J. and Gangnet, M. (2002). Color-based probabilistic tracking, Eur. Conf. on Computer Vision, ECCV, Copenhagen, Denmark, Vol. 1, pp. 631–636.Google Scholar
  21. 21.
    Rapantzikos, K. and Tsapatsoulis, N. (2005). Enhancing the robustness of skin-based face detection schemes through a visual attention architecture, International Conference on Image Processing, Vol. 2, pp. 1298–1301.Google Scholar
  22. 22.
    Rowley, H., Baluja, S., and Kanade, T. (1998). Neural Network-based Face Detection. In IEEE Transaction Pattern Analysis and Machine Intelligence, Vol. 20, No. 1, pp. 23–28.CrossRefGoogle Scholar
  23. 23.
    Schneiderman, H. and Kanade, T. (2000). A statistical method for 3D object detection applied to faces and cars, International Conference on Computer Vision and Pattern Recognition, pp. 746–751.Google Scholar
  24. 24.
    Siagian, C. and Itti, L. (2004). Biologically-Inspired Face Detection: Non-Brute-Force-Search Approach, Conference on Computer Vision and Pattern Recognition Workshop, pp. 62–69.Google Scholar
  25. 25.
    Spengler, M. and Schiele, B. (2001). Towards Robust Multi-cue Integration for Visual Tracking, Lecture Notes in Computer Science; Vol. 2095, Springer-Verlag.Google Scholar
  26. 26.
    Terrillon, J.C., Shirazi, M.N., Fukamachi, H. and Akamatsu, S. (2000). Comparative performance of different skin chrominance models and chrominance spaces for the automatic detection of human faces in color images, Proceedings International Conference on Automatic Face and Gesture Recognition, pp. 54–61.Google Scholar
  27. 27.
    Triesch, J. and Von der Malsburg, C. (2000). Self-organized integration of adaptive visual cues for face tracking, International Conference on Automatic Face and Gesture Recognition, pp. 102–107.Google Scholar
  28. 28.
    Verma, R. C., Schmid, C. and Mikolajczyk, K. (2003). Face detection and tracking in a video by propagating detection probabilities, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 25(10), pp. 1215–1228.CrossRefGoogle Scholar
  29. 29.
    Viola, P. and Jones, M. J. (2004). Robust Real-time Object Detection, Second International Workshop on Statistical and Computational Theories of Vision.Google Scholar
  30. 30.
    Yang, M. H. and Ahuja, N. (1998). Detecting Human Faces in Color Images, Proc. IEEE Int’l Conf. Image Processing, Vol. 1, pp. 127–130.Google Scholar
  31. 31.
    Yang, M. H., Kriegman D.J. and Ahuja N. (2002). Detecting Faces in Images: A Survey, Transactions on Pattern Analysis and Machine Intelligence Vol. 24, No. 1, Jan pp. 34–58.CrossRefGoogle Scholar
  32. 32.
    Yilmaz, O. J and Shah, M. (2006). Object Tracking: A Survey ACM, Journal of Computing Surveys, Vol. 38, No. 4, pp.1–45.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Paresh Anandathirtha
    • 1
  • K.R. Ramakrishnan
  • S. Kumar Raja
  • Mohan S. Kankanhalli
  1. 1.Indian Institute of ScienceBangaloreIndia

Personalised recommendations