Probabilistic and Voting Approaches to Cue Integration for Figure-Ground Segmentation

  • Eric Hayman
  • Jan-Olof Eklundh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2352)


This paper describes techniques for fusing the output of multiple cues to robustly and accurately segment foreground objects from the background in image sequences. Two different methods for cue integration are presented and tested. The first is a probabilistic approach which at each pixel computes the likelihood of observations over all cues before assigning pixels to foreground or background layers using Bayes Rule. The second method allows each cue to make a decision independent of the other cues before fusing their outputs with a weighted sum. A further important contribution of our work concerns demonstrating how models for some cues can be learnt and subsequently adapted online. In particular, regions of coherent motion are used to train distributions for colour and for a simple texture descriptor. An additional aspect of our framework is in providing mechanisms for suppressing cues when they are believed to be unreliable, for instance during training or when they disagree with the general consensus. Results on extended video sequences are presented.


Gaussian Mixture Model Expectation Maximization Algorithm Foreground Object Motion Segmentation Segmentation Mask 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Y. Altunbasak, P.E. Eren, and A.M. Tekalp. Region-based parametric motion segmentation using color information. Graphical Models and Image Processing, 60(1): 13–23, Jan 1998.Google Scholar
  2. 2.
    S. Ayer and H. Sawhney. Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding. In Proc. Int. Conf. on Computer Vision, pages 777–784, 1995.Google Scholar
  3. 3.
    S. Belongie, C. Carson, H. Greenspan, and J. Malik. Color-and texture-based image segmentation using the Expectation-Maximization algorithm and its application to content-based image retrieval. In Proc. Int. Conf. on Computer Vision, pages 675–682, 1998.Google Scholar
  4. 4.
    J. Bilmes. A gentle tutorial on the EM algorithm and application to gaussian mixtures and Baum-Welch. Technical Report TR-97-021, International Computer Science Institute, Berkeley, CA, April 1997.Google Scholar
  5. 5.
    M.J. Black and P. Anandan. The robust estimation of multiple motions: Parametric and piecewise-smooth flow-fields. CVIU, 63(1):75–104, January 1996.Google Scholar
  6. 6.
    C. Bräutigam, J.-O. Eklundh, and H.I. Christensen. A model-free voting approach for integrating multiple cues. In Proc. European Conf. on Computer Vision, 1998.Google Scholar
  7. 7.
    J. J. Clark and A. L. Yuille. Data Fusion for Sensory Information Processing Systems. Kluwer Academic Press, 1990.Google Scholar
  8. 8.
    A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc., 39 B:1–38, 1977.MathSciNetGoogle Scholar
  9. 9.
    E. Hayman and J. O. Eklundh. Figure-ground segmentation of image sequences from multiple cues, 2002. The long version of this conference paper is available at
  10. 10.
    M. Irani and P. Anandan. A unified approach to moving object detection in 2D and 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(6), 1998.Google Scholar
  11. 11.
    M. Irani, B. Rousso, and S. Peleg. Computing Occluding and Transparent Motions. International Journal of Computer Vision, 12(1):5–16, 1994.CrossRefGoogle Scholar
  12. 12.
    S. Khan and M. Shah. Object based segmentation of video using color, motion and spatial information. In Proc. Computer Vision and Pattern Recognition, pages II:746–751, 2001.Google Scholar
  13. 13.
    D. Kragić. Visual Servoing for Manipulation: Robustness and Integration Issues. PhD thesis, Royal Institute of Technology (KTH), Stockholm, Sweden, 2001.Google Scholar
  14. 14.
    J. Malik, S. Belongie, J. Shi, and T. Leung. Textons, contours and regions: Cue integration in image segmentation. In Proc. Int. Conf. on Computer Vision, 1999.Google Scholar
  15. 15.
    P. Nordlund and J.-O. Eklundh. Towards a seeing agent. In First International Workshop on Cooperative Distributed Vision, Kyoto, Japan, pages 93–123, 1997.Google Scholar
  16. 16.
    Y. Raja, S.J. McKenna, and S. Gong. Colour model selection and adaptation in dynamic scenes. In Proc. European Conf. on Computer Vision, pages 460–474, 1998.Google Scholar
  17. 17.
    H.S. Sawhney, Y. Guo, and R. Kumar. Independent motion detection in 3D scenes. IEEE Trans. on Patt. Analysis and Machine Intelligence, 22(10):1191–1199, October 2000.Google Scholar
  18. 18.
    J. Sherrah and S. Gong. Continuous global evidence-based bayesian modality fusion for simultaneous tracking of multiple objects. In Proc. Int. Conf. on Computer Vision, pages II: 42–49, 2001.Google Scholar
  19. 19.
    J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. on Patt. Analysis and Machine Intelligence, 22(8), Aug 2000.Google Scholar
  20. 20.
    M. Spengler and B. Schiele. Towards robust multi-cue integration for visual tracking. In Computer Vision Systems, July 2001, Vancouver,BC, 2001.Google Scholar
  21. 21.
    H. Tao, H.S. Sawhney, and R. Kumar. Dynamic layer representation with applications to tracking. In Proc. Computer Vision and Pattern Recognition, pages II: 134–141, 2000.Google Scholar
  22. 22.
    H. Tao, H.S. Sawhney, and R. Kumar. A global matching framework for stereo computation. In Proc. Int. Conf. on Computer Vision, pages I: 532–539, 2001.Google Scholar
  23. 23.
    P.H.S. Torr, R. Szeliski, and P. Anandan. An integrated Bayesian approach to layer extraction from image sequences. IEEE Trans. on Patt. Analysis and Machine Intelligence, 23(3):297–303, March 2001.Google Scholar
  24. 24.
    K. Toyama and E. Horvitz. Bayesian modality fusion: Probabilistic integration of multiple vision cues for head tracking. In Proc. Asian Conference on Computer Vision, 2000.Google Scholar
  25. 25.
    K. Toyama and Y Wu. Bootstrap initialization of nonparametric texture models for tracking. In Proc. 6th European Conf. on Computer Vision, Dublin, 2000.Google Scholar
  26. 26.
    J. Triesch and C. von der Malsburg. Self-organized integration of adaptive visual cues for face tracking. In Proc Int, Conf. on Automatic Face and Gesture Recognition, Grenoble, France, 2000.Google Scholar
  27. 27.
    J.Y.A. Wang and E.H. Adelson. Spatio-temporal segmentation of video data. In SPIE: Image and Video Processing II, San Jose, Feb 1994.Google Scholar
  28. 28.
    Y. Wu and T.S. Huang. A co-inference approach to robust visual tracking. In Proc. Int. Conf. on Computer Vision, pages II: 26–33, 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eric Hayman
    • 1
  • Jan-Olof Eklundh
    • 1
  1. 1.Dept. of Numerical Analysis and Computer Science KTHComputational Vision and Active Perception Laboratory (CVAP)StockholmSweden

Personalised recommendations