Parametric Distributional Clustering for Image Segmentation

  • Lothar Hermes
  • Thomas Zöller
  • Joachim M. Buhmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2352)


Unsupervised Image Segmentation is one of the central issues in Computer Vision. From the viewpoint of exploratory data analysis, segmentation can be formulated as a clustering problem in which pixels or small image patches are grouped together based on local feature information. In this contribution, parametrical distributional clustering (PDC) is presented as a novel approach to image segmentation. In contrast to noise sensitive point measurements, local distributions of image features provide a statistically robust description of the local image properties. The segmentation technique is formulated as a generative model in the maximum likelihood framework. Moreover, there exists an insightful connection to the novel information theoretic concept of the Information Bottleneck (Tishby et al. [17]), which emphasizes the compromise between efficient coding of an image and preservation of characteristic information in the measured feature distributions.

The search for good grouping solutions is posed as an optimization problem, which is solved by deterministic annealing techniques. In order to further increase the computational efficiency of the resulting segmentation algorithm, a multi-scale optimization scheme is developed. Finally, the performance of the novel model is demonstrated by segmentation of color images from the Corel data base.


Image Segmentation Clustering Maximum Likelihood Information Theory 


  1. 1.
    S. Borra and S. Sakar. A framework for performance characterization of intermediate-level grouping modules. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11):1306–1312, 1997.CrossRefGoogle Scholar
  2. 2.
    Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. John Wiley & Sons, 1991.Google Scholar
  3. 3.
    I. Csizàr and G. Tusnady. Information geometry and alternating minimization procedures. In E. J. Dudewicz et al, editor, Recent Results in Estimation Theory and Related Topics, Statistics and Decisions, Supplement Issue No. 1. Oldenbourg, 1984.Google Scholar
  4. 4.
    A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39:1–38, 1977.zbMATHMathSciNetGoogle Scholar
  5. 5.
    R. J. Hathaway. Another interpretation of the EM algorithm for mixture distributions. Statistics and Probability Letters, 4:53–56, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    F. Heitz, P. Perez, and P. Bouthemy. Multiscale minimization of global energy functions in some visual recovery problems. CVGIP: Image Understanding, 59(1):125–134, 1994.CrossRefGoogle Scholar
  7. 7.
    A. Jain and R. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, NJ 07632, 1988.zbMATHGoogle Scholar
  8. 8.
    D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. ICCV’01, 2001.Google Scholar
  9. 9.
    R. M Neal and G. E. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan, editor, Learning in Graphical Models. MIT Press, 1999.Google Scholar
  10. 10.
    F. Pereira, N. Tishby, and L. Lee. Distributional clustering of english words. In 30th International Meeting of the Association of Computational Linguistics, pages 183–190, Columbus, Ohio, 1993.Google Scholar
  11. 11.
    J. Puzicha and J. M. Buhmann. Multiscale annealing for unsupervised image segmentation. Computer Vision and Image Understanding, 76(3):213–230, 1999.CrossRefGoogle Scholar
  12. 12.
    J. Puzicha, T. Hofmann, and J. M. Buhmann. Histogram clustering for unsupervised segmentation and image retrieval. Pattern Recognition Letters, 20:899–909, 1999.CrossRefGoogle Scholar
  13. 13.
    J. Puzicha, T. Hofmann, and J. M. Buhmann. A theory of proximity based clustering: Structure detection by optimization. Pattern Recognition, 2000.Google Scholar
  14. 14.
    K. Rose, E. Gurewitz, and G. Fox. A deterministic annealing approach to clustering. Pattern Recognition Letters, 11:589–594, 1990.zbMATHCrossRefGoogle Scholar
  15. 15.
    K. Rose, E. Gurewitz, and G. Fox. Statistical mechanics and phse transitions in clustering. Physical Review Letters, 65(8):945–948, 1990.CrossRefGoogle Scholar
  16. 16.
    Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.CrossRefGoogle Scholar
  17. 17.
    N. Tishby, F. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th annual Allerton Conference on Communication, Control, and Computing, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Lothar Hermes
    • 1
  • Thomas Zöller
    • 1
  • Joachim M. Buhmann
    • 1
  1. 1.Institut für Informatik IIIRheinische Friedrich Wilhelms UniversitätBonnGermany

Personalised recommendations