Image Segmentation Evaluation by Techniques of Comparing Clusterings

  • Xiaoyi Jiang
  • Cyril Marti
  • Christophe Irniger
  • Horst Bunke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3617)


The task considered in this paper is performance evaluation of region segmentation algorithms in the ground truth (GT) based paradigm. Given a machine segmentation and a GT reference, performance measures are needed. We propose to consider the image segmentation problem as one of data clustering and, as a consequence, to use measures for comparing clusterings developed in statistics and machine learning. By doing so, we obtain a variety of performance measures which have not been used before in computer vision. In particular, some of these measures have the highly desired property of being a metric. Experimental results are reported on both synthetic and real data to validate the measures and compare them with others.


Ground Truth Image Segmentation Segmentation Algorithm Range Image Region Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Proc. of Pacific Symposium on Biocomputing, pp. 6–17 (2002)Google Scholar
  2. 2.
    Chang, K.I., Bowyer, K.W., Sivagurunath, M.: Evaluation of texture segmentation algorithms. In: Proc. of CVPR, pp. 294–299 (1999)Google Scholar
  3. 3.
    Cingue, L., Cucciara, R., Levialdi, S., Martinez, S., Pignalberi, G.: Optimal range segmentation parameters through genetic algorithms. In: Proc. of 15th ICPR, Barcelona, vol. 1, pp. 474–477 (2000)Google Scholar
  4. 4.
    Fowlkes, E.B., Mallows, C.L.: A method for comparing two hierarchical clusterings. Journal of the American Statistical Association 78, 553–569 (1983)zbMATHCrossRefGoogle Scholar
  5. 5.
    Hoover, A., Jean-Baptiste, G., Jiang, X., Flynn, P.J., Bunke, H., Goldgof, D., Bowyer, K., Eggert, D., Fitzgibbon, A., Fisher, R.: An experimental comparison of range image segmentation algorithms. IEEE Trans. on PAMI 18(7), 673–689 (1996)Google Scholar
  6. 6.
    Huang, Q., Dom, B.: Quantitative methods of evaluating image segmentation. In: Proc. of ICIP, pp. 53–56 (1995)Google Scholar
  7. 7.
    Jiang, X., Bowyer, K., Morioka, Y., Hiura, S., Sato, K., Inokuchi, S., Bock, M., Guerra, C., Loke, R.E., du Buf, J.M.H.: Some further results of experimental comparison of range image segmentation algorithms. In: Proc. of 15th ICPR, Barcelona, vol. 4, pp. 877–881 (2000)Google Scholar
  8. 8.
    Jiang, X.: An adaptive contour closure algorithm and its experimental evaluation. IEEE Trans. on PAMI 22(11), 1252–1265 (2000)Google Scholar
  9. 9.
    Jiang, X.: Performance evaluation of image segmentation algorithms. In: Chen, C.H., Wang, P.S.P. (eds.) Handbook of Pattern Recognition and Computer Vision, 3rd edn., pp. 525–542. World Scientific, Singapore (2005)CrossRefGoogle Scholar
  10. 10.
    Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its applications to evaluating segmentation algorithms and measuring ecological statistics. In: Proc. of ICCV, vol. 2, pp. 416–423 (2001)Google Scholar
  11. 11.
    Meila, M.: Comparing clusterings by the variation of information. In: Proc. of 6th Annual Conference on Learning Theory (2003)Google Scholar
  12. 12.
    Min, J., Powell, M., Bowyer, K.W.: Automated performance evaluation of range image segmentation algorithms. IEEE Trans. on SMC-B 34(1), 263–271 (2004)Google Scholar
  13. 13.
    Powell, M.W., Bowyer, K.W., Jiang, X., Bunke, H.: Comparing curved-surface range image segmenters. In: Proc. of 6th ICCV, Bombay, pp. 286–291 (1998)Google Scholar
  14. 14.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 846–850 (1971)CrossRefGoogle Scholar
  15. 15.
    Strehl, A., Gosh, J., Mooney, R.: Impact of similarity measures on web-page clustering. In: Proc. of AAAI Workshop of Artificial Intelligence for Web Search, pp. 58–64 (2000)Google Scholar
  16. 16.
    van Dongen, S.: Performance criteria for graph clustering and Markov cluster experiments. Technical Report INS-R0012, Centrum voor Wiskunde en Informatica (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Xiaoyi Jiang
    • 1
  • Cyril Marti
    • 2
  • Christophe Irniger
    • 2
  • Horst Bunke
    • 2
  1. 1.Department of Computer ScienceUniversity of MünsterMünsterGermany
  2. 2.Institute of Informatics and Applied MathematicsUniversity of BernBernSwitzerland

Personalised recommendations