The European Physical Journal B

, Volume 66, Issue 1, pp 125–135 | Cite as

Unsupervised and semi-supervised clustering by message passing: soft-constraint affinity propagation

Interdisciplinary Physics

Abstract

Soft-constraint affinity propagation (SCAP) is a new statistical-physics based clustering technique [M. Leone, Sumedha, M. Weigt, Bioinformatics 23, 2708 (2007)]. First we give the derivation of a simplified version of the algorithm and discuss possibilities of time- and memory-efficient implementations. Later we give a detailed analysis of the performance of SCAP on artificial data, showing that the algorithm efficiently unveils clustered and hierarchical data structures. We generalize the algorithm to the problem of semi-supervised clustering, where data are already partially labeled, and clustering assigns labels to previously unlabeled points. SCAP uses both the geometrical organization of the data and the available labels assigned to few points in a computationally efficient way, as is shown on artificial and biological benchmark data.

PACS

02.50.Tt Inference methods 05.20.-y Classical statistical mechanics 89.75.Fb Structures and organization in complex systems 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.K. Jain, M.N. Murthy, P.J. Flynn, ACM Computing Surveys 31, 264 (1999) Google Scholar
  2. R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, 2nd edn. (Wiley-Interscience, 2000) Google Scholar
  3. Semi-Supervised Learning edited by O. Chapelle, B. Schölkopf, A. Zien (MIT Press, Cambridge MA, 2006) Google Scholar
  4. G. Getz, N. Sehntal and E.Domany, Proceedings of “Learning with Partially Classified Training Data” ICML 2005, p. 37 Google Scholar
  5. R.R. Sokal, C.D. Michener, University of Kansas Scientific Bulletin (1958) Google Scholar
  6. S.C. Johnson, Psychometrika 2, 241 (1967) Google Scholar
  7. J. McQueen, in Proc. 5th Berkeley Symp. on Math. Stat. and Prob., edited by L. Le Cam, J. Neyman (Uni. of California Press, 1967) Google Scholar
  8. M. Blatt, S. Wiseman, E. Domany, Phys. Rev. Lett. 76, 3251 (1996) Google Scholar
  9. B.J. Frey, D. Dueck, Science 315, 972 (2007) Google Scholar
  10. J.S. Yedidia, W.F. Freeman, Y. Weiss, IEEE Trans. Inform. Theory 47, 1 (2005) Google Scholar
  11. F.R. Kschischang, B.J. Frey, H.A. Loeliger, IEEE Trans. Inform. Theory 47, 1 (2001) Google Scholar
  12. M. Mézard, G. Parisi, Eur. Phys. J. B 20, 217 (2001) Google Scholar
  13. A.K. Hartmann, M. Weigt, Phase Transitions in Combinatorial Optimization Problems (Wiley-VCH, Berlin, 2005) Google Scholar
  14. M. Leone, Sumedha, M. Weigt, Bioinformatics 23, 2708 (2007) Google Scholar
  15. R.O. Duda, P.E. Hart, Classification and Scene Analysis (Wiley, New York, 1973) Google Scholar

Copyright information

© EDP Sciences, SIF, Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  1. 1.Institute for Scientific InterchangeTorinoItaly

Personalised recommendations