Skip to main content

Correlation Clustering with Stochastic Labellings

  • Conference paper
Similarity-Based Pattern Recognition (SIMBAD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7953))

Included in the following conference series:

Abstract

Correlation clustering is the problem of finding a crisp partition of the vertices of a correlation graph in such a way as to minimize the disagreements in the cluster assignments. In this paper, we discuss a relaxation to the original problem setting which allows probabilistic assignments of vertices to labels. By so doing, overlapping clusters can be captured. We also show that a known optimization heuristic can be applied to the problem formulation, but with the automatic selection of the number of classes. Additionally, we propose a simple way of building an ensemble of agreement functions sampled from a reproducing kernel Hilbert space, which allows to apply correlation clustering without the empirical estimation of pairwise correlation values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. In: STOC, pp. 684–693 (2005)

    Google Scholar 

  2. Aronszajn, N.: Theory of reproducing kernels. Trans. Amer. Math. Soc. 68, 337–404 (1950)

    Article  MathSciNet  MATH  Google Scholar 

  3. Arora, R., Gupta, M., Kapila, A., Fazel, M.: Clustering by left-stochastic matrix factorization. In: ICML, pp. 761–768 (2011)

    Google Scholar 

  4. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)

    Article  MATH  Google Scholar 

  5. Baum, L.E., Eagon, J.A.: An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull. Amer. Math. Soc. 73, 360–363 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  6. Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Math. Statistics 41, 164–171 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  7. Baum, L.E., Sell, G.R.: Growth transformations for functions on manifolds. Pac. J. Math. 27, 221–227 (1968)

    Article  MathSciNet  Google Scholar 

  8. Bonchi, F., Gionis, A., Ukkonen, A.: Overlapping correlation clustering. In: ICDM, pp. 51–60 (2011)

    Google Scholar 

  9. Bulò, S.R., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds.) SSPR&SPR 2010. LNCS, vol. 6218, pp. 395–404. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Bulò, S.R., Pelillo, M.: Probabilistic clustering using the baum-eagon inequality. In: ICPR, pp. 1429–1432 (2010)

    Google Scholar 

  11. Chen, Y., Garcia, E.K., Gupta, M.R., Rahimi, A., Cazzanti, L.: Similarity-based Classification: Concepts and Algorithms. Journal of Machine Learning Research 10, 747–776 (2009)

    MathSciNet  MATH  Google Scholar 

  12. Coleman, T., Saunderson, J., Wirth, A.: Spectral clustering with inconsistent advice. In: ICML, pp. 152–159 (2008)

    Google Scholar 

  13. Demaine, E.D., Emanuel, D., Fiat, A., Immorlica, N.: Correlation clustering in general weighted graphs. Theor. Comput. Sci. 361(2-3), 172–187 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  14. Downing, N., Stuckey, P.J., Wirth, A.: Improved consensus clustering via linear programming. In: ACSC, pp. 61–70 (2010)

    Google Scholar 

  15. Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Fawcett, T., Mishra, N. (eds.) ICML, pp. 186–193. AAAI Press (2003)

    Google Scholar 

  16. Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)

    Article  Google Scholar 

  17. Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. In: Proceedings of the 21st International Conference on Data Engineering (ICDE), pp. 341–352 (2005)

    Google Scholar 

  18. Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42(6), 1115–1145 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  19. Joachims, T., Hopcroft, J.E.: Error bounds for correlation clustering. In: ICML, pp. 385–392 (2005)

    Google Scholar 

  20. Mathieu, C., Schudy, W.: Bounding and comparing methods for correlation clustering beyond ILP. In: ILP-NLP (2009)

    Google Scholar 

  21. MATLAB: version 7.8.0 (R2009a). The MathWorks Inc., Natick, Massachusetts (2009)

    Google Scholar 

  22. Meilă, M.: Comparing Clusterings by the Variation of Information. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 173–187. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  23. Monti, S., Tamayo, P., Mesirov, J.P., Golub, T.R.: Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning 52(1-2), 91–118 (2003)

    Article  MATH  Google Scholar 

  24. Nepusz, T., Petróczi, A., Négyessy, L., Bazsó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E 77(1), 016107 (2008)

    Google Scholar 

  25. Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  26. Swamy, C.: Correlation clustering: maximizing agreements via semidefinite programming. In: SODA, pp. 526–527 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rebagliati, N., Rota Bulò, S., Pelillo, M. (2013). Correlation Clustering with Stochastic Labellings. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39140-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39139-2

  • Online ISBN: 978-3-642-39140-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics