Correlation Clustering with Stochastic Labellings

Rebagliati, Nicola; Rota Bulò, Samuel; Pelillo, Marcello

doi:10.1007/978-3-642-39140-8_8

Nicola Rebagliati¹⁸,
Samuel Rota Bulò¹⁹ &
Marcello Pelillo¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7953))

Included in the following conference series:

International Workshop on Similarity-Based Pattern Recognition

1509 Accesses
3 Citations

Abstract

Correlation clustering is the problem of finding a crisp partition of the vertices of a correlation graph in such a way as to minimize the disagreements in the cluster assignments. In this paper, we discuss a relaxation to the original problem setting which allows probabilistic assignments of vertices to labels. By so doing, overlapping clusters can be captured. We also show that a known optimization heuristic can be applied to the problem formulation, but with the automatic selection of the number of classes. Additionally, we propose a simple way of building an ensemble of agreement functions sampled from a reproducing kernel Hilbert space, which allows to apply correlation clustering without the empirical estimation of pairwise correlation values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. In: STOC, pp. 684–693 (2005)
Google Scholar
Aronszajn, N.: Theory of reproducing kernels. Trans. Amer. Math. Soc. 68, 337–404 (1950)
Article MathSciNet MATH Google Scholar
Arora, R., Gupta, M., Kapila, A., Fazel, M.: Clustering by left-stochastic matrix factorization. In: ICML, pp. 761–768 (2011)
Google Scholar
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)
Article MATH Google Scholar
Baum, L.E., Eagon, J.A.: An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull. Amer. Math. Soc. 73, 360–363 (1967)
Article MathSciNet MATH Google Scholar
Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Math. Statistics 41, 164–171 (1970)
Article MathSciNet MATH Google Scholar
Baum, L.E., Sell, G.R.: Growth transformations for functions on manifolds. Pac. J. Math. 27, 221–227 (1968)
Article MathSciNet Google Scholar
Bonchi, F., Gionis, A., Ukkonen, A.: Overlapping correlation clustering. In: ICDM, pp. 51–60 (2011)
Google Scholar
Bulò, S.R., Lourenço, A., Fred, A., Pelillo, M.: Pairwise probabilistic clustering using evidence accumulation. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds.) SSPR&SPR 2010. LNCS, vol. 6218, pp. 395–404. Springer, Heidelberg (2010)
Chapter Google Scholar
Bulò, S.R., Pelillo, M.: Probabilistic clustering using the baum-eagon inequality. In: ICPR, pp. 1429–1432 (2010)
Google Scholar
Chen, Y., Garcia, E.K., Gupta, M.R., Rahimi, A., Cazzanti, L.: Similarity-based Classification: Concepts and Algorithms. Journal of Machine Learning Research 10, 747–776 (2009)
MathSciNet MATH Google Scholar
Coleman, T., Saunderson, J., Wirth, A.: Spectral clustering with inconsistent advice. In: ICML, pp. 152–159 (2008)
Google Scholar
Demaine, E.D., Emanuel, D., Fiat, A., Immorlica, N.: Correlation clustering in general weighted graphs. Theor. Comput. Sci. 361(2-3), 172–187 (2006)
Article MathSciNet MATH Google Scholar
Downing, N., Stuckey, P.J., Wirth, A.: Improved consensus clustering via linear programming. In: ACSC, pp. 61–70 (2010)
Google Scholar
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Fawcett, T., Mishra, N. (eds.) ICML, pp. 186–193. AAAI Press (2003)
Google Scholar
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Article Google Scholar
Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. In: Proceedings of the 21st International Conference on Data Engineering (ICDE), pp. 341–352 (2005)
Google Scholar
Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42(6), 1115–1145 (1995)
Article MathSciNet MATH Google Scholar
Joachims, T., Hopcroft, J.E.: Error bounds for correlation clustering. In: ICML, pp. 385–392 (2005)
Google Scholar
Mathieu, C., Schudy, W.: Bounding and comparing methods for correlation clustering beyond ILP. In: ILP-NLP (2009)
Google Scholar
MATLAB: version 7.8.0 (R2009a). The MathWorks Inc., Natick, Massachusetts (2009)
Google Scholar
Meilă, M.: Comparing Clusterings by the Variation of Information. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 173–187. Springer, Heidelberg (2003)
Chapter Google Scholar
Monti, S., Tamayo, P., Mesirov, J.P., Golub, T.R.: Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning 52(1-2), 91–118 (2003)
Article MATH Google Scholar
Nepusz, T., Petróczi, A., Négyessy, L., Bazsó, F.: Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E 77(1), 016107 (2008)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
MathSciNet Google Scholar
Swamy, C.: Correlation clustering: maximizing agreements via semidefinite programming. In: SODA, pp. 526–527 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

VTT Technical Research Centre of Finland, 02044, Finland
Nicola Rebagliati
Department of Enviromental Science, Computer Science and Statistics, Universitá Ca’ Foscari, Venezia, 30121, Italy
Samuel Rota Bulò & Marcello Pelillo

Authors

Nicola Rebagliati
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Rota Bulò
View author publications
You can also search for this author in PubMed Google Scholar
Marcello Pelillo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of York, Deramore Lane, YO10 5GH, York, UK
Edwin Hancock
DAIS, Università Ca’ Foscari, Via Torino 155, 30172, Venice, Italy
Marcello Pelillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rebagliati, N., Rota Bulò, S., Pelillo, M. (2013). Correlation Clustering with Stochastic Labellings. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-39140-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39139-2
Online ISBN: 978-3-642-39140-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics