Correlation Clustering and Consensus Clustering

Bonizzoni, Paola; Della Vedova, Gianluca; Dondi, Riccardo; Jiang, Tao

doi:10.1007/11602613_24

Paola Bonizzoni¹⁸,
Gianluca Della Vedova¹⁹,
Riccardo Dondi¹⁸ &
…
Tao Jiang^20,21

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3827))

Included in the following conference series:

International Symposium on Algorithms and Computation

1306 Accesses
11 Citations

Abstract

The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical \({4}\over{5}\)-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alimonti, P., Kann, V.: Some APX-completeness results for cubic graphs. Theoretical Computer Science 237(1–2), 123–134 (2000)
Article MATH MathSciNet Google Scholar
Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of \(\mathcal{NP}\)-hard problems. Journal of Computer and System Sciences 58, 193–210 (2000)
Article MathSciNet Google Scholar
Ailon, N., Charikar, M., Newman, A.: Aggregating Inconsistent Information: Ranking and Clustering. In: Proc. 37th Symposium on Theory of Computing (STOC 2005), pp. 684–693 (2005)
Google Scholar
Ausiello, G., Crescenzi, P., Gambosi, V., Kann, G., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)
MATH Google Scholar
Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)
Article MATH Google Scholar
Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. In: Proc. 44th Symp. Foundations of Computer Science (FOCS), pp. 524–533 (2003)
Google Scholar
Demaine, E.D., Immorlica, N.: Correlation clustering with partial information. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 1–13. Springer, Heidelberg (2003)
Google Scholar
Emanuel, D., Fiat, A.: Correlation clustering – minimizing disagreements on arbitrary weighted graphs. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 208–220. Springer, Heidelberg (2003)
Chapter Google Scholar
Filkov, V., Skiena, S.: Integrating microarray data by consensus clustering. In: Proc. 15th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 418–425 (2003)
Google Scholar
Filkov, V., Skiena, S.: Heterogeneous data integration with the consensus clustering formalism. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 110–123. Springer, Heidelberg (2004)
Chapter Google Scholar
Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Mathematical Programming 45, 52–96 (1989)
Article Google Scholar
Krivanek, M., Moravek, J.: Hard problems in hierarchical-tree clustering. Acta Informatica 23, 311–323 (1986)
Article MATH MathSciNet Google Scholar
Swamy, C.: Correlation clustering: maximizing agreements via semidefinite programming. In: Proc. 15th Symp. on Discrete Algorithms (SODA), pp. 526–527 (2004)
Google Scholar
Wakabayashi, Y.: The complexity of computing medians of relations. Resenhas 3(3), 323–349 (1998)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca Milano, Italy
Paola Bonizzoni & Riccardo Dondi
Dipartimento di Statistica, Università degli Studi di Milano-Bicocca Milano, Italy
Gianluca Della Vedova
Department of Computer Science and Engineering, University of California, Riverside, USA
Tao Jiang
Center for Advanced Study, Tsinghua University, Beijing, China
Tao Jiang

Authors

Paola Bonizzoni
View author publications
You can also search for this author in PubMed Google Scholar
Gianluca Della Vedova
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Dondi
View author publications
You can also search for this author in PubMed Google Scholar
Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon , Hong Kong
Xiaotie Deng
Department of Computer Science, University of Texas at Dallas, EE/CS Building, 75083, Richardson, TX, USA
Ding-Zhu Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bonizzoni, P., Della Vedova, G., Dondi, R., Jiang, T. (2005). Correlation Clustering and Consensus Clustering. In: Deng, X., Du, DZ. (eds) Algorithms and Computation. ISAAC 2005. Lecture Notes in Computer Science, vol 3827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602613_24

Download citation

DOI: https://doi.org/10.1007/11602613_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30935-2
Online ISBN: 978-3-540-32426-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics