Skip to main content

Correlation Clustering and Consensus Clustering

  • Conference paper
Algorithms and Computation (ISAAC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3827))

Included in the following conference series:

Abstract

The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical \({4}\over{5}\)-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alimonti, P., Kann, V.: Some APX-completeness results for cubic graphs. Theoretical Computer Science 237(1–2), 123–134 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  2. Arora, S., Karger, D., Karpinski, M.: Polynomial time approximation schemes for dense instances of \(\mathcal{NP}\)-hard problems. Journal of Computer and System Sciences 58, 193–210 (2000)

    Article  MathSciNet  Google Scholar 

  3. Ailon, N., Charikar, M., Newman, A.: Aggregating Inconsistent Information: Ranking and Clustering. In: Proc. 37th Symposium on Theory of Computing (STOC 2005), pp. 684–693 (2005)

    Google Scholar 

  4. Ausiello, G., Crescenzi, P., Gambosi, V., Kann, G., Marchetti-Spaccamela, A., Protasi, M.: Complexity and Approximation: Combinatorial optimization problems and their approximability properties. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  5. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Machine Learning 56(1-3), 89–113 (2004)

    Article  MATH  Google Scholar 

  6. Charikar, M., Guruswami, V., Wirth, A.: Clustering with qualitative information. In: Proc. 44th Symp. Foundations of Computer Science (FOCS), pp. 524–533 (2003)

    Google Scholar 

  7. Demaine, E.D., Immorlica, N.: Correlation clustering with partial information. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 1–13. Springer, Heidelberg (2003)

    Google Scholar 

  8. Emanuel, D., Fiat, A.: Correlation clustering – minimizing disagreements on arbitrary weighted graphs. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 208–220. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Filkov, V., Skiena, S.: Integrating microarray data by consensus clustering. In: Proc. 15th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 418–425 (2003)

    Google Scholar 

  10. Filkov, V., Skiena, S.: Heterogeneous data integration with the consensus clustering formalism. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 110–123. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Mathematical Programming 45, 52–96 (1989)

    Article  Google Scholar 

  12. Krivanek, M., Moravek, J.: Hard problems in hierarchical-tree clustering. Acta Informatica 23, 311–323 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  13. Swamy, C.: Correlation clustering: maximizing agreements via semidefinite programming. In: Proc. 15th Symp. on Discrete Algorithms (SODA), pp. 526–527 (2004)

    Google Scholar 

  14. Wakabayashi, Y.: The complexity of computing medians of relations. Resenhas 3(3), 323–349 (1998)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bonizzoni, P., Della Vedova, G., Dondi, R., Jiang, T. (2005). Correlation Clustering and Consensus Clustering. In: Deng, X., Du, DZ. (eds) Algorithms and Computation. ISAAC 2005. Lecture Notes in Computer Science, vol 3827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602613_24

Download citation

  • DOI: https://doi.org/10.1007/11602613_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30935-2

  • Online ISBN: 978-3-540-32426-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics