Clustering with Constraints
The area of clustering with constraints makes use of hints or advice in the form of constraints to aid or bias the clustering process. The most prevalent form of advice are conjunctions of pair-wise instance level constraints of the form must-link (ML) and cannot-link (CL) which state that pairs of instances should be in the same or different clusters respectively. Given a set of points P to cluster and a set of constraints C, the aim of clustering with constraints is to use the constraints to improve the clustering results. Constraints have so far being used in two main ways: (i) Writing algorithms that use a standard distance metric but attempt to satisfy all or as many constraints as possible and (ii) Using the constraints to learn a distance function that is then used in the clustering algorithm.
The idea of using constraints to guide clustering was first introduced by Wagstaff and Cardie in their seminal paper...
- 1.Basu S, Banerjee A, Mooney R. Semi-supervised clustering by seeding. In: Proceedings of the 19th International Conference on Machine Learning; 2002. p. 27–34.Google Scholar
- 2.Basu S, Banerjee A, Mooney RJ. Active semi-supervision for pairwise constrained clustering. In: Proceedings of the SIAM International Conference on Data Mining; 2004.Google Scholar
- 3.Basu S, Davidson I, Wagstaff K, editors. Constrained clustering: advances in algorithms, theory and applications. New York: Chapman & Hall/CRC Press; 2008.Google Scholar
- 4.Cohn D, Caruana R, McCallum A. Semi-supervised clustering with user feedback. Technical Report 2003–1892. Cornell University; 2003.Google Scholar
- 6.Davidson I, Ravi SS. Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of the SIAM International Conference on Data Mining; 2005.Google Scholar
- 7.Davidson I, Ravi SS. Identifying and generating easy sets of constraints for clustering. In: Proceedings of the 15th National Conference on AI; 2006.Google Scholar
- 8.Davidson I, Ester M, Ravi SS. Efficient incremental clustering with constraints. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2007. p. 204–49.Google Scholar
- 9.Davidson I, Ravi SS. Intractability and clustering with constraints. In: Proceedings of the 24th International Conference on Machine Learning; 2007. p. 201–8.Google Scholar
- 11.Gondek D, Hofmann T. Non-redundant data clustering. In: Proceedings of the 2004 IEEE International Conference on Data Mining; 2004. p. 75–82.Google Scholar
- 12.Klein D, Kamvar SD, Manning CD. From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the 19th International Conference on Machine Learning; 2002. p. 307–14.Google Scholar
- 13.Wagstaff K, Cardie C. Clustering with instance-level constraints. In: Proceedings of the 17th International Conference on Machine Learning; 2000. p. 1103–10.Google Scholar
- 14.Wagstaff K, Cardie C, Rogers S, Schroedl S. Constrained K-means clustering with background knowledge. In: Proceedings of the 18th International Conference on Machine Learning; 2001. p. 577–84.Google Scholar
- 15.Xing E, Ng A, Jordan M, Russell S. Distance metric learning, with application to clustering with side-information. Adv Neural Inf Process Syst. 2002;15:505.Google Scholar