Synonyms
Semi-supervised clustering
Definition
The area of clustering with constraints makes use of hints or advice in the form of constraints to aid or bias the clustering process. The most prevalent form of advice are conjunctions of pair-wise instance level constraints of the form must-link (ML) and cannot-link (CL) which state that pairs of instances should be in the same or different clusters respectively. Given a set of points P to cluster and a set of constraints C, the aim of clustering with constraints is to use the constraints to improve the clustering results. Constraints have so far being used in two main ways: (i) Writing algorithms that use a standard distance metric but attempt to satisfy all or as many constraints as possible and (ii) Using the constraints to learn a distance function that is then used in the clustering algorithm.
Historical Background
The idea of using constraints to guide clustering was first introduced by Wagstaff and Cardie in their seminal paper...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Basu S, Banerjee A, Mooney R. Semi-supervised clustering by seeding. In: Proceedings of the 19th International Conference on Machine Learning; 2002. p. 27–34.
Basu S, Banerjee A, Mooney RJ. Active semi-supervision for pairwise constrained clustering. In: Proceedings of the SIAM International Conference on Data Mining; 2004.
Basu S, Davidson I, Wagstaff K, editors. Constrained clustering: advances in algorithms, theory and applications. New York: Chapman & Hall/CRC Press; 2008.
Cohn D, Caruana R, McCallum A. Semi-supervised clustering with user feedback. Technical Report 2003–1892. Cornell University; 2003.
Davidson I, Ravi SS. Agglomerative hierarchical clustering with constraints: theoretical and empirical results. In: Principles of Data Mining and Knowledge Discovery, 9th European Conference; 2005. p. 59–70.
Davidson I, Ravi SS. Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of the SIAM International Conference on Data Mining; 2005.
Davidson I, Ravi SS. Identifying and generating easy sets of constraints for clustering. In: Proceedings of the 15th National Conference on AI; 2006.
Davidson I, Ester M, Ravi SS. Efficient incremental clustering with constraints. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2007. p. 204–49.
Davidson I, Ravi SS. Intractability and clustering with constraints. In: Proceedings of the 24th International Conference on Machine Learning; 2007. p. 201–8.
Davidson I, Ravi SS. The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Mining Knowl Discov. 2007;14(1):25–61.
Gondek D, Hofmann T. Non-redundant data clustering. In: Proceedings of the 2004 IEEE International Conference on Data Mining; 2004. p. 75–82.
Klein D, Kamvar SD, Manning CD. From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the 19th International Conference on Machine Learning; 2002. p. 307–14.
Wagstaff K, Cardie C. Clustering with instance-level constraints. In: Proceedings of the 17th International Conference on Machine Learning; 2000. p. 1103–10.
Wagstaff K, Cardie C, Rogers S, Schroedl S. Constrained K-means clustering with background knowledge. In: Proceedings of the 18th International Conference on Machine Learning; 2001. p. 577–84.
Xing E, Ng A, Jordan M, Russell S. Distance metric learning, with application to clustering with side-information. Adv Neural Inf Process Syst. 2002;15:505.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Davidson, I. (2018). Clustering with Constraints. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_610
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_610
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering