Abstract
Partition-based clustering is the task of partitioning a dataset in a number of groups of examples, such that examples in each group are similar to each other. Many criteria for what constitutes a good clustering have been identified in the literature; furthermore, the use of additional constraints to find more useful clusterings has been proposed. In this chapter, it will be shown that most of these clustering tasks can be formalized using optimization criteria and constraints. We demonstrate how a range of clustering tasks can be modelled in generic constraint programming languages with these constraints and optimization criteria. Using the constraint-based modeling approach we also relate the DBSCAN method for density-based clustering to the label propagation technique for community discovery.
Siegfried Nijssen can currently be reached at the Institute of Information and Communication Technologies, Electronics and Applied Mathematics, UC Louvain, Belgium.
Tias Guns can currently be reached at the Vrije Universiteit Brussel.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications, 1st edn. Chapman & Hall/CRC, Boca Raton (2013)
Babaki, B., Guns, T., Nijssen, S.: Constrained clustering using column generation. In: Simonis, H. (ed.) CPAIOR 2014. LNCS, vol. 8451, pp. 438–454. Springer, Heidelberg (2014). doi:10.1007/978-3-319-07046-9_31
Basu, S., Davidson, I., Wagstaff, K., Clustering, C.: Advances in Algorithms, Theory, and Applications, 1st edn. Chapman & Hall/CRC, Boca Raton (2008)
Berthold, M.R., Borgelt, C., Hppner, F., Klawonn, F.: Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data, 1st edn. Springer, Heidelberg (2010)
Coscia, M., Giannotti, F., Pedreschi, D.: A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)
Dao, T., Duong, K., Vrain, C.: A filtering algorithm for constrained clustering with within-cluster sum of dissimilarities criterion. In: 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, Herndon, VA, USA, 4–6 November 2013, pp. 1060–1067 (2013)
Dao, T.-B.-H., Duong, K.-C., Vrain, C.: A declarative framework for constrained clustering. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 419–434. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_27
Dao, T.-B.-H., Duong, K.-C., Vrain, C.: Constrained clustering by constraint programming. Artif. Intell. (2015)
Davidson, I., Ravi, S.: The complexity of non-hierarchical clustering with instance and cluster level constraints. Data Min. Knowl. Disc. 14(1), 25–61 (2007)
Davidson, I., Ravi, S.S.: Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of the 2005 SIAM International Conference on Data Mining, SDM 2005, Newport Beach, CA, USA, 21–23 April 2005, pp. 138–149 (2005)
du Merle, O., Hansen, P., Jaumard, B., Mladenovic, N.: An interior point algorithm for minimum sum-of-squares clustering. SIAM J. Sci. Comput. 21(4), 1485–1505 (1999)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) KDD, pp. 226–231. AAAI Press, Menlo Park (1996)
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)
Hansen,P., Aloise, D.: A survey on exact methods for minimum sum-of-squares clustering, pp. 1–2, January 2009. http://www.math.iit.edu/Buck65files/msscStLouis.pdf
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Mueller, M., Kramer, S.: Integer linear programming models for constrained clustering. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds.) DS 2010. LNCS (LNAI), vol. 6332, pp. 159–173. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16184-1_12
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(2), 036106+ (2007)
Wagstaff, K., Cardie,C.: Clustering with instance-level constraints. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29–July 2 2000, pp. 1103–1110 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this chapter
Cite this chapter
Grossi, V., Guns, T., Monreale, A., Nanni, M., Nijssen, S. (2016). Partition-Based Clustering Using Constraint Optimization. In: Bessiere, C., De Raedt, L., Kotthoff, L., Nijssen, S., O'Sullivan, B., Pedreschi, D. (eds) Data Mining and Constraint Programming. Lecture Notes in Computer Science(), vol 10101. Springer, Cham. https://doi.org/10.1007/978-3-319-50137-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-50137-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50136-9
Online ISBN: 978-3-319-50137-6
eBook Packages: Computer ScienceComputer Science (R0)