Abstract
Condorcet clustering methods have the attractive features of producing clusterings which place similar points in the same cluster and dissimilar points in different clusters as well as not requiring a priori specification of the number of clusters. They have the disadvantages of being combinatorially hard and the method produces only convex clusters. We propose a novel modification to this method, which improves it significantly on both accounts and works particularly well when applied to social network type data sets. Specifically, we reduce the domain of the clustering to be over a Delaunay triangulation, whose size scales as \(O(n^{\lfloor m/2 \rfloor })\) where n is the number of records and m is the number of attributes used for the clustering. The triangulation also limits focus to local structure, which allows for non-convex clusterings. We demonstrate its use in comparison to other well-known heuristic methods using several constructed datasets, then use it to cluster real-world datasets.
This research was completed in partial fulfillment of the United States Military Academy’s Network Science Minor program and sponsored by the West Point Network Science Center.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ah-Pine, J., Marcotorchino, J.F.: Overview of the relational analysis approach in data-mining and multi-criteria decision making. In: Web Intelligence and Intelligent Agents. InTech (2010)
Atamtürk, A., Nemhauser, G.L., Savelsbergh, M.W.: A combined lagrangian, linear programming, and implication heuristic for large-scale set partitioning problems. J. Heuristics 1(2), 247–259 (1996)
Bertsimas, D., Allison, K., Pulleyblank, W.R.: The Analytics Edge. Dynamic Ideas LLC (2016)
Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat.-Theory Methods 3(1), 1–27 (1974)
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Franti, P., Virmajoki, O., Hautamaki, V.: Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1875–1881 (2006)
Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer Series in Statistics, vol. 1. Springer, New York (2001). https://doi.org/10.1007/978-0-387-21606-5
Grötschel, M., Wakabayashi, Y.: A cutting plane algorithm for a clustering problem. Math. Program. 45(1–3), 59–96 (1989)
Grötschel, M., Wakabayashi, Y.: Facets of the clique partitioning polytope. Math. Program. 47(1–3), 367–387 (1990)
Helliwell, J.F., L.R., Sachs, J.: World happiness report 2015 (2015)
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Lee, D., Schachter, B.J.: Two algorithms for constructing a Delaunay triangulation. Int. J. Comput. Inf. Sci. 9(3), 219–242 (1980)
Marcotorchino, F., Michaud, P.: Agregation de similarites en classification automatique. Rev. Stat. Appl. 30(2), 21–44 (1982)
McInnes, L., Healy, J., Astels, S.: HDBSCAN: hierarchical density based clustering. J. Open Source Softw. 2(11), 205 (2017)
Mehrotra, A., Trick, M.A.: Cliques and clustering: a combinatorial approach. Oper. Res. Lett. 22(1), 1–12 (1998)
Meila, M., Shi, J.: Learning segmentation by random walks. In: Advances in Neural Information Processing Systems, pp. 873–879 (2001)
Miyauchi, A., Sonobe, T., Sukegawa, N.: Exact clustering via integer programming and maximum satisfiability. In: AAAI Conference on Artificial Intelligence (2018)
Miyauchi, A., Sukegawa, N.: Redundant constraints in the standard formulation for the clique partitioning problem. Optim. Lett. 9(1), 199–207 (2015)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
NOAA: Comparative climatic data. National Centers for Environmental Information (2015). https://www.ncdc.noaa.gov/ghcn/comparative-climatic-data
Oosten, M., Rutten, J.H., Spieksma, F.C.: The clique partitioning problem: facets and patching facets. Netw. Int. J. 38(4), 209–226 (2001)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Sukegawa, N., Yamamoto, Y., Zhang, L.: Lagrangian relaxation and pegging test for the clique partitioning problem. Adv. Data Anal. Classif. 7(4), 363–391 (2013)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 This is a U.S. government work and not under copyright protection in the United States; foreign copyright protection may apply
About this paper
Cite this paper
Bassett, M. et al. (2019). Condorcet Optimal Clustering with Delaunay Triangulation: Climate Zones and World Happiness Insights. In: Thomson, R., Bisgin, H., Dancy, C., Hyder, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2019. Lecture Notes in Computer Science(), vol 11549. Springer, Cham. https://doi.org/10.1007/978-3-030-21741-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-21741-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21740-2
Online ISBN: 978-3-030-21741-9
eBook Packages: Computer ScienceComputer Science (R0)