Abstract
In spite of some attempts at improving the quality of the clustering ensemble methods, it seems that little research has been devoted to the selection procedure within the fuzzy clustering ensemble. In addition, quality and local diversity of base-clusterings are two important factors in the selection of base-clusterings. Very few of the studies have considered these two factors together for selecting the best fuzzy base-clusterings in the ensemble. We propose a novel fuzzy clustering ensemble framework based on a new fuzzy diversity measure and a fuzzy quality measure to find the base-clusterings with the best performance. Diversity and quality are defined based on the fuzzy normalized mutual information between fuzzy base-clusterings. In our framework, the final clustering of selected base-clusterings is obtained by two types of consensus functions: (1) a fuzzy co-association matrix is constructed from the selected base-clusterings and then, a single traditional clustering such as hierarchical agglomerative clustering is applied as consensus function over the matrix to construct the final clustering. (2) a new graph based fuzzy consensus function. The time complexity of the proposed consensus function is linear in terms of the number of data-objects. Experimental results reveal the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria on various standard datasets.
Similar content being viewed by others
References
Tuan TM, Ngan TT, Son LH (2016) A novel semi-supervised fuzzy clustering method based on interactive fuzzy satisficing for dental X-ray image segmentation. Appl Intell 45:402–428
Son LH, Thong PH (2017) Some novel hybrid forecast methods based on picture fuzzy clustering for weather nowcasting from satellite image sequences. Appl Intell 46:1–15
Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203
Lesot M-J, Kruse R (2006) Gustafson-Kessel-like clustering algorithm based on typicality degrees. Int Conf Inf Process Manag Uncertain Knowledge-Based Syst 1300–1307
Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 11:773–780
Chen DZS (2002) Fuzzy clustering using kernel method. IEEE, Nanjing
Huang H-C, Chuang Y-Y, Chen C-S (2012) Multiple kernel fuzzy clustering. IEEE Trans Fuzzy Syst 20:120–134
Supratid S, Kim H (2009) Modified fuzzy ants clustering approach. Appl Intell 31:122–134
Silva Filho TM, Pimentel BA, Souza RMCR et al (2015) Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst Appl 42:6315–6328
Thong PH, Son LH (2016) Picture fuzzy clustering: a new computational intelligence method. Soft Comput 20:3549–3562
Son LH (2015) DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets. Expert Syst Appl 42:51–66
Thong PH, Son LH (2016) A novel automatic picture fuzzy clustering method based on particle swarm optimization and picture composite cardinality. Knowledge-Based Syst 109:48–60
Son LH (2016) Generalized picture distance measure and applications to picture fuzzy clustering. Appl Soft Comput 46:284–295
Kleinberg JM (2003) An impossibility theorem for clustering. In: Advances in neural information processing systems. 463–470
Strehl A, Ghosh J (2002) Cluster ensembles---a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03). 186–193
Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning. 36
Greene D, Tsymbal A, Bolshakova N, et al (2004) Ensemble clustering in medical diagnostics. In: Computer-Based Medical Systems, 2004. CBMS 2004. Proceedings. 17th IEEE Symposium on. 576–581
Hadjitodorov ST, Kuncheva LI, Todorova LP (2006) Moderate diversity for better cluster ensembles. Inf Fusion 7:264–275
Kuncheva LI, Hadjitodorov ST, Todorova LP (2006) Experimental comparison of cluster ensemble methods. In: Information Fusion, 2006 9th International Conference on. 1–7
Topchy A, Jain AK, Punch W (2003) Combining multiple weak clusterings. Third IEEE Int Conf Data Min 0–7
Topchy AP, Jain AAK, Punch WF (2004) A Mixture Model for Clustering Ensembles. Sdm 379–390
Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27:1866–1881
VEGA-PONS S, RUIZ-SHULCLOPER J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25:337–372
Akbari E, Mohamed Dahlan H, Ibrahim R et al (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39:146–156
Li T, Ogihara M, Ma S (2010) On combining multiple clusterings: an overview and a new perspective. Appl Intell 33:207–219
Arabie P, Hubert LJ (1996) An overview of combinatorial data. Clust Classif 5
Berikov V (2014) Weighted ensemble of algorithms for complex data clustering. Pattern Recogn Lett 38:99–106
Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850
Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. IEEE, p 188–192 Vol. 2
Yu Z, Wong H-S, You J et al (2012) Hybrid cluster ensemble framework based on the random combination of data transformation operators. Pattern Recogn 45:1826–1837
Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47:833–842
Fred ALN, Jain AK (2002) Data clustering using evidence accumulation. Object Recognit Supp User Interact Serv Robot 4:276–280
Zhong C, Yue X, Zhang Z et al (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48:2699–2709
Singh V, Mukherjee L, Peng JM et al (2010) Ensemble clustering using semidefinite programming with applications. Mach Learn 79:177–200
Ayad HG, Kamel MS (2008) Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans Pattern Anal Mach Intell 30:160–173
Sevillano X, Alías F, Socoró JC (2012) Positional and confidence voting-based consensus functions for fuzzy cluster ensembles. Fuzzy Sets Syst 193:1–32
Ayad HG, Kamel MS (2010) On voting-based consensus of cluster ensembles. Pattern Recogn 43:1943–1953
Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) Optimizing fuzzy cluster ensemble in string representation. Int J Pattern Recognit Artif Intell 27:1350005
Bedalli E, Mançellari E, Asilkan O (2016) A heterogeneous cluster ensemble model for improving the stability of fuzzy cluster analysis. Procedia Comput Sci 102:129–136
Berikov VB (2018) A probabilistic model of fuzzy clustering ensemble. Pattern Recognit Image Anal 28:1–10
Kailath T (1967) The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 15:52–60
Punera K, Ghosh J (2008) Consensus-based ensembles of soft Clusterings. Appl Artif Intell 22:780–810
Dhillon IS (2003) A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. 3:1265–1287
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
de Oliveira JV, Szabo A, de Castro LN (2017) Particle swarm clustering in clustering ensembles: exploiting pruning and alignment free consensus. Appl Soft Comput 55:141–153
Ball G, Hall Dj I (1965) A novel method of data analysis and pattern classification. Isodata, A novel method of data analysis and pattern classification. Tch. Report 5RI, Project 5533
Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Methods 3:1–27
Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J Cybern 4:95–104
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Pal NR, Bezdek JC (1995) On cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst 3:370–379
Minaei-bidgoli HPB (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. 87–112
Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist 2:83–97
Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on. 195–200
de Borda JC (1784) M{é}moire sur les {é}lections au scrutin. Hist l’Academie R des Sci pour 1781 (Paris, 1784)
Copeland AH (1951) A reasonable social welfare function. In: Mimeographed notes from a Seminar on Applications of Mathematics to the Social Sciences, University of Michigan
Seera M, Randhawa K, Lim CP (2018) Improving the fuzzy min--max neural network performance with an ensemble of clustering trees. Neurocomputing 275:1744–1751
Simpson PK (1993) Fuzzy min-max neural networks-part 2: clustering. IEEE Trans Fuzzy Syst 1:32
Son LH, Van Hai P (2016) A novel multiple fuzzy clustering method based on internal clustering validation measures with gradient descent. Int J Fuzzy Syst 18:894–903
Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: a comparative overview. Stat Anal data Min ASA data Sci J 3:209–235
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26:127–150
Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). 393–400
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational learning theory. 92–100
Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems 849–856
Tao H, Hou C, Yi D (2014) Multiple-view spectral embedded clustering using a co-training approach. In: Computer Engineering and Networking. Springer, 979–987
Appice A, Malerba D (2016) A co-training strategy for multiple view clustering in process mining. IEEE Trans Serv Comput:832–845
Alizadeh H, Parvin H, Parvin S (2012) A framework for cluster ensemble based on a max metric as cluster evaluator. IAENG Int J Comput Sci 39:10–19
Naldi MC, Carvalho A, Campello RJGB (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27:259–289
Huang D, Lai J-H, Wang C-D (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250
Li T, Ding C (2008) Weighted consensus clustering. In: Proceedings of the 2008 SIAM International Conference on Data Mining. 798–809
Yu Z, Li L, Gao Y et al (2014) Hybrid clustering solution selection strategy. Pattern Recogn 47:3362–3375
Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18:389–408
Yousefnezhad M, Reihanian A, Zhang D et al (2016) A new selection strategy for selective cluster ensemble based on diversity and independency. Eng Appl Artif Intell 56:260–272
Mondal S, Banerjee A (2015) ESDF: Ensemble Selection using Diversity and Frequency. arXiv Prepr arXiv150804333
Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Information Fusion (FUSION), 2013 16th International Conference on. 438–444
Kuncheva LI, Hadjitodorov ST Using diversity in cluster ensembles. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583). IEEE, 1214–1219
Iam-On N, Boongoen T, Garrett S et al (2011) A link-based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33:2396–2409
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Discov Data 2:17
Yang F, Li T, Zhou Q et al (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20:359–392
Blake CL, Merz CJ (1998) UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California. Dep Inf Comput Sci 55
Ernández AF, Uengo JL, Errac JD (2011) KEEL data-mining software tool : data set repository. Int Algorith Exp Anal Framework 17:255–287
Iam-on N, Garrett S (2010) LinkCluE: a MATLAB package for link-based. J Stat Softw 36:1–36
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Methods 9:571–595
Saha I, Maulik U, Bandyopadhyay S et al (2012) SVMeFC: SVM ensemble fuzzy clustering for satellite image segmentation. IEEE Geosci Remote Sens Lett 9:52–55
Ye P, Pan G (2017) Global optimization method using ensemble of metamodels based on fuzzy clustering for design space reduction. Eng Comput 33:573–585
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bagherinia, A., Minaei-Bidgoli, B., Hossinzadeh, M. et al. Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Appl Intell 49, 1724–1747 (2019). https://doi.org/10.1007/s10489-018-1332-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1332-x