Skip to main content

Advertisement

Log in

Elite fuzzy clustering ensemble based on clustering diversity and quality measures

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In spite of some attempts at improving the quality of the clustering ensemble methods, it seems that little research has been devoted to the selection procedure within the fuzzy clustering ensemble. In addition, quality and local diversity of base-clusterings are two important factors in the selection of base-clusterings. Very few of the studies have considered these two factors together for selecting the best fuzzy base-clusterings in the ensemble. We propose a novel fuzzy clustering ensemble framework based on a new fuzzy diversity measure and a fuzzy quality measure to find the base-clusterings with the best performance. Diversity and quality are defined based on the fuzzy normalized mutual information between fuzzy base-clusterings. In our framework, the final clustering of selected base-clusterings is obtained by two types of consensus functions: (1) a fuzzy co-association matrix is constructed from the selected base-clusterings and then, a single traditional clustering such as hierarchical agglomerative clustering is applied as consensus function over the matrix to construct the final clustering. (2) a new graph based fuzzy consensus function. The time complexity of the proposed consensus function is linear in terms of the number of data-objects. Experimental results reveal the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria on various standard datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Tuan TM, Ngan TT, Son LH (2016) A novel semi-supervised fuzzy clustering method based on interactive fuzzy satisficing for dental X-ray image segmentation. Appl Intell 45:402–428

    Article  Google Scholar 

  2. Son LH, Thong PH (2017) Some novel hybrid forecast methods based on picture fuzzy clustering for weather nowcasting from satellite image sequences. Appl Intell 46:1–15

    Article  Google Scholar 

  3. Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203

    Article  Google Scholar 

  4. Lesot M-J, Kruse R (2006) Gustafson-Kessel-like clustering algorithm based on typicality degrees. Int Conf Inf Process Manag Uncertain Knowledge-Based Syst 1300–1307

  5. Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 11:773–780

    Article  MATH  Google Scholar 

  6. Chen DZS (2002) Fuzzy clustering using kernel method. IEEE, Nanjing

    Google Scholar 

  7. Huang H-C, Chuang Y-Y, Chen C-S (2012) Multiple kernel fuzzy clustering. IEEE Trans Fuzzy Syst 20:120–134

    Article  Google Scholar 

  8. Supratid S, Kim H (2009) Modified fuzzy ants clustering approach. Appl Intell 31:122–134

    Article  Google Scholar 

  9. Silva Filho TM, Pimentel BA, Souza RMCR et al (2015) Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst Appl 42:6315–6328

    Article  Google Scholar 

  10. Thong PH, Son LH (2016) Picture fuzzy clustering: a new computational intelligence method. Soft Comput 20:3549–3562

    Article  MATH  Google Scholar 

  11. Son LH (2015) DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets. Expert Syst Appl 42:51–66

    Article  Google Scholar 

  12. Thong PH, Son LH (2016) A novel automatic picture fuzzy clustering method based on particle swarm optimization and picture composite cardinality. Knowledge-Based Syst 109:48–60

    Article  Google Scholar 

  13. Son LH (2016) Generalized picture distance measure and applications to picture fuzzy clustering. Appl Soft Comput 46:284–295

    Article  Google Scholar 

  14. Kleinberg JM (2003) An impossibility theorem for clustering. In: Advances in neural information processing systems. 463–470

  15. Strehl A, Ghosh J (2002) Cluster ensembles---a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  16. Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03). 186–193

  17. Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning. 36

  18. Greene D, Tsymbal A, Bolshakova N, et al (2004) Ensemble clustering in medical diagnostics. In: Computer-Based Medical Systems, 2004. CBMS 2004. Proceedings. 17th IEEE Symposium on. 576–581

  19. Hadjitodorov ST, Kuncheva LI, Todorova LP (2006) Moderate diversity for better cluster ensembles. Inf Fusion 7:264–275

    Article  Google Scholar 

  20. Kuncheva LI, Hadjitodorov ST, Todorova LP (2006) Experimental comparison of cluster ensemble methods. In: Information Fusion, 2006 9th International Conference on. 1–7

  21. Topchy A, Jain AK, Punch W (2003) Combining multiple weak clusterings. Third IEEE Int Conf Data Min 0–7

  22. Topchy AP, Jain AAK, Punch WF (2004) A Mixture Model for Clustering Ensembles. Sdm 379–390

  23. Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27:1866–1881

    Article  Google Scholar 

  24. VEGA-PONS S, RUIZ-SHULCLOPER J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25:337–372

    Article  MathSciNet  Google Scholar 

  25. Akbari E, Mohamed Dahlan H, Ibrahim R et al (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39:146–156

    Article  Google Scholar 

  26. Li T, Ogihara M, Ma S (2010) On combining multiple clusterings: an overview and a new perspective. Appl Intell 33:207–219

    Article  Google Scholar 

  27. Arabie P, Hubert LJ (1996) An overview of combinatorial data. Clust Classif 5

  28. Berikov V (2014) Weighted ensemble of algorithms for complex data clustering. Pattern Recogn Lett 38:99–106

    Article  Google Scholar 

  29. Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850

    Article  Google Scholar 

  30. Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. IEEE, p 188–192 Vol. 2

  31. Yu Z, Wong H-S, You J et al (2012) Hybrid cluster ensemble framework based on the random combination of data transformation operators. Pattern Recogn 45:1826–1837

    Article  MATH  Google Scholar 

  32. Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47:833–842

    Article  MATH  Google Scholar 

  33. Fred ALN, Jain AK (2002) Data clustering using evidence accumulation. Object Recognit Supp User Interact Serv Robot 4:276–280

    Article  Google Scholar 

  34. Zhong C, Yue X, Zhang Z et al (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48:2699–2709

    Article  MATH  Google Scholar 

  35. Singh V, Mukherjee L, Peng JM et al (2010) Ensemble clustering using semidefinite programming with applications. Mach Learn 79:177–200

    Article  MathSciNet  Google Scholar 

  36. Ayad HG, Kamel MS (2008) Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans Pattern Anal Mach Intell 30:160–173

    Article  Google Scholar 

  37. Sevillano X, Alías F, Socoró JC (2012) Positional and confidence voting-based consensus functions for fuzzy cluster ensembles. Fuzzy Sets Syst 193:1–32

    Article  MathSciNet  Google Scholar 

  38. Ayad HG, Kamel MS (2010) On voting-based consensus of cluster ensembles. Pattern Recogn 43:1943–1953

    Article  MATH  Google Scholar 

  39. Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) Optimizing fuzzy cluster ensemble in string representation. Int J Pattern Recognit Artif Intell 27:1350005

    Article  MathSciNet  Google Scholar 

  40. Bedalli E, Mançellari E, Asilkan O (2016) A heterogeneous cluster ensemble model for improving the stability of fuzzy cluster analysis. Procedia Comput Sci 102:129–136

    Article  Google Scholar 

  41. Berikov VB (2018) A probabilistic model of fuzzy clustering ensemble. Pattern Recognit Image Anal 28:1–10

    Article  Google Scholar 

  42. Kailath T (1967) The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 15:52–60

    Article  Google Scholar 

  43. Punera K, Ghosh J (2008) Consensus-based ensembles of soft Clusterings. Appl Artif Intell 22:780–810

    Article  Google Scholar 

  44. Dhillon IS (2003) A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. 3:1265–1287

  45. Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86

    Article  MathSciNet  MATH  Google Scholar 

  46. de Oliveira JV, Szabo A, de Castro LN (2017) Particle swarm clustering in clustering ensembles: exploiting pruning and alignment free consensus. Appl Soft Comput 55:141–153

    Article  Google Scholar 

  47. Ball G, Hall Dj I (1965) A novel method of data analysis and pattern classification. Isodata, A novel method of data analysis and pattern classification. Tch. Report 5RI, Project 5533

  48. Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Methods 3:1–27

    Article  MathSciNet  MATH  Google Scholar 

  49. Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J Cybern 4:95–104

    Article  MathSciNet  MATH  Google Scholar 

  50. Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    Article  MATH  Google Scholar 

  51. Pal NR, Bezdek JC (1995) On cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst 3:370–379

    Article  Google Scholar 

  52. Minaei-bidgoli HPB (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. 87–112

  53. Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist 2:83–97

    Article  MathSciNet  MATH  Google Scholar 

  54. Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on. 195–200

  55. de Borda JC (1784) M{é}moire sur les {é}lections au scrutin. Hist l’Academie R des Sci pour 1781 (Paris, 1784)

  56. Copeland AH (1951) A reasonable social welfare function. In: Mimeographed notes from a Seminar on Applications of Mathematics to the Social Sciences, University of Michigan

  57. Seera M, Randhawa K, Lim CP (2018) Improving the fuzzy min--max neural network performance with an ensemble of clustering trees. Neurocomputing 275:1744–1751

    Article  Google Scholar 

  58. Simpson PK (1993) Fuzzy min-max neural networks-part 2: clustering. IEEE Trans Fuzzy Syst 1:32

    Article  Google Scholar 

  59. Son LH, Van Hai P (2016) A novel multiple fuzzy clustering method based on internal clustering validation measures with gradient descent. Int J Fuzzy Syst 18:894–903

    Article  MathSciNet  Google Scholar 

  60. Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: a comparative overview. Stat Anal data Min ASA data Sci J 3:209–235

    MathSciNet  Google Scholar 

  61. Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26:127–150

    Article  Google Scholar 

  62. Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). 393–400

  63. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational learning theory. 92–100

  64. Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems 849–856

  65. Tao H, Hou C, Yi D (2014) Multiple-view spectral embedded clustering using a co-training approach. In: Computer Engineering and Networking. Springer, 979–987

  66. Appice A, Malerba D (2016) A co-training strategy for multiple view clustering in process mining. IEEE Trans Serv Comput:832–845

  67. Alizadeh H, Parvin H, Parvin S (2012) A framework for cluster ensemble based on a max metric as cluster evaluator. IAENG Int J Comput Sci 39:10–19

    Google Scholar 

  68. Naldi MC, Carvalho A, Campello RJGB (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27:259–289

    Article  MathSciNet  MATH  Google Scholar 

  69. Huang D, Lai J-H, Wang C-D (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250

    Article  Google Scholar 

  70. Li T, Ding C (2008) Weighted consensus clustering. In: Proceedings of the 2008 SIAM International Conference on Data Mining. 798–809

  71. Yu Z, Li L, Gao Y et al (2014) Hybrid clustering solution selection strategy. Pattern Recogn 47:3362–3375

    Article  Google Scholar 

  72. Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18:389–408

    Article  Google Scholar 

  73. Yousefnezhad M, Reihanian A, Zhang D et al (2016) A new selection strategy for selective cluster ensemble based on diversity and independency. Eng Appl Artif Intell 56:260–272

    Article  Google Scholar 

  74. Mondal S, Banerjee A (2015) ESDF: Ensemble Selection using Diversity and Frequency. arXiv Prepr arXiv150804333

  75. Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Information Fusion (FUSION), 2013 16th International Conference on. 438–444

  76. Kuncheva LI, Hadjitodorov ST Using diversity in cluster ensembles. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583). IEEE, 1214–1219

  77. Iam-On N, Boongoen T, Garrett S et al (2011) A link-based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33:2396–2409

    Article  Google Scholar 

  78. Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Discov Data 2:17

    Article  Google Scholar 

  79. Yang F, Li T, Zhou Q et al (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70

    Article  Google Scholar 

  80. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20:359–392

    Article  MathSciNet  MATH  Google Scholar 

  81. Blake CL, Merz CJ (1998) UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California. Dep Inf Comput Sci 55

  82. Ernández AF, Uengo JL, Errac JD (2011) KEEL data-mining software tool : data set repository. Int Algorith Exp Anal Framework 17:255–287

    Google Scholar 

  83. Iam-on N, Garrett S (2010) LinkCluE: a MATLAB package for link-based. J Stat Softw 36:1–36

    Article  Google Scholar 

  84. Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850

    Article  Google Scholar 

  85. Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Methods 9:571–595

    Article  MATH  Google Scholar 

  86. Saha I, Maulik U, Bandyopadhyay S et al (2012) SVMeFC: SVM ensemble fuzzy clustering for satellite image segmentation. IEEE Geosci Remote Sens Lett 9:52–55

    Article  Google Scholar 

  87. Ye P, Pan G (2017) Global optimization method using ensemble of metamodels based on fuzzy clustering for design space reduction. Eng Comput 33:573–585

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Behrooz Minaei-Bidgoli.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bagherinia, A., Minaei-Bidgoli, B., Hossinzadeh, M. et al. Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Appl Intell 49, 1724–1747 (2019). https://doi.org/10.1007/s10489-018-1332-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1332-x

Keywords

Navigation