Advertisement

Applied Intelligence

, Volume 49, Issue 5, pp 1724–1747 | Cite as

Elite fuzzy clustering ensemble based on clustering diversity and quality measures

  • Ali Bagherinia
  • Behrooz Minaei-BidgoliEmail author
  • Mehdi Hossinzadeh
  • Hamid Parvin
Article
  • 87 Downloads

Abstract

In spite of some attempts at improving the quality of the clustering ensemble methods, it seems that little research has been devoted to the selection procedure within the fuzzy clustering ensemble. In addition, quality and local diversity of base-clusterings are two important factors in the selection of base-clusterings. Very few of the studies have considered these two factors together for selecting the best fuzzy base-clusterings in the ensemble. We propose a novel fuzzy clustering ensemble framework based on a new fuzzy diversity measure and a fuzzy quality measure to find the base-clusterings with the best performance. Diversity and quality are defined based on the fuzzy normalized mutual information between fuzzy base-clusterings. In our framework, the final clustering of selected base-clusterings is obtained by two types of consensus functions: (1) a fuzzy co-association matrix is constructed from the selected base-clusterings and then, a single traditional clustering such as hierarchical agglomerative clustering is applied as consensus function over the matrix to construct the final clustering. (2) a new graph based fuzzy consensus function. The time complexity of the proposed consensus function is linear in terms of the number of data-objects. Experimental results reveal the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria on various standard datasets.

Keywords

Consensus function Diversity Fuzzy clustering ensemble Selective fuzzy clustering ensemble 

References

  1. 1.
    Tuan TM, Ngan TT, Son LH (2016) A novel semi-supervised fuzzy clustering method based on interactive fuzzy satisficing for dental X-ray image segmentation. Appl Intell 45:402–428CrossRefGoogle Scholar
  2. 2.
    Son LH, Thong PH (2017) Some novel hybrid forecast methods based on picture fuzzy clustering for weather nowcasting from satellite image sequences. Appl Intell 46:1–15CrossRefGoogle Scholar
  3. 3.
    Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203CrossRefGoogle Scholar
  4. 4.
    Lesot M-J, Kruse R (2006) Gustafson-Kessel-like clustering algorithm based on typicality degrees. Int Conf Inf Process Manag Uncertain Knowledge-Based Syst 1300–1307Google Scholar
  5. 5.
    Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 11:773–780CrossRefzbMATHGoogle Scholar
  6. 6.
    Chen DZS (2002) Fuzzy clustering using kernel method. IEEE, NanjingGoogle Scholar
  7. 7.
    Huang H-C, Chuang Y-Y, Chen C-S (2012) Multiple kernel fuzzy clustering. IEEE Trans Fuzzy Syst 20:120–134CrossRefGoogle Scholar
  8. 8.
    Supratid S, Kim H (2009) Modified fuzzy ants clustering approach. Appl Intell 31:122–134CrossRefGoogle Scholar
  9. 9.
    Silva Filho TM, Pimentel BA, Souza RMCR et al (2015) Hybrid methods for fuzzy clustering based on fuzzy c-means and improved particle swarm optimization. Expert Syst Appl 42:6315–6328CrossRefGoogle Scholar
  10. 10.
    Thong PH, Son LH (2016) Picture fuzzy clustering: a new computational intelligence method. Soft Comput 20:3549–3562CrossRefzbMATHGoogle Scholar
  11. 11.
    Son LH (2015) DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets. Expert Syst Appl 42:51–66CrossRefGoogle Scholar
  12. 12.
    Thong PH, Son LH (2016) A novel automatic picture fuzzy clustering method based on particle swarm optimization and picture composite cardinality. Knowledge-Based Syst 109:48–60CrossRefGoogle Scholar
  13. 13.
    Son LH (2016) Generalized picture distance measure and applications to picture fuzzy clustering. Appl Soft Comput 46:284–295CrossRefGoogle Scholar
  14. 14.
    Kleinberg JM (2003) An impossibility theorem for clustering. In: Advances in neural information processing systems. 463–470Google Scholar
  15. 15.
    Strehl A, Ghosh J (2002) Cluster ensembles---a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MathSciNetzbMATHGoogle Scholar
  16. 16.
    Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03). 186–193Google Scholar
  17. 17.
    Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on Machine learning. 36Google Scholar
  18. 18.
    Greene D, Tsymbal A, Bolshakova N, et al (2004) Ensemble clustering in medical diagnostics. In: Computer-Based Medical Systems, 2004. CBMS 2004. Proceedings. 17th IEEE Symposium on. 576–581Google Scholar
  19. 19.
    Hadjitodorov ST, Kuncheva LI, Todorova LP (2006) Moderate diversity for better cluster ensembles. Inf Fusion 7:264–275CrossRefGoogle Scholar
  20. 20.
    Kuncheva LI, Hadjitodorov ST, Todorova LP (2006) Experimental comparison of cluster ensemble methods. In: Information Fusion, 2006 9th International Conference on. 1–7Google Scholar
  21. 21.
    Topchy A, Jain AK, Punch W (2003) Combining multiple weak clusterings. Third IEEE Int Conf Data Min 0–7Google Scholar
  22. 22.
    Topchy AP, Jain AAK, Punch WF (2004) A Mixture Model for Clustering Ensembles. Sdm 379–390Google Scholar
  23. 23.
    Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27:1866–1881CrossRefGoogle Scholar
  24. 24.
    VEGA-PONS S, RUIZ-SHULCLOPER J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25:337–372MathSciNetCrossRefGoogle Scholar
  25. 25.
    Akbari E, Mohamed Dahlan H, Ibrahim R et al (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39:146–156CrossRefGoogle Scholar
  26. 26.
    Li T, Ogihara M, Ma S (2010) On combining multiple clusterings: an overview and a new perspective. Appl Intell 33:207–219CrossRefGoogle Scholar
  27. 27.
    Arabie P, Hubert LJ (1996) An overview of combinatorial data. Clust Classif 5Google Scholar
  28. 28.
    Berikov V (2014) Weighted ensemble of algorithms for complex data clustering. Pattern Recogn Lett 38:99–106CrossRefGoogle Scholar
  29. 29.
    Fred ALN, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27:835–850CrossRefGoogle Scholar
  30. 30.
    Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. IEEE, p 188–192 Vol. 2Google Scholar
  31. 31.
    Yu Z, Wong H-S, You J et al (2012) Hybrid cluster ensemble framework based on the random combination of data transformation operators. Pattern Recogn 45:1826–1837CrossRefzbMATHGoogle Scholar
  32. 32.
    Franek L, Jiang X (2014) Ensemble clustering by means of clustering embedding in vector spaces. Pattern Recogn 47:833–842CrossRefzbMATHGoogle Scholar
  33. 33.
    Fred ALN, Jain AK (2002) Data clustering using evidence accumulation. Object Recognit Supp User Interact Serv Robot 4:276–280CrossRefGoogle Scholar
  34. 34.
    Zhong C, Yue X, Zhang Z et al (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48:2699–2709CrossRefzbMATHGoogle Scholar
  35. 35.
    Singh V, Mukherjee L, Peng JM et al (2010) Ensemble clustering using semidefinite programming with applications. Mach Learn 79:177–200MathSciNetCrossRefGoogle Scholar
  36. 36.
    Ayad HG, Kamel MS (2008) Cumulative voting consensus method for partitions with variable number of clusters. IEEE Trans Pattern Anal Mach Intell 30:160–173CrossRefGoogle Scholar
  37. 37.
    Sevillano X, Alías F, Socoró JC (2012) Positional and confidence voting-based consensus functions for fuzzy cluster ensembles. Fuzzy Sets Syst 193:1–32MathSciNetCrossRefGoogle Scholar
  38. 38.
    Ayad HG, Kamel MS (2010) On voting-based consensus of cluster ensembles. Pattern Recogn 43:1943–1953CrossRefzbMATHGoogle Scholar
  39. 39.
    Alizadeh H, Minaei-Bidgoli B, Parvin H (2013) Optimizing fuzzy cluster ensemble in string representation. Int J Pattern Recognit Artif Intell 27:1350005MathSciNetCrossRefGoogle Scholar
  40. 40.
    Bedalli E, Mançellari E, Asilkan O (2016) A heterogeneous cluster ensemble model for improving the stability of fuzzy cluster analysis. Procedia Comput Sci 102:129–136CrossRefGoogle Scholar
  41. 41.
    Berikov VB (2018) A probabilistic model of fuzzy clustering ensemble. Pattern Recognit Image Anal 28:1–10CrossRefGoogle Scholar
  42. 42.
    Kailath T (1967) The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 15:52–60CrossRefGoogle Scholar
  43. 43.
    Punera K, Ghosh J (2008) Consensus-based ensembles of soft Clusterings. Appl Artif Intell 22:780–810CrossRefGoogle Scholar
  44. 44.
    Dhillon IS (2003) A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification. 3:1265–1287Google Scholar
  45. 45.
    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    de Oliveira JV, Szabo A, de Castro LN (2017) Particle swarm clustering in clustering ensembles: exploiting pruning and alignment free consensus. Appl Soft Comput 55:141–153CrossRefGoogle Scholar
  47. 47.
    Ball G, Hall Dj I (1965) A novel method of data analysis and pattern classification. Isodata, A novel method of data analysis and pattern classification. Tch. Report 5RI, Project 5533Google Scholar
  48. 48.
    Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Methods 3:1–27MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Dunn JC (1974) Well-separated clusters and optimal fuzzy partitions. J Cybern 4:95–104MathSciNetCrossRefzbMATHGoogle Scholar
  50. 50.
    Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65CrossRefzbMATHGoogle Scholar
  51. 51.
    Pal NR, Bezdek JC (1995) On cluster validity for the fuzzy c-means model. IEEE Trans Fuzzy Syst 3:370–379CrossRefGoogle Scholar
  52. 52.
    Minaei-bidgoli HPB (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. 87–112Google Scholar
  53. 53.
    Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist 2:83–97MathSciNetCrossRefzbMATHGoogle Scholar
  54. 54.
    Van Erp M, Vuurpijl L, Schomaker L (2002) An overview and comparison of voting methods for pattern recognition. In: Frontiers in Handwriting Recognition, 2002. Proceedings. Eighth International Workshop on. 195–200Google Scholar
  55. 55.
    de Borda JC (1784) M{é}moire sur les {é}lections au scrutin. Hist l’Academie R des Sci pour 1781 (Paris, 1784)Google Scholar
  56. 56.
    Copeland AH (1951) A reasonable social welfare function. In: Mimeographed notes from a Seminar on Applications of Mathematics to the Social Sciences, University of MichiganGoogle Scholar
  57. 57.
    Seera M, Randhawa K, Lim CP (2018) Improving the fuzzy min--max neural network performance with an ensemble of clustering trees. Neurocomputing 275:1744–1751CrossRefGoogle Scholar
  58. 58.
    Simpson PK (1993) Fuzzy min-max neural networks-part 2: clustering. IEEE Trans Fuzzy Syst 1:32CrossRefGoogle Scholar
  59. 59.
    Son LH, Van Hai P (2016) A novel multiple fuzzy clustering method based on internal clustering validation measures with gradient descent. Int J Fuzzy Syst 18:894–903MathSciNetCrossRefGoogle Scholar
  60. 60.
    Vendramin L, Campello RJGB, Hruschka ER (2010) Relative clustering validity criteria: a comparative overview. Stat Anal data Min ASA data Sci J 3:209–235MathSciNetGoogle Scholar
  61. 61.
    Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) To improve the quality of cluster ensembles by selecting a subset of base clusters. J Exp Theor Artif Intell 26:127–150CrossRefGoogle Scholar
  62. 62.
    Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11). 393–400Google Scholar
  63. 63.
    Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational learning theory. 92–100Google Scholar
  64. 64.
    Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems 849–856Google Scholar
  65. 65.
    Tao H, Hou C, Yi D (2014) Multiple-view spectral embedded clustering using a co-training approach. In: Computer Engineering and Networking. Springer, 979–987Google Scholar
  66. 66.
    Appice A, Malerba D (2016) A co-training strategy for multiple view clustering in process mining. IEEE Trans Serv Comput:832–845Google Scholar
  67. 67.
    Alizadeh H, Parvin H, Parvin S (2012) A framework for cluster ensemble based on a max metric as cluster evaluator. IAENG Int J Comput Sci 39:10–19Google Scholar
  68. 68.
    Naldi MC, Carvalho A, Campello RJGB (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27:259–289MathSciNetCrossRefzbMATHGoogle Scholar
  69. 69.
    Huang D, Lai J-H, Wang C-D (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250CrossRefGoogle Scholar
  70. 70.
    Li T, Ding C (2008) Weighted consensus clustering. In: Proceedings of the 2008 SIAM International Conference on Data Mining. 798–809Google Scholar
  71. 71.
    Yu Z, Li L, Gao Y et al (2014) Hybrid clustering solution selection strategy. Pattern Recogn 47:3362–3375CrossRefGoogle Scholar
  72. 72.
    Alizadeh H, Minaei-Bidgoli B, Parvin H (2014) Cluster ensemble selection based on a new cluster stability measure. Intell Data Anal 18:389–408CrossRefGoogle Scholar
  73. 73.
    Yousefnezhad M, Reihanian A, Zhang D et al (2016) A new selection strategy for selective cluster ensemble based on diversity and independency. Eng Appl Artif Intell 56:260–272CrossRefGoogle Scholar
  74. 74.
    Mondal S, Banerjee A (2015) ESDF: Ensemble Selection using Diversity and Frequency. arXiv Prepr arXiv150804333Google Scholar
  75. 75.
    Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Information Fusion (FUSION), 2013 16th International Conference on. 438–444Google Scholar
  76. 76.
    Kuncheva LI, Hadjitodorov ST Using diversity in cluster ensembles. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583). IEEE, 1214–1219Google Scholar
  77. 77.
    Iam-On N, Boongoen T, Garrett S et al (2011) A link-based approach to the cluster ensemble problem. IEEE Trans Pattern Anal Mach Intell 33:2396–2409CrossRefGoogle Scholar
  78. 78.
    Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowl Discov Data 2:17CrossRefGoogle Scholar
  79. 79.
    Yang F, Li T, Zhou Q et al (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70CrossRefGoogle Scholar
  80. 80.
    Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20:359–392MathSciNetCrossRefzbMATHGoogle Scholar
  81. 81.
    Blake CL, Merz CJ (1998) UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California. Dep Inf Comput Sci 55
  82. 82.
    Ernández AF, Uengo JL, Errac JD (2011) KEEL data-mining software tool : data set repository. Int Algorith Exp Anal Framework 17:255–287Google Scholar
  83. 83.
    Iam-on N, Garrett S (2010) LinkCluE: a MATLAB package for link-based. J Stat Softw 36:1–36CrossRefGoogle Scholar
  84. 84.
    Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850CrossRefGoogle Scholar
  85. 85.
    Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Methods 9:571–595CrossRefzbMATHGoogle Scholar
  86. 86.
    Saha I, Maulik U, Bandyopadhyay S et al (2012) SVMeFC: SVM ensemble fuzzy clustering for satellite image segmentation. IEEE Geosci Remote Sens Lett 9:52–55CrossRefGoogle Scholar
  87. 87.
    Ye P, Pan G (2017) Global optimization method using ensemble of metamodels based on fuzzy clustering for design space reduction. Eng Comput 33:573–585CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Engineering, Science and Research BranchIslamic Azad UniversityTehranIran
  2. 2.Computer Engineering DepartmentIran University of Science and TechnologyTehranIran
  3. 3.Iran University of Medical SciencesTehranIran
  4. 4.Computer ScienceUniversity of Human DevelopmentSulaimaniyahIraq

Personalised recommendations