Advertisement

Applied Intelligence

, Volume 49, Issue 7, pp 2567–2581 | Cite as

A fuzzy clustering ensemble based on cluster clustering and iterative Fusion of base clusters

  • Musa Mojarad
  • Samad NejatianEmail author
  • Hamid Parvin
  • Majid Mohammadpoor
Article
  • 77 Downloads

Abstract

For obtaining the more robust, novel, stable, and consistent clustering result, clustering ensemble has been emerged. There are two approaches in clustering ensemble frameworks: (a) the approaches that focus on creation or preparation of a suitable ensemble, called as ensemble creation approaches, and (b) the approaches that try to find a suitable final clustering (called also as consensus clustering) out of a given ensemble, called as ensemble aggregation approaches. The first approaches try to solve ensemble creation problem. The second approaches try to solve aggregation problem. This paper tries to propose an ensemble aggregator, or a consensus function, called as Robust Clustering Ensemble based on Sampling and Cluster Clustering (RCESCC).RCESCC algorithm first generates an ensemble of fuzzy clusterings generated by the fuzzy c-means algorithm on subsampled data. Then, it obtains a cluster-cluster similarity matrix out of the fuzzy clusters. After that, it partitions the fuzzy clusters by applying a hierarchical clustering algorithm on the cluster-cluster similarity matrix. In the next phase, the RCESCC algorithm assigns the data points to merged clusters. The experimental results comparing with the state of the art clustering algorithms indicate the effectiveness of the RCESCC algorithm in terms of performance, speed and robustness.

Keywords

Clustering ensemble Fuzzy c-means Between-cluster similarity Subsampling 

Notes

Acknowledgments

This paper is extracted from a PhD thesis written by Musa Mojarad.

References

  1. 1.
    Wang B, Zhang J, Liu Y, Zou Y (2017) Density peaks clustering based integrate framework for multi-document summarization. CAAI Transactions on Intelligence Technology 2(1):26–30Google Scholar
  2. 2.
    Ma J, Jiang X, Gong M (2018) Two-phase clustering algorithm with density exploring distance measure. CAAI Transactions on Intelligence Technology 3(1):59–64Google Scholar
  3. 3.
    Deng Q, Wu S, Wen J, Xu Y (2018) Multi-level image representation for large-scale image-based instance retrieval. CAAI Transactions on Intelligence Technology 3(1):33–39Google Scholar
  4. 4.
    Chakraborty D, Singh S, Dutta D (2017) Segmentation and classification of high spatial resolution images based on Hölder exponents and variance. Geo-spatial Information Science 20(1):39–45Google Scholar
  5. 5.
    Yang H, Yu L (2017) Feature extraction of wood-hole defects using wavelet-based ultrasonic testing. J For Res 28(2):395–402Google Scholar
  6. 6.
    Li C, Zhang Y, Tu W et al (2017) Soft measurement of wood defects based on LDA feature fusion and compressed sensor images. J For Res 28(6):1285–1292Google Scholar
  7. 7.
    Alsaaideh B, Tateishi R, Phong DX, Hoan NT, Al-Hanbali A, Xiulian B (2017) New urban map of Eurasia using MODIS and multi-source geospatial data. Geo-spatial Information Science 20(1):29–38Google Scholar
  8. 8.
    Song XP, Huang C, Townshend JR (2017) Improving global land cover characterization through data fusion. Geo-spatial Information Science 20(2):141–150Google Scholar
  9. 9.
    Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for multiple partitions. The Journal of Machine Learning Research 3:583–617MathSciNetzbMATHGoogle Scholar
  10. 10.
    Alizadeh H, Minaei-Bidgoli B, Parvin H (2014b) To improve the quality of cluster ensembles by selecting a subset of base clusters. Journal of Experimental & Theoretical Artificial Intelligence 26(1):127–150Google Scholar
  11. 11.
    Mondal S, Banerjee A (2015) ESDF: ensemble selection using diversity and frequency. Eprint Arxiv 68(1):10–12Google Scholar
  12. 12.
    Naldi MC, Carvalho AC, Campello RJ (2013) Cluster ensemble selection based on relative validity indexes. Data Min Knowl Disc 27(2):259–289MathSciNetzbMATHGoogle Scholar
  13. 13.
    Ni Z, Wu X, Ni L, Tang L, Xiao H (2015) The research on selective clustering ensemble algorithm based on fractal dimension and projection. Journal of Computational Information Systems 11(11):4025–4035Google Scholar
  14. 14.
    X. Wang, D. Han, C. Han, Rough set based cluster ensemble selection, information FUSION (FUSION), 2013aGoogle Scholar
  15. 15.
    Yang F, Li T, Zhou Q, Xiao H (2017) Cluster ensemble selection with constraints. Neurocomputing 235:59–70Google Scholar
  16. 16.
    Yousefnezhad M, Reihanian A, Zhang D, Minaei-Bidgoli B (2016) A new selection strategy for selective cluster ensemble based on diversity and independency. Eng Appl Artif Intell 56(C):260–272Google Scholar
  17. 17.
    Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WF (2013) Effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41(1):27–48Google Scholar
  18. 18.
    Yu Z, Chen H, You J, Wong HS (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles, IEEE/ACM transactions on computational biology. Bioinformatics 11(4):727–740Google Scholar
  19. 19.
    Kao LJ, Huang YP (2013) Ejecting outliers to enhance robustness of fuzzy cluster ensemble. In: IEEE international conference on systems, man, and cybernetics, pp 3790–3795Google Scholar
  20. 20.
    Mishra SP, Mishra D, Patnaik S (2015) An integrated robust semi-supervised framework for improving cluster reliability using ensemble method for heterogeneous datasets. Karbala International Journal of Modern Science 1(4):200–211Google Scholar
  21. 21.
    Akbari E, Dahlan HM, Ibrahim R, Alizadeh H (2015) Hierarchical cluster ensemble selection. Eng Appl Artif Intell 39(39):146–156Google Scholar
  22. 22.
    H. Wang, J. Qi, W. Zheng, M. Wang, Semi-supervised cluster ensemble based on binary similarity matrix, in: The IEEE International Conference on Information Management and Engineering, 2010, pp. 251–254Google Scholar
  23. 23.
    Alizadeh H, Minaei B, Parvin H (2013) Optimizing fuzzy cluster Ensemble in String Representation. International Journal of Pattern Recognition and Artificial Intelligence, IJPRAI, ISSN:0218–0014Google Scholar
  24. 24.
    Meng J, Hao H, Luan Y (2016) Classifier ensemble selection based on affinity propagation clustering. J Biomed Inform 60:234–242Google Scholar
  25. 25.
    Soltanmohammadi E, Naraghi-Pour M, Schaar MVD (2016) Context-based unsupervised ensemble learning and feature ranking. Mach Learn 105(3):1–27MathSciNetzbMATHGoogle Scholar
  26. 26.
    Wang D, Li L, Yu Z, Wang X (2013b) AP2CE: double affinity propagation based cluster ensemble. In: International conference on machine learning and cybernetics, pp 16–23Google Scholar
  27. 27.
    Yu Z, Luo P, You J, Wong HS, Leung H, Wu S, Zhang J, Han G (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Transactions on Knowledge & Data Engineering 28(3):701–714Google Scholar
  28. 28.
    Iam-On N, Boongoen T, Garrett S, Price C (2011) A link-based approach to the cluster ensemble problem. IEEE transactions on Pattern Analysis & Machine Intelligence 33(12):2396–2409Google Scholar
  29. 29.
    Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709zbMATHGoogle Scholar
  30. 30.
    Wang LJ, Hao ZF, Cai RC, Wen W (2014) An improved local adaptive clustering ensemble based on link analysis. In: International conference on machine learning and cybernetics, pp 10–15Google Scholar
  31. 31.
    Xiao W, Yang Y, Wang H, Li T, Xing H (2016) Semi-supervised hierarchical clustering ensemble and its application. Neurocomputing 173:1362–1376Google Scholar
  32. 32.
    Wang W (2008) Some fundamental issues in ensemble methods. In: proceedings of the IEEE international joint conference on neural networks, IEEE world congress on. Comput Intell:2243–2250Google Scholar
  33. 33.
    Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM computing surveys (CSUR) 31(3):264–323Google Scholar
  34. 34.
    Tan PN, Steinbach M, Kumar V (2006) Introduction to data mining. In: Pearson Addison Wesley (Boston)Google Scholar
  35. 35.
    Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3:32–57MathSciNetzbMATHGoogle Scholar
  36. 36.
    Berikov VB (2018) A probabilistic model of fuzzy clustering ensemble. Pattern Recognition and Image Analysis 28(1):1–10.  https://doi.org/10.1134/S1054661818010029 Google Scholar
  37. 37.
    Dimitriadou E, Weingessel A, Hornik K (2002) A combination scheme for fuzzy clustering. Int J Pattern Recognit Artif Intell 16(07):901–912zbMATHGoogle Scholar
  38. 38.
    Li T, Chen Y (2010) Fuzzy clustering ensemble with selection of number of clusters. JCP 5(7):1112–1119Google Scholar
  39. 39.
    Nazari A, Dehghan A, Nejatian S, Rezaie V, Parvin H (2018) A comprehensive study of clustering ensemble weighting based on cluster quality and diversity. Pattern Anal Applic.  https://doi.org/10.1007/s10044-017-0676-x
  40. 40.
    Pan S, Changjing S, Qiang S (2015) A hierarchical fuzzy cluster ensemble approach and its application to big data clustering. Journal of Intelligent & Fuzzy Systems 28(6):2409–2421MathSciNetGoogle Scholar
  41. 41.
    Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112MathSciNetzbMATHGoogle Scholar
  42. 42.
    Sevillano X, JC S’o, Alıas F (2009) Fuzzy clusterers combination by positional voting for robust document clustering. Procesamiento del lenguaje natural 43:245–253Google Scholar
  43. 43.
    Alqurashi T, Wang W (2015) A new consensus function based on dual-similarity measurements for clustering ensemble. In: International conference on data science and advanced analytics (DSAA), IEEE. ACM, pp 149–155Google Scholar
  44. 44.
    Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning, pp 186–193, URL http://www.aaai.org/Papers/ICML/2003/ICML03–027.pdf
  45. 45.
    Breiman L (1996) Bagging predictors. Mach Learn 24:123–140zbMATHGoogle Scholar
  46. 46.
    Minaei-Bidgoli B, Topchy A, Punch WF (2004) Ensembles of partitions via data resampling. In: Proceedings of the international conference on information technology: coding and computing ITCC, IEEE, vol 2, pp 188–192Google Scholar
  47. 47.
    Parvin H, Minaei-Bidgoli B, Alinejad-Rokny H, Punch WF (2013) Data weighing mechanisms for clustering ensembles. Computers & Electrical Engineering 39(5):1433–1450Google Scholar
  48. 48.
    Alizadeh H, Yousefnezhad M, Minaei-Bidgoli B (2015) Wisdom of crowds cluster ensemble. Intell Data Anal 19(3):485–503Google Scholar
  49. 49.
    Topchy A, Jain AK, Punch W (2004) A mixture model of clustering ensembles. Proceedings of the SIAM International Conference of Data Mining, InGoogle Scholar
  50. 50.
    Iam-on N, Boongoen T, Garrett S (2010) LCE: a link-based cluster ensemble method for improved gene expression data analysis. Bioinformatics 26(12):1513–1519Google Scholar
  51. 51.
    Iam-On N, Boongeon T, Garrett S, Price C (2012) A link based cluster ensemble approach for categorical data clustering. IEEE Trans Knowl Data Eng 24(3):413–425Google Scholar
  52. 52.
    Yi J, Yang T, Jin R, Jain AK, Mahdavi M (2012) Robust ensemble clustering by matrix completion. In: proceedings of the IEEE 12th international conference on data mining (ICDM). IEEE:1176–1181Google Scholar
  53. 53.
    Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1):4–esGoogle Scholar
  54. 54.
    Alqurashi T, Wang W (2014) Object-neighborhood clustering ensemble method. In: Intelligent data engineering and automated learning (IDEAL). Springer, pp 142–149Google Scholar
  55. 55.
    Fred AL, Jain AK (2005) Combining multiple clustering's using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850Google Scholar
  56. 56.
    Yang Y, Jiang J (2016) Hybrid sampling-based clustering ensemble with global and local constitutions. IEEE Transactions on Neural Networks and Learning Systems 27(5):952–965MathSciNetGoogle Scholar
  57. 57.
    Bai L, Cheng X, Liang J, Guo Y (2017) Fast graph clustering with a new description model for community detection. Inf Sci 388-389:37–47Google Scholar
  58. 58.
    Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392MathSciNetzbMATHGoogle Scholar
  59. 59.
    Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the 21st International Conference on Machine learning, ACM, p 36Google Scholar
  60. 60.
    Huang D, Lai J, Wang CD (2016b) Robust ensemble clustering using probability trajectories. The IEEE Transactions on Knowledge and Data Engineering, Robust Ensemble Clustering Using Probability TrajectoriesGoogle Scholar
  61. 61.
    Huang D, Lai J, Wang CD (2016a) Ensemble clustering using factor graph. Pattern Recogn 50:131–142zbMATHGoogle Scholar
  62. 62.
    Houle ME (2008) The relevant-set correlation model for data clustering. Statistical Analysis and Data Mining 1(3):157–176MathSciNetGoogle Scholar
  63. 63.
    Vinh NX, Houle ME (2010) A set correlation model for partitional clustering. Advances in Knowledge Discovery and Data Mining, Springer, In, pp 4–15Google Scholar
  64. 64.
    D. Dueck, “Affinity propagation: Clustering data by passing messages,” Ph.D. dissertation, University of Toronto, 2009Google Scholar
  65. 65.
    Newman CBDJ, SS Hettich, C Merz (1998) UCI repository of Mach Learn databases, http://www.ics.uci.edu/˜mlearn/MLSummary.html, (1998)
  66. 66.
    Ren Y, Zhang G, Domeniconi C, Yu G (2013) Weighted object ensemble clustering. In: Proceedings of the IEEE 13th International Conference on Data Mining (ICDM), IEEE, pp 627–636Google Scholar
  67. 67.
    Mimaroglu S, Aksehirli E (2012) DICLENS: divisive clustering ensemble with automatic cluster number. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 9(2):408–420Google Scholar
  68. 68.
    Alizadeh H, Minaei-Bidgoli B, Parvin H (2014a) Cluster ensemble selection based on a new cluster stability measure. Intelligent Data Analysis 18(3):389–408Google Scholar
  69. 69.
    Huang D, Lai JH, Wang CD (2015) Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis. Neurocomputing 170:240–250Google Scholar
  70. 70.
    Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Musa Mojarad
    • 1
  • Samad Nejatian
    • 2
    • 3
    Email author
  • Hamid Parvin
    • 4
    • 5
  • Majid Mohammadpoor
    • 3
    • 4
  1. 1.Department of Computer EngineeringYasooj Branch, Islamic Azad UniversityYasoojIran
  2. 2.Department of Electrical EngineeringYasooj Branch, Islamic Azad UniversityYasoojIran
  3. 3.Young Researchers and Elite ClubYasooj Branch, Islamic Azad UniversityYasoojIran
  4. 4.Department of Computer EngineeringNourabad Mamasani Branch, Islamic Azad UniversityNourabad MamasaniIran
  5. 5.Young Researchers and Elite ClubNourabad Mamasani Branch, Islamic Azad UniversityNourabad MamasaniIran

Personalised recommendations