Multi-objective clustering refers to the partitioning of a given collection of objects into various K-groups based on some similarity/dissimilarity criterion while optimizing different partition quality measures simultaneously. The current paper proposes an automated decomposition based multi-objective clustering technique, SOMDEA_clust, which is a fusion of self-organizing map (SOM) and multi-objective differential evolution. A novel reproduction operator is designed where the ensemble of multiple neighborhoods extracted using self-organizing map is used for constructing the variable mating pool size. The probabilities of selecting different sizes of the neighborhood are updated based on their performances in generating new improved solutions in the last few generations. Decomposition based selection scheme is also utilized in our paper which divides the multi-objective optimization (MOO) problem into a number of single objective subproblems. The objective functions corresponding to these subproblems are optimized in a collaborative manner by the use of MOO. The potentiality of the proposed framework is shown for clustering four real-life data sets and five artificial data sets in comparison to some existing multi-objective based clustering techniques, namely MOCK, SMEA_clust, MEA_clust, a single objective based genetic clustering technique, SOGA and a traditional clustering technique, K-means. To show the utility of SOM based reproduction operators, another decomposition based multi-objective clustering technique (MDEA_clust) without the use of SOM based operators is also developed in this paper. In order to show the efficacy of the proposed clustering technique in handling large data sets, two large scale datasets having more than 5000 data points are also utilized. As a real-life application, the proposed clustering technique is applied for scientific/web document clustering where a set of scientific/web documents are partitioned based on their content-similarities. Semantic representation is utilized to covert the text document into a real vector. Experimental results clearly illustrate the effectiveness of fusion of SOM and DE in developing an effective clustering technique.
This is a preview of subscription content, log in to check access.
Dr. Sriparna Saha would like to acknowledge the support of SERB Women in Excellence Award-SB/WEA/08/2017 for carrying out this work.
Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez J M, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256CrossRefGoogle Scholar
Bandyopadhyay S, Maulik U (2002) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn 35(6):1197–1208CrossRefzbMATHGoogle Scholar
Bandyopadhyay S, Saha S (2007) Gaps: a clustering method using a new point symmetry-based distance measure. Pattern Recogn 40(12):3430–3451CrossRefzbMATHGoogle Scholar
Bandyopadhyay S, Saha S (2008) A new principal axis based line symmetry measurement and its application to clustering. In: International conference on neural information processing. Springer, pp 543–550Google Scholar
Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457CrossRefGoogle Scholar
Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evol Comput 12(3):269–283CrossRefGoogle Scholar
Cardoso-Cachopo A (2007) Improving methods for single-label text categorization. PdD Thesis, Instituto Superior Tecnico, Universidade Tecnica de LisboaGoogle Scholar
Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237CrossRefGoogle Scholar
Maulik U, Bandyopadhyay S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654CrossRefGoogle Scholar
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501CrossRefzbMATHGoogle Scholar
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532–1543Google Scholar
Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, BerlinzbMATHGoogle Scholar
Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recogn 43(3):738–751CrossRefzbMATHGoogle Scholar
Saha S, Bandyopadhyay S (2012) Some connectivity based cluster validity indices. Appl Soft Comput 12 (5):1555–1565CrossRefGoogle Scholar
Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108CrossRefGoogle Scholar
Saini N, Chourasia S, Saha S, Bhattacharyya P (2017) A self organizing map based multi-objective framework for automatic evolution of clusters. In: International conference on neural information processing. Springer, pp 672–682Google Scholar
Saini N, Saha S, Bhattacharyya P (2018) An improved technique for automatic email classification. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–8Google Scholar