Skip to main content

Advertisement

Log in

Sophisticated SOM based genetic operators in multi-objective clustering framework

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Multi-objective clustering refers to the partitioning of a given collection of objects into various K-groups based on some similarity/dissimilarity criterion while optimizing different partition quality measures simultaneously. The current paper proposes an automated decomposition based multi-objective clustering technique, SOMDEA_clust, which is a fusion of self-organizing map (SOM) and multi-objective differential evolution. A novel reproduction operator is designed where the ensemble of multiple neighborhoods extracted using self-organizing map is used for constructing the variable mating pool size. The probabilities of selecting different sizes of the neighborhood are updated based on their performances in generating new improved solutions in the last few generations. Decomposition based selection scheme is also utilized in our paper which divides the multi-objective optimization (MOO) problem into a number of single objective subproblems. The objective functions corresponding to these subproblems are optimized in a collaborative manner by the use of MOO. The potentiality of the proposed framework is shown for clustering four real-life data sets and five artificial data sets in comparison to some existing multi-objective based clustering techniques, namely MOCK, SMEA_clust, MEA_clust, a single objective based genetic clustering technique, SOGA and a traditional clustering technique, K-means. To show the utility of SOM based reproduction operators, another decomposition based multi-objective clustering technique (MDEA_clust) without the use of SOM based operators is also developed in this paper. In order to show the efficacy of the proposed clustering technique in handling large data sets, two large scale datasets having more than 5000 data points are also utilized. As a real-life application, the proposed clustering technique is applied for scientific/web document clustering where a set of scientific/web documents are partitioned based on their content-similarities. Semantic representation is utilized to covert the text document into a real vector. Experimental results clearly illustrate the effectiveness of fusion of SOM and DE in developing an effective clustering technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/benhamner/exploring-the-nips-2015-papers/data

  2. https://github.com/stanfordnlp/GloVe

References

  1. Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez J M, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256

    Article  Google Scholar 

  2. Bandyopadhyay S, Maulik U (2002) Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn 35(6):1197–1208

    Article  MATH  Google Scholar 

  3. Bandyopadhyay S, Saha S (2007) Gaps: a clustering method using a new point symmetry-based distance measure. Pattern Recogn 40(12):3430–3451

    Article  MATH  Google Scholar 

  4. Bandyopadhyay S, Saha S (2008) A new principal axis based line symmetry measurement and its application to clustering. In: International conference on neural information processing. Springer, pp 543–550

  5. Bandyopadhyay S, Saha S (2008) A point symmetry-based clustering technique for automatic evolution of clusters. IEEE Trans Knowl Data Eng 20(11):1441–1457

    Article  Google Scholar 

  6. Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evol Comput 12(3):269–283

    Article  Google Scholar 

  7. Cardoso-Cachopo A (2007) Improving methods for single-label text categorization. PdD Thesis, Instituto Superior Tecnico, Universidade Tecnica de Lisboa

  8. Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):218–237

    Article  Google Scholar 

  9. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI 1(2):224–227. https://doi.org/10.1109/TPAMI.1979.4766909

    Article  Google Scholar 

  10. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1(2):224–227

    Article  Google Scholar 

  11. Deb K (2014) Multi-objective optimization. In: Search methodologies. Springer, pp 403–449

  12. Deb K, Tiwari S (2008) Omni-optimizer: a generic evolutionary algorithm for single and multi-objective optimization. Eur J Oper Res 185(3):1062–1087

    Article  MathSciNet  MATH  Google Scholar 

  13. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  14. Giagkiozis I, Purshouse RC, Fleming PJ (2014) Generalized decomposition and cross entropy methods for many-objective optimization. Inf Sci 282:363–387

    Article  MathSciNet  MATH  Google Scholar 

  15. Handl J, Knowles J (2007) An evolutionary approach to multiobjective clustering. IEEE Trans Evol Comput 11(1):56–76

    Article  Google Scholar 

  16. Haykin SS (2009) Neural networks and learning machines, vol 3. Prentice-Hall, Pearson Upper Saddle River

    Google Scholar 

  17. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Pearson Upper Saddle River

    MATH  Google Scholar 

  18. Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75

    Article  Google Scholar 

  19. Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, pp 760–766

  20. Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6

    Article  MathSciNet  MATH  Google Scholar 

  21. Li H, Zhang Q (2009) Multiobjective optimization problems with complicated pareto sets, moea/d and nsga-ii. IEEE Trans Evol Comput 13(2):284–302

    Article  Google Scholar 

  22. Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml

  23. Maulik U, Bandyopadhyay S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654

    Article  Google Scholar 

  24. Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501

    Article  MATH  Google Scholar 

  25. Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532–1543

  26. Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  27. Saha S, Bandyopadhyay S (2010) A symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recogn 43(3):738–751

    Article  MATH  Google Scholar 

  28. Saha S, Bandyopadhyay S (2012) Some connectivity based cluster validity indices. Appl Soft Comput 12 (5):1555–1565

    Article  Google Scholar 

  29. Saha S, Bandyopadhyay S (2013) A generalized automatic clustering algorithm in a multiobjective framework. Appl Soft Comput 13(1):89–108

    Article  Google Scholar 

  30. Saini N, Chourasia S, Saha S, Bhattacharyya P (2017) A self organizing map based multi-objective framework for automatic evolution of clusters. In: International conference on neural information processing. Springer, pp 672–682

  31. Saini N, Saha S, Bhattacharyya P (2018) An improved technique for automatic email classification. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  32. Starczewski A (2017) A new validity index for crisp clusters. Pattern Anal Applic 20(3):687–700

    Article  MathSciNet  Google Scholar 

  33. Suresh K, Kundu D, Ghosh S, Das S, Abraham A (2009) Data clustering using multi-objective differential evolution algorithms. Fundamenta Informaticae 97(4):381–403

    MathSciNet  Google Scholar 

  34. Welch BL (1947) The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika 34(1/2):28–35. http://www.jstor.org/stable/2332510

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhang H, Zhang X, Gao XZ, Song S (2016) Self-organizing multiobjective optimization based on decomposition with neighborhood ensemble. Neurocomputing 173:1868–1884

    Article  Google Scholar 

  36. Zhang H, Zhou A, Song S, Zhang Q, Gao XZ, Zhang J (2016) A self-organizing multiobjective evolutionary algorithm. IEEE Trans Evol Comput 20(5):792–806. https://doi.org/10.1109/TEVC.2016.2521868

    Article  Google Scholar 

  37. Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

Download references

Acknowledgments

Dr. Sriparna Saha would like to acknowledge the support of SERB Women in Excellence Award-SB/WEA/08/2017 for carrying out this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naveen Saini.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saini, N., Saha, S., Harsh, A. et al. Sophisticated SOM based genetic operators in multi-objective clustering framework. Appl Intell 49, 1803–1822 (2019). https://doi.org/10.1007/s10489-018-1350-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1350-8

Keywords

Navigation