Advertisement

Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Semi-supervised data clustering using particle swarm optimisation

  • 78 Accesses

  • 1 Citations

Abstract

In this study, we propose the semi-supervised particle swarm optimisation (ssPSO) algorithm for data clustering. The algorithm takes advantage of the strengths of semi-supervised fuzzy c-means (ssFCM) and particle swarm optimisation (PSO) to allow for a more informed search using labelled data across small number of iterations while maintaining diversity in the search process. ssFCM algorithms can find meaningful clusters using available labelled data to guide the learning process. PSOs are often chosen to solve clustering problems due to their versatility in problem representation and exploration capabilities. To verify the goodness of ssPSOs and provide practical insights to researchers, the clustering performances and clustering behaviours of ssPSOs are investigated and compared with PSO variants and ssFCMs. Two approaches of ssPSO were studied, one applied at initialisation only and the other throughout the learning process. Evaluated based on accuracy and quantisation error (QE), the ssPSO, PSOs and ssFCM algorithms were tested on 13 UCI datasets with different sizes, dimensions, number of classes and distribution, exploring several swarm size and maximum iteration settings over 100 runs. Visual examination of biplots and convergence graphs was conducted. ssPSOs were found to perform competitively well with ssFCM in most datasets in terms of accuracy and outperform ssFCM in terms of QE using swarm size 20 and maximum iteration 20. The results demonstrate that ssPSOs perform particularly well in sparsely distributed datasets with overlapping clusters and produce clusters with better structures in terms of QE. Furthermore, ssPSOs were demonstrated to perform competitively well as ssFCM in datasets with more than three clusters, while QPSO performed poorly in such datasets.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Azab SS, Hady MFA, Hefny HA (2017) Semi-supervised classification: cluster and label approach using particle swarm optimization. Int J Comput Appl 160(3):39

  2. Chen L, Wu X, Gao C (2012) Semi-supervised fuzzy clustering algorithm based on QPSO. J Inf Comput Sci 9(1):93–101

  3. Chuang LY, Hsiao CJ, Yang CH (2011) Chaotic particle swarm optimization for data clustering. Expert Syst Appl 38(12):14555–14563

  4. Guo J, Sato Y (2017) A bare bones particle swarm optimization algorithm with dynamic local search. In: International conference in swarm intelligence. Springer, pp 158–165

  5. Kennedy J (2003) Bare bones particle swarms. In: Proceedings of the 2003 IEEE swarm intelligence symposium, SIS’03. IEEE, pp 80–87

  6. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. IEEE, pp 1942–1948

  7. Lai DTC, Garibaldi JM (2011) A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms. In: Proceedings of IEEE international conference on fuzzy systems, pp 1580–1586

  8. Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

  9. Omran M, Al-Sharhan S (2007) Barebones particle swarm methods for unsupervised image classification. In: IEEE congress on evolutionary computation, CEC 2007. IEEE, pp 3247–3252

  10. Pedrycz W, Waletzky J (1997) Fuzzy clustering with partial supervision. IEEE Trans Syst Man Cybern 27(5):787–795

  11. Sengupta S, Basak S, Peters RA (2018) Data clustering using a hybrid of fuzzy c-means and quantum-behaved particle swarm optimization. In: 2018 IEEE 8th annual computing and communication workshop and conference (CCWC). IEEE, pp 137–142

  12. Sun J, Xu W, Feng B (2004) A global search strategy of quantum-behaved particle swarm optimization. In: 2004 IEEE conference on cybernetics and intelligent systems, vol 1. IEEE, pp 111–116

  13. Van der Merwe D, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: The 2003 congress on evolutionary computation, CEC’03, vol 1. IEEE, pp 215–220

  14. Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. Lect Notes Comput Sci Neural Inf Process 3316:1229–1234

  15. Zhang Y, Xiong X, Zhang Q (2013) An improved self-adaptive PSO algorithm with detection function for multimodal function optimization problems. Math Probl Eng 2013:8

  16. Zhang X, Jiao L, Paul A, Yuan Y, Wei Z, Song Q (2014) Semisupervised particle swarm optimization for classification. Math Probl Eng 2014:832135. https://doi.org/10.1155/2014/832135

Download references

Acknowledgements

The authors would like to thank Dr. Mikiko Sato from Tokai University for her feedback. This work is partially supported by Universiti Brunei Darussalam under Grant UBD/PNC2/2/RG/1(311). D. T. C. Lai is funded by Hosei University under the Hosei International Fund Foreign Scholars Fellowship.

Author information

Correspondence to Daphne T. C. Lai.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lai, D.T.C., Miyakawa, M. & Sato, Y. Semi-supervised data clustering using particle swarm optimisation. Soft Comput 24, 3499–3510 (2020). https://doi.org/10.1007/s00500-019-04114-z

Download citation

Keywords

  • Semi-supervised clustering
  • Particle swarm optimisation
  • Bare bones