Unsupervised Learning on Healthcare Survey Data with Particle Swarm Optimization

  • Hina FirdausEmail author
  • Syed Imtiyaz Hassan
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 13)


The behavior of a machine learning algorithm very much depends upon the structure of entities. There a pool of big data, which have lots of ambiguity, noises, granularity, unlabeled behavior that make it tough to find pattern and visualize an outcome. To solve this issue machine learning proposes an algorithm like clustering, which is mainly part of unsupervised learning. In the proposed model we used the unsupervised learning specifically the clustering algorithms like Expectation Maximization (EM), Make Density Based Analysis (MDBC) wrapped with the clustering algorithm like Kmeans, EM, GenClust++. The dataset has been chosen is a survey data on healthcare issues like chronic diseases, regular treatment etc. across various states and district of India. We have used Weka tool for formulation and training our dataset. This dataset is obtained from the Open Data Government Platform India Portal under the Government of Open Data License India, from the Department of Health and Family Welfare. The dataset contains ordinal type of values. These kinds of dataset are excellent examples of exploratory data analysis. The particle swarm optimization is applied on dataset for optimizing the attributes later trained with clustering algorithms which gave better log likelihood value comparable to the results on simple clustering algorithms. We even compare the performance speed among the algorithms to check how quickly they are able to perform on the variety of attributes.


Unsupervised learning Particle swarm optimization Kmeans Genclust++ UN SDG Healthcare analytics Machine learning Big data 


  1. 1.
    A. Samuel, Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959). Scholar
  2. 2.
    A. Turing, I—Computing machinery and intelligence. Mind LIX(236), 433–460 (1950). Scholar
  3. 3.
    T.M. Mitchell, Machine Learning, vol. 45(37), pp. 870–877 (McGraw Hill, Burr Ridge, IL, 1997)Google Scholar
  4. 4.
    G. Hinton, T. Sejnowski, Unsupervised Learning: Foundations of Neural Computation (Computational neuroscience). (MIT Press, 1999)Google Scholar
  5. 5.
    T. Koch, K. Denike, Crediting his critics’ concerns: remaking John Snow’s map of Broad Street cholera, 1854. Soc. Sci. Med. 69(8), 1246–1251 (2009). Scholar
  6. 6.
    Practical Guide to Clustering Algorithms & Evaluation in R Tutorials & Notes | Machine Learning | HackerEarth. (2019). Retrieved 10 October 2019, from
  7. 7.
    M. Islam, V. Estivill-Castro, M. Rahman, T. Bossomaier, Combining K-Means and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering. Expert Syst. Appl. 91, 402–417 (2018). Scholar
  8. 8.
    J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 m No, 14 (1967, June), pp. 281–297Google Scholar
  9. 9.
    A. Dempster, N. Laird, D. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 39(1), 1–22 (1977).
  10. 10.
    C. Stover, Log-likelihood function—from Wolfram MathWorld (2019). Retrieved 17 October 2019, from
  11. 11.
    M. Rahman, M. Islam, A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl.-Based Syst. 71, 345–365 (2014). Scholar
  12. 12.
    Error Sum of Squares (2019). Retrieved 17 October 2019, from
  13. 13.
    D. van der Merwe, A. Engelbrecht, Data clustering using particle swarm optimization. In: The 2003 Congress on Evolutionary Computation, CEC ‘03 (2003).
  14. 14.
    M. Alswaitti, M. Albughdadi, N. Isa, Density-based particle swarm optimization algorithm for data clustering. Expert Syst. Appl. 91, 170–186 (2018). Scholar
  15. 15.
    S. Singh, A. Singh, Web-spam features selection using CFS-PSO. Procedia Comput. Sci. 125, 568–575 (2018). Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of Engineering Sciences and TechnologyJamia HamdardNew DelhiIndia

Personalised recommendations