Advertisement

Unsupervised Learning on Healthcare Survey Data with Particle Swarm Optimization

  • Hina FirdausEmail author
  • Syed Imtiyaz Hassan
Chapter
  • 18 Downloads
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 13)

Abstract

The behavior of a machine learning algorithm very much depends upon the structure of entities. There a pool of big data, which have lots of ambiguity, noises, granularity, unlabeled behavior that make it tough to find pattern and visualize an outcome. To solve this issue machine learning proposes an algorithm like clustering, which is mainly part of unsupervised learning. In the proposed model we used the unsupervised learning specifically the clustering algorithms like Expectation Maximization (EM), Make Density Based Analysis (MDBC) wrapped with the clustering algorithm like Kmeans, EM, GenClust++. The dataset has been chosen is a survey data on healthcare issues like chronic diseases, regular treatment etc. across various states and district of India. We have used Weka tool for formulation and training our dataset. This dataset is obtained from the Open Data Government Platform India Portal under the Government of Open Data License India, from the Department of Health and Family Welfare. The dataset contains ordinal type of values. These kinds of dataset are excellent examples of exploratory data analysis. The particle swarm optimization is applied on dataset for optimizing the attributes later trained with clustering algorithms which gave better log likelihood value comparable to the results on simple clustering algorithms. We even compare the performance speed among the algorithms to check how quickly they are able to perform on the variety of attributes.

Keywords

Unsupervised learning Particle swarm optimization Kmeans Genclust++ UN SDG Healthcare analytics Machine learning Big data 

References

  1. 1.
    A. Samuel, Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229 (1959).  https://doi.org/10.1147/rd.33.0210CrossRefMathSciNetGoogle Scholar
  2. 2.
    A. Turing, I—Computing machinery and intelligence. Mind LIX(236), 433–460 (1950).  https://doi.org/10.1093/mind/lix.236.433CrossRefMathSciNetGoogle Scholar
  3. 3.
    T.M. Mitchell, Machine Learning, vol. 45(37), pp. 870–877 (McGraw Hill, Burr Ridge, IL, 1997)Google Scholar
  4. 4.
    G. Hinton, T. Sejnowski, Unsupervised Learning: Foundations of Neural Computation (Computational neuroscience). (MIT Press, 1999)Google Scholar
  5. 5.
    T. Koch, K. Denike, Crediting his critics’ concerns: remaking John Snow’s map of Broad Street cholera, 1854. Soc. Sci. Med. 69(8), 1246–1251 (2009).  https://doi.org/10.1016/j.socscimed.2009.07.046CrossRefGoogle Scholar
  6. 6.
    Practical Guide to Clustering Algorithms & Evaluation in R Tutorials & Notes | Machine Learning | HackerEarth. (2019). Retrieved 10 October 2019, from https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/clustering-algorithms-evaluation-r/tutorial/
  7. 7.
    M. Islam, V. Estivill-Castro, M. Rahman, T. Bossomaier, Combining K-Means and a genetic algorithm through a novel arrangement of genetic operators for high quality clustering. Expert Syst. Appl. 91, 402–417 (2018).  https://doi.org/10.1016/j.eswa.2017.09.005CrossRefGoogle Scholar
  8. 8.
    J. MacQueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 m No, 14 (1967, June), pp. 281–297Google Scholar
  9. 9.
    A. Dempster, N. Laird, D. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 39(1), 1–22 (1977).  https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
  10. 10.
    C. Stover, Log-likelihood function—from Wolfram MathWorld (2019). Retrieved 17 October 2019, from http://mathworld.wolfram.com/Log-LikelihoodFunction.html
  11. 11.
    M. Rahman, M. Islam, A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl.-Based Syst. 71, 345–365 (2014).  https://doi.org/10.1016/j.knosys.2014.08.011CrossRefGoogle Scholar
  12. 12.
    Error Sum of Squares (2019). Retrieved 17 October 2019, from https://hlab.stanford.edu/brian/error_sum_of_squares.html
  13. 13.
    D. van der Merwe, A. Engelbrecht, Data clustering using particle swarm optimization. In: The 2003 Congress on Evolutionary Computation, CEC ‘03 (2003).  https://doi.org/10.1109/cec.2003.1299577
  14. 14.
    M. Alswaitti, M. Albughdadi, N. Isa, Density-based particle swarm optimization algorithm for data clustering. Expert Syst. Appl. 91, 170–186 (2018).  https://doi.org/10.1016/j.eswa.2017.08.050CrossRefGoogle Scholar
  15. 15.
    S. Singh, A. Singh, Web-spam features selection using CFS-PSO. Procedia Comput. Sci. 125, 568–575 (2018).  https://doi.org/10.1016/j.procs.2017.12.07CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.School of Engineering Sciences and TechnologyJamia HamdardNew DelhiIndia

Personalised recommendations