A decision support system to improve medical diagnosis using a combination of k-medoids clustering based attribute weighting and SVM

  • Musa Peker
Systems-Level Quality Improvement
Part of the following topical collections:
  1. Systems-Level Quality Improvement


The use of machine learning tools has become widespread in medical diagnosis. The main reason for this is the effective results obtained from classification and diagnosis systems developed to help medical professionals in the diagnosis phase of diseases. The primary objective of this study is to improve the accuracy of classification in medical diagnosis problems. To this end, studies were carried out on 3 different datasets. These datasets are heart disease, Parkinson’s disease (PD) and BUPA liver disorders. Key feature of these datasets is that they have a linearly non-separable distribution. A new method entitled k-medoids clustering-based attribute weighting (kmAW) has been proposed as a data preprocessing method. The support vector machine (SVM) was preferred in the classification phase. In the performance evaluation stage, classification accuracy, specificity, sensitivity analysis, f-measure, kappa statistics value and ROC analysis were used. Experimental results showed that the developed hybrid system entitled kmAW + SVM gave better results compared to other methods described in the literature. Consequently, this hybrid intelligent system can be used as a useful medical decision support tool.


Medical diagnosis k-medoids clustering based attribute weighting Support vector machine Hybrid classification method Decision support system 


  1. 1.
    Das, R., Turkoglu, I., and Sengur, A., Diagnosis of valvular heart disease through neural networks ensembles. Comput. Methods Programs Biomed. 93(2):185–191, 2009.CrossRefPubMedGoogle Scholar
  2. 2.
    Peker, M., A new approach for automatic sleep scoring: Combining Taguchi based complex-valued neural network and complex wavelet transform. Comput. Methods Programs Biomed. 2016. doi: 10.1016/j.cmpb.2016.01.001.PubMedGoogle Scholar
  3. 3.
    Das, R., and Sengur, A., Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Syst. Appl. 37(7):5110–5115, 2010.CrossRefGoogle Scholar
  4. 4.
    Bache, K., and Lichman, M., UCI machine learning repository. 2013, Available at
  5. 5.
    Duch, W., Adamczak, R., and Grabczewski, K., A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Trans. Neural Network 12(2):277–306, 2001.CrossRefGoogle Scholar
  6. 6.
    Sahan, S., Polat, K., Kodaz, H., and Gunes, S., The medical applications of attribute weighted artificial immune system (AWAIS): Diagnosis of heart and diabetes diseases. Lect. Notes Comput. Sci. 3627:456–468, 2005.CrossRefGoogle Scholar
  7. 7.
    Polat, K., and Gunes, S., A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and AIRS. Comput. Methods Programs Biomed. 88(2):164–174, 2007.CrossRefPubMedGoogle Scholar
  8. 8.
    Polat, K., Sahan, S., and Gunes, S., Automatic detection of heart disease using an artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism and k-NN (nearest neighbour) based weighting preprocessing. Expert Syst. Appl. 32(2):625–631, 2007.CrossRefGoogle Scholar
  9. 9.
    Ozsen, S., and Gunes, S., Effect of feature-type in selecting distance measure for an artificial immune system as a pattern recognizer. Digit. Signal Process. 18(4):635–645, 2008.CrossRefGoogle Scholar
  10. 10.
    Kahramanli, H., and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl. 35(1–2):82–89, 2008.CrossRefGoogle Scholar
  11. 11.
    Polat, K., and Gunes, S., A new feature selection method on classification of medical datasets: Kernel F-score feature selection. Expert Syst. Appl. 36(7):10367–10373, 2009.CrossRefGoogle Scholar
  12. 12.
    Das, R., Turkoglu, I., and Sengur, A., Effective diagnosis of heart disease through neural networks ensembles. Expert Syst. Appl. 36(4):7675–7680, 2009.CrossRefGoogle Scholar
  13. 13.
    Subbulakshmi, C. V., Deepa, S. N., and Malathi, N., Extreme learning machine for two category data classification. In 2012 I.E. International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp. 458–461, 2012.Google Scholar
  14. 14.
    Mantas, C. J., and Abellán, J., Credal-C4. 5: Decision tree based on imprecise probabilities to classify noisy data. Expert Syst. Appl 41(10):4625–4637, 2014.CrossRefGoogle Scholar
  15. 15.
    Shahbaba, B., and Neal, R., Nonlinear models using Dirichlet process mixtures. J. Mach. Learn. Res. 10:1829–1850, 2009.Google Scholar
  16. 16.
    Das, R., A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Syst. Appl. 37(2):1568–1572, 2010.CrossRefGoogle Scholar
  17. 17.
    Guo, P. F., Bhattacharya, P., and Kharma, N., Advances in detecting Parkinson’s disease. in Medical Biometrics, vol. 6165 of Lect. Notes Comput. Sci, pp. 306–314, 2010.Google Scholar
  18. 18.
    Sakar, C. O., and Kursun, O., Telediagnosis of Parkinson’s disease using measurements of dysphonia. J. Med. Syst. 34(4):591–599, 2010.CrossRefPubMedGoogle Scholar
  19. 19.
    Ozcift, A., and Gulten, A., Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Comput. Methods Programs Biomed. 104(3):443–451, 2011.CrossRefPubMedGoogle Scholar
  20. 20.
    Astrom, F., and Koker, R., A parallel neural network approach to prediction of Parkinson’s disease. Expert Syst. Appl. 38(10):12470–12474, 2011.CrossRefGoogle Scholar
  21. 21.
    Luukka, P., Feature selection using fuzzy entropy measures with similarity classifier. Expert Syst. Appl. 38(4):4600–4607, 2011.CrossRefGoogle Scholar
  22. 22.
    Li, D. C., Liu, C. W., and Hu, S. C., A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artif. Intell. Med. 52(1):45–52, 2011.CrossRefPubMedGoogle Scholar
  23. 23.
    Ozcift, A., SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease. J. Med. Syst. 36(4):2141–2147, 2012.CrossRefPubMedGoogle Scholar
  24. 24.
    Polat, K., Classification of Parkinson’s disease using feature weighting method on the basis of fuzzy c-means clustering. Int. J. Syst. Sci. 43(4):597–609, 2012.CrossRefGoogle Scholar
  25. 25.
    Daliri, M. R., Chi-square distance kernel of the gaits for the diagnosis of Parkinson’s disease. Biomed. Signal Process. Contr. 8(1):66–70, 2013.CrossRefGoogle Scholar
  26. 26.
    Zuo, W. L., Wang, Z. Y., Liu, T., and Chen, H. L., Effective detection of Parkinson’s disease using an adaptive fuzzy k-nearest neighbor approach. Biomed. Signal Process. Contr. 8(4):364–373, 2013.CrossRefGoogle Scholar
  27. 27.
    Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., and Wang, S. J., An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert Syst. Appl. 40(1):263–271, 2013.CrossRefGoogle Scholar
  28. 28.
    Ma, C., Ouyang, J., Chen, H. L., and Zhao, X. H., An efficient diagnosis system for Parkinson’s disease using kernel-based extreme learning machine with subtractive clustering features weighting approach. Comput Math. Methods Med. 2014. doi: 10.1155/2014/985789.Google Scholar
  29. 29.
    Pham, D. T., Dimov, S. S., and Salem, Z., Technique for selecting examples in inductive learning. In European Symposium on Intelligent Techniques (ESIT 2000), pp. 119–127, 2000.Google Scholar
  30. 30.
    Van Gestel, T., Suykens, J. A. K., Lanckriet, G., Lambrechts, A., De Moor, B., and Vandewalle, J., Bayesian framework for least squares support vector machine classifiers, Gaussian processes and kernel fisher discriminant analysis. Neural. Comput. 14(5):1115–1147, 2002.CrossRefPubMedGoogle Scholar
  31. 31.
    Goncalves, L. B., Vellasco, M. B. R., Pacheco, M. A. C., and de Souza, F. J., Inverted hierarchical neuro-fuzzy BSP system: A novel neuro-fuzzy model for pattern classification and rule extraction in databases. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 36(2):236–248, 2006.CrossRefGoogle Scholar
  32. 32.
    Polat, K., Sahan, S., Kodaz, H., and Gunes, S., Breast cancer and liver disorders classification using artificial immune recognition system (AIRS) with performance evaluation by fuzzy resource allocation mechanism. Expert Syst. Appl. 32(1):172–183, 2007.CrossRefGoogle Scholar
  33. 33.
    Jin, B., Tang, Y. C., and Zhang, Y. Q., Support vector machines with genetic fuzzy feature transformation for biomedical data classification. Inform. Sci. 177(2):476–489, 2007.CrossRefGoogle Scholar
  34. 34.
    Ozsen, S., and Gunes, S., Attribute weighting via genetic algorithms for attribute weighted artificial immune system (AWAIS) and its application to heart disease and liver disorders problems. Expert Syst. Appl. 36(1):386–392, 2009.CrossRefGoogle Scholar
  35. 35.
    Lee, Y. J., and Mangasarian, O. L., SSVM: A smooth support vector machine for classification. Comput. Optim. Appl. 20(1):5–22, 2001.CrossRefGoogle Scholar
  36. 36.
    Chen, L. F., Su, C. T., Chen, K. H., and Wang, P. C., Particle swarm optimization for feature selection with application in obstructive sleep apnea diagnosis. Neural Comput. Appl. 21(8):2087–2096, 2012.CrossRefGoogle Scholar
  37. 37.
    Dehuri, S., Roy, R., Cho, S. B., and Ghosh, A., An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. J. Syst. Software 85(6):1333–1345, 2012.CrossRefGoogle Scholar
  38. 38.
    Shao, Y. H., and Deng, N. Y., A coordinate descent margin based-twin support vector machine for classification. Neural Network 25:114–121, 2012.CrossRefGoogle Scholar
  39. 39.
    Savitha, R., Suresh, S., Sundararajan, N., and Kim, H. J., A fully complex-valued radial basis function classifier for real-valued classification problems. Neurocomputing 78(1):104–110, 2012.CrossRefGoogle Scholar
  40. 40.
    López, F. M., Puertas, S. M., and Arriaza, J. T., Training of support vector machine with the use of multivariate normalization. Appl. Soft Comput. 24:1105–1111, 2014.CrossRefGoogle Scholar
  41. 41.
    Gunes, S., Polat, K., and Yosunkaya, S., Efficient sleep stage recognition system based on EEG signal using k-means clustering based feature weighting. Expert Syst. Appl. 37(12):7922–7928, 2010.CrossRefGoogle Scholar
  42. 42.
    Han, J., Kamber, M., and Pei, J., Data mining: Concepts and techniques. Morgan Kaufmann, 2006.Google Scholar
  43. 43.
    Polat, K., and Gunes, S., A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system. Digit. Signal Process. 16(6):913–921, 2006.CrossRefGoogle Scholar
  44. 44.
    Tahir, M. A., Bouridane, A., and Kurugollu, F., Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier. Pattern Recogn. Lett. 28(4):438–446, 2007.CrossRefGoogle Scholar
  45. 45.
    Sun, Y., Iterative RELIEF for feature weighting: Algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29(6):1035–1051, 2007.CrossRefPubMedGoogle Scholar
  46. 46.
    Polat, K., Latifoglu, F., Kara, S., and Gunes, S., Usage of novel similarity based weighting method to diagnose the Atherosclerosis from carotid artery Doppler signals. Med. Biol. Eng. Comput. 46:353–362, 2008.CrossRefPubMedGoogle Scholar
  47. 47.
    Dua, S., Singh, H., and Thompson, H. W., Associative classification of mammograms using weighted rules. Expert Syst. Appl. 36(5):9250–9259, 2009.CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Polat, K., and Durduran, S. S., Subtractive clustering attribute weighting (SCAW) to discriminate the traffic accidents on Konya–Afyonkarahisar highway in Turkey with the help of GIS: A case study. Adv. Eng. Software 42(7):491–500, 2011.CrossRefGoogle Scholar
  49. 49.
    Unal, Y., Polat, K., and Kocer, H. E., Pairwise FCM based feature weighting for improved classification of vertebral column disorders. Comput. Biol. Med. 46:61–70, 2014.CrossRefPubMedGoogle Scholar
  50. 50.
    MacQueen, J. B., Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297, 1967.Google Scholar
  51. 51.
    Bezdek, J. C., Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York, 1981.CrossRefGoogle Scholar
  52. 52.
    Yager, R. R., and Filev, D. P., Generation of fuzzy rules by mountain clustering. J. Intell. Fuzzy Syst. 24:209–219, 1994.Google Scholar
  53. 53.
    Chiu, S. L., Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 2:267–278, 1994.CrossRefGoogle Scholar
  54. 54.
    Kaufman, L., and Rousseeuw, P., Clustering by means of medoids. North-Holland, 1987.Google Scholar
  55. 55.
    Kaufman, L., and Rousseeuw, P. J., Finding groups in data: An introduction to cluster analysis. Wiley, Hoboken, NJ, 1990.CrossRefGoogle Scholar
  56. 56.
    Vapnik, V. N., The nature of statistical learning theory. Springer, NewYork, 1995.CrossRefGoogle Scholar
  57. 57.
    Berikol, G. B., Yildiz, O., and Ozcan, I. T., Diagnosis of acute coronary syndrome with a support vector machine. J. Med. Syst. 40(4):1–8, 2016.CrossRefGoogle Scholar
  58. 58.
    Su, L., Shi, T., Xu, Z., Lu, X., and Liao, G., Defect inspection of flip chip solder bumps using an ultrasonic transducer. Sensors 13(12):16281–16291, 2013.CrossRefPubMedCentralGoogle Scholar
  59. 59.
    Cortes, C., and Vapnik, V., Support vector network. Mach. Learn. 20(3):273–297, 1995.Google Scholar
  60. 60.
    Elbaz, A., Bower, J. H., Maraganore, D. M., McDonnell, S. K., Peterson, B. J., Ahlskog, J. E., Schaid, D. J., and Rocca, W. A., Risk tables for Parkinsonism and Parkinson’s disease. J. Clin. Epidemiol. 55:25–31, 2002.CrossRefPubMedGoogle Scholar
  61. 61.
    Little, M. A., McSharry, P. E., Hunter, E. J., and Ramig, L. O., Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 56:1015–1022, 2009.CrossRefPubMedPubMedCentralGoogle Scholar
  62. 62.
    Bergstra, J., and Bengio, Y., Random search for hyper-parameter optimization. The J. Mach. Learn. Res. 13(1):281–305, 2012.Google Scholar
  63. 63.
    Chang, C. C., and Lin, C. J., LIBSVM: A library for support vector machines. 2001, Software available at
  64. 64.
    Cohen, J., A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1):37–46, 1960.CrossRefGoogle Scholar
  65. 65.
    Kocer, S., and Canal, M. R., Classifying epilepsy diseases using artificial neural networks and genetic algorithm. J. Med. Syst. 35(4):489–498, 2011.CrossRefPubMedGoogle Scholar
  66. 66.
    Alickovic, E., and Subasi, A., Medical decision support system for diagnosis of heart arrhythmia using DWT and random forests classifier. J. Med. Syst. 40(4):1–12, 2016.CrossRefGoogle Scholar
  67. 67.
    Ozsen, S., Gunes, S., Kara, S., and Latifoglu, F., Use of kernel functions in artificial immune systems for the nonlinear classification problems. IEEE Trans. Inform. Tech. Biomed. 13(4):621–628, 2009.CrossRefGoogle Scholar
  68. 68.
    Tian, J., Li, M., and Chen, F., A hybrid classification algorithm based on coevolutionary EBFNN and domain covering method. Neural Comput. Appl. 18(3):293–308, 2009.CrossRefGoogle Scholar
  69. 69.
    Torun, Y., and Tohumoglu, G., Designing simulated annealing and subtractive clustering based fuzzy classifier. Appl. Soft Comput. 11(2):2193–2201, 2011.CrossRefGoogle Scholar
  70. 70.
    Al-Obeidat, F., Belacela, N., Carretero, J. A., and Mahanti, P., An evolutionary framework using particle swarm optimization for classification method PROAFTN. Appl. Soft Comput. 11(8):4971–4980, 2011.CrossRefGoogle Scholar
  71. 71.
    Jaganathan, P., and Kuppuchamy, R., A threshold fuzzy entropy based featureselection for medical database classification. Comput. Biol. Med. 43:2222–2229, 2013.CrossRefPubMedGoogle Scholar
  72. 72.
    Lim, C. K., and Chan, C. S., A weighted inference engine based on interval valued fuzzy relational theory. Expert Syst. Appl. 42:3410–3419, 2015.CrossRefGoogle Scholar
  73. 73.
    Yang, C. Y., Chou, J. J., and Lian, F. L., Robust classifier learning with fuzzy class labels for large-margin support vector machines. Neurocomputing 99:1–14, 2013.CrossRefGoogle Scholar
  74. 74.
    Ahmad, F., Isa, N. A. M., Hussain, Z., and Osman, M. K., Intelligent medical disease diagnosis using improved hybrid genetic algorithm-multilayer perceptron network. J. Med. Syst. 37(2):1–8, 2013.CrossRefGoogle Scholar
  75. 75.
    Ibrikci, T., Ustun, D., and Kaya, I. E., Diagnosis of several diseases by using combined kernels with support vector machine. J. Med. Syst. 36(3):1831–1840, 2012.CrossRefPubMedGoogle Scholar
  76. 76.
    Psorakis, I., Damoulas, T., and Girolami, M. A., Multiclass relevance vector machines: Sparsity and accuracy. IEEE Trans. Neural Network 21(10):1588–1598, 2010.CrossRefGoogle Scholar
  77. 77.
    Lin, J. J., and Chang, P. C., A particle swarm optimization based classifier for liver disorders classification, in: International Conference on Computational Problem-Solving (ICCP), pp. 3–5, 2010.Google Scholar
  78. 78.
    Wang, J., Belatreche, A., Maguire, L., and McGinnity, T. M., An online supervised learning method for spiking neural networks with adaptive structure. Neurocomputing 144:526–536, 2014.CrossRefGoogle Scholar
  79. 79.
    Ozsen, S., and Yucelbas, C., On the evolution of ellipsoidal recognition regions in artificial immune systems. Appl. Soft Comput. 31:210–222, 2015.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Information Systems Engineering, Faculty of TechnologyMugla Sitki Kocman UniversityMuglaTurkey

Personalised recommendations