Advertisement

Majority Voting Algorithm for Diagnosing of Imbalanced Malaria Disease

  • T. SajanaEmail author
  • M. R. Narasingarao
Conference paper
Part of the Lecture Notes in Computational Vision and Biomechanics book series (LNCVB, volume 30)

Abstract

Vector borne diseases like malaria fever is one of the most elevating issues in medical domain. Accurate identification of a patient from the given set of samples and classification becomes one of the challenging task when dealing with imbalanced datasets. Many conventional machine learning and data mining algorithms are shows poor performance to classify skewed distributed data because they are trained very well with the majority class samples only. Proposing an ensemble method called majority voting defined with a set of machine learning algorithms namely decision tree—C4.5, Naive Bayesian and K-Nearest Neighbor (KNN) classifiers. Classification of samples can be done based on the majority voting of classifiers. Experiment results stating that voting ensemble method shows classification accuracy of 95.2% on imbalanced malaria disease data whereas dealing with balanced malaria disease data voting ensembler shows 92.1% of accuracy. Consequently voting shows 100% classification report on precision, Recall and F1-Score on imbalanced malaria disease data sets whereas on balanced malaria disease data voting shows 96% of Precision, Recall and F1-Score metrics.

Keywords

Malaria disease Balanced data Imbalanced data Voting ensembler 

References

  1. 1.
    Bui TQ, Pham HM (2016) Web based GIS for spatial pattern detection: application to malaria incidence in Vietnam. Bui Pham Springer Plus 5(1014):1–14Google Scholar
  2. 2.
    MacLeod DA, Jones A, Di Giuseppe F, Caminade C, Morse AP (2015) Demonstration of successful malaria forecasts for Botswana using an operational seasonal climate model. Environ Res Lett 10:044005, 1–11 (IOP Publishing)Google Scholar
  3. 3.
    Rahman MZ, Roytman L, Kadik A, Rosy DA (2015) Environmental data analysis and remote sensing for early detection of dengue and malaria. In: Proceedings of SPIE, vol 9112, pp 1–9Google Scholar
  4. 4.
  5. 5.
    Pengfei J, Chunkai Z, Zhenyu H (2014) A new sampling approach for classification of imbalanced data sets with high density. In: IEEE—BigComp, pp 217–222Google Scholar
  6. 6.
    Ditzler G, Polikar R (2012) Incremental learning of concept drift from streaming imbalanced data. IEEE Trans Knowl Data Eng, pp 1–30Google Scholar
  7. 7.
    Nugroho HA, Akbar SA, Murhandarwati EEH (2015) Feature extraction and classification for detection malaria parasites in thin blood smear. In: IEEE 2nd international conference on information technology, computer and electrical engineering (ICITACEE), pp 197–201Google Scholar
  8. 8.
    Das DK, Maiti AK, Chakraborty C (2015) Automated system for characterization and classification of malaria-infected stages using light microscopic images of thin blood smears. J Microsc 257(3):238–252CrossRefGoogle Scholar
  9. 9.
    Ruiz D, Brun C, Connor SJ, Omumbo JA, Lyon B, Thomson MC (2014) Testing a multi-malaria-model ensemble against 30 years of data in the Kenyan highlands. Malaria J 13:206, 1–14CrossRefGoogle Scholar
  10. 10.
    Smith T, Ross A, Maire N, Chitnis N, Studer A, Hardy D, Brooks A, Penny M, Tanner M (2012) Ensemble modeling of the likely public health impact of pre-erythrocytic malaria vaccine. PLOS Med 9(1):1–20CrossRefGoogle Scholar
  11. 11.
    Pandit P, Anand A (2016, August) Artificial neural networks for detection of malaria in RBCs. ArXiv: 1608.06627)Google Scholar
  12. 12.
    Bbosa F, Wesonga R, Jehopio P (2016) Clinical malaria diagnosis: rule based Classification statistical prototype. Springer Plus 5:939CrossRefGoogle Scholar
  13. 13.
    Wu C, Wong PJY (2016) Multi-dimensional discrete Halanay inequalities and the global stability of the disease free equilibrium of a discrete delayed malaria model. Adv Differ Equ 2016:113MathSciNetCrossRefGoogle Scholar
  14. 14.
    Tsai M-H, Tsai M-H, Yu S-S, Chan Y-K, Jen C-C (2015) Blood smear image based malaria parasite and infected-erythrocyte detection and segmentation. Transactional Processing Systems. J Med Syst 39:118.  https://doi.org/10.1007/s10916-015-0280-9CrossRefGoogle Scholar
  15. 15.
    Rahmanti FZ, Ningrum NK, Imania NK, Purnomo MH (2015, November) Plasmodium vivax classification from digitalization microscopic thick blood film using combination of second order statistical feature extraction and K-Nearest Neighbour (K-NN) classifier method. In: IEEE 4th international conference on instrumentation, communications, information technology, and biomedical engineering (ICICI-BME), Bandung, pp 2–3Google Scholar
  16. 16.
    Charpe KC, Bairagi V (2015) Automated malaria parasite and there stage detection in microscopic blood images. In: IEEE sponsored 9th international conference on intelligent systems and control (ISCO)Google Scholar
  17. 17.
    Somasekar J, Reddy BE (2015) Segmentation of erythrocytes infected with malaria parasites for the diagnosis using microscopy imaging. Comput Electr Eng, pp 336–351 (Elsevier)CrossRefGoogle Scholar
  18. 18.
    Cameron E, Battle KE, Bhatt S, Weiss DJ, Bisanzio D, Mappin B, Dalrymple U, Hay SI, Smith DL, Griffin JT, Wenger EA, Eckhoff PA, Smith TA, Penny MA, Gething PW (2015) Defining the relationship between infection prevalence and clinical incidence of Plasmodium falciparum malaria. Nat Commun 6:8170, 1–10Google Scholar
  19. 19.
    Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Prog Artif Intell, pp 1–12Google Scholar
  20. 20.
    Deng X, Zhong W, Ren J, Zeng D, Zhang H (2016) An imbalanced data classification method based on automatic clustering under-sampling. IEEE Trans, pp 1–8Google Scholar
  21. 21.
    Ali A, Shamsuddin SM, Ralescu AL (2013) Classification with class imbalance problem: a review. Int J Adv Soft Comput Appl 5(3):1–30Google Scholar
  22. 22.
    Poolsawad N, Kambhampati C, Cleland JGF (2014) Balancing class for performance of classification with a clinical dataset. In: Proceedings of the World Congress on engineering, vol 1, pp 1–6Google Scholar
  23. 23.
    Rahman MM, Davis DN (2013) Addressing the class imbalance problem in medical datasets. Int J Mach Learn Comput 3(2):224–228CrossRefGoogle Scholar
  24. 24.
    Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2016) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl, pp 1–49Google Scholar
  25. 25.
    Jamal S, Periwal V, Scaria V (2013) Predictive modeling of anti-malarial molecules inhibiting apicoplast formation. BMC Bioinform 14:55, 1–8CrossRefGoogle Scholar
  26. 26.
    Andrade BB, Reis-Filho A, Souza-Neto SM, Clarencio J, Carmargo LMA, Barral A, Barral-Netto M (2010) Severe Plasmodium vivax malaria exhibits marked inflammatory imbalance. Malaria J 9:13, 1–8CrossRefGoogle Scholar
  27. 27.
    Dubey R, Zhou J, Wanga Y, Thompson PM, Ye J (2014) Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study. Elsevier Neuro Image 87:220–241Google Scholar
  28. 28.
    Ng WWY, Hu J, Yeung DS, Yin S, Roli F (2015) Diversified sensitivity-based under sampling for imbalance classification problems. IEEE Trans Cybern, pp 1–11Google Scholar
  29. 29.
    Roumani YF, May JH, Strum DP, Vargas LG (2013) Classifying highly imbalanced ICU data. Health care Manag Sci 16:119–128CrossRefGoogle Scholar
  30. 30.
    Pengfei J, Chunkai Z, Zhenyu H (2014) A new sampling approach for classification of imbalanced data sets with high density. In: IEEE transaction, pp 217–222Google Scholar
  31. 31.
    Garcia V, Sanchez JS, Mollineda RA (2012) On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl Based Syst 25:13–21 (Elsevier)CrossRefGoogle Scholar
  32. 32.
    Thongkam J, Xu G, Zhang Y, Huang F (2009) Toward breast cancer survivability prediction model through improving training space. Expert Syst Appl 36:12200–12209 (Elsevier)CrossRefGoogle Scholar
  33. 33.
    Zhao X-M, Li X, Chen L, Aihara K (2007) Protein classification with imbalanced data. Wiley InterSci 70:125–1132Google Scholar
  34. 34.
    López V, Fernandez A, Garcia S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141 (Elsevier)CrossRefGoogle Scholar
  35. 35.
    Ma L, Fan S (2017) CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests. BMC Bioinform 18:169CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringK L E FVaddeswaram, GunturIndia

Personalised recommendations