Comparative Study of Classification Algorithm for Diabetics Data
Data mining techniques play a major role in healthcare centers to solve large volume of datasets. For diabetes patients if the blood glucose level diverges from typical range leads to serious complications. So, they must be monitored regularly to determine any critical variations. Implementing a predictive model for monitoring the glucose level would enable the patients to take preventive measures. This paper describes a solution for early detection of diabetes by applying various data mining techniques to generate informative structures to train on specific data. The main goal of the research is to generate clear and understandable pattern description in order to extract data knowledge and information stored in the dataset. We investigate the relative performance of various classifiers such as Naive Bayes, SMO-Support Vector Machine (SVM), Decision Tree, and also Neural Network (multilayer perceptron) for our purpose. The ensemble data mining approaches have been improved by classification algorithm. The experimental result shows that Naive Bayes algorithm shows better accuracy of 83.5% by splitting techniques (ST), when the data sets is reduced by 70–30 ratio percentage. By cross-validation (CV) decision tree shows better result 78.3% when compared with other classifiers. The experiment is performed on diabetes dataset at UCI repository in Weka tool. The study shows the potential of ensemble predictive model for predicting instance of diabetes using UCI repository diabetes data. The results are compared among various classifiers and accuracy of test results is measured.
KeywordsData mining Diabetes Classification algorithm Naive bayes Support vector machine Decision tree Neural network
The authors thank VIT University for providing “VIT SEED GRANT” for carrying out this research work.
- 1.Patil, T.R., Sherekar, S.S.: Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6(2), 256–261 (2013)Google Scholar
- 2.David, S.K., Saeb, A.T.M., Al Rubeaan, K.: Comparative analysis of data mining tools and classification techniques using weka in medical bioinformatics. Comput. Eng. Intell. Syst. 4(13), 28–38 (2013)Google Scholar
- 4.Sujata, Priyanka Shetty, S.R.: Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool. Int. J. Recent Innov. Trends Comput. Commun. 3(3), 1168–1173 (2015)Google Scholar
- 5.Sharma, T.C., Jain, M.: WEKA approach for comparative study of classification algorithm. Int. J. Adv. Res. Comput. Commun. Eng. 2(4), 1925–1931 (2013)Google Scholar
- 6.Kumari, M., Vohra, R., Arora, A.: Prediction of Diabetes Using Bayesian Network (2014)Google Scholar
- 7.Salas-Zárate, M.d.P., et al.: Sentiment analysis on tweets about diabetes: an aspect-level approach. Comput. Math. Methods Med. (2017)Google Scholar