Abstract
Student imbalanced data is one of the problems in data mining community. To state the student dropout problem, an ensemble method with under-sampling technique is applied for improved the performance of classification of imbalanced student dataset. Mutual information for feature selection methods is used to find a significant feature. Voting, bagging, and adaboost technique in the ensemble method are used with decision tree (C4.5) and artificial neural network (ANN) classifiers to classify student in point of research objective. The result of this experiment evaluated by overall accuracy, precision, and recall. Bagging technique by random forest gave the best result in terms of overall accuracy is 74.57% and the recall of the prediction in the class (low) which we interested is 95.61%. This experiment extremely useful not only finding a useful knowledge for student and academic planning and management but also improving classification for imbalanced data which is the most effective way to state the classify student performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Rashu, R.I., Haq, N., Rahman, R.M.: Data mining approaches to predict final grade by overcoming class imbalance problem. In: IEEE, 17th International Conference on Computer and Information Technology (ICCIT), pp. 14–19 (2014)
Jishan, S.T., Rashu, R.I., Haque, N., Rahman, R.M.: Improving accuracy of students’ final grade prediction model using optimal equal width binning and synthetic minority over-sampling technique. Decis. Anal. 2(1), 1 (2015)
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(2), 539–550 (2009)
Liu, T.Y.: Easyensemble and feature selection for imbalance data sets. In Bioinformatics. In: IEEE International Joint Conference on Systems Biology and Intelligent Computing IJCBS 2009, pp. 517–520 (2009)
Lima, R.F., Pereira, A.C.M.: A fraud detection model based on feature selection and undersampling applied to Web payment systems. In: 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 3, pp. 219–222 (2015)
Yin, H., Gai, K., Wang, Z.: A classification algorithm based on ensemble feature selections for imbalanced-class dataset. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), pp. 245–249 (2016)
Longadge, R., Dongre, S.: Class imbalance problem in data mining review. arXiv preprint arXiv:1305.1707 (2013)
Lam-On, N., Boongoen, T.: Using cluster ensemble to improve classification of student dropout in Thai university. In: IEEE 15th International Symposium on Soft Computing and Intelligent Systems (SCIS), 2014 Joint 7th International Conference on and Advanced Intelligent Systems (ISIS), pp. 452–457 (2014)
Govindarajan, M.: Analysis of bagged ensemble classifiers for blogger data. In: IEEE, International Conference in Computing Technologies and Intelligent Data Engineering (ICCTIDE), pp. 1–5 (2016)
Kulkarni, S., Kelkar, V.: Classification of multispectral satellite images using ensemble techniques of bagging, boosting and adaboost. In: IEEE 2014 International Conference on Circuits, Systems, Communication and Information Technology Applications (CSCITA), pp. 253–258 (2014)
Mirza, B., Lin, Z., Cao, J., Lai, X.: Voting based weighted online sequential extreme learning machine for imbalance multi-class classification. In: IEEE 2015 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 565–568 (2015)
Fazelpour, A., Khoshgoftaar, T. M., Dittman, D. J., Naplitano, A.: Investigating the variation of ensemble size on bagging-based classifier performance in imbalanced bioinformatics datasets. In: 2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), pp. 377–383 (2016)
Kaur, P., Negi, V.: Techniques based upon boosting to counter class imbalance problem—a survey. In: IEEE, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 2620–2623 (2016)
Ruangthong, P., Jaiyen, S.: Hybrid ensembles of decision trees and Bayesian network for class imbalance problem. In: IEEE 2016 8th International Conference on Knowledge and Smart Technology (KST), pp. 39–42 (2016)
Webb, G.I.: Multiboosting: a technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)
Mustafa, G., Niu, Z., Yousif, A., Tarus, J.: Distribution based ensemble for class imbalance learning. In: IEEE 2015 Fifth International Conference on Innovative Computing Technology (INTECH), pp. 5–10 (2015)
Punlumjeak, W., Rachburee, N., Arunrerk, J.: Big data analytics: student performance prediction using feature selection and machine learning on microsoft azure platform. J. Telecommun. Electron. Comput. Eng. JTEC 9(1–4), 113–117 (2017)
Acknowledgements
We would like to thanks to Rajamangala University of Technology Thanyaburi, Pathumthani, Thailand for providing the student data for conduct this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Punlumjeak, W., Rugtanom, S., Jantarat, S., Rachburee, N. (2018). Improving Classification of Imbalanced Student Dataset Using Ensemble Method of Voting, Bagging, and Adaboost with Under-Sampling Technique. In: Kim, K., Kim, H., Baek, N. (eds) IT Convergence and Security 2017. Lecture Notes in Electrical Engineering, vol 449. Springer, Singapore. https://doi.org/10.1007/978-981-10-6451-7_4
Download citation
DOI: https://doi.org/10.1007/978-981-10-6451-7_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6450-0
Online ISBN: 978-981-10-6451-7
eBook Packages: EngineeringEngineering (R0)