Technology, Knowledge and Learning

, Volume 24, Issue 4, pp 567–598 | Cite as

Factors Affecting Students’ Performance in Higher Education: A Systematic Review of Predictive Data Mining Techniques

  • Amjed Abu Saa
  • Mostafa Al-EmranEmail author
  • Khaled Shaalan
Original research


Predicting the students’ performance has become a challenging task due to the increasing amount of data in educational systems. In keeping with this, identifying the factors affecting the students’ performance in higher education, especially by using predictive data mining techniques, is still in short supply. This field of research is usually identified as educational data mining. Hence, the main aim of this study is to identify the most commonly studied factors that affect the students’ performance, as well as, the most common data mining techniques applied to identify these factors. In this study, 36 research articles out of a total of 420 from 2009 to 2018 were critically reviewed and analyzed by applying a systematic literature review approach. The results showed that the most common factors are grouped under four main categories, namely students’ previous grades and class performance, students’ e-Learning activity, students’ demographics, and students’ social information. Additionally, the results also indicated that the most common data mining techniques used to predict and classify students’ factors are decision trees, Naïve Bayes classifiers, and artificial neural networks.


Educational data mining Students’ performance Data mining techniques Systematic review 



  1. Abazeed, A., & Khder, M. (2017). A classification and prediction model for student’s performance in university level. Journal of Computer Science, 13, 228–233.CrossRefGoogle Scholar
  2. Abdous, M., He, W., & Yen, C. J. (2012). Using data mining for predicting relationships between online question theme and final grade. Educational Technology and Society, 15(3), 77–88.Google Scholar
  3. Abu Saa, A. (2016). Educational data mining and students’ performance prediction. Internation Journal of Advanced Computer Science and Applications. Scholar
  4. Abu Saa, A., Al-Emran, M., & Shaalan, K. (2019). Mining student information system records to predict students’ academic performance. In International conference on advanced machine learning technologies and applications (pp. 229–239). Berlin: Springer.Google Scholar
  5. Al-Emran, M., Mezhuyev, V., Kamaludin, A., & Shaalan, K. (2018). The impact of knowledge management processes on information systems: A systematic review. International Journal of Information Management, 43, 173–187.CrossRefGoogle Scholar
  6. Al-Qaysi, N., Mohamad-Nordin, N., & Al-Emran, M. (2018). A systematic review of social media acceptance from the perspective of educational and information systems theories and models. Journal of Educational Computing Research. Scholar
  7. Anuradha Bharathiar, C., & Velmurugan, T. (2015). A comparative analysis on the evaluation of classification algorithms in the prediction of students performance. Indian Journal of Science and Technology. Scholar
  8. Araque, F., Rolddsán, C., & Salguero, A. (2009). Factors influencing university drop out rates. Computers and Education. Scholar
  9. Asif, R., Merceron, A., Ali, S. A., & Haider, N. G. (2017). Analyzing undergraduate students’ performance using educational data mining. Computers and Education. Scholar
  10. Badr El Din Ahmed, A., Sayed Elaraby, I., & Sayed Elaraby, I. (2014). Data mining: A prediction for student’s performance using classification method. World Journal of Computer Application and Technology. Scholar
  11. Bakhshinategh, B., Zaiane, O. R., ElAtia, S., & Ipperciel, D. (2018). Educational data mining applications and tasks: A survey of the last 10 years. Education and Information Technologies. Scholar
  12. Baradwaj, B., & Pal, S. (2012). Mining educational data to analyze student’s performance. International Journal of Advanced Computer Science and Applications, 2(6), 63–69.Google Scholar
  13. Berland, M., Baker, R. S., & Blikstein, P. (2014). Educational data mining and learning analytics: Applications to constructionist research. Technology, Knowledge and Learning. Scholar
  14. Bhardwaj, B. K., & Pal, S. (2012). Data mining: A prediction for performance improvement using classification. (IJCSIS) International Journal of Computer Science and Information Security, 9(4), 1–5.Google Scholar
  15. Burgos, C., Campanario, M. L., de la Peña, D., Lara, J. A., Lizcano, D., & Martínez, M. A. (2018). Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout. Computers and Electrical Engineering. Scholar
  16. Cerezo, R., Sánchez-Santillán, M., Paule-Ruiz, M. P., & Núñez, J. C. (2016). Students’ LMS interaction patterns and their relationship with achievement: A case study in higher education. Computers and Education. Scholar
  17. Chamizo-Gonzalez, J., Cano-Montero, E. I., Urquia-Grande, E., & Muñoz-Colomina, C. I. (2015). Educational data mining for improving learning outcomes in teaching accounting within higher education. International Journal of Information and Learning Technology. Scholar
  18. Costantini, P., Linting, M., & Porzio, G. C. (2010). Mining performance data through nonlinear PCA with optimal scaling. Applied Stochastic Models in Business and Industry. Scholar
  19. Fernandes, E., Holanda, M., Victorino, M., Borges, V., Carvalho, R., & Van Erven, G. (2018). Educational data mining: Predictive analysis of academic performance of public school students in the capital of Brazil. Journal of Business Research. Scholar
  20. Gamulin, J., Gamulin, O., & Kermek, D. (2016). Using Fourier coefficients in time series analysis for student performance prediction in blended learning environments. Expert Systems. Scholar
  21. Gómez-Rey, P., Fernández-Navarro, F., & Barberà, E. (2016). Ordinal regression by a gravitational model in the field of educational data mining. Expert Systems. Scholar
  22. Hasheminejad, S. M., & Sarvmili, M. (2018). S3PSO: Students’ performance prediction based on particle swarm optimization. Journal of AI and Data Mining, 7, 77–96.Google Scholar
  23. Hu, Y.-H., Lo, C.-L., & Shih, S.-P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior. Scholar
  24. Huang, S., & Fang, N. (2013). Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models. Computers and Education. Scholar
  25. Hung, J., Hsu, Y.-C., & Rice, K. (2012). Integrating data mining in program evaluation of K-12 online education. Educational Technology and Society. Scholar
  26. Jiang, Y. H., Javaad, S. S., & Golab, L. (2016). Data mining of undergraduate course evaluations. Informatics in Education. Scholar
  27. Kitchenham, B., & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering (pp. 1–57). Software Engineering Group, School of Computer Science and Mathematics, Keele University.Google Scholar
  28. Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., & Linkman, S. (2009). Systematic literature reviews in software engineering: A systematic literature review. Information and Software Technology. Scholar
  29. Kotsiantis, S. B. (2012). Use of machine learning techniques for educational proposes: A decision support system for forecasting students’ grades. Artificial Intelligence Review. Scholar
  30. Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems. Scholar
  31. Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., & Riera, T. (2014). A system for knowledge discovery in e-Learning environments within the European higher education area: Application to student data from Open University of Madrid, UDIMA. Computers and Education, 72, 23–36. Scholar
  32. Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J., et al. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. Journal of Clinical Epidemiology. Scholar
  33. Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers and Education. Scholar
  34. Márquez-Vera, C., Cano, A., Romero, C., Noaman, A. Y. M., Mousa Fardoun, H., & Ventura, S. (2016). Early dropout prediction using data mining: A case study with high school students. Expert Systems. Scholar
  35. Márquez-Vera, C., Cano, A., Romero, C., & Ventura, S. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence. Scholar
  36. Mwalumbwe, I., & Mtebe, J. S. (2017). Using learning analytics to predict students’ performance in moodle learning management system: A case of Mbeya University of science and technology. Electronic Journal of Information Systems in Developing Countries. Scholar
  37. Pandey, U. K., & Pal, S. (2011). Data mining: A prediction of performer or underperformer using classification. International Journal of Computer Science and Information Technologies, 2, 686–690.Google Scholar
  38. Peña-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications. Scholar
  39. Romero, C., López, M. A., Luna, J. M., & Ventura, S. (2013). Predicting students’ final performance from participation in on-line discussion forums. Computers and Education. Scholar
  40. Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications. Scholar
  41. Salloum, S. A., Al-Emran, M., Shaalan, K., & Tarhini, A. (2019). Factors affecting the E-learning acceptance: A case study from UAE. Education and Information Technologies, 24(1), 509–530. Scholar
  42. Shahiri, A. M., Husain, W., & Rashid, N. A. (2015). A review on predicting student’s performance using data mining techniques. Procedia Computer Science. Scholar
  43. Wook, M., Yusof, Z. M., & Nazri, M. Z. A. (2017). Educational data mining acceptance among undergraduate students. Education and Information Technologies. Scholar
  44. Xing, W., Guo, R., Petakovic, E., & Goggins, S. (2015). Participation-based student final performance prediction model through interpretable genetic programming: Integrating learning analytics, educational data mining and theory. Computers in Human Behavior. Scholar
  45. Yadav, S., Bharadwaj, B., & Pal, S. (2012). Data mining applications: A comparative study for predicting student’s performance. International Journal of Innovative Technology and Creative Engineering, 1, 13–19.Google Scholar
  46. Yadav, S. K., & Pal, S. (2012). Data mining: A prediction for performance improvement of engineering students using classification. World of Computer Science and Information Technology Journal WCSIT. Scholar
  47. Yukselturk, E., Ozekes, S., Türel, Y. K., Education, C., Ozekes, S., Türel, Y. K., et al. (2014). Predicting dropout student: An application of data mining methods in an online education program. European Journal of Open, Distance and E-Learning. Scholar
  48. Zafra, A., & Ventura, S. (2012). Multi-instance genetic programming for predicting student performance in web based educational environments. Applied Soft Computing. Scholar
  49. Zhou, Q., Zheng, Y., & Mou, C. (2015). Predicting students’ performance of an offline course from their online behaviors. In 2015 5th international conference on digital information and communication technology and its applications, DICTAP 2015.

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Faculty of Engineering and ITThe British University in DubaiDubaiUAE
  2. 2.Applied Computational Civil and Structural Engineering Research Group, Faculty of Civil EngineeringTon Duc Thang UniversityHo Chi Minh CityVietnam

Personalised recommendations