A Data Mining Approach for Predicting Academic Success – A Case Study

  • Maria P. G. MartinsEmail author
  • Vera L. Miguéis
  • D. S. B. Fonseca
  • Albano Alves
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 918)


The present study puts forward a regression analytic model based on the random forest algorithm, developed to predict, at an early stage, the global academic performance of the undergraduates of a polytechnic higher education institution. The study targets the universe of an institution composed of 5 schools rather than following the usual procedure of delimiting the prediction to one single specific degree course. Hence, we intend to provide the institution with one single tool capable of including the heterogeneity of the universe of students as well as educational dynamics. A different approach to feature selection is proposed, which enables to completely exclude categories of predictive variables, making the model useful for scenarios in which not all categories of data considered are collected. The introduced model can be used at a central level by the decision-makers who are entitled to design actions to mitigate academic failure.


Data mining Educational data mining Prediction Academic success Random forest Regression 



This work was supported by the Portuguese Foundation for Science and Technology (FCT) under Project UID/EEA/04131/2013. The authors would also like to thank the Polytechnic Institute of Bragança for making available the data analysed in this study.


  1. 1.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  2. 2.
    Romero, C., Ventura, S.: Educational data mining: a survey from 1995 to 2005. Expert Syst. Appl. 33(1), 135–146 (2007)CrossRefGoogle Scholar
  3. 3.
    Romero, C., Ventura, S.: Educational data mining: a review of the state of the art. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 40(6), 601–618 (2010)CrossRefGoogle Scholar
  4. 4.
    Romero, C., Ventura, S.: Data mining in education. Wiley Interdisc. Rev.: Data Min. Knowl. Disc. 3(1), 12–27 (2013)Google Scholar
  5. 5.
    Baker, R.S.J.D., Yacef, K.: The state of educational data mining in 2009: a review and future visions. JEDM-J. Educ. Data Min. 1(1), 3–17 (2009)Google Scholar
  6. 6.
    Huebner, R.A.: A survey of educational data-mining research. Res. Higher Educ. J. 19, 1–13 (2013)Google Scholar
  7. 7.
    Papamitsiou, Z.K., Economides, A.A.: Learning analytics and educational data mining in practice: a systematic literature review of empirical evidence. Educ. Technol. Soc. 17(4), 49–64 (2014)Google Scholar
  8. 8.
    Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014)CrossRefGoogle Scholar
  9. 9.
    Algarni, A.: Data mining in education. Int. J. Adv. Comput. Sci. Appl. 7, 456–461 (2016)Google Scholar
  10. 10.
    Shahiri, A.M., Husain, W., Rashid, N.A.: A review on predicting student’s performance using data mining techniques. Procedia Comput. Sci. 72, 414–422 (2015)CrossRefGoogle Scholar
  11. 11.
    Del Río, C.A., Insuasti, J.A.P.: Predicting academic performance in traditional environments at higher-education institutions using data mining: a review. Ecos de la Academia. 2016(7), 185–201 (2016)Google Scholar
  12. 12.
    Natek, S., Zwilling, M.: Student data mining solution-knowledge management system related to higher education institutions. Expert Syst. Appl. 41(14), 6400–6407 (2014)CrossRefGoogle Scholar
  13. 13.
    Asif, R., Merceron, A., Ali, S.A., Haider, N.G.: Analyzing undergraduate students’ performance using educational data mining. Comput. Educ. 113, 177–194 (2017)CrossRefGoogle Scholar
  14. 14.
    Miguéis, V.L., Freitas, A., Garcia, P.J.V., Silva, A.: Early segmentation of students according to their academic performance: a predictive modelling approach. Decis. Support Syst. 115, 36–51 (2018)CrossRefGoogle Scholar
  15. 15.
    Manhães, L.M.B.: Predição Do Desempenho Acadêmico De Graduandos Utilizando Mineração De Dados Educacionais. Ph.D. thesis (Tese Doutorado), Universidade Federal do Rio de Janeiro (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Maria P. G. Martins
    • 1
    • 3
    Email author
  • Vera L. Miguéis
    • 2
  • D. S. B. Fonseca
    • 3
  • Albano Alves
    • 1
  1. 1.School of Technology and ManagementPolytechnic Institute of BragançaBragançaPortugal
  2. 2.Faculty of EngineeringUniversity of PortoPortoPortugal
  3. 3.CISE - Electromechatronic Systems Research CentreUniversity of Beira InteriorCovilhãPortugal

Personalised recommendations