Predicting Academic Performance of International Students Using Machine Learning Techniques and Human Interpretable Explanations Using LIME—Case Study of an Indian University

  • Pawan KumarEmail author
  • Manmohan Sharma
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1087)


With the increasing globalization in higher education, universities are giving importance to attract international students to achieve diversity and good ratings from accreditation bodies. Machine learning techniques have the potential to predict the class of a dependent variable and can thus enable educational institutes to predict the academic performance of students and improve related learning processes. The purpose of this study is to predict the academic performance of international students studying at a university in North India. This study has explored the predictive potential of attributes like their attendance percentage, pending reappears, economy level, geographical region, etc. in developing a statistical model that can predict the likely performance of a student as satisfactory or poor. Machine learning algorithms like logistic regression, naïve Bayes, CART and random forests have been used. Classification accuracy, sensitivity, specificity and area under the ROC curve have been used for evaluation purpose. Interpretable explanations for model outcomes have also been obtained. Classification accuracy of above 90% was observed during experiments. Features like attendance percentage and pending reappears were observed to be contributing most towards prediction outcomes.


Academic performance Binary classification Educational data mining Human interpretability Machine learning Predictive models 



The data set is of international students studying at Lovely Professional University (LPU), India, during 2015–16. The authors are thankful to Division of Research and Development, at LPU, for granting permission to use this data set for our research study.


  1. 1.
  2. 2.
  3. 3.
    P. Kaur, M. Singh, G.S. Josan, Classification and prediction based data mining algorithms to predict slow learners in education sector. Procedia Comput. Sci. 57, 500–508 (2015). 3rd International Conference on Recent Trends in Computing 2015(ICRTC-2015)CrossRefGoogle Scholar
  4. 4.
    F. Ahmad, N.H. Ismail, A.A. Aziz, The prediction of students’ academic performance using classification data mining techniques. Appl. Math. Sci. 9(129), 6415–6426 (2015)Google Scholar
  5. 5.
    G. Hughes, C. Dobbins, The utilization of data analysis techniques in predicting student performance in massive open online courses (MOOCs). Res. Pract. Technol. Enhanced Learn 10, 10 (2015).
  6. 6.
    O. Simpson, Predicting student success in open and distance learning. Open Learn.Google Scholar
  7. 7.
    A.M. Shahiria, W. Husaina, N.A. Rashida, A review on predicting student’s performance using data mining techniques. Third Inf. Syst. Int. Conf. Procedia Comput. Sci. 72(2015), 414–422 (2015)Google Scholar
  8. 8.
  9. 9.
  10. 10.
    D. Storcheus, A. Rostamizadeh, S. Kumar, A survey of modern questions and challenges in feature extraction, in The 1st International Workshop Feature Extraction: Modern Questions and Challenges, JMLR: Workshop and Conference Proceedings. vol. 44, (2015), pp. 1–18Google Scholar
  11. 11.
    I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 1157–1182 (2003)Google Scholar
  12. 12.
    K.L. Wagstaff, Machine learning that matters, in Proceedings of the 29th International Conference on Machine Learning (Edinburgh, Scotland, UK, 2012)Google Scholar
  13. 13.
    P. Domingos, A few useful things to know about machine learning. Commun. ACM, 55(10, October 2012), 78–87; 21(2, June 2006), 125–138 (2006)CrossRefGoogle Scholar
  14. 14.
    L. Breiman, J. Friedman, C.J. Stone, R.A. Olshen, Classification and Regression Trees (CRC press, 1084)Google Scholar
  15. 15.
    D. Baehrens, T. Schroeter, S. Harmeling, How to explain individual classification decisions. J. Mach. Learn. Res. 11(2010), 1803–1831 (2010)MathSciNetzbMATHGoogle Scholar
  16. 16.
    M.T. Ribeiro, S. Singh, C. Guestrin, Why should i trust you? Explaining the predictions of any classifier, in KDD 2016 (San Francisco, CA, USA, 2016).
  17. 17.
    Z.C. Lipton, The mythos of model interpretability, in ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) (New York, NY, USA, 2016)Google Scholar
  18. 18.
    L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Lovely Professional UniversityPhagwaraIndia

Personalised recommendations