An Enhanced CNN Model on Temporal Educational Data for Program-Level Student Classification

  • Chau VoEmail author
  • Hua Phung NguyenEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12033)


In educational data mining, study performance prediction is one of the most popular tasks to forecast final study status of students. Via these predictions, in-trouble students can be identified and supported appropriately. In the existing works, this task has been considered in many various contexts at both course and program levels with different learning approaches. However, the real-world characteristics of the task’s inputs and output such as temporal aspects, data imbalance, and data shortage with sparseness have not yet been fully investigated. Making the most of deep learning, our work is the first one handling those challenges for the program-level student classification task on temporal educational data. In a simple but effective manner, a novel solution is proposed with convolutional neural networks (CNNs) to exploit their well-known advantages on images for temporal educational data. Moreover, image augmentation is done in different ways so that data shortage with sparseness can be overcome. In addition, we adapt new loss functions (Mean False Error and Mean Squared False Error) to make CNN models tackle data imbalance better. As a result, the task is resolved by our enhanced CNN models with more effectiveness and practicability. Indeed, in an empirical study on three real temporal educational datasets, our models outperform other traditional models and original CNN variants on a consistent basis with Accuracy of about 85%–95%.


Program-level student classification Deep learning Convolutional neural network Data imbalance Data sparseness 


  1. 1.
    Academic Affairs Office, Ho Chi Minh City University of Technology, Vietnam, 29 June 2017.
  2. 2.
    Botelho, A.F., Baker, R.S., Heffernan, N.T.: Improving sensor-free affect detection using deep learning. In: Proceedings of the 18th International Conference on Artificial Intelligence in Education, pp. 40–51 (2017)Google Scholar
  3. 3.
    Buda, M., Maki, A., Mazurowski, M. A.: A systematic study of the class imbalance problem in convolutional neural networks. arXiv preprint arXiv:1710.05381 (2017)
  4. 4.
    Fei, M., Yeung, D-Y.: Temporal models for predicting student dropout in massive open online courses. In: Proceedings of the IEEE International Conference on Data Mining Workshop, pp. 256–263. IEEE (2015)Google Scholar
  5. 5.
    Hernández-Blanco, A., Herrera-Flores, B., Tomás, D., Navarro-Colorado, B.: A systematic review of deep learning approaches to educational data mining. Complexity 2019, 1–22 (2019)CrossRefGoogle Scholar
  6. 6.
    Keras: the Python deep learning library. Accessed 24 Sept 2019
  7. 7.
    Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–15 (2017)Google Scholar
  8. 8.
    Kim, B.-H., Vizitei, E., Ganapathi, V.: GritNet 2: real-time student performance prediction with domain adaptation. arXiv:1809.06686v3 [cs.CY], pp. 1–8 (2019)
  9. 9.
    Kim, B.-H., Vizitei, E., Ganapathi, V.: GritNet: student performance prediction with deep learning. In: Proceedings of the 11th International Conference on Educational Data Mining, pp. 1–5 (2018)Google Scholar
  10. 10.
    Kravvaris, D., Kermanidis, K.L., Thanou, E.: Success is hidden in the students’ data. Artif. Intell. Appl. Innov. 382, 401–410 (2012)Google Scholar
  11. 11.
    Nguyen, P.H.G., Vo, C.T.N.: A CNN model with data imbalance handling for course-level student prediction based on forum texts. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 479–490. Springer, Cham (2018). Scholar
  12. 12.
    Su, Y., et al.: Exercise-enhanced sequential modeling for student performance prediction. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, pp. 2435–2443 (2018)Google Scholar
  13. 13.
    Theano: a Python library to define, optimize, and evaluate mathematical expressions. Accessed 24 Sept 2019
  14. 14.
    Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., Kennedy, P.J.: Training deep neural networks on imbalanced data sets. In: Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), pp. 4368–4374. IEEE (2016)Google Scholar
  15. 15.
    Weka 3. Accessed 28 June 2017
  16. 16.
    Whitehill, J., Mohan, K., Seaton, D., Rosen, Y., Tingley, D.: Delving deeper into MOOC student dropout prediction. arXiv:1702.06404v1 [cs.AI], pp. 1–9 (2017)
  17. 17.
    Wilson, K.H., et al.: Estimating student proficiency: deep learning is not the panacea. In: Proceedings of the Neural Information Processing Systems Workshop on Machine Learning for Education, pp. 1–8 (2016)Google Scholar
  18. 18.
    Zhang, Y., Shah, R., Chi, M.: Deep learning + student modeling + clustering: a recipe for effective automatic short answer grading. In: Proceedings of the 9th International Conference on Educational Data Mining, pp. 562–567 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Ho Chi Minh City University of Technology, Vietnam National University – HCMCHo Chi Minh CityVietnam

Personalised recommendations