A data mining approach to guide students through the enrollment process based on academic performance

  • César Vialardi
  • Jorge Chue
  • Juan Pablo Peche
  • Gustavo Alvarado
  • Bruno Vinatea
  • Jhonny Estrella
  • Álvaro Ortigosa
Original Paper

Abstract

Student academic performance at universities is crucial for education management systems. Many actions and decisions are made based on it, specifically the enrollment process. During enrollment, students have to decide which courses to sign up for. This research presents the rationale behind the design of a recommender system to support the enrollment process using the students’ academic performance record. To build this system, the CRISP-DM methodology was applied to data from students of the Computer Science Department at University of Lima, Perú. One of the main contributions of this work is the use of two synthetic attributes to improve the relevance of the recommendations made. The first attribute estimates the inherent difficulty of a given course. The second attribute, named potential, is a measure of the competence of a student for a given course based on the grades obtained in related courses. Data was mined using C4.5, KNN (K-nearest neighbor), Naïve Bayes, Bagging and Boosting, and a set of experiments was developed in order to determine the best algorithm for this application domain. Results indicate that Bagging is the best method regarding predictive accuracy. Based on these results, the “Student Performance Recommender System” (SPRS) was developed, including a learning engine. SPRS was tested with a sample group of 39 students during the enrollment process. Results showed that the system had a very good performance under real-life conditions.

Keywords

Data mining Enrollment process Supervised classification Machine learning Recommender systems Predictive accuracy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Radaideh, Q., AI-Shawakfa, M., Al-Najjar, M.: Mining student data using decision trees. In: The 2006 International Arab Conference on Information Technology, Yarmouk University, Jordan (2006)Google Scholar
  2. Breiman L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATHMathSciNetGoogle Scholar
  3. Castellano, E., Martínez, L.: ORIEB, A CRS for academic orientation using qualitative assessments. In: Proceedings of the IADIS International Conference E-Learning, pp. 38–42 (2008)Google Scholar
  4. Cestnik, B., Bratko, I.: On estimating probabilities in tree pruning. In: Machine Learning (EWSL’91) Lecture Notes in Computer Science, vol. 482, no. 3, pp. 138–150. Springer-Verlag, Berlin (1991)Google Scholar
  5. Cortez, P., Silva, A.: Using data mining to predict secondary school student performance. In: Proceedings of 5th Future Business Technology Conference, Oporto, Portugal, pp. 5–12 (2008)Google Scholar
  6. Dekker, G., Pechenizkiy, M., Vleeshouwers, J.: Predicting students drop out: a case study. In: Proceedings of the 2nd International Conference on Educational Data Mining (EDM’09), Cordoba, Spain, pp. 41–50 (2009)Google Scholar
  7. Edelstein, H.: Building profitable customer relationships with data mining. In: SPSS White Paper-Executive Briefing, pp. 1–13. Two Crows Corporation (2000)Google Scholar
  8. Enas G., Choi S.: Choice of the smoothing parameter and efficiency of K-nearest neighbor classification. Comput. Math. Appl. 12, 235–244 (1986)CrossRefMATHGoogle Scholar
  9. Esposito F., Malerba D., Semeraro G.: A comparative analysis of methods for pruning decisión trees. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–491 (1997)CrossRefGoogle Scholar
  10. Feldman R.: Mining the biomedical literature using semantic analysis. Biosilico 1(2), 69–80 (2003)CrossRefGoogle Scholar
  11. Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Machine Learning, Proceedings of the Thirteenth International Conference (ICML’96), pp. 148–156 (1996)Google Scholar
  12. Han, J.: How can data mining help bio-data analysis? In: Proceedings of the 2nd ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD’2002), Edmonton, Canada, pp. 1–2 (2002)Google Scholar
  13. Han J., Kamber M.: Data Mining: Concepts and Techniques. 2nd edn. Morgan Kaufmann, San Francisco (2006)Google Scholar
  14. Larose D.: Discovering Knowledge in Data. 1st edn. Willey, New Jersey (2005)MATHGoogle Scholar
  15. Lehmann E., Casella G.: Theory of Point Estimation. 2nd edn. Springer-Verlag, New York (1998)MATHGoogle Scholar
  16. Luan, J.: Data mining and knowledge management: a system analysis for establishing a Tiered Knowledge Management Model (TKMM). In: Proceedings of AIR Forum, Toronto, Canada (2001)Google Scholar
  17. Luan, J.: Data mining and knowledge management in higher education-potential applications. In: Proceedings of AIR Forum, Toronto, Canada, pp. 1–18 (2002a)Google Scholar
  18. Luan, J.: Data Mining Application in Higher Education. SPSS Executive Report, pp. 1–8 (2002b)Google Scholar
  19. Mingers J.: Expert Systems-Rule Induction with Statistical Data. J. Oper. Res. Soc. 38, 39–47 (1987)Google Scholar
  20. Mingers J.: An empirical comparison of pruning methods for decision tree induction. Mach. Learn. 4(2), 227–243 (1989)CrossRefGoogle Scholar
  21. Mitchell T.: Machine Learning. 1st edn. McGraw-Hill, Boston (1997)MATHGoogle Scholar
  22. Mobasher, B., Jain, N., Han, E., Srivastava, J.: Web Mining: Pattern Discovery from World Wide Web Transactions. Technical Report TR96-OS0. Department of Computer Science, University of Minnesota (1996)Google Scholar
  23. Opitz D., Maclin R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)MATHGoogle Scholar
  24. Quinlan R.: Simplifying decision trees. Int. J. Man–Mach. Stud. 27, 221–234 (1987)CrossRefGoogle Scholar
  25. Quinlan R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  26. Ramaswami M., Bhaskaran R.: A CHAID based performance prediction model in educational data mining. Int. J. Comput. Sci. Issues (IJCSI) 7(1), 10–18 (2010)Google Scholar
  27. Rokach L., Maimon O.: Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing, Danvers (2008)MATHGoogle Scholar
  28. Romero C., Ventura S.: Educational data mining: a review of the state-of-the-art. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 40(6), 601–618 (2010)CrossRefGoogle Scholar
  29. Schafer, J.B.: The application of data-mining to recommender systems. In: Encyclopedia of Data Warehousing and Mining, vol. 1, pp. 44–48. Idea Group Reference, Hershey, PA (2005)Google Scholar
  30. Vialardi, C., Bravo, J., Shafti, L. Ortigosa, A.: Recommendation in higher education using data mining techniques. In: Proceedings of Second Educational Data Mining Conference, Córdoba, Spain, pp. 190–199 (2009)Google Scholar
  31. Vialardi, C., Chue, J., Barrientos, A., Victoria, D., Estrella, J., Ortigosa, A., Peche, J.: A case study: data mining applied to student enrollment. In: Proceedings of Third Educational Data Mining Conference, Pennsylvania, USA, pp. 333–335 (2010)Google Scholar
  32. Waiyamai, K.: Improving Quality of Graduate Students by Data Mining. Department of Computer Engineering, Faculty of Engineering, Kasetsart University, Bangkok (2003)Google Scholar
  33. Wu X. et al.: Top ten algorithms in data mining. Knowl. Inform. Syst. 14(1), 1–37 (2008)CrossRefGoogle Scholar
  34. Zaïane, O.: Building a recommender agent for E-learning systems. In: International Conference on Computers in Education, New Zealand, pp. 55–59 (2002)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • César Vialardi
    • 1
  • Jorge Chue
    • 1
  • Juan Pablo Peche
    • 1
  • Gustavo Alvarado
    • 1
  • Bruno Vinatea
    • 1
  • Jhonny Estrella
    • 1
  • Álvaro Ortigosa
    • 2
  1. 1.Facultad de Ingeniería de SistemasUniversidad de LimaLimaPerú
  2. 2.Escuela Politécnica SuperiorUniversidad Autónoma de MadridMadridSpain

Personalised recommendations