Abstract
Educational data mining has gained popularity due to its ability to provide useful knowledge hidden in data of students’ records for better educational decision-making support. During the last years, a variety of methods have been applied to develop accurate models to monitor students’ behavior and performance, while most of these studies examine the efficiency of supervised classification methods. In this work, we propose a new ensemble-based semi-supervised method for the prognosis of students’ performance in the final examinations at the end of academic year. Our experimental results reveal that our proposed method is proved to be effective and practical for early student progress prediction as compared to some existing semi-supervised learning methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aha, D. (1997). Lazy learning. Dordrecht: Kluwer Academic Publishers.
Baker, R., & Yacef, K. (2009). The state of educational data mining in 2009: A review future visions. Journal of Educational Data Mining, 1(1), 3–17.
Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In 11th Annual Conference on Computational Learning Theory (pp. 92–100).
Bousbia, N., & Belamri, I. (2014). Which contribution does EDM provide to computer-based learning environments? In Educational data mining (pp. 3–28). Berlin: Springer.
Chapelle, O., Scholkopf, B., & Zien, A. (2009). Semi-supervised learning. IEEE Transactions on Neural Networks, 20(3), 542–542.
Cortez, P., & Silva, A. (2008). Using data mining to predict secondary school student performance. In Proceedings of 5th Annual Future Business Technology Conference (pp. 5–12).
Dietterich, T. (2001). Ensemble methods in machine learning. In J. Kittler & F. Roli (Eds.), Multiple classifier systems (Vol. 1857, pp. 1–15). Berlin: Springer.
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. In Machine learning (Vol. 29, pp. 103–130).
Du, J., Ling, C., & Zhou, Z. (2011). When does co-training work in real data? IEEE Transactions on Knowledge and Data Engineering, 23(5), 788–799.
Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Frank, E., & Witten, I. (1998). Generating accurate rule sets without global optimization. In 15th International Conference on Machine Learning (pp. 144–151).
Gandhi, P., & Aggarwal, V. (2010). Ensemble hybrid logit model. In Proceedings of the KDD 2010 Cup: Workshop Knowledge Discovery in Educational Data (pp. 33–50).
Greller, W., & Drachsler, H. (2012). Translating learning into numbers: A generic framework for learning analytics. Journal of Educational Technology & Society, 15(3), 42.
Guo, T., & Li, G. (2012). Improved tri-training with unlabeled data. In Software engineering and knowledge engineering: Theory and practice (pp. 139–147). Berlin: Springer.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. (2009). The WEKA data mining software: An update. SIGKDD Explorations Newsletters, 11, 10–18.
Hodges, J., & Lehmann, E. (1962). Rank methods for combination of independent experiments in analysis of variance. The Annals of Mathematical Statistics, 33(2), 482–497.
Kostopoulos, G., Kotsiantis, S., & Pintelas, P. (2015). Predicting student performance in distance higher education using semi-supervised techniques. In Model and data engineering (pp. 259–270). Berlin: Springer.
Kostopoulos, G., Livieris, I., Kotsiantis, S., & Tampakas, V. (2017). Enhancing high school students’ performance prediction using semi-supervised methods. In 8th International Conference on Information, Intelligence, Systems and Applications (IISA 2017). Piscataway: IEEE.
Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23(6), 529–535.
Kotsiantis, S., Pierrakeas, C., & Pintelas, P. (2003). Preventing student dropout in distance learning using machine learning techniques. In Knowledge-based intelligent information and engineering systems (pp. 267–274). Berlin: Springer.
Kotsiantis, S., Pierrakeas, C., & Pintelas, P. (2004). Predicting students’ performance in distance learning using machine learning techniques. Applied Artificial Intelligence, 18(5), 411–426.
Lam, L., & Suen, S. (1997). Application of majority voting to pattern recognition: An analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 27(5), 553–568.
Landwehr, N., Hall, M., & Frank, E. (2005). Logistic model trees. Machine Learning, 59(1–2), 161–205.
Levatic, J., Dzeroski, S., Supek, F., & Smuc, T. (2013). Semi-supervised learning for quantitative structure-activity modeling. Informatica, 37(2), 173.
Liu, C., & Yuen, P. (2011). A boosted co-training algorithm for human action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 21(9), 1203–1213.
Livieris, I., Drakopoulou, K., Kotsilieris, T., Tampakas, V., & Pintelas, P. (2017). DSS-PSP – A decision support software for evaluating students’ performance. In Engineering applications of neural networks (pp. 63–74). Berlin: Springer.
Livieris, I., Drakopoulou, K., & Pintelas, P. (2012). Predicting students’ performance using artificial neural networks. In Information and communication technologies in education (pp. 321–328).
Livieris, I., Mikropoulos, T., & Pintelas, P. (2016). A decision support system for predicting students’ performance. Themes in Science and Technology Education, 9, 43–57.
Matan, O. (1996). On voting ensembles of classifiers. In Proceedings of AAAI-96 Workshop on Integrating Multiple Learned Models (pp. 84–88).
Merz, C. (1997). Combining classifiers using correspondence analysis. In Advances in neural information processing systems (pp. 592–597).
Merz, C. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36, 33–58.
Ng, V., & Cardie, C. (2003). Weakly supervised natural language learning without redundant views. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (Vol. 1, pp. 94–101). Stroudsburg: Association for Computational Linguistics.
Pena-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4), 1432–1462.
Pise, N., & Kulkarni, P. (2008). A survey of semi-supervised learning methods. In Proceedings of the 2008 International Conference on Computational Intelligence and Security (Vol. 2, pp. 30–34). Washington, DC: IEEE Computer Society.
Platt, J. (1999). Using sparseness and analytic QP to speed training of support vector machines. In M. Kearns, S. Solla, & D. Cohn (Eds.), Advances in neural information processing systems (pp. 557–563). Cambridge, MA: MIT Press.
Ramaswami, M., & Bhaskaran, R. (2010). A CHAID based performance prediction model in educational data mining. International Journal of Computer Science Issues, 7(1), 135–146.
Ramesh, V., Parkav, P., & Rama, K. (2013). Predicting student performance: A statistical and data mining. International Journal of Computer Applications, 63(8), 35–39.
Re, M., & Valentini, G. (2012). Ensemble methods: A review. In Advances in machine learning and data mining for astro-nomy (pp. 563–594). Boca Raton: CRC Press.
Rokach, L. (2010). Pattern classification using ensemble methods. Singapore: World Scientific Publishing Company.
Roli, F., & Marcialis, G. (2006). Semi-supervised PCA-based face recognition using self-training. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 560–568).
Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33, 135–146.
Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews, 40(6), 601–618.
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (pp. 318–362). Cambridge, MA: MIT Press.
Schwenker, F., & Trentin, E. (2014). Pattern classification and clustering: A review of partially supervised learning approaches. Pattern Recognition Letters, 37, 4–14.
Sigdel, M., Dinç, I., Dinç, S., Sigdel, M., Pusey, M., & Aygun, R. (2014). Evaluation of semi-supervised learning for classification of protein crystallization imagery. In Southeastcon 2014 (pp. 1–6). IEEE.
Sun, S., & Jin, F. (2011). Robust co-training. International Journal of Pattern Recognition and Artificial Intelligence, 25(07), 1113–1126.
Thai-Nghe, N., Busche, A., & Schmidt-Thieme, L. (2009). Improving academic performance prediction by dealing with class imbalance. In 9th International Conference on Intelligent Systems Design and Applications (ISDA’09) (pp. 878–883).
Thai-Nghe, N., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Proceeding of 37th IEEE Frontiers in Education Conference (pp. 7–12).
Todorovski, L., & Džeroski, S. (2002). Combining classifiers with meta decision trees. Machine Learning, 50(3), 223–249.
Triguero, I., & Garcıa, S. (2015). Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study. Knowledge and Information Systems, 42(2), 245–284.
Triguero, I., Saez, J., Luengo, J., Garcia, S., & Herrera, F. (2014). On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification. Neurocomputing, 132, 30–41.
Wang, Y., & Chen, S. (2013). Safety-aware semi-supervised classification. IEEE Transactions on Neural Networks and Learning Systems, 24(11), 1763–1772.
Wu, X., Kumar, V., Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.
Zhou, Z. (2011). When semi-supervised learning meets ensemble learning. In Frontiers of electrical and electronic engineering in China (Vol. 6, pp. 6–16). Berlin: Springer.
Zhu, X. (2006). Semi-supervised learning literature survey (Technical Report 1530). Madison: University of Wisconsin.
Zhu, X. (2011). Semi-supervised learning. In Encyclopedia of machine learning (pp. 892–897). Berlin: Springer.
Zhu, X., & Goldberg, A. (2009). Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 3(1), 1–130.
Acknowledgments
The authors are grateful to the private high school “Avgoulea-Linardatou” for the collection of the data used in our study and valuable comments which essentially improved our work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Livieris, I.E., Drakopoulou, K., Mikropoulos, T.A., Tampakas, V., Pintelas, P. (2018). An Ensemble-Based Semi-Supervised Approach for Predicting Students’ Performance. In: Mikropoulos, T. (eds) Research on e-Learning and ICT in Education. Springer, Cham. https://doi.org/10.1007/978-3-319-95059-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-95059-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95058-7
Online ISBN: 978-3-319-95059-4
eBook Packages: EducationEducation (R0)