Skip to main content

An Ensemble-Based Semi-Supervised Approach for Predicting Students’ Performance

  • Chapter
  • First Online:
Research on e-Learning and ICT in Education

Abstract

Educational data mining has gained popularity due to its ability to provide useful knowledge hidden in data of students’ records for better educational decision-making support. During the last years, a variety of methods have been applied to develop accurate models to monitor students’ behavior and performance, while most of these studies examine the efficiency of supervised classification methods. In this work, we propose a new ensemble-based semi-supervised method for the prognosis of students’ performance in the final examinations at the end of academic year. Our experimental results reveal that our proposed method is proved to be effective and practical for early student progress prediction as compared to some existing semi-supervised learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aha, D. (1997). Lazy learning. Dordrecht: Kluwer Academic Publishers.

    Book  Google Scholar 

  • Baker, R., & Yacef, K. (2009). The state of educational data mining in 2009: A review future visions. Journal of Educational Data Mining, 1(1), 3–17.

    Google Scholar 

  • Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In 11th Annual Conference on Computational Learning Theory (pp. 92–100).

    Google Scholar 

  • Bousbia, N., & Belamri, I. (2014). Which contribution does EDM provide to computer-based learning environments? In Educational data mining (pp. 3–28). Berlin: Springer.

    Chapter  Google Scholar 

  • Chapelle, O., Scholkopf, B., & Zien, A. (2009). Semi-supervised learning. IEEE Transactions on Neural Networks, 20(3), 542–542.

    Article  Google Scholar 

  • Cortez, P., & Silva, A. (2008). Using data mining to predict secondary school student performance. In Proceedings of 5th Annual Future Business Technology Conference (pp. 5–12).

    Google Scholar 

  • Dietterich, T. (2001). Ensemble methods in machine learning. In J. Kittler & F. Roli (Eds.), Multiple classifier systems (Vol. 1857, pp. 1–15). Berlin: Springer.

    Google Scholar 

  • Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. In Machine learning (Vol. 29, pp. 103–130).

    Google Scholar 

  • Du, J., Ling, C., & Zhou, Z. (2011). When does co-training work in real data? IEEE Transactions on Knowledge and Data Engineering, 23(5), 788–799.

    Article  Google Scholar 

  • Efron, B., & Tibshirani, R. (1993). An introduction to the bootstrap. New York: Chapman & Hall.

    Book  Google Scholar 

  • Frank, E., & Witten, I. (1998). Generating accurate rule sets without global optimization. In 15th International Conference on Machine Learning (pp. 144–151).

    Google Scholar 

  • Gandhi, P., & Aggarwal, V. (2010). Ensemble hybrid logit model. In Proceedings of the KDD 2010 Cup: Workshop Knowledge Discovery in Educational Data (pp. 33–50).

    Google Scholar 

  • Greller, W., & Drachsler, H. (2012). Translating learning into numbers: A generic framework for learning analytics. Journal of Educational Technology & Society, 15(3), 42.

    Google Scholar 

  • Guo, T., & Li, G. (2012). Improved tri-training with unlabeled data. In Software engineering and knowledge engineering: Theory and practice (pp. 139–147). Berlin: Springer.

    Chapter  Google Scholar 

  • Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. (2009). The WEKA data mining software: An update. SIGKDD Explorations Newsletters, 11, 10–18.

    Article  Google Scholar 

  • Hodges, J., & Lehmann, E. (1962). Rank methods for combination of independent experiments in analysis of variance. The Annals of Mathematical Statistics, 33(2), 482–497.

    Article  Google Scholar 

  • Kostopoulos, G., Kotsiantis, S., & Pintelas, P. (2015). Predicting student performance in distance higher education using semi-supervised techniques. In Model and data engineering (pp. 259–270). Berlin: Springer.

    Chapter  Google Scholar 

  • Kostopoulos, G., Livieris, I., Kotsiantis, S., & Tampakas, V. (2017). Enhancing high school students’ performance prediction using semi-supervised methods. In 8th International Conference on Information, Intelligence, Systems and Applications (IISA 2017). Piscataway: IEEE.

    Google Scholar 

  • Kotsiantis, S., Patriarcheas, K., & Xenos, M. (2010). A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowledge-Based Systems, 23(6), 529–535.

    Article  Google Scholar 

  • Kotsiantis, S., Pierrakeas, C., & Pintelas, P. (2003). Preventing student dropout in distance learning using machine learning techniques. In Knowledge-based intelligent information and engineering systems (pp. 267–274). Berlin: Springer.

    Chapter  Google Scholar 

  • Kotsiantis, S., Pierrakeas, C., & Pintelas, P. (2004). Predicting students’ performance in distance learning using machine learning techniques. Applied Artificial Intelligence, 18(5), 411–426.

    Article  Google Scholar 

  • Lam, L., & Suen, S. (1997). Application of majority voting to pattern recognition: An analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 27(5), 553–568.

    Article  Google Scholar 

  • Landwehr, N., Hall, M., & Frank, E. (2005). Logistic model trees. Machine Learning, 59(1–2), 161–205.

    Article  Google Scholar 

  • Levatic, J., Dzeroski, S., Supek, F., & Smuc, T. (2013). Semi-supervised learning for quantitative structure-activity modeling. Informatica, 37(2), 173.

    Google Scholar 

  • Liu, C., & Yuen, P. (2011). A boosted co-training algorithm for human action recognition. IEEE Transactions on Circuits and Systems for Video Technology, 21(9), 1203–1213.

    Article  Google Scholar 

  • Livieris, I., Drakopoulou, K., Kotsilieris, T., Tampakas, V., & Pintelas, P. (2017). DSS-PSP – A decision support software for evaluating students’ performance. In Engineering applications of neural networks (pp. 63–74). Berlin: Springer.

    Chapter  Google Scholar 

  • Livieris, I., Drakopoulou, K., & Pintelas, P. (2012). Predicting students’ performance using artificial neural networks. In Information and communication technologies in education (pp. 321–328).

    Google Scholar 

  • Livieris, I., Mikropoulos, T., & Pintelas, P. (2016). A decision support system for predicting students’ performance. Themes in Science and Technology Education, 9, 43–57.

    Google Scholar 

  • Matan, O. (1996). On voting ensembles of classifiers. In Proceedings of AAAI-96 Workshop on Integrating Multiple Learned Models (pp. 84–88).

    Google Scholar 

  • Merz, C. (1997). Combining classifiers using correspondence analysis. In Advances in neural information processing systems (pp. 592–597).

    Google Scholar 

  • Merz, C. (1999). Using correspondence analysis to combine classifiers. Machine Learning, 36, 33–58.

    Article  Google Scholar 

  • Ng, V., & Cardie, C. (2003). Weakly supervised natural language learning without redundant views. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (Vol. 1, pp. 94–101). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  • Pena-Ayala, A. (2014). Educational data mining: A survey and a data mining-based analysis of recent works. Expert Systems with Applications, 41(4), 1432–1462.

    Article  Google Scholar 

  • Pise, N., & Kulkarni, P. (2008). A survey of semi-supervised learning methods. In Proceedings of the 2008 International Conference on Computational Intelligence and Security (Vol. 2, pp. 30–34). Washington, DC: IEEE Computer Society.

    Chapter  Google Scholar 

  • Platt, J. (1999). Using sparseness and analytic QP to speed training of support vector machines. In M. Kearns, S. Solla, & D. Cohn (Eds.), Advances in neural information processing systems (pp. 557–563). Cambridge, MA: MIT Press.

    Google Scholar 

  • Ramaswami, M., & Bhaskaran, R. (2010). A CHAID based performance prediction model in educational data mining. International Journal of Computer Science Issues, 7(1), 135–146.

    Google Scholar 

  • Ramesh, V., Parkav, P., & Rama, K. (2013). Predicting student performance: A statistical and data mining. International Journal of Computer Applications, 63(8), 35–39.

    Article  Google Scholar 

  • Re, M., & Valentini, G. (2012). Ensemble methods: A review. In Advances in machine learning and data mining for astro-nomy (pp. 563–594). Boca Raton: CRC Press.

    Google Scholar 

  • Rokach, L. (2010). Pattern classification using ensemble methods. Singapore: World Scientific Publishing Company.

    Google Scholar 

  • Roli, F., & Marcialis, G. (2006). Semi-supervised PCA-based face recognition using self-training. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 560–568).

    Google Scholar 

  • Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 33, 135–146.

    Article  Google Scholar 

  • Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews, 40(6), 601–618.

    Article  Google Scholar 

  • Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (pp. 318–362). Cambridge, MA: MIT Press.

    Google Scholar 

  • Schwenker, F., & Trentin, E. (2014). Pattern classification and clustering: A review of partially supervised learning approaches. Pattern Recognition Letters, 37, 4–14.

    Article  Google Scholar 

  • Sigdel, M., Dinç, I., Dinç, S., Sigdel, M., Pusey, M., & Aygun, R. (2014). Evaluation of semi-supervised learning for classification of protein crystallization imagery. In Southeastcon 2014 (pp. 1–6). IEEE.

    Google Scholar 

  • Sun, S., & Jin, F. (2011). Robust co-training. International Journal of Pattern Recognition and Artificial Intelligence, 25(07), 1113–1126.

    Article  Google Scholar 

  • Thai-Nghe, N., Busche, A., & Schmidt-Thieme, L. (2009). Improving academic performance prediction by dealing with class imbalance. In 9th International Conference on Intelligent Systems Design and Applications (ISDA’09) (pp. 878–883).

    Google Scholar 

  • Thai-Nghe, N., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Proceeding of 37th IEEE Frontiers in Education Conference (pp. 7–12).

    Google Scholar 

  • Todorovski, L., & Džeroski, S. (2002). Combining classifiers with meta decision trees. Machine Learning, 50(3), 223–249.

    Article  Google Scholar 

  • Triguero, I., & Garcıa, S. (2015). Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study. Knowledge and Information Systems, 42(2), 245–284.

    Article  Google Scholar 

  • Triguero, I., Saez, J., Luengo, J., Garcia, S., & Herrera, F. (2014). On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification. Neurocomputing, 132, 30–41.

    Article  Google Scholar 

  • Wang, Y., & Chen, S. (2013). Safety-aware semi-supervised classification. IEEE Transactions on Neural Networks and Learning Systems, 24(11), 1763–1772.

    Article  Google Scholar 

  • Wu, X., Kumar, V., Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., et al. (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1), 1–37.

    Article  Google Scholar 

  • Zhou, Z. (2011). When semi-supervised learning meets ensemble learning. In Frontiers of electrical and electronic engineering in China (Vol. 6, pp. 6–16). Berlin: Springer.

    Google Scholar 

  • Zhu, X. (2006). Semi-supervised learning literature survey (Technical Report 1530). Madison: University of Wisconsin.

    Google Scholar 

  • Zhu, X. (2011). Semi-supervised learning. In Encyclopedia of machine learning (pp. 892–897). Berlin: Springer.

    Google Scholar 

  • Zhu, X., & Goldberg, A. (2009). Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 3(1), 1–130.

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the private high school “Avgoulea-Linardatou” for the collection of the data used in our study and valuable comments which essentially improved our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis E. Livieris .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Livieris, I.E., Drakopoulou, K., Mikropoulos, T.A., Tampakas, V., Pintelas, P. (2018). An Ensemble-Based Semi-Supervised Approach for Predicting Students’ Performance. In: Mikropoulos, T. (eds) Research on e-Learning and ICT in Education. Springer, Cham. https://doi.org/10.1007/978-3-319-95059-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-95059-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-95058-7

  • Online ISBN: 978-3-319-95059-4

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics