Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week’s Activities

Alamri, Ahmed; Alshehri, Mohammad; Cristea, Alexandra; Pereira, Filipe D.; Oliveira, Elaine; Shi, Lei; Stewart, Craig

doi:10.1007/978-3-030-22244-4_20

Ahmed Alamri¹⁷,
Mohammad Alshehri¹⁷,
Alexandra Cristea¹⁷,
Filipe D. Pereira¹⁸,
Elaine Oliveira¹⁸,
Lei Shi¹⁹ &
…
Craig Stewart²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11528))

Included in the following conference series:

International Conference on Intelligent Tutoring Systems

1797 Accesses
31 Citations
1 Altmetric

Abstract

While Massive Open Online Course (MOOCs) platforms provide knowledge in a new and unique way, the very high number of dropouts is a significant drawback. Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout. The jury is still out on which factors are the most appropriate predictors. However, the literature agrees that early prediction is vital to allow for a timely intervention. Whilst feature-rich predictors may have the best chance for high accuracy, they may be unwieldy. This study aims to predict learner dropout early-on, from the first week, by comparing several machine-learning approaches, including Random Forest, Adaptive Boost, XGBoost and GradientBoost Classifiers. The results show promising accuracies (82%–94%) using as little as 2 features. We show that the accuracies obtained outperform state of the art approaches, even when the latter deploy several features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.mooclab.club/resources/mooclab-report-the-global-mooc-landscape-2017.214/

References

Ipaye, B., Ipaye, C.B.: Opportunities and challenges for open educational resources and massive open online courses: the case of Nigeria. Commonwealth of Learning. Educo-Health Project. Ilorin (2013)
Google Scholar
Kloft, M., Stiehler, F., Zheng, Z., Pinkwart, N.: Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 60–65 (2014)
Google Scholar
Yang, D., Sinha, T., Adamson, D., Rose, C.P.: Turn on, tune in, drop out: anticipating student dropouts in massive open online courses. In: Proceedings of NIPS Work Data Driven Education, pp. 1–8 (2013)
Google Scholar
Jordan, K.: MOOC completion rate: the data (2013)
Google Scholar
Ye, C., Biswas, G.: Early prediction of student dropout and performance in MOOCs using higher granularity temporal information. J. Learn. Anal. 1, 169–172 (2014)
Article Google Scholar
Coates, A., et al.: Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of International Conference Document Anal. and Recognition ICDAR, pp. 440–445 (2011)
Google Scholar
Wen, M., Yang, D., Ros, C.P., Rosé, C.P., Rose, C.P.: Linguistic reflections of student engagement in massive open online courses. In: Proceedings of 8th International Conference of Weblogs Social Media, ICWSM 2014, pp. 525–534 (2014)
Google Scholar
Wen, M., Yang, D., Rosé, C.P.: Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In: Proceedings of the 7th International Conference on Educational Data Mining (EDM), pp. 1–8 (2014)
Google Scholar
Gardner, J., Brooks, C.: Student success prediction in MOOCs. User Model. User-Adapt. Inter. 28, 127–203 (2018)
Article Google Scholar
Hong, B., Wei, Z., Yang, Y.: Discovering learning behavior patterns to predict dropout in MOOC. In: 12th International Conference on Computer Science and Education, ICCSE 2017, pp. 700–704. IEEE. (2017)
Google Scholar
Xing, W., Chen, X., Stein, J., Marcinkowski, M.: Temporal predication of dropouts in MOOCs: reaching the low hanging fruit through stacking generalization. Comput. Hum. Behav. 58, 119–129 (2016)
Article Google Scholar
Halawa, S., Greene, D., Mitchell, J.: Dropout prediction in MOOCs using learner activity features. In: Proceedings of the Second European MOOC Stakeholder Summit, pp. 58–65 (2014)
Google Scholar
Sharkey, M., Sanders, R.: A process for predicting MOOC attrition. In: Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs, pp. 50–54 (2014)
Google Scholar
Nagrecha, S., Dillon, J.Z., Chawla, N.V.: MOOC dropout prediction: lessons learned from making pipelines interpretable. In: International World Wide Web Conferences Steering Committee Proceedings of the 26th International Conference on World Wide Web Companion, pp. 351–359 (2017)
Google Scholar
Bote-Lorenzo, M.L., Gómez-Sánchez, E.: Predicting the decrease of engagement indicators in a MOOC. In: Proceedings of the Seventh International Learning Analytics and Knowledge Conference on LAK 2017. pp. 143–147. ACM Press, New York (2017)
Google Scholar
Liang, J., Yang, J., Wu, Y., Li, C., Zheng, L.: Big data application in education: Dropout prediction in Edx MOOCs. In: Proceedings of 2016 IEEE 2nd International Conference on Multimedia Big Data, BigMM 2016, pp. 440–443, IEEE (2016)
Google Scholar
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, ACM. (2016)
Google Scholar
Dietterich, Thomas G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Chapter Google Scholar
Ruipérez-Valiente, J.A., Cobos, R., Muñoz-Merino, P.J., Andujar, Á., Delgado Kloos, C.: Early prediction and variable importance of certificate accomplishment in a MOOC. In: Delgado Kloos, C., Jermann, P., Pérez-Sanagustín, M., Seaton, D.T., White, S. (eds.) EMOOCs 2017. LNCS, vol. 10254, pp. 263–272. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59044-8_31
Chapter Google Scholar
Cristea, A.I., Alamri, A., Kayama, M., Stewart, C., Alshehri, M., Shi, L.: Earliest predictor of dropout in MOOCs: a longitudinal study of futurelearn courses. In: 27th International Conference on Information Systems Development (ISD) (2018)
Google Scholar
Alshehri, M., et al.: On the need for fine-grained analysis of gender versus commenting behaviour in MOOCs. In: Proceedings of the 2018 The 3rd International Conference on Information and Education Innovations, pp. 73–77. ACM (2018)
Google Scholar
Cristea, A.I., Alshehri, M., Alamri, A., Kayama, M., Stewart, C., Shi, L.: How is learning fluctuating? futurelearn MOOCs fine-grained temporal analysis and feedback to teachers and designers. In: 27th International Conference on Information Systems Development (ISD2018). Association for Information Systems, Lund (2018)
Google Scholar
Dorfman, R.: A formula for the Gini coefficient. Rev. Econ. Stat. 61, 146–149 (1979)
Article MathSciNet Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Article MathSciNet Google Scholar
Hastie, T., Rosset, S., Zhu, J., Zou, H.: Multi-class adaboost. Statistics and its. Interface 2, 349–360 (2009)
MathSciNet MATH Google Scholar
Schapire, R.E., Freund, Y.: Boosting: Foundations and algorithms. MIT press, Cambridge (2012)
MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
An, S., Liu, W., Venkatesh, S.: Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognit. 40, 2154–2162 (2007)
Article Google Scholar
Hinkley, D.V., Cox, D.: Theoretical Statistics. Chapman and Hall/CRC, London (1979)
MATH Google Scholar

Download references

Acknowledgment

We would like to thank FAPEAM (Foundation for the State of Amazonas Research), through Edital 009/2017, for partially funding this research.

Author information

Authors and Affiliations

Department of Computer Science, Durham University, Durham, UK
Ahmed Alamri, Mohammad Alshehri & Alexandra Cristea
Institute of Computing, Federal University of Roraima, Boa Vista, Brazil
Filipe D. Pereira & Elaine Oliveira
Centre for Educational Development, University of Liverpool, Liverpool, UK
Lei Shi
School of Computing Electronics and Mathematics, Coventry University, Coventry, UK
Craig Stewart

Authors

Ahmed Alamri
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Alshehri
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Cristea
View author publications
You can also search for this author in PubMed Google Scholar
Filipe D. Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Elaine Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Lei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Craig Stewart
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandra Cristea .

Editor information

Editors and Affiliations

University of the West Indies, Kingston, Jamaica
Andre Coy
Ritsumeikan University, Osaka, Japan
Yugo Hayashi
Athabasca University, Edmonton, AB, Canada
Maiga Chang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alamri, A. et al. (2019). Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week’s Activities. In: Coy, A., Hayashi, Y., Chang, M. (eds) Intelligent Tutoring Systems. ITS 2019. Lecture Notes in Computer Science(), vol 11528. Springer, Cham. https://doi.org/10.1007/978-3-030-22244-4_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-22244-4_20
Published: 30 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22243-7
Online ISBN: 978-3-030-22244-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics