Educational Data Mining

Volume 524 of the series Studies in Computational Intelligence pp 175-202


Predicting Student Performance from Combined Data Sources

  • Annika WolffAffiliated withKnowledge Media Institute, The Open University Email author 
  • , Zdenek ZdrahalAffiliated withKnowledge Media Institute, The Open University
  • , Drahomira HerrmannovaAffiliated withKnowledge Media Institute, The Open University
  • , Petr KnothAffiliated withKnowledge Media Institute, The Open University

* Final gross prices may vary according to local VAT.

Get Access


This chapter will explore the use of predictive modeling methods for identifying students who will benefit most from tutor interventions. This is a growing area of research and is especially useful in distance learning where tutors and students do not meet face to face. The methods discussed will include decision-tree classification, support vector machine (SVM), general unary hypotheses automaton (GUHA), Bayesian networks, and linear and logistic regression. These methods have been trialed through building and testing predictive models using data from several Open University (OU) modules. The Open University offers a good test-bed for this work, as it is one of the largest distance learning institutions in Europe. The chapter will discuss how the predictive capacity of the different sources of data changes as the course progresses. It will also highlight the importance of understanding how a student’s pattern of behavior changes during the course.


Predictive modeling Education Virtual learning environment Student outcome