Human Expert Labeling Process: Valence-Arousal Labeling for Students’ Affective States

  • Sinem Aslan
  • Eda Okur
  • Nese AlyuzEmail author
  • Asli Arslan Esme
  • Ryan S. Baker
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 804)


Affect has emerged as an important part of the interaction between learners and computers, with important implications for learning outcomes. As a result, it has emerged as an important area of research within learning analytics. Reliable and valid data labeling is a key tenet for training machine learning models providing such analytics. In this study, using Human Expert Labeling Process (HELP) as a baseline labeling protocol, we investigated an optimized method through several experiments for labeling student affect based on Circumplex Model of Emotion (Valence-Arousal). Using the optimized method, we then had the human experts label a larger quantity of student data so that we could test and validate this method on a relatively larger and different dataset. The results showed that using the optimized method, the experts were able to achieve an acceptable consensus in labeling outcomes as aligned with affect labeling literature.


Affective state labeling Circumplex Model of Emotion Inter-rater agreement Intelligent tutoring systems Affective computing 


  1. 1.
    Sabourin, J., Mott, B., Lester, J.C.: Modeling learner affect with theoretically grounded dynamic Bayesian networks. In: Proceedings of International Conference on Affective Computing and Intelligent Interaction, pp. 286–295. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  2. 2.
    Jaques, N., Conati, C., Harley, J.M., Azevedo, R.: Predicting affect from gaze data during interaction with an intelligent tutoring system. In: Proceedings of the International Conference on Intelligent Tutoring Systems, pp. 29–38. Springer, Cham (2014)CrossRefGoogle Scholar
  3. 3.
    Pardos, Z.A., Baker, R.S., San Pedro, M.O.C.Z., Gowda, S.M., Gowda, S.M.: Affective states and state tests: investigating how affect and engagement during the school year predict end of year learning outcomes. J. Learn. Anal. 1(1), 107–128 (2014)CrossRefGoogle Scholar
  4. 4.
    Kapoor, A., Picard, R.W.: Multimodal affect recognition in learning environments. In: Proceedings of the International Conference on Multimedia, pp. 677–682. ACM (2005)Google Scholar
  5. 5.
    Kapoor, A., Burleson, W., Picard, R.W.: Automatic prediction of frustration. Int. J. Hum.-Comput. Stud. 65(8), 724–736 (2007)CrossRefGoogle Scholar
  6. 6.
    Hoque, M.E., McDuff, D.J., Picard, R.W.: Exploring temporal patterns in classifying frustrated and delighted smiles. Trans. Affect. Comput. 65(8), 323–334 (2012)CrossRefGoogle Scholar
  7. 7.
    Grafsgaard, J.F., Wiggins, J.B., Boyer, K.E., Wiebe, E.N., Lester, J.C.: Automatically recognizing facial indicators of frustration: a learning-centric analysis. In: Proceedings of the International Conference on Affective Computing and Intelligent Interaction, pp. 159–165. IEEE (2013)Google Scholar
  8. 8.
    Bosch, N., D’Mello, S., Baker, R., Ocumpaugh, J., Shute, V., Ventura, M., Zhao, W.: Automatic detection of learning centered affective states in the wild. In: Proceedings of the International Conference on Intelligent User Interfaces, pp. 379–388. ACM (2015)Google Scholar
  9. 9.
    Arroyo, I., Cooper, D.G., Burleson, W., Woolf, B.P., Muldner, K., Christopherson, R.: Emotion sensors go to school. In: Proceedings of the International Conference on Artificial Intelligence in Education, vol. 200, pp. 17–24 (2009)Google Scholar
  10. 10.
    Ekman, P.: An argument for basic emotions. Cogn. Emot. 6(3–4), 169–200 (1992)CrossRefGoogle Scholar
  11. 11.
    Ortony, A., Clore, G.L., Collins, A.: The Cognitive Structure of Emotions. Cambridge University Press, Cambridge (1988)CrossRefGoogle Scholar
  12. 12.
    D’Mello, S., Picard, R.W., Graesser, A.: Toward an affect-sensitive AutoTutor. Intell. Syst. 22(4), 53–61 (2007)Google Scholar
  13. 13.
    Barrett, L.F., Russell, J.A.: The structure of current affect: controversies and emerging consensus. Curr. Dir. Psychol. Sci. 8(1), 10–14 (1999)CrossRefGoogle Scholar
  14. 14.
    Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)CrossRefGoogle Scholar
  15. 15.
    Aslan, S., Mete, S.E., Okur, E., Oktay, E., Alyuz, N., Genc, U., Stanhill, D., Arslan Esme, A.: Human expert labeling process (HELP): towards a reliable higher-order user state labeling process and tool to assess student engagement. Educ. Technol. 57(1), 53–59 (2017)Google Scholar
  16. 16.
    Okur, E., Alyuz, N., Aslan, S., Genc, U., Tanriover, C., Arslan Esme, A.: Behavioral engagement detection of students in the wild. In: Proceedings of the International Conference on Artificial Intelligence in Education, pp. 250–261. Springer, Cham (2017)Google Scholar
  17. 17.
    Ocumpaugh, J., Baker, R., Rodrigo, M.M.T.: Baker Rodrigo Ocumpaugh monitoring protocol (BROMP) 2.0 technical and training manual. New York, NY and Manila, Philippines: Teachers College, Columbia University and Ateneo Laboratory for the Learning Sciences (2015)Google Scholar
  18. 18.
    Stemler, S.E.: A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Pract. Assess. Res. Eval. 9(4), 1–19 (2004)Google Scholar
  19. 19.
    Krippendorff, K.: Computing Krippendorff’s alpha-reliability. Departmental Papers (ASC), 43. Retrieved from (2011)
  20. 20.
    Siegert, L., Böck, R., Wendemuth, A.: Inter-rater reliability for emotion annotation in human–computer interaction: comparison and methodological improvements. J. Multimodal User Interfaces 8(1), 17–28 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Sinem Aslan
    • 1
  • Eda Okur
    • 1
  • Nese Alyuz
    • 1
    Email author
  • Asli Arslan Esme
    • 1
  • Ryan S. Baker
    • 2
  1. 1.Intel CorporationHillsboroUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations