Abstract
Identifying users’ demographic characteristics is called Author Profiling task (AP), which is a useful task in providing a robust automatic prediction for different social user aspects, and subsequently supporting decision making on massive information systems. For example, in MOOCs, it used to provide personalised recommendation systems for learners. In this paper, we explore intelligent techniques and strategies for solving the task, and mainly we focus on predicting the employment status of users on a MOOC platform. For this, we compare sequential with parallel ensemble deep learning (DL) architectures. Importantly, we show that our prediction model can achieve high accuracy even though not many stylistic text features that are usually used for the AP task are employed (only tokens of words are used). To address our highly unbalanced data, we compare widely used oversampling method with a generative paraphrasing method. We obtained an average of 96.4% high accuracy for our best method, involving sequential DL with paraphrasing overall, as well as per-individual class (employment statuses of users).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Almatrafi, O., Johri, A.: Systematic review of discussion forums in massive open online courses (MOOCs). IEEE Trans. Learn. Technol. PP, 1 (2018)
Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Chen, G., et al.: Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: IJCNN Proceedings (2017)
Cliche, M.: BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs. In: ACL Proceedings, pp. 573–580 (2017)
Cohen, J.: Statistical Power Analysis for the Behavioural Sciences. Routledge, New York (2013)
Gamallo, P., Almatarneh, S.: Naive-Bayesian classification for bot detection in twitter notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)
Ganitkevitch, J., Callison-Burch, C.: The multilingual paraphrase database. In: LREC (2014)
Gardner, J., Brooks, C.: Student success prediction in MOOCs. User Model. User-Adapt. Interact. 28, 127–203 (2017)
Kellogg, S., et al.: A social network perspective on peer supported learning in MOOCs for educators. Int. Rev. Res. Open Distance Learn. 15, 263–289 (2014)
Kovács, G., et al.: Author profiling using semantic and syntactic features notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)
Liu, H., et al.: Ensemble learning approaches. In: Rule Based Systems for Big Data, pp. 63–73 (2016)
Raghunadha Reddy, T., et al.: A survey on Authorship Profiling techniques. Int. J. Appl. Eng. Res. 11(5), 3092–3102 (2016)
Rangel, F., Rosso, P.: Overview of the 7th author profiling task at PAN 2019: bots and gender profiling. In: CEUR Proceedings (2019)
Reich, J., Tingley, D., Leder-Luis, J., Roberts, M.E., Stewart, B.M.: Computer-assisted reading and discovery for student generated text in massive open online courses. J. Learn. Anal. 2, 156–184 (2015)
Sezerer, E., et al.: A Turkish dataset for gender identification of Twitter users. In: ACL, LAW XII, pp. 203–207 (2019)
Vogel, I., Jiang, P.: Bot and gender identification in Twitter using word and character N-Grams notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)
Wassertheil, S., Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Biometrics (1970)
Yin, W., et al.: Comparative study of CNN and RNN for natural language processing. CoRR (2017)
Acknowledgement
This work was funded by Ministry of Education of Saudi Arabia.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Aljohani, T., Pereira, F.D., Cristea, A.I., Oliveira, E. (2020). Prediction of Users’ Professional Profile in MOOCs Only by Utilising Learners’ Written Texts. In: Kumar, V., Troussas, C. (eds) Intelligent Tutoring Systems. ITS 2020. Lecture Notes in Computer Science(), vol 12149. Springer, Cham. https://doi.org/10.1007/978-3-030-49663-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-49663-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49662-3
Online ISBN: 978-3-030-49663-0
eBook Packages: Computer ScienceComputer Science (R0)