Skip to main content

Prediction of Users’ Professional Profile in MOOCs Only by Utilising Learners’ Written Texts

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12149))

Abstract

Identifying users’ demographic characteristics is called Author Profiling task (AP), which is a useful task in providing a robust automatic prediction for different social user aspects, and subsequently supporting decision making on massive information systems. For example, in MOOCs, it used to provide personalised recommendation systems for learners. In this paper, we explore intelligent techniques and strategies for solving the task, and mainly we focus on predicting the employment status of users on a MOOC platform. For this, we compare sequential with parallel ensemble deep learning (DL) architectures. Importantly, we show that our prediction model can achieve high accuracy even though not many stylistic text features that are usually used for the AP task are employed (only tokens of words are used). To address our highly unbalanced data, we compare widely used oversampling method with a generative paraphrasing method. We obtained an average of 96.4% high accuracy for our best method, involving sequential DL with paraphrasing overall, as well as per-individual class (employment statuses of users).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Almatrafi, O., Johri, A.: Systematic review of discussion forums in massive open online courses (MOOCs). IEEE Trans. Learn. Technol. PP, 1 (2018)

    Google Scholar 

  2. Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  3. Chen, G., et al.: Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: IJCNN Proceedings (2017)

    Google Scholar 

  4. Cliche, M.: BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs. In: ACL Proceedings, pp. 573–580 (2017)

    Google Scholar 

  5. Cohen, J.: Statistical Power Analysis for the Behavioural Sciences. Routledge, New York (2013)

    Book  Google Scholar 

  6. Gamallo, P., Almatarneh, S.: Naive-Bayesian classification for bot detection in twitter notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)

    Google Scholar 

  7. Ganitkevitch, J., Callison-Burch, C.: The multilingual paraphrase database. In: LREC (2014)

    Google Scholar 

  8. Gardner, J., Brooks, C.: Student success prediction in MOOCs. User Model. User-Adapt. Interact. 28, 127–203 (2017)

    Article  Google Scholar 

  9. Kellogg, S., et al.: A social network perspective on peer supported learning in MOOCs for educators. Int. Rev. Res. Open Distance Learn. 15, 263–289 (2014)

    Article  Google Scholar 

  10. Kovács, G., et al.: Author profiling using semantic and syntactic features notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)

    Google Scholar 

  11. Liu, H., et al.: Ensemble learning approaches. In: Rule Based Systems for Big Data, pp. 63–73 (2016)

    Google Scholar 

  12. Raghunadha Reddy, T., et al.: A survey on Authorship Profiling techniques. Int. J. Appl. Eng. Res. 11(5), 3092–3102 (2016)

    Google Scholar 

  13. Rangel, F., Rosso, P.: Overview of the 7th author profiling task at PAN 2019: bots and gender profiling. In: CEUR Proceedings (2019)

    Google Scholar 

  14. Reich, J., Tingley, D., Leder-Luis, J., Roberts, M.E., Stewart, B.M.: Computer-assisted reading and discovery for student generated text in massive open online courses. J. Learn. Anal. 2, 156–184 (2015)

    Article  Google Scholar 

  15. Sezerer, E., et al.: A Turkish dataset for gender identification of Twitter users. In: ACL, LAW XII, pp. 203–207 (2019)

    Google Scholar 

  16. Vogel, I., Jiang, P.: Bot and gender identification in Twitter using word and character N-Grams notebook for PAN at CLEF 2019. In: CEUR Proceedings (2019)

    Google Scholar 

  17. Wassertheil, S., Cohen, J.: Statistical Power Analysis for the Behavioral Sciences. Biometrics (1970)

    Google Scholar 

  18. Yin, W., et al.: Comparative study of CNN and RNN for natural language processing. CoRR (2017)

    Google Scholar 

Download references

Acknowledgement

This work was funded by Ministry of Education of Saudi Arabia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tahani Aljohani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aljohani, T., Pereira, F.D., Cristea, A.I., Oliveira, E. (2020). Prediction of Users’ Professional Profile in MOOCs Only by Utilising Learners’ Written Texts. In: Kumar, V., Troussas, C. (eds) Intelligent Tutoring Systems. ITS 2020. Lecture Notes in Computer Science(), vol 12149. Springer, Cham. https://doi.org/10.1007/978-3-030-49663-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49663-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49662-3

  • Online ISBN: 978-3-030-49663-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics