Towards the Semantic MOOC: Extracting, Enriching and Interlinking E-Learning Data in Open edX Platform

  • Dmitry VolchekEmail author
  • Aleksei Romanov
  • Dmitry Mouromtsev
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 786)


In recent years, the educational technology market is growing rapidly. This phenomenon is explained by the increasing number of Massive Open Online Courses (MOOC) which provide learners an opportunity to study 24/7 at the top universities of the world. Information contained in such courses can be better structured, linked, and enriched by means of the semantic technologies and linked data principles. Given semantic annotations, discovery, and matching among learners, teachers, and learning resources can be made a lot more efficient. In this paper, we describe a method of metadata extraction from Open edX online courses for its subsequent processing. We solved the problem of a course representation at the formal and semantic levels, thus, both computers and humans could process and use the course following the ontology development. Also, we exploited NLP and RAKE technologies to integrate automatic concept extraction from course lectures. Triples are imported into RDF storage system allowing user the execution of SPARQL queries through the SPARQL endpoint. Moreover, plugin supports enriching and interlinking courses allowing users to learn the educational content of the courses on an individual trajectory. To summarize the above, it can be concluded that the considered data set is mapped at a satisfactory high level. The collected data can be useful for analyzing the relevance and quality of the course structure.


Semantic web Linked learning EdX Education Metadata Linked data in education Educational ontology population eLearning system Semantic web technologies in education 


  1. 1.
    d’Aquin, M.: Linked Data for Open and Distance Learning (2012)Google Scholar
  2. 2.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp. 205–227 (2009)Google Scholar
  3. 3.
    Höver, K.M., Mühlhäuser, M.: LOOCs-linked open online courses: a vision. In: 2014 IEEE 14th International Conference on Advanced Learning Technologies (ICALT), pp. 546–547. IEEE (2014)Google Scholar
  4. 4.
    Deved, V., et al.: Semantic Web and Education, vol. 12. Springer Science & Business Media, Heidelberg (2006)Google Scholar
  5. 5.
    Mouromtsev, D., Romanov, A., Volchek, D., Kozlov, F.: Metadata extraction from open edX online courses using dynamic mapping of NoSQL queries. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 501–506. International World Wide Web Conferences Steering Committee (2016)Google Scholar
  6. 6.
    Tiropanis, T., Millard, D., Davis, H.C.: Guest editorial: special section on semantic technologies for learning and teaching support in higher education. IEEE Trans. Learn. Technol. 2, 102–103 (2012)CrossRefGoogle Scholar
  7. 7.
    Mouromtsev, D., Kozlov, F., Kovriguina, L., Parkhimovich, O.: ECOLE: student knowledge assessment in the education process. In: Proceedings of the 24th International Conference on World Wide Web Companion, pp. 695–700. International World Wide Web Conferences Steering Committee (2015)Google Scholar
  8. 8.
    Zablith, F.: Interconnecting and enriching higher education programs using linked data. In: Proceedings of the 24th International Conference on World Wide Web Companion, pp. 711–716. International World Wide Web Conferences Steering Committee (2015)Google Scholar
  9. 9.
    Sunar, A.S., Abdullah, N.A., White, S., Davis, H.C.: Personalisation of MOOCs: the state of the art. In: 7th International Conference on Computer Supported Education (CSEDU 2015) (2015)Google Scholar
  10. 10.
    Torii, M., Wagholikar, K., Liu, H.: Using machine learning for concept extraction on clinical documents from multiple data sources. J. Am. Med. Inform. Assoc. 18(5), 580–587 (2011)CrossRefGoogle Scholar
  11. 11.
    Jonnalagadda, S., Cohen, T., Wu, S., Gonzalez, G.: Enhancing clinical concept extraction with distributional semantics. J. Biomed. Inform. 45(1), 129–140 (2012)CrossRefGoogle Scholar
  12. 12.
    Bamidis, P.D., Kaldoudi, E., Pattichis, C.: mEducator: a best practice network for repurposing and sharing medical educational multi-type content. In: Camarinha-Matos, L.M., Paraskakis, I., Afsarmanesh, H. (eds.) PRO-VE 2009. IAICT, vol. 307, pp. 769–776. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-04568-4_78 CrossRefGoogle Scholar
  13. 13.
    Koper, R.: Use of the semantic web to solve some basic problems in education: increase flexible, distributed lifelong learning; decrease teacher’s workload. J. Interact. Media Educ. 2004(1) Art-5 (2010)Google Scholar
  14. 14.
    Keßler, C., d’Aquin, M., Dietze, S.: Linked data for science and education. Semant. Web 4(1), 1–2 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Dmitry Volchek
    • 1
    Email author
  • Aleksei Romanov
    • 1
  • Dmitry Mouromtsev
    • 1
  1. 1.Laboratory of Information Science and Semantic TechnologiesITMO UniversitySt. PetersburgRussia

Personalised recommendations