Using Topic Modelling to Correlate a Research Institution’s Outputs with Its Goals

  • Nicholas ChamansinghEmail author
  • Patrick Hosein
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1130)


With the increasing pressure on private research organizations and universities to convert research output into innovative products and services that can lead to revenue streams, it has become even more important to ensure that research performed matches research goals set. If an institution could quantitatively compare their research output with other highly successful institutions that have similar goals (e.g., their respective countries have similar needs) then this would allow them to make appropriate organizational and personnel changes. We address this problem and demonstrate the approach taken by looking at universities from countries with similar characteristics and comparing their research outputs. This is achieved by forming topic clusters using Latent Dirichlet Allocations and then using a proposed metric for comparison of abstracts with topic clusters to quantify closeness. We determine an upper bound on this metric by comparing abstracts that were used to generate the topic clusters and a lower bound by generating a dataset of randomly chosen abstracts. We also investigate trending of this comparison over time by splitting datasets based on temporal information.


Topic Modelling Latent Dirichlet Allocation Topic Coherence Document Context Distance 


  1. 1.
    Lin, L., Tang, L., Dong, W., Yao, S., Zhou, W.: An overview of topic modeling and its current applications in bioinformatics. SpringerPlus 5(1), 1608 (2016)CrossRefGoogle Scholar
  2. 2.
    Chen, Y., Bordes, J.-B., Filliat, D.: An experimental comparison between NMF and LDA for active cross-situational object-word learning. In: 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 217–222. IEEE (2016)Google Scholar
  3. 3.
    Purushotham, S., Liu, Y., Kuo, C.-C.J.: Collaborative topic regression with social matrix factorization for recommendation systems. arXiv preprint arXiv:1206.4684 (2012)
  4. 4.
    Agarwal, D., Chen, B.-C.: fLDA: matrix factorization through latent dirichlet allocation. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 91–100. ACM (2010)Google Scholar
  5. 5.
    Hall, D., Jurafsky, D., Manning, C.D.: Studying the history of ideas using topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 363–371. Association for Computational Linguistics (2008)Google Scholar
  6. 6.
    Jacobi, C., van Atteveldt, W., Welbers, K.: Quantitative analysis of large amounts of journalistic texts using topic modelling. Digit. J. 4(1), 89–106 (2016)Google Scholar
  7. 7.
    Wang, C., Blei, D.M.: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 448–456. ACM (2011)Google Scholar
  8. 8.
    Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 665–672. ACM (2009)Google Scholar
  9. 9.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  10. 10.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)Google Scholar
  11. 11.
    Xu, G., Zhang, Y., Yi, X.: Modelling user behaviour for web recommendation using LDA model. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2008, vol. 3, pp. 529–532. IEEE (2008)Google Scholar
  12. 12.
    Wilson, J., Chaudhury, S., Lall, B.: Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), vol. 01, pp. 340–346. IEEE Computer Society (2014)Google Scholar
  13. 13.
    Kuhn, T.S.: The Structure of Scientific Revolutions, vol. 2. University of Chicago Press, Chicago (1963)Google Scholar
  14. 14.
    De Smet, W., Moens, M.-F.: Cross-language linking of news stories on the web using interlingual topic modelling. In: Proceedings of the 2nd ACM Workshop on Social Web Search and Mining, pp. 57–64. ACM (2009)Google Scholar
  15. 15.
    Nikolenko, S.I., Koltcov, S., Koltsova, O.: Topic modelling for qualitative studies. J. Inf. Sci. 43(1), 88–102 (2017)CrossRefGoogle Scholar
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  17. 17.
    Stevens, K., Kegelmeyer, P., Andrzejewski, D., Buttler, D.: Exploring topic coherence over many models and many topics. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 952–961. Association for Computational Linguistics (2012)Google Scholar
  18. 18.
    Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100–108. Association for Computational Linguistics (2010)Google Scholar
  19. 19.
    Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272. Association for Computational Linguistics (2011)Google Scholar
  20. 20.
    Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408. ACM (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.The University of the West IndiesSt. AugustineTrinidad

Personalised recommendations