Skip to main content

Using Topic Modelling to Correlate a Research Institution’s Outputs with Its Goals

  • Conference paper
  • First Online:
Advances in Information and Communication (FICC 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1130))

Included in the following conference series:

  • 1311 Accesses

Abstract

With the increasing pressure on private research organizations and universities to convert research output into innovative products and services that can lead to revenue streams, it has become even more important to ensure that research performed matches research goals set. If an institution could quantitatively compare their research output with other highly successful institutions that have similar goals (e.g., their respective countries have similar needs) then this would allow them to make appropriate organizational and personnel changes. We address this problem and demonstrate the approach taken by looking at universities from countries with similar characteristics and comparing their research outputs. This is achieved by forming topic clusters using Latent Dirichlet Allocations and then using a proposed metric for comparison of abstracts with topic clusters to quantify closeness. We determine an upper bound on this metric by comparing abstracts that were used to generate the topic clusters and a lower bound by generating a dataset of randomly chosen abstracts. We also investigate trending of this comparison over time by splitting datasets based on temporal information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lin, L., Tang, L., Dong, W., Yao, S., Zhou, W.: An overview of topic modeling and its current applications in bioinformatics. SpringerPlus 5(1), 1608 (2016)

    Article  Google Scholar 

  2. Chen, Y., Bordes, J.-B., Filliat, D.: An experimental comparison between NMF and LDA for active cross-situational object-word learning. In: 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), pp. 217–222. IEEE (2016)

    Google Scholar 

  3. Purushotham, S., Liu, Y., Kuo, C.-C.J.: Collaborative topic regression with social matrix factorization for recommendation systems. arXiv preprint arXiv:1206.4684 (2012)

  4. Agarwal, D., Chen, B.-C.: fLDA: matrix factorization through latent dirichlet allocation. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 91–100. ACM (2010)

    Google Scholar 

  5. Hall, D., Jurafsky, D., Manning, C.D.: Studying the history of ideas using topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 363–371. Association for Computational Linguistics (2008)

    Google Scholar 

  6. Jacobi, C., van Atteveldt, W., Welbers, K.: Quantitative analysis of large amounts of journalistic texts using topic modelling. Digit. J. 4(1), 89–106 (2016)

    Google Scholar 

  7. Wang, C., Blei, D.M.: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 448–456. ACM (2011)

    Google Scholar 

  8. Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 665–672. ACM (2009)

    Google Scholar 

  9. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. 101(suppl 1), 5228–5235 (2004)

    Article  Google Scholar 

  10. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)

    Google Scholar 

  11. Xu, G., Zhang, Y., Yi, X.: Modelling user behaviour for web recommendation using LDA model. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2008, vol. 3, pp. 529–532. IEEE (2008)

    Google Scholar 

  12. Wilson, J., Chaudhury, S., Lall, B.: Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), vol. 01, pp. 340–346. IEEE Computer Society (2014)

    Google Scholar 

  13. Kuhn, T.S.: The Structure of Scientific Revolutions, vol. 2. University of Chicago Press, Chicago (1963)

    Google Scholar 

  14. De Smet, W., Moens, M.-F.: Cross-language linking of news stories on the web using interlingual topic modelling. In: Proceedings of the 2nd ACM Workshop on Social Web Search and Mining, pp. 57–64. ACM (2009)

    Google Scholar 

  15. Nikolenko, S.I., Koltcov, S., Koltsova, O.: Topic modelling for qualitative studies. J. Inf. Sci. 43(1), 88–102 (2017)

    Article  Google Scholar 

  16. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  17. Stevens, K., Kegelmeyer, P., Andrzejewski, D., Buttler, D.: Exploring topic coherence over many models and many topics. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 952–961. Association for Computational Linguistics (2012)

    Google Scholar 

  18. Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 100–108. Association for Computational Linguistics (2010)

    Google Scholar 

  19. Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272. Association for Computational Linguistics (2011)

    Google Scholar 

  20. Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408. ACM (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicholas Chamansingh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chamansingh, N., Hosein, P. (2020). Using Topic Modelling to Correlate a Research Institution’s Outputs with Its Goals. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Advances in Information and Communication. FICC 2020. Advances in Intelligent Systems and Computing, vol 1130. Springer, Cham. https://doi.org/10.1007/978-3-030-39442-4_13

Download citation

Publish with us

Policies and ethics