Skip to main content

Hierarchical Expert Profiling Using Heterogeneous Information Networks

  • Conference paper
  • First Online:
Discovery Science (DS 2018)

Abstract

Linking an expert to his knowledge areas is still a challenging research problem. The task is usually divided into two steps: identifying the knowledge areas/topics in the text corpus and assign them to the experts. Common approaches for the expert profiling task are based on the Latent Dirichlet Allocation (LDA) algorithm. As a result, they require pre-defining the number of topics to be identified which is not ideal in most cases. Furthermore, LDA generates a list of independent topics without any kind of relationship between them. Expert profiles created using this kind of flat topic lists have been reported as highly redundant and many times either too specific or too general.

In this paper we propose a methodology that addresses these limitations by creating hierarchical expert profiles, where the knowledge areas of a researcher are mapped along different granularity levels, from broad areas to more specific ones. For the purpose, we explore the rich structure and semantics of Heterogeneous Information Networks (HINs). Our strategy is divided into two parts. First, we introduce a novel algorithm that can fully use the rich content of an HIN to create a topical hierarchy, by discovering overlapping communities and ranking the nodes inside each community. We then present a strategy to map the knowledge areas of an expert along all the levels of the hierarchy, exploiting the information we have about the expert to obtain an hierarchical profile of topics.

To test our proposed methodology, we used a computer science bibliographical dataset to create a star-schema HIN containing publications as star-nodes and authors, keywords and ISI fields as attribute-nodes. We use heterogeneous pointwise mutual information to demonstrate the quality and coherence of our created hierarchies. Furthermore, we use manually labelled data to serve as ground truth to evaluate our hierarchical expert profiles, showcasing how our strategy is capable of building accurate profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.acm.org/publications/class-2012.

  2. 2.

    https://www.authenticus.pt.

  3. 3.

    Research areas created by the Institute for Scientific Information.

  4. 4.

    For simplicity consider that the links have the same weight.

  5. 5.

    As illustrated by Fig. 2.

  6. 6.

    For clarification, an ’-’ symbol refers to a different level on the hierarchy.

  7. 7.

    Through experimentation we determined that 4 was the number of levels that achieved the most comprehensible topical hierarchy.

  8. 8.

    Following the idea of [21], we setted \(k=5\) for ISI fields since there are only 120 of them in the HIN. In these cases, the part \(\frac{1}{k^2}\) of the formula changes to \(\frac{1}{5k}\).

  9. 9.

    https://scholar.google.com/.

References

  1. Balog, K., Fang, Y., de Rijke, M., Serdyukov, P., Si, L.: Expertise retrieval. Found. Trends\(\textregistered \) Inf. Retriev. 6(2–3), 127–256 (2012)

    Article  Google Scholar 

  2. Berendsen, R., Rijke, M., Balog, K., Bogers, T., Bosch, A.: On the assessment of expertise profiles. J. Assoc. Inf. Sci. Technol. 64(10), 2024–2044 (2013)

    Article  Google Scholar 

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of machine Learn. Res. 3(Jan), 993–1022 (2003)

    Google Scholar 

  4. Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008(10), P10008 (2008)

    Article  Google Scholar 

  5. Daud, A.: Using time topic modeling for semantics-based dynamic research interest finding. Knowl.Based Syst. 26, 154–163 (2012)

    Article  Google Scholar 

  6. De Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: Committee-based profiles for politician finding. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 25(Suppl. 2), 21–36 (2017)

    Article  Google Scholar 

  7. Duan, D., Li, Y., Li, R., Lu, Z., Wen, A.: Mei: Mutual enhanced infinite community-topic model for analyzing text-augmented social networks. Comput. J. 56(3), 336–354 (2012)

    Article  Google Scholar 

  8. Gerlach, M., Peixoto, T.P., Altmann, E.G.: A network approach to topic models. arXiv preprint arXiv:1708.01677 (2017)

  9. bin Jamaludin, N.A., Annamalai, M., Jamil, N., Bakar, Z.A.: A model for keyword profile creation using extracted keywords and terminological ontology. In: 2013 IEEE Conference on e-Learning, e-Management and e-Services (IC3e), pp. 136–141. IEEE (2013)

    Google Scholar 

  10. Jeong, Y.S., Lee, S.H., Gweon, G.: Discovery of research interests of authors over time using a topic model. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 24–31. IEEE (2016)

    Google Scholar 

  11. Karimzadehgan, M., White, R.W., Richardson, M.: Enhancing expert finding using organizational hierarchies. In: European Conference on Information Retrieval, pp. 177–188. Springer (2009)

    Google Scholar 

  12. Li, C., Cheung, W.K., Ye, Y., Zhang, X., Chu, D., Li, X.: The author-topic-community model for author interest profiling and community discovery. Knowl. Inf. Syst. 44(2), 359–383 (2015)

    Article  Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  14. Newman, M.E.: Modularity and community structure in networks. Proc. Natl Acad. Sci. 103(23), 8577–8582 (2006)

    Article  Google Scholar 

  15. Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)

    Google Scholar 

  16. Rybak, Jan, Balog, Krisztian, Nørvåg, Kjetil: Temporal expertise profiling. In: de Rijke, Maarten, Kenter, Tom, de Vries, Arjen P., Zhai, ChengXiang, de Jong, Franciska, Radinsky, Kira, Hofmann, Katja (eds.) ECIR 2014. LNCS, vol. 8416, pp. 540–546. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_54

    Chapter  Google Scholar 

  17. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)

    Article  Google Scholar 

  18. Sun, Y., Han, J., Zhao, P., Yin, Z., Cheng, H., Wu, T.: Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 565–576. ACM (2009)

    Google Scholar 

  19. Sun, Y., Yu, Y., Han, J.: Ranking-based clustering of heterogeneous information networks with star network schema. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806. ACM (2009)

    Google Scholar 

  20. Tang, J., Jin, R., Zhang, J.: A topic modeling approach and its integration into the random walk framework for academic search. In: Eighth IEEE International Conference on Data Mining, 2008 ICDM 2008, pp. 1055–1060. IEEE (2008)

    Google Scholar 

  21. Wang, C., Liu, J., Desai, N., Danilevsky, M., Han, J.: Constructing topical hierarchies in heterogeneous information networks. Knowl. Inf. Syst. 44(3), 529–558 (2015)

    Article  Google Scholar 

  22. Wang, J., Hu, X., Tu, X., He, T.: Author-conference topic-connection model for academic network search. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 2179–2183. ACM (2012)

    Google Scholar 

Download references

Acknowledgements

This work is funded by the ERDF through the COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT as part of project UID/EEA/50014/2013. Jorge Silva is also supported by a FCT/MAP-i PhD research grant (PD/BD/128157/2016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge Silva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Silva, J., Ribeiro, P., Silva, F. (2018). Hierarchical Expert Profiling Using Heterogeneous Information Networks. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01771-2_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01770-5

  • Online ISBN: 978-3-030-01771-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics