Abstract
This paper investigates the problem of finding author interest in co-author network through topic modeling with providing several performance evaluation measures. Intuitively, there are two types of explicit grouping exists in research papers (1) authors who have co-authored with author A in one document (subgroup) and (2) authors who have co-authored with author A in all documents (group). Traditional methods use graph-link structure by using keywords based matching and ignored semantics-based information, while topic modeling considered semantics-based information but ignored both types of explicit grouping e.g. State-of-the-art Author-Topic model used only one kind of explicit grouping single document (subgroup) for finding author interest. In this paper, we introduce Group-Author-Topic (GAT) modeling which exploits both types of grouping simultaneously. We compare four different topic modeling methods for same task on large DBLP dataset. We provide three performance measures for method evaluation from different domains which are; perplexity, entropy, and prediction ranking accuracy. We show the trade of between these performance evaluation measures. Experimental results demonstrate that our proposed method significantly outperformed the baselines in finding author interest. The trade of between used evaluation measures shows that they are equally useful for evaluating topic modeling methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for Machine Learning. Machine Learning 50, 5–43 (2003)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the Annual Conference on Research and Development in Information Retrieval, SIGIR (2003)
Daud, A., Li, J., Zhu, L., Muhammad, F.: Temporal Expert Finding through Generalized Time Topic Modeling. Knowledge-Based Systems (KBS) 23(6), 615–625 (2010)
Daud, A., Li, J., Zhou, L., Muhammad, F.: Conference Mining via Generalized Topic Modeling. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 244–259. Springer, Heidelberg (2009)
DBLP Bibliography Database, http://www.informatik.uni-trier.de/~ley/db/
Diederich, J., Kindermann, J., Leopold, E., Paass, G.: Authorship Attribution with Support Vector Machines. Applied Intelligence 19(1) (2003)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences, 5228–5235 (2004)
Gray, A., Sallis, P., MacDonell, S.: Softwareforensics: Extending Authorship Analysis Techniques to Computer Programs. In: Proceedings of the 3rd IAFL, Durham NC (1997)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI), Stockholm, Sweden, July 30-August 1 (1999)
Mimno, D., McCallum, A.: Expertise modeling for matching papers with reviewers. In: Proceedings of KDD, pp. 500–509 (2007)
Mutschke, P.: Mining Networks and Central Entities in Digital Libraries: A Graph Theoretic Approach Applied to Co-author Networks. Intelligent Data Analysis, 155–166 (2003)
Newman, M.E.J.: Scientific collaboration networks: I. Network construction and fundamental results. Physical Review EÂ 64, 016131 (2001)
Kawamae, N.: Author Interest Topic Model. In: Proceedings of SIGIR, July 19–23, pp. 887–888 (2010)
Rosen-Zvi, M., Chemudugunta, C., Griffiths, T., Smyth, P., Steyvers, M.: Learning Author-Topic Models from Text Corpora. ACM Transactions on Information Systems, 1–38 (March 2009)
White, S., Smyth, P.: Algorithms for Estimating Relative Importance in Networks. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 266–275 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Daud, A. (2011). Exploiting Explicit Semantics-Based Grouping for Author Interest Finding. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-20291-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20290-2
Online ISBN: 978-3-642-20291-9
eBook Packages: Computer ScienceComputer Science (R0)