Conference Mining via Generalized Topic Modeling

  • Ali Daud
  • Juanzi Li
  • Lizhu Zhou
  • Faqir Muhammad
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


Conference Mining has been an important problem discussed these days for the purpose of academic recommendation. Previous approaches mined conferences by using network connectivity or by using semantics-based intrinsic structure of the words present between documents (modeling from document level (DL)), while ignored semantics-based intrinsic structure of the words present between conferences. In this paper, we address this problem by considering semantics-based intrinsic structure of the words present in conferences (richer semantics) by modeling from conference level (CL). We propose a generalized topic modeling approach based on Latent Dirichlet Allocation (LDA) named as Conference Mining (ConMin). By using it we can discover topically related conferences, conferences correlations and conferences temporal topic trends. Experimental results show that proposed approach significantly outperformed baseline approach in discovering topically related conferences and finding conferences correlations because of its ability to produce less sparse topics.


Richer Semantics Conference Mining Generalized Topic Modeling Unsupervised Learning 


  1. 1.
    Andrieu, C., Freitas, N.D., Doucet, A., Jordan, M.: An Introduction to MCMC for Machine Learning. Journal of Machine Learning 50, 5–43 (2003)CrossRefzbMATHGoogle Scholar
  2. 2.
    Azzopardi, L., Girolami, M., van Risjbergen, K.: Investigating the Relationship between Language Model Perplexity and IR Precision-Recall Measures. In: Proc. of the 26th ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, July 28-August 1 (2003)Google Scholar
  3. 3.
    Balabanovic, M., Shoham, Y.: Content-Based Collaborative Recommendation. Communications of the ACM, CACM (1997)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Blei, D.M., Lafferty, J.: Dynamic Topic Models. In: Proc. of 23rd International Conference on Machine Learning (ICML), Pittsburgh, Pennsylvania, USA, June 25-29 (2006)Google Scholar
  6. 6.
    Breese, J., Heckerman, D., Kadie, C.: Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In: Proc. of the International Conference on Uncertainty in Intelligence (UAI), pp. 43–52 (1998)Google Scholar
  7. 7.
    Deshpande, M., Karypis, G.: Item-based Top-n Recommendation Algorithms. ACM Transactions on Information Systems 22(1), 143–177 (2004)CrossRefGoogle Scholar
  8. 8.
    DBLP Bibliography database,
  9. 9.
    Girvan, M., Newman, M.E.J.: Community Structure in Social and Biological Networks. In: Proc. of the National Academy of Sciences, USA, vol. 99, pp. 8271–8276 (2002)Google Scholar
  10. 10.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: Proc. of the National Academy of Sciences, pp. 5228–5235 (2004)Google Scholar
  11. 11.
    Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proc. of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI), Stockholm, Sweden, July 30-August 1 (1999)Google Scholar
  12. 12.
    Kernighan, B.W., Lin, S.: An Efficient Heuristic Procedure for Partitioning Graphs. Bell System Technical Journal 49, 291–307 (1970)CrossRefzbMATHGoogle Scholar
  13. 13.
    Linstead, E., Rigor, P., Bajracharya, S., Lopes, C., Baldi, P.: Mining Eclipse Developer Contributions via Author-Topic Models. In: 29th International Conference on Software Engineering Workshops, ICSEW (2007)Google Scholar
  14. 14.
    Ley, M.: The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives. In: Proc. of the International Symposium on String Processing and Information Retrieval (SPIRE), Lisbon, Portugal, September 11-13, 2002, pp. 1–10 (2002)Google Scholar
  15. 15.
    McCallum, A., Nigam, K., Ungar, L.H.: Efficient Clustering of High-dimensional Data Sets with Application to Reference Matching. In: Proc. of the 6th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, August 20-23, 2000, pp. 169–178 (2000)Google Scholar
  16. 16.
    Popescul, A., Flake, G.W., Lawrence, S., et al.: Clustering and Identifying Temporal Trends in Document Databases. In: IEEE Advances in Digital Libraries (ADL), pp. 173–182 (2000)Google Scholar
  17. 17.
    Pothen, A., Simon, H., Liou, K.P.: Partitioning Sparse Matrices with Eigenvectors of Graphs. SIAM Journal on Matrix Analysis and Applications 11, 430–452 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Radicchi, F., Castellano, C., Cecconi, F., et al.: Dening and Identifying Communities in Networks. In: Proc. of the National Academy of Sciences, USA (2004)Google Scholar
  19. 19.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M.: Smyth. P.: The Author-Topic Model for Authors and Documents. In: Proc. of the 20th International Conference on Uncertainty in Artificial Intelligence (UAI), Banff, Canada, July 7-11 (2004)Google Scholar
  20. 20.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: Extraction and Mining of Academic Social Networks. In: Proc. of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), Las Vegas, USA, August 24-27 (2008)Google Scholar
  21. 21.
    Tyler, J.R., Wilkinson, D.M., Huberman, B.A.: Email as Spectroscopy: Automated Discovery of Community Structure within Organizations. In: Proc. of the International Conference on Communities and Technologies, pp. 81–96 (2003)Google Scholar
  22. 22.
    Wang, X., McCallum, A.: Topics over time: A non-markov continuous-time model of topical trends. In: Proc. of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, USA, August 20-23 (2006)Google Scholar
  23. 23.
    Wang, J.-L., Xu, C., Li, G., Dai, Z., Luo, G.: Understanding Research Field Evolving and Trend with Dynamic Bayesian Networks. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 320–331. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  24. 24.
    Zaiane, O.R., Chen, J., Goebel, R.: DBconnect: Mining Research Community on DBLP Data. In: Joint 9th WEBKDD and 1st SNA-KDD Workshop, San Jose, California, USA, August 12 (2007)Google Scholar
  25. 25.
    Zhang, J., Tang, J., Liang, B., et al.: Recommendation over a Heterogeneous Social Network. In: Proc. of the 9th International Conference on Web-Age Information Management (WAIM), ZhangJiaJie, China, July 20-22 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ali Daud
    • 1
  • Juanzi Li
    • 1
  • Lizhu Zhou
    • 1
  • Faqir Muhammad
    • 2
  1. 1.Department of Computer Science & TechnologyTsinghua, UniversityBeijingChina
  2. 2.Department of Mathematics & StatisticsAllama Iqbal Open UniversityIslamabadPakistan

Personalised recommendations