Finding Co-occurring Topics in Wikipedia Article Segments
Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.
KeywordsLDA MLE Link Wikipedia
Unable to display preview. Download preview PDF.
- 2.Blei, D.M., Moreno, P.J.: Topic segmentation with an aspect hidden markov model. In: Proceedings of SIGIR (2001)Google Scholar
- 3.Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR 2001, pp. 120–127 (2001)Google Scholar
- 4.Liu, X., Croft, W.B.: Cluster-based retrieval using language models. In: Proc. 27th International ACM SIGIRConf. Research and Development Information Retrieval, pp. 186–193 (2004)Google Scholar
- 5.Xing, W., Croft, W.B.: LDA-Based Document Models for Ad-hoc Retrieval. In: Proc. 29thACM SIGIR Conf., pp. 178–185 (2006)Google Scholar
- 6.Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proc. 24th ACM SIGIR 2001, pp. 334–34 (2001)Google Scholar
- 7.Evgeniy, G., Shaul, M.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proc. IJCAI 2007 Proceedings of the 20th International Joint Conference on Artifical Intelligence, San Francisco, pp. 1606–1611 (2007)Google Scholar
- 8.David, M., Ian, H.W.: An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links. In: Proc. AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, Chicago, pp. 25–30 (2008)Google Scholar