A Dependency Graph Isomorphism for News Sentence Searching
Given that the amount of news being published is only increasing, an effective search tool is invaluable to many Web-based companies. With word-based approaches ignoring much of the information in texts, we propose Destiny, a linguistic approach that leverages the syntactic information in sentences by representing sentences as graphs with disambiguated words as nodes and grammatical relations as edges. Destiny performs approximate sub-graph isomorphism on the query graph and the news sentence graphs, exploiting word synonymy as well as hypernymy. Employing a custom corpus of user-rated queries and sentences, the algorithm is evaluated using the normalized Discounted Cumulative Gain, Spearman’s Rho, and Mean Average Precision and it is shown that Destiny performs significantly better than a TF-IDF baseline on the considered measures and corpus.
KeywordsMean Average Precision Word Sense Query Graph Grammatical Relation Discount Cumulative Gain
Unable to display preview. Download preview PDF.
- 1.Ahn, J., Brusilovsky, P., Grady, J., He, D., Syn, S.Y.: Open User Profiles for Adaptive News Systems: Help or Harm? In: 16th International Conference on World Wide Web (WWW 2007), pp. 11–20. ACM (2007)Google Scholar
- 2.Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press (1998)Google Scholar
- 3.Haghighi, A., Klein, D.: Coreference Resolution in a Modular, Entity-Centered Model. In: Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2010), pp. 385–393. ACL (2010)Google Scholar
- 4.Kilgarriff, A., Rosenzweig, J.: English senseval: Report and results. In: 2nd International Conference on Language Resources and Evaluation (LREC 2000), pp. 1239–1244. ELRA (2000)Google Scholar
- 6.Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill (1983)Google Scholar