Document Summarization using Wikipedia
Although most of the developing world is likely to first access the Internet through mobile phones, mobile devices are constrained by screen space, bandwidth and limited attention span. Single document summarization techniques have the potential to simplify information consumption on mobile phones by presenting only the most relevant information contained in the document. In this paper we present a language independent single-document summarization method. We map document sentences to semantic concepts in Wikipedia and select sentences for the summary based on the frequency of the mapped-to concepts. Our evaluation on English documents using the ROUGE package indicates our summarization method is competitive with the state of the art in single document summarization.
KeywordsMobile Phone Bipartite Graph Semantic Concept English Document Text Summarization
Unable to display preview. Download preview PDF.
- 1.Orkut B. et al,: Seeing the whole in parts: Text summarization for web browsing on handheld devices, WWW 2001 (2001)Google Scholar
- 2.Mihalcea R., Tarau P.: A language independent algorithm for single and multiple document summarization. IJCNLP (2005)Google Scholar
- 3.ROUGE package for evaluating summaries, http://berouge.com/default.aspx.Google Scholar
- 4.Newsinessence, http://lada.si.umich.edu:8080/clair/nie.cgiGoogle Scholar
- 5.Lin C., Hovy E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada (2003)Google Scholar
- 6.Barzilay R., Elhadad M.: Using lexical chains for text summarization. Proceedings of the ACL workshop on intelligent scalable text summarization, pp. 10–17 (1997)Google Scholar
- 7.Luhn H.: The automatic creation of literature abstract., IBM journal, April (1958)Google Scholar
- 8.Nguyen P. et al.: Summarization of multiple user reviews in the restaurant domain. Microsoft Research technical report, MSR-TR-126-2007 (2007)Google Scholar
- 9.Kupiec J., Pedersen J., Chen F.: A Trainable Document Summarizer. SIGIR (1995)Google Scholar
- 10.Gabrilovich E., Markovich S.: Overcoming the brittleness bottleneck with Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge, Proc. of the AAAI conference (2006)Google Scholar
- 11.Wan X., Yang J., Xiao J.: Incorporating cross document relationships between sentences for single document summarization, ECDL 2006, LNCS 4172, pp. 403–414, (2006)Google Scholar