Document Summarization using Wikipedia

  • Krishnan Ramanathan
  • Yogesh Sankarasubramaniam
  • Nidhi Mathur
  • Ajay Gupta


Although most of the developing world is likely to first access the Internet through mobile phones, mobile devices are constrained by screen space, bandwidth and limited attention span. Single document summarization techniques have the potential to simplify information consumption on mobile phones by presenting only the most relevant information contained in the document. In this paper we present a language independent single-document summarization method. We map document sentences to semantic concepts in Wikipedia and select sentences for the summary based on the frequency of the mapped-to concepts. Our evaluation on English documents using the ROUGE package indicates our summarization method is competitive with the state of the art in single document summarization.


Mobile Phone Bipartite Graph Semantic Concept English Document Text Summarization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Orkut B. et al,: Seeing the whole in parts: Text summarization for web browsing on handheld devices, WWW 2001 (2001)Google Scholar
  2. 2.
    Mihalcea R., Tarau P.: A language independent algorithm for single and multiple document summarization. IJCNLP (2005)Google Scholar
  3. 3.
    ROUGE package for evaluating summaries, Scholar
  4. 4.
    Newsinessence, Scholar
  5. 5.
    Lin C., Hovy E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada (2003)Google Scholar
  6. 6.
    Barzilay R., Elhadad M.: Using lexical chains for text summarization. Proceedings of the ACL workshop on intelligent scalable text summarization, pp. 10–17 (1997)Google Scholar
  7. 7.
    Luhn H.: The automatic creation of literature abstract., IBM journal, April (1958)Google Scholar
  8. 8.
    Nguyen P. et al.: Summarization of multiple user reviews in the restaurant domain. Microsoft Research technical report, MSR-TR-126-2007 (2007)Google Scholar
  9. 9.
    Kupiec J., Pedersen J., Chen F.: A Trainable Document Summarizer. SIGIR (1995)Google Scholar
  10. 10.
    Gabrilovich E., Markovich S.: Overcoming the brittleness bottleneck with Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge, Proc. of the AAAI conference (2006)Google Scholar
  11. 11.
    Wan X., Yang J., Xiao J.: Incorporating cross document relationships between sentences for single document summarization, ECDL 2006, LNCS 4172, pp. 403–414, (2006)Google Scholar

Copyright information

© Indian Institute of Information Technology, India 2009

Authors and Affiliations

  • Krishnan Ramanathan
    • 1
  • Yogesh Sankarasubramaniam
    • 1
  • Nidhi Mathur
    • 1
  • Ajay Gupta
    • 1
  1. 1.HP LaboratoriesBangaloreIndia

Personalised recommendations