Skip to main content

Learning from Numerous Untailored Summaries

  • Conference paper
  • First Online:
PRICAI 2016: Trends in Artificial Intelligence (PRICAI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9810))

Included in the following conference series:

  • 2550 Accesses

Abstract

We present an attempt to use a large amount of summaries contained in the New York Times Annotated Corpus (NYTAC). We introduce five methods inspired by domain adaptation techniques in other research areas to train our supervised summarization system and evaluate them on three test sets. Among the five methods, the one that is trained on the NYTAC followed by fine-tuning on the target data (i.e. the three test sets; DUC2002, RSTDTB\(_{\text {long}}\) and RSTDTB\(_{\text {short}}\)) performs the best for all the test sets. We also propose an instance selection method according to the faithfulness of the extractive oracle summary to the reference summary and empirically show that it improves summarization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The current datasets for multi-document summarization are also small.

  2. 2.

    In this paper, the (extractive) oracle summary is defined to be the best possible summary that can be generated by sentence extraction, and the reference summary is defined to be the original human-written summary in NYTAC.

  3. 3.

    Although only 1,365 articles were used in their work [28], there are potentially many more articles in CNN.com.

  4. 4.

    2016 The New York Times Annotated Corpus, used with permission.

  5. 5.

    When the benefit of each sentence is represented as the dot product of a weight vector and a feature vector, the benefit can be negative. The optimization problem with negative benefits cannot be regarded as a KP. However, such cases are very rare and can be ignored in practice.

  6. 6.

    Some reference summaries contain more than 100 words. Such summaries as well as system summaries were truncated to 100 words during the evaluation.

  7. 7.

    With options “-a -x -n 1 -m -s” on version 1.5.5 of the official ROUGE script.

  8. 8.

    With options “-a -x -n 2 -m” on version 1.5.5 of the official ROUGE script.

  9. 9.

    For the statistical significance test, we used Wilcoxon signed-rank test (\(p\le 0.05\)).

  10. 10.

    The selected values of thr for each fold were 0.1, 0.1, 0.1, 0.1 and 0.1, respectively.

  11. 11.

    Explanation of these two types of summaries can be found in the book written by Nenkova and McKeown [23]. We quote the relevant part of the book: A summary that enables the reader to determine about-ness has often been called an indicative summary, while one that can be read in place of the document has been called an informative summary.

  12. 12.

    The selected values of thr for each fold are 0.3, 0.6, 0.3, 0.3 and 0.6, respectively.

References

  1. Almeida, M., Martins, A.: Fast and robust compressive summarization with dual decomposition and multi-task learning. In: Proceedings of ACL 2013, pp. 196–206 (2013)

    Google Scholar 

  2. Axelrod, A., He, X., Gao, J.: Domain adaptation via pseudo in-domain data selection. In: Proceedings of EMNLP 2011, pp. 355–362 (2011)

    Google Scholar 

  3. Biçici, E.: Domain adaptation for machine translation with instance selection. Prague Bull. Math. Linguist. 103, 5–20 (2015)

    Google Scholar 

  4. Carlson, L., Marcu, D., Okurowski, M.E.: RST discourse treebank. In: Linguistic Data Consortium (2002). https://catalog.ldc.upenn.edu/LDC2002T07

  5. Consortium, L.D: Hansard corpus of parallel english and french. In: Linguistic Data Consortium (1997). http://www.ldc.upenn.edu/

  6. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)

    MathSciNet  MATH  Google Scholar 

  7. Crammer, K., McDonald, R., Pereira, F.: Scalable large-margin online learning for structured classification. In: Proceedings of NIPS05 Workshop on Learning With Structured Outputs (2005)

    Google Scholar 

  8. Daumé III., H.: Frustratingly easy domain adaptation. In: Proceedings of ACL 2007, pp. 256–263 (2007)

    Google Scholar 

  9. Daumé, H., Marcu, D.: Induction of word and phrase alignments for automatic document summarization. Comput. Linguist. 31(4), 505–530 (2005)

    Article  MATH  Google Scholar 

  10. DUC: Document understanding conference. In: ACL Workshop on Automatic Summarization (2002)

    Google Scholar 

  11. Hirao, T., Isozaki, H., Maeda, E., Matsumoto, Y.: Extracting important sentences with support vector machines. In: Proceedings of COLING 2002, vol. 1, pp. 1–7 (2002)

    Google Scholar 

  12. Hirao, T., Yoshida, Y., Nishino, M., Yasuda, N., Nagata, M.: Single-document summarization as a tree knapsack problem. In: Proceedings of EMNLP 2013, pp. 1515–1520 (2013)

    Google Scholar 

  13. Hong, K., Nenkova, A.: Improving the estimation of word importance for news multi-document summarization. In: Proceedings of EACL 2014, pp. 712–721 (2014)

    Google Scholar 

  14. Jing, H., McKeown, K.R.: Cut and paste based text summarization. In: Proceedings of NAACL 2000, pp. 178–185 (2000)

    Google Scholar 

  15. Li, C., Liu, Y., Zhao, L.: Using external resources and joint learning for bigram weighting in ILP-based multi-document summarization. In: Proceedings of NAACL 2015, pp. 778–787 (2015)

    Google Scholar 

  16. Li, C., Qian, X., Liu, Y.: Using supervised bigram-based ILP for extractive summarization. In: Proceedings of ACL 2013, pp. 1004–1013 (2013)

    Google Scholar 

  17. Li, J.J., Nenkova, A.: Fast and accurate prediction of sentence specificity. In: Proceedings of AAA 2015, pp. 2281–2287 (2015)

    Google Scholar 

  18. Li, Q.: Literature survey: domain adaptation algorithms for natural language processing. Technical report, Department of Computer Science. The Graduate Center, The City University of New York (2012)

    Google Scholar 

  19. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, pp. 74–81 (2004)

    Google Scholar 

  20. Marcu, D.: Improving summarization through rhetorical parsing tuning. In: Proceedings of Sixth Workshop on Very Large Corpora, pp. 206–215 (1998)

    Google Scholar 

  21. Marcu, D.: The automatic construction of large-scale corpora for summarization research. In: Proceedings of SIGIR99, pp. 137–144 (1999)

    Google Scholar 

  22. Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of EMNLP 2004, pp. 404–411 (2004)

    Google Scholar 

  23. Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends. Inf. Retrieval 2–3, 103–233 (2011)

    Article  Google Scholar 

  24. Nishikawa, H., Arita, K., Tanaka, K., Hirao, T., Makino, T., Matsuo, Y.: Learning to generate coherent summary with discriminative hidden semi-Markov model. In: Proceedings of COLING 2014, pp. 1648–1659 (2014)

    Google Scholar 

  25. Remus, R.: Domain adaptation using domain similarity- and domain complexity-based instance selection for cross-domain sentiment analysis. In: Proceedings of ICDMW 2012) Workshop on SENTIRE, pp. 717–723 (2012)

    Google Scholar 

  26. Sandhaus, E.: The New York Times annotated corpus. In: Linguistic Data Consortium (2008). https://catalog.ldc.upenn.edu/LDC2008T19

  27. Sipos, R., Shivaswamy, P., Joachims, T.: Large-margin learning of submodular summarization models. In: Proceedings of EACL 2012, pp. 224–233 (2012)

    Google Scholar 

  28. Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of EMNLP-CoNLL 2007, Association for Computational Linguistics, Prague, Czech Republic, pp. 448–457. http://www.aclweb.org/anthology/D/D07/D07-1047

  29. Takamura, H., Okumura, M.: Learning to generate summary as structured output. In: Proceedings of CIKM 2010, pp. 1437–1440 (2010)

    Google Scholar 

  30. Xia, R., Zong, C., Hu, X., Cambria, E.: Feature ensemble plus sample selection: domain adaptation for sentiment classification. IEEE Intell. Syst. 28(3), 10–18 (2013)

    Article  Google Scholar 

  31. Yang, Y., Nenkova, A.: Detecting information-dense texts in multiple news domains. In: Proceedings of AAAI 2014, pp. 1650–1656 (2014)

    Google Scholar 

  32. Yih, W.T., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI 2007, pp. 1776–1782 (2007)

    Google Scholar 

  33. Zhao, J., Qiu, X., Liu, Z., Huang, X.: Online distributed passive-aggressive algorithm for structured learning. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds.) CCL and NLP-NABD 2013. LNCS, vol. 8202, pp. 120–130. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP26280080.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuta Kikuchi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kikuchi, Y., Watanabe, A., Ryohei, S., Takamura, H., Okumura, M. (2016). Learning from Numerous Untailored Summaries. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42911-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42910-6

  • Online ISBN: 978-3-319-42911-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics