Learning from Numerous Untailored Summaries

Kikuchi, Yuta; Watanabe, Akihiko; Ryohei, Sasano; Takamura, Hiroya; Okumura, Manabu

doi:10.1007/978-3-319-42911-3_17

Yuta Kikuchi¹⁵,
Akihiko Watanabe¹⁵,
Sasano Ryohei¹⁵,
Hiroya Takamura¹⁵ &
…
Manabu Okumura¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9810))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2550 Accesses

Abstract

We present an attempt to use a large amount of summaries contained in the New York Times Annotated Corpus (NYTAC). We introduce five methods inspired by domain adaptation techniques in other research areas to train our supervised summarization system and evaluate them on three test sets. Among the five methods, the one that is trained on the NYTAC followed by fine-tuning on the target data (i.e. the three test sets; DUC2002, RSTDTB\(_{\text {long}}\) and RSTDTB\(_{\text {short}}\)) performs the best for all the test sets. We also propose an instance selection method according to the faithfulness of the extractive oracle summary to the reference summary and empirically show that it improves summarization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The current datasets for multi-document summarization are also small.
2.
In this paper, the (extractive) oracle summary is defined to be the best possible summary that can be generated by sentence extraction, and the reference summary is defined to be the original human-written summary in NYTAC.
3.
Although only 1,365 articles were used in their work [28], there are potentially many more articles in CNN.com.
4.
2016 The New York Times Annotated Corpus, used with permission.
5.
When the benefit of each sentence is represented as the dot product of a weight vector and a feature vector, the benefit can be negative. The optimization problem with negative benefits cannot be regarded as a KP. However, such cases are very rare and can be ignored in practice.
6.
Some reference summaries contain more than 100 words. Such summaries as well as system summaries were truncated to 100 words during the evaluation.
7.
With options “-a -x -n 1 -m -s” on version 1.5.5 of the official ROUGE script.
8.
With options “-a -x -n 2 -m” on version 1.5.5 of the official ROUGE script.
9.
For the statistical significance test, we used Wilcoxon signed-rank test (\(p\le 0.05\)).
10.
The selected values of thr for each fold were 0.1, 0.1, 0.1, 0.1 and 0.1, respectively.
11.
Explanation of these two types of summaries can be found in the book written by Nenkova and McKeown [23]. We quote the relevant part of the book: A summary that enables the reader to determine about-ness has often been called an indicative summary, while one that can be read in place of the document has been called an informative summary.
12.
The selected values of thr for each fold are 0.3, 0.6, 0.3, 0.3 and 0.6, respectively.

References

Almeida, M., Martins, A.: Fast and robust compressive summarization with dual decomposition and multi-task learning. In: Proceedings of ACL 2013, pp. 196–206 (2013)
Google Scholar
Axelrod, A., He, X., Gao, J.: Domain adaptation via pseudo in-domain data selection. In: Proceedings of EMNLP 2011, pp. 355–362 (2011)
Google Scholar
Biçici, E.: Domain adaptation for machine translation with instance selection. Prague Bull. Math. Linguist. 103, 5–20 (2015)
Google Scholar
Carlson, L., Marcu, D., Okurowski, M.E.: RST discourse treebank. In: Linguistic Data Consortium (2002). https://catalog.ldc.upenn.edu/LDC2002T07
Consortium, L.D: Hansard corpus of parallel english and french. In: Linguistic Data Consortium (1997). http://www.ldc.upenn.edu/
Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. J. Mach. Learn. Res. 7, 551–585 (2006)
MathSciNet MATH Google Scholar
Crammer, K., McDonald, R., Pereira, F.: Scalable large-margin online learning for structured classification. In: Proceedings of NIPS05 Workshop on Learning With Structured Outputs (2005)
Google Scholar
Daumé III., H.: Frustratingly easy domain adaptation. In: Proceedings of ACL 2007, pp. 256–263 (2007)
Google Scholar
Daumé, H., Marcu, D.: Induction of word and phrase alignments for automatic document summarization. Comput. Linguist. 31(4), 505–530 (2005)
Article MATH Google Scholar
DUC: Document understanding conference. In: ACL Workshop on Automatic Summarization (2002)
Google Scholar
Hirao, T., Isozaki, H., Maeda, E., Matsumoto, Y.: Extracting important sentences with support vector machines. In: Proceedings of COLING 2002, vol. 1, pp. 1–7 (2002)
Google Scholar
Hirao, T., Yoshida, Y., Nishino, M., Yasuda, N., Nagata, M.: Single-document summarization as a tree knapsack problem. In: Proceedings of EMNLP 2013, pp. 1515–1520 (2013)
Google Scholar
Hong, K., Nenkova, A.: Improving the estimation of word importance for news multi-document summarization. In: Proceedings of EACL 2014, pp. 712–721 (2014)
Google Scholar
Jing, H., McKeown, K.R.: Cut and paste based text summarization. In: Proceedings of NAACL 2000, pp. 178–185 (2000)
Google Scholar
Li, C., Liu, Y., Zhao, L.: Using external resources and joint learning for bigram weighting in ILP-based multi-document summarization. In: Proceedings of NAACL 2015, pp. 778–787 (2015)
Google Scholar
Li, C., Qian, X., Liu, Y.: Using supervised bigram-based ILP for extractive summarization. In: Proceedings of ACL 2013, pp. 1004–1013 (2013)
Google Scholar
Li, J.J., Nenkova, A.: Fast and accurate prediction of sentence specificity. In: Proceedings of AAA 2015, pp. 2281–2287 (2015)
Google Scholar
Li, Q.: Literature survey: domain adaptation algorithms for natural language processing. Technical report, Department of Computer Science. The Graduate Center, The City University of New York (2012)
Google Scholar
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out, pp. 74–81 (2004)
Google Scholar
Marcu, D.: Improving summarization through rhetorical parsing tuning. In: Proceedings of Sixth Workshop on Very Large Corpora, pp. 206–215 (1998)
Google Scholar
Marcu, D.: The automatic construction of large-scale corpora for summarization research. In: Proceedings of SIGIR99, pp. 137–144 (1999)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of EMNLP 2004, pp. 404–411 (2004)
Google Scholar
Nenkova, A., McKeown, K.: Automatic summarization. Found. Trends. Inf. Retrieval 2–3, 103–233 (2011)
Article Google Scholar
Nishikawa, H., Arita, K., Tanaka, K., Hirao, T., Makino, T., Matsuo, Y.: Learning to generate coherent summary with discriminative hidden semi-Markov model. In: Proceedings of COLING 2014, pp. 1648–1659 (2014)
Google Scholar
Remus, R.: Domain adaptation using domain similarity- and domain complexity-based instance selection for cross-domain sentiment analysis. In: Proceedings of ICDMW 2012) Workshop on SENTIRE, pp. 717–723 (2012)
Google Scholar
Sandhaus, E.: The New York Times annotated corpus. In: Linguistic Data Consortium (2008). https://catalog.ldc.upenn.edu/LDC2008T19
Sipos, R., Shivaswamy, P., Joachims, T.: Large-margin learning of submodular summarization models. In: Proceedings of EACL 2012, pp. 224–233 (2012)
Google Scholar
Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of EMNLP-CoNLL 2007, Association for Computational Linguistics, Prague, Czech Republic, pp. 448–457. http://www.aclweb.org/anthology/D/D07/D07-1047
Takamura, H., Okumura, M.: Learning to generate summary as structured output. In: Proceedings of CIKM 2010, pp. 1437–1440 (2010)
Google Scholar
Xia, R., Zong, C., Hu, X., Cambria, E.: Feature ensemble plus sample selection: domain adaptation for sentiment classification. IEEE Intell. Syst. 28(3), 10–18 (2013)
Article Google Scholar
Yang, Y., Nenkova, A.: Detecting information-dense texts in multiple news domains. In: Proceedings of AAAI 2014, pp. 1650–1656 (2014)
Google Scholar
Yih, W.T., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI 2007, pp. 1776–1782 (2007)
Google Scholar
Zhao, J., Qiu, X., Liu, Z., Huang, X.: Online distributed passive-aggressive algorithm for structured learning. In: Sun, M., Zhang, M., Lin, D., Wang, H. (eds.) CCL and NLP-NABD 2013. LNCS, vol. 8202, pp. 120–130. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP26280080.

Author information

Authors and Affiliations

Tokyo Institute of Technology, Yokohama, Japan
Yuta Kikuchi, Akihiko Watanabe, Sasano Ryohei, Hiroya Takamura & Manabu Okumura

Authors

Yuta Kikuchi
View author publications
You can also search for this author in PubMed Google Scholar
Akihiko Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Sasano Ryohei
View author publications
You can also search for this author in PubMed Google Scholar
Hiroya Takamura
View author publications
You can also search for this author in PubMed Google Scholar
Manabu Okumura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuta Kikuchi .

Editor information

Editors and Affiliations

Cardiff University, Cardiff, United Kingdom
Richard Booth
Southeast University , Nanjing, China
Min-Ling Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kikuchi, Y., Watanabe, A., Ryohei, S., Takamura, H., Okumura, M. (2016). Learning from Numerous Untailored Summaries. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-42911-3_17
Published: 10 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics