Conceptual Multi-layer Neural Network Model for Headline Generation

  • Yidi GuoEmail author
  • Heyan Huang
  • Yang Gao
  • Chi Lu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10565)


Neural attention-based models have been widely used recently in headline generation by mapping source document to target headline. However, the traditional neural headline generation models utilize the first sentence of the document as the training input while ignoring the impact of the document concept information on headline generation. In this work, A new neural attention-based model called concept sensitive neural headline model is proposed, which connects the concept information of the document to input text for headline generation and achieves satisfactory results. Besides, we use a multi-layer Bi-LSTM in encoder instead of single layer. Experiments have shown that our model outperforms state-of-the-art systems on DUC-2004 and Gigaword test sets.


Attention-based Concept Multi-layer Bi-LSTM 



The work was supported by National Basic Research Program of China (973 Program, Grant No. 2013CB329303), National Nature Science Foundation of China (Grant No. 61602036), Beijing Advanced Innovation Center for Imaging Technology (BAICIT-2016007).


  1. 1.
    Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: a parse-and-trim approach to headline generation. In: Proceedings of the HLT-NAACL 2003 on Text Summarization Workshop, vol. 5. Association for Computational Linguistics (2003)Google Scholar
  2. 2.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)Google Scholar
  3. 3.
    Cho, K., Merrienboer, B.V., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. Computer Science (2014)Google Scholar
  4. 4.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Computer Science (2014)Google Scholar
  5. 5.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. Computer Science (2015)Google Scholar
  6. 6.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735 (1997)CrossRefGoogle Scholar
  7. 7.
    Parker, R., Graff, D., Kong, J., Chen, K., Maeda, K.: English Gigaword, 5th edn. (2011)Google Scholar
  8. 8.
    Chopra, S., Auli, M., Rush, A.M.: Abstractive sentence summarization with attentive recurrent neural networks. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98 (2016)Google Scholar
  9. 9.
    Flick, C.: ROUGE: a package for automatic evaluation of summaries. In: The Workshop on Text Summarization Branches Out, p. 10 (2004)Google Scholar
  10. 10.
    Over, P., Dang, H., Harman, D.: Duc in context. Inf. Process. Manag. 43(6), 1506–1520 (2007)CrossRefGoogle Scholar
  11. 11.
    Zajic, D., Dorr, B., Schwartz, R.: BBN/UMD at DUC-2004: Topiary. In: Document Understanding Conference at NLT/NAACL, pp. 112–119 (2004)Google Scholar
  12. 12.
    Cohn, T., Lapata, M.: Sentence compression beyond word deletion. In: Proceedings of the International Conference on Computational Linguistics, COLING 2008, Manchester, UK, vol. 163, pp. 137–144, 18–22 August 2008Google Scholar
  13. 13.
    Woodsend, K., Feng, Y., Lapata, M.: Title generation with quasi-synchronous grammar. In: Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, Mit Stata Center, Massachusetts, USA, A Meeting of Sigdat, A Special Interest Group of the ACL, pp. 513–523, 9–11 October 2010Google Scholar
  14. 14.
    Takase, S., Suzuki, J., Okazaki, N., Hirao, T., Nagata, M.: Neural headline generation on abstract meaning representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1054–1059 (2016)Google Scholar
  15. 15.
    Gulcehre, C., Ahn, S., Nallapati, R., Zhou, B., Bengio, Y.: Pointing the unknown words. In: Meeting of the Association for Computational Linguistics, pp. 140–149 (2016)Google Scholar
  16. 16.
    Koehn, P., Hoang, H., Alexandra, B., Callison-Burch, C., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the Association for Computational Linguistics ACL 2007, vol. 9(1), pp. 177–180 (2007)Google Scholar
  17. 17.
    Ayana, S.S., Liu, Z., Sun, M.: Neural headline generation with minimum risk training (2016)Google Scholar
  18. 18.
    Nallapati, R., Zhou, B., Santos, C.N.D., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Beijing Institute of TechnologyBeijingChina
  2. 2.Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing ApplicationsBeijingChina
  3. 3.Beijing Advanced Innovation Center for Imaging TechnologyCapital Normal UniversityBeijingPeople’s Republic of China

Personalised recommendations