Skip to main content

Evaluating Image-Inspired Poetry Generation

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2019)

Abstract

Creative natural language generation, such as poetry generation, writing lyrics, and storytelling, is appealing but difficult to evaluate. We take the application of image-inspired poetry generation as a showcase and investigate two problems in evaluation: (1) how to evaluate the generated text when there are no ground truths, and (2) how to evaluate nondeterministic systems that output different texts given the same input image. Regarding the first problem, we first design a judgment tool to collect ratings of a few poems for comparison with the inspiring image shown to assessors. We then propose a novelty measurement that quantifies how different a generated text is compared to a known corpus. Regarding the second problem, we experiment with different strategies to approximate evaluating multiple trials of output poems. We also use a measure for quantifying the diversity of different texts generated in response to the same input image, and discuss their merits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boden, M.A.: The Creative Mind: Myths and Mechanisms. Basic Books Inc, New York (1991)

    Google Scholar 

  2. Cheng, W.F., Wu, C.C., Song, R., Fu, J., Xie, X., Nie, J.Y.: Image inspired poetry generation in xiaoice. CoRR abs/1808.03090 (2018)

    Google Scholar 

  3. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990). http://dl.acm.org/citation.cfm?id=89086.89095

  4. Colton, S., Goodwin, J., Veale, T.: Full-face poetry generation. In: Proceedings of the Third International Conference on Computational Creativity (ICCC 2012) (2012)

    Google Scholar 

  5. Deng, F., Siersdorfer, S., Zerr, S.: Efficient jaccard-based diversity analysis of large document collections. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management CIKM 2012, pp. 1402–1411. ACM, New York (2012). https://doi.org/10.1145/2396761.2398445, http://doi.acm.org/10.1145/2396761.2398445

  6. Devlin, J., et al.: Language models for image captioning: the quirks and what works. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (ACL 2017) (2015)

    Google Scholar 

  7. Galley, M., et al.: deltaBLEU: a discriminative metric for generation tasks with intrinsically diverse targets. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 445–450. Association for Computational Linguistics, Beijing, China, July 2015. http://www.aclweb.org/anthology/P15-2073

  8. Ghazvininejad, M., Shi, X., Priyadarshi, J., Knight, K.: Hafez: an interactive poetry generation system. In: Proceedings of ACL 2017, System Demonstrations, pp. 43–48. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-4008, http://aclanthology.coli.uni-saarland.de/pdf/P/P17/P17-4008.pdf

  9. Goncalo Oliveira, H., Hervas, R., Diaz, A., Gervas, P.: Multilingual extension and evaluation of a poetry generator. Nat. Lang. Eng. 23(6), 929–967 (2017). https://doi.org/10.1017/S1351324917000171

    Article  Google Scholar 

  10. Hastie, H., Belz, A.: A comparative evaluation methodology for NLG in interactive systems. In: Calzolari, N. (ed.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (2014)

    Google Scholar 

  11. He, J., Jiang, L., Ming, Z.: Generating Chinese couplets using a statistical MT approach. In: Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING 2008, pp. 377–384. Association for Computational Linguistics, Stroudsburg (2008). http://dl.acm.org/citation.cfm?id=1599081.1599129

  12. Hopkins, J., Kiela, D.: Automatically generating rhythmic verse with neural networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 168–178. Association for Computational Linguistics (2017). https://doi.org/10.18653/v1/P17-1016, http://aclanthology.coli.uni-saarland.de/pdf/P/P17/P17-1016.pdf

  13. Jones, K.S., Galliers, J.R.: Evaluating Natural Language Processing Systems: An Analysis and Review. Springer, New York (1996). https://doi.org/10.1007/BFb0027470

    Book  Google Scholar 

  14. Jordanous, A.: A standardised procedure for evaluating creative systems: computational creativity evaluation based on what it is to be creative. Cogn. Comput. 4(3), 246–279 (2012). https://doi.org/10.1007/s12559-012-9156-1

  15. Lamb, C., Brown, D., Clarke, C.: Evaluating digital poetry: insights from the CAT. In: Proceedings of the Seventh International Conference on Computational Creativity (ICCC 2016). Sony CSL, Paris, France (2016). http://www.computationalcreativity.net/iccc2016/wp-content/uploads/2016/01/Evaluating-digital-poetry.pdf

  16. Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, July 2004. https://www.microsoft.com/en-us/research/publication/rouge-a-package-for-automatic-evaluation-of-summaries/

  17. Liu, B., Fu, J., Kato, M.P., Yoshikawa, M.: Beyond narrative description: generating poetry from images by multi-adversarial training. In: Proceedings of the 26th ACM International Conference on Multimedia MM 2018, pp. 783–791. ACM, New York (2018). https://doi.org/10.1145/3240508.3240587, https://doi.acm.org/10.1145/3240508.3240587

  18. Mellish, C., Dale, R.: Evaluation in the context of natural language generation. Comput. Speech Lang. 12(4), 349–373 (1998). https://doi.org/10.1006/csla.1998.0106, http://www.sciencedirect.com/science/article/pii/S0885230898901061

  19. Oliveira, H.G.: Poetryme: a versatile platform for poetry generation. Comput. Creativity, Concept Invention Gen. Intell. 1, 21 (2012)

    Google Scholar 

  20. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002), pp. 311–318. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1073083.1073135

  21. Potash, P., Romanov, A., Rumshisky, A.: Evaluating Creative Language Generation: The Case of Rap Lyric Ghostwriting. ArXiv e-prints, December 2016

    Google Scholar 

  22. Ritchie, G.: Assessing creativity. In: Proceedings of the AISB01 Symposium on Artificial Intelligence and Creativity in Arts and Science, pp. 3–11 (2001)

    Google Scholar 

  23. Ritchie, G.: Some empirical criteria for attributing creativity to a computer program. Minds Mach. 17(1), 67–99 (2007). https://doi.org/10.1007/s11023-007-9066-2, http://dx.doi.org/10.1007/s11023-007-9066-2

  24. Stent, A., Marge, M., Singhai, M.: Evaluating evaluation methods for generation in the presence of variation. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 341–351. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-30586-6_38

    Chapter  Google Scholar 

  25. van der Velde, F., Wolf, R., Schmettow, M., Nazareth, D.: A Semantic Map for Evaluating Creativity, pp. 94–101. WordPress, June 2015

    Google Scholar 

  26. Wang, Q., Luo, T., Wang, D.: Can machine generate traditional chinese poetry? a feigenbaum test. In: Liu, C.-L., Hussain, A., Luo, B., Tan, K.C., Zeng, Y., Zhang, Z. (eds.) BICS 2016. LNCS (LNAI), vol. 10023, pp. 34–46. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49685-6_4

    Chapter  Google Scholar 

  27. Yan, R.: i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. In: Kambhampati, S. (ed.) Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, 9–15 July 2016, pp. 2238–2244. IJCAI/AAAI Press, New York (2016). http://www.ijcai.org/Abstract/16/319

  28. Zhang, M., Hurley, N.: Avoiding monotony: improving the diversity of recommendation lists. In: Proceedings of the 2008 ACM Conference on Recommender Systems RecSys 2008, pp. 123–130. ACM, New York (2008). https://doi.org/10.1145/1454008.1454030, http://doi.acm.org/10.1145/1454008.1454030

  29. Zhang, X., Lapata, M.: Chinese Poetry Generation with Recurrent Neural Networks, pp. 670–680. Association for Computational Linguistics, October 2014

    Google Scholar 

  30. Zhu, X., Xu, Z., Khot, T.: How creative is your writing? a linguistic creativity measure from computer science and cognitive psychology perspectives. In: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity CALC 2009, pp. 87–93. Association for Computational Linguistics, Stroudsburg (2009). http://dl.acm.org/citation.cfm?id=1642011.1642023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruihua Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, CC., Song, R., Sakai, T., Cheng, WF., Xie, X., Lin, SD. (2019). Evaluating Image-Inspired Poetry Generation. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11838. Springer, Cham. https://doi.org/10.1007/978-3-030-32233-5_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32233-5_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32232-8

  • Online ISBN: 978-3-030-32233-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics