A Roadmap to Realization Approaches in Natural Language Generation
- 11 Downloads
Text realization is the most significant step involved in natural language generation. It involves the approaches used to generate syntactically and semantically valid text, given an abstract linguistic representation. Based on the data and nature of data, a typical task generation includes text-to-text generation, database-to-text generation, concept-to-text generation, and speech-to-text generation. There are many approaches of natural language generation to generate texts, usually from non-linguistic structured data, which varies from a canned text approach to methods of learning from a text corpus and generating text based on the characters, content, keywords, size of the text, context, etc. Much work has also been done in learning and generating text mimicking a writing style. For applications like tutoring systems, wherein the text has to be manipulated and validated, we have to rely more on a template-based approach. Machine learning and other probabilistic-based statistical approaches can generate text for applications like report generation, summarization. This paper presents a roadmap and a comparative analysis of various text realization approaches.
KeywordsCanned text Context-free grammar Template-based systems Tree adjoining grammar Lexical functional grammar
- 1.Mann, W. Text generation. American Journal of Computational Linguistics 8(2) April–June 1982.Google Scholar
- 2.Baptist, L., and S. Seneff. 2000. Genesis-II: A versatile system for language generation in conversational system applications. In Sixth international conference on spoken language processing.Google Scholar
- 3.Busemann, Stephan, and Helmut Horacek. 1998. In 9th international workshop on natural language generation, Niagara-on-the-Lake, Canada, August 1998, 238–247. arXiv:cs/9812018 [cs.CL].
- 5.Le, H.T. 2007. A frame-based approach to text generation. In Proceedings of the 21st Pacific Asia conference on language, information and computation, 192–201.Google Scholar
- 6.Sennhauser, L., R. Berwick. Evaluating the ability of LSTMs to Learn context-free grammars. In Proceedings of the 2018 EMNLP workshop blackbox NLP: Analyzing and interpreting neural networks for NLP, 115–124, Brussels, Belgium, 1 Nov 2018.Google Scholar
- 7.Sundararajan, S. 2001. Probabilistic context-free grammars in natural language processing, Joan Bresnan. Lexical-Functional Syntax. Oxford, UK: Blackwell.Google Scholar
- 8.Lareau, F., M. Dras, B. Borschinger, and R. Dale. 2011. Collocations in multilingual natural language generation: Lexical functions meet lexical functional grammar.Google Scholar
- 9.Kahane, S. 2000. How to solve some failures of LTAG. In Proceedings of the fifth international workshop on tree adjoining grammar and related frameworks (TAG + 5), 123–128.Google Scholar
- 11.Kim, Y., Y. Jernite, D. Sontag, and A.M. Rush. 2016. Character-aware neural language models, In Thirtieth AAAI conference on artificial intelligence.Google Scholar
- 12.DeVault, David, David Traum, and Ron Artstein. 2008. Practical grammar-based NLG from examples, INLG 2008.Google Scholar
- 13.Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems.Google Scholar
- 14.Wang, Alex, and Kyunghyun Cho. 2019. BERT has a mouth, and it must speak: BERT as a Markov random field language model. arXiv preprint arXiv:1902.04094. Apr 2019.
- 15.Radford, Alec, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding with unsupervised learning. Technical report, OpenAI.Google Scholar