Skip to main content

Text Generation: The Problem of Text Structure

  • Chapter
Natural Language Generation Systems

Part of the book series: Symbolic Computation ((1064))

Overview

What is text generation?1 The long term view is that it is the process of creating a technology for building computer programs that can function as authors or speakers. I call this the long term view because in the text generation programs in existence today there is very little that deserves the title of “author” or “speaker.” Writing and speaking are rightly regarded as complex arts capable of high refinement and great intellectual achievement. In contrast, our programs reflect only fragments of the most basic skills.

Text generation has been studied seriously in computational linguistics only in the last five or ten years, and so it is still sorting out its goals and identifying its problems.

Part of the diversity of approaches in text generation will surely come from a problem that we face now: It is by no means clear how authors do what they do. Even though we are all exposed to text nearly every day, and manipulate it successfully, there is very little explicit knowledge of how text works.

One of the central issues involves text organization. It is evident that natural text is organized, and that its organization is essential to its function. Text has parts, arranged systematically. But what is the nature of text organization or structure? What are the parts, and what are the principles of arrangement?

We must have answers to such questions if we are to create text generators. However, there are no widely accepted answers to these questions. The answers that are available from outside of computational linguistics are partial and complex. There are logicians’ answers, grammarians’ answers and so forth, often representing mostly the favorite methods and assumptions of their developers – – a priori selectivity rather than comparative results.

Also, crucially for text generation, there are no accounts of text organization at a level of detail sufficient to support computer programming. Far more detail is needed.

As a result, text generation has been inventing its own answers to these questions. It has had to.2

To explore the nature of text structure, we focus in this paper on two of the energetic attempts within Text Generation to describe text structure in a way that is sufficiently detailed and general to serve as a basis for programming. They are:

  1. 1.

    The TEXT system, and

  2. 2.

    Rhetorical Structure Theory

The TEXT system was developed by Kathy McKeown at the University of Pennsylvania, as the centerpiece of her PhD dissertation, and is being followed up at Columbia University and elsewhere [Paris 86]. Rhetorical Structure Theory (RST) was initially defined by Sandra Thompson, Christian Matthiessen and the author; it is under active development at USC Information Sciences Institute.3

This paper describes each of these lines of research in its own terms; then they are compared as text structure descriptions. The comparison is extended to the related construction processes, and finally conclusions are drawn about text structures in future text generation work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bienkowski, Marie A., A Computational Model for Extemporaneous Elaborations, Cognitive Science Laboratory, Princeton University, Technical Report CSL 1, September 1986.

    Google Scholar 

  2. Fox, Barbara A., Discourse Structure and Anaphora in Written and Conversational English, Ph.D. thesis, UCLA, 1984. To appear through Cambridge University Press.

    Google Scholar 

  3. Grimes, J. E., The Thread of Discourse, Mouton, The Hague, 1975.

    Google Scholar 

  4. Mann, W. C., Discourse Structures for Text Generation, USC/Information Sciences Institute, Technical Report RR-84–127, February 1984. Also appeared in the proceedings of the 1984 Coling/ACL conference, July 1984.

    Google Scholar 

  5. Mann, William C. and Sandra A. Thompson, “Assertions from Discourse Structure,” in Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics Society, Berkeley Linguistic Society, Berkeley, 1985. Also available as ISI/RS-85–155.

    Google Scholar 

  6. Mann, William C. and Sandra A. Thompson, “Relational Propositions in Discourse,”Discourse Processes 9, (1), January-March 1986, 57–90. Also available as ISI/RR-83–115.

    Article  Google Scholar 

  7. Mann, William C. and Thompson, Sandra A., “Rhetorical Structure Theory: A Theory of Text Organization,” in Livia Polanyi (ed.), Discourse Structure, Ablex, Norwood, N.J., 1987. To Appear.

    Google Scholar 

  8. Matthiessen, Christian and Thompson, Sandra A., “The Structure of Discourse and “Subordination”, in Haiman and Thompson (eds.), Clause Combining in Grammar and Discourse, Benjamins, Amsterdam, 1986. To Appear

    Google Scholar 

  9. McDonald, D. D., Natural Language Production as a Process of Decision-Making Under Constraints, Ph.D. thesis, Massachusetts Institute of Technology, Dept. of Electricial Engineering and Computer Science, 1980. To appear as a technical report from the MIT Artificial Intelligence Laboratory.

    Google Scholar 

  10. Kathleen R. McKeown, Studies in Natural Language Processing. Volume 2: Text generation: Using discourse strategies and focus constraints to generate natural language text, Cambridge University Press, Cambridge, 1985.

    Book  Google Scholar 

  11. Noel, Dirk, Towards a Functional Characterization of the News of the BBC World News Service. Antwerp, Belgium, 1986. Antwerp Papers in Linguistics, Number 49.

    Google Scholar 

  12. Paris, Cecile L. and Kathleen R. McKeown, “Discourse Strategies for Descriptions of Complex Physical Objects,” in Gerard Kempen (ed.), Proceedings of the Third International Workshop on Text Generation, Nijmegen, The Netherlands, August 1986.

    Google Scholar 

  13. Thompson, Sandra A. and William C. Mann, “Antithesis: A Study in Clause Combining and Discourse Structure, ” in Nominal Book Title, Nominal Press, 1987. Submitted for inclusion in a festschrift book. Publication to be announced.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1988 Springer-Verlag New York Inc.

About this chapter

Cite this chapter

Mann, W.C. (1988). Text Generation: The Problem of Text Structure. In: McDonald, D.D., Bolc, L. (eds) Natural Language Generation Systems. Symbolic Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-3846-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-3846-1_2

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-8374-4

  • Online ISBN: 978-1-4612-3846-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics