Abstract
Ambiguity in the output is a concern for NLG in general. This paper considers the case of structural ambiguity in spoken language generation. We present an algorithm which inserts pauses in spoken text in order to attempt to resolve potential structural ambiguities. This is based on a simple model of the human parser and a characterisation of a subset of places where local ambiguity can arise. A preliminary evaluation contrasts the success of this method with that of some already proposed algorithms for inserting pauses for this purpose.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abney, S., Johnson, M.: Memory Requirements and Local Ambiguities for Parsing Strategies. Journal of Psycholinguistic Research 20(3), 233–250 (1991)
Cutler, A., Dahan, D., van Donselaar, W.: Prosody in the Comprehension of Spoke n Language: A Literature Review. Language and Speech 20(2), 141–201 (1997)
Fitzpatrick, D.: Towards Accessible Technical Documents: Production of Speech and Braille Output from Formatted Documents. PhD thesis, School of Computer Applications, Dublin City University (1999)
Hirschberg, J.: Communication and Prosody: Functional Aspects of Prosody. Speech Communication 36, 31–43 (2002)
Hirschberg, J., Prieto, P.: Training Intonational Phrasing Automatically for English and Spanish Text-to-Speech. Speech Communication 18, 281–290 (1996)
Holm, B., Bailly, G., Laborde, C.: Performance structures of mathematical formulae. In: Proceedings of the International Congress of Phonetic Sciences, San Francisco, USA, pp. 1297–1300 (1999)
Koehn, P., Abney, S., Hirschberg, J., Collins, M.: Improving Intonational Phrasing with Syntactic Information. In: Proceedings of ICASSP 2000 (2000)
Ladd, D.: Intonational Phonology. Cambridge University Press, Cambridge (1996)
Paris, C., Thomas, M., Gilson, R., Kincaid, J.: Linguistuc cues and memory for synthetic and natural speech. Human Factors 42(3), 421–431 (2000)
Pereira, F.: A New Characterisation of Attachment Preferences. In: Dowty, D., Karttunen, L., Zwicky, A. (eds.) Natural Language Parsing, pp. 307–319. Cambridge University Press, Cambridge (1985)
Prevost, S.: An Information Structural Approach To Spoken Language Generation. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 294–301 (1996)
Price, P., Ostendorf, M., Shattuck-Hufnagel, S., Fong, C.: The Use of Prosody in Syntactic Disambiguation. J of the Acoustical Society of America 90(6), 2956–2970 (1991)
Pulman, S.: Grammars, Parsers and Memory Limitations. Language and Cognitive Processes 1(3), 197–225 (1986)
Sanderman, A., Coller, R.: Prosodic Phrasing and Comprehension. Language and Speech 40(4), 391–409 (1997)
Stevens, R.: Principles for the Design of Auditory Interfaces to Present Complex Information to Blind People. PhD thesis, University of York (1996)
Stevens, R., Edwards, A., Harling, P.: Access to Mathematics for Visually Disabl edStudents through Multimodal Interaction. Human-Computer Interaction 12, 47–92 (1997)
Streeter, L.: Acoustic Determinants of Phrase Boundary Perception. J of the Acoustical Society of America 64(6), 1582–1592 (1978)
Theune, M.: From Data to Speech: Language Generation in Context. PhD thesis, University of Eindhoven (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mellish, C. (2004). Resolving Structural Ambiguity in Generated Speech. In: Belz, A., Evans, R., Piwek, P. (eds) Natural Language Generation. INLG 2004. Lecture Notes in Computer Science(), vol 3123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27823-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-27823-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22340-5
Online ISBN: 978-3-540-27823-8
eBook Packages: Springer Book Archive