Abstract
Technology and science are often perceived as polar extremes with respect to spoken language. Speech applications rarely incorporate scientific insight and conversely, basic research is often viewed as oblivious to practical concerns of the real world. Melding phonetic insight with speech technology can, however, yield extremely productive results for both applications and basic science if performed within the appropriate theoretical framework. Such an approach is illustrated with respect to the relation between prosodic (stress accent) and phonetic properties of conversational telephone dialogues (American English) using the Switchboard corpus. Phonetic properties, such as vocalic identity and duration, are shown to reflect prosodic phenomena, and thus could be used to enhance the quality of automatic speech recognition performance, as well as provide detailed insight into the nature of spoken language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Beckman, M. Stress and Non-Stress Accent. Dordrecht, Fortis, 1986.
Clark, J. and Yallop, C. Introduction to Phonology and Phonetics. Oxford, Blackwell, 1990.
Cole, R., Fanty, M., Noel, M., and Lander, T. Telephone speech corpus development at CSLU, In: Proceeding of the Third International Conference on Spoken Language Processing 1994.
Darwin, C. Voyage of the Beagle. New York, Collier [reprinted, 1909] 1839.
Darwin, C. On the Origin of Species. Cambridge, MA, Harvard University Press (facsimile of the 1st edition, 1964) 1859.
Fry, D. Experiments in the perception of stress. Language and Speech 1 (1955): 126–152.
Fudge, E. English Word-Stress. London, Allen and Unwin, 1984.
Gimson, A. An Introduction to the Pronunciation of English (3rd ed.). London, Edward Arnold, 1980.
Godfrey, J.J., Holliman, E.C., and McDaniel, J. SWITCHBOARD: Telephone speech corpus for research and development. In: Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, 1992: 517–520.
Greenberg, S. The Switchboard Transcription Project. In Research Report #24, 1996 Large Vocabulary Continuous Speech Recognition Summer Research Workshop Technical Report Series. Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, 1997.
Greenberg, S. Recognition in a new key—Towards a science of spoken language. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1998: 1041–1045.
Greenberg, S. Speaking in shorthand—A syllable-centric perspective for understanding pronunciation variation. Speech Communication 29 (1999): 159–176.
Greenberg, S. Whither speech technology?—A twenty-first century perspective. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech-2001), 2001: 3–6.
Greenberg, S., Carvey, H., and Hitchcock, L. The relation between stress accent and pronunciation variation in spontaneous American English discourse. In: Proceedings of the International Conference on Speech Prosody-2002, 2002.
Greenberg, S. and Chang, S. Linguistic dissection of switchboard-corpus automatic speech recognition systems. In: Proceedings of the ISCA Workshop on Automatic Speech Recognition: Challenges for the New Millennium, 2000: 195–202.
Hitchcock, L. Acoustic Properties of Vocalic Nuclei Associated with Prosodic Stress Accent in Spontaneous American English Discourse, Undergraduate Honors Thesis, Department of Linguistics, University of California, Berkeley, 2001. (available from http://www.icsi.berkeley.edu/steveng/prosody).
Hitchcock, L. and Greenberg, S. Vowel height is intimately associated with stress-accent in spontaneous American English discourse. In: 7th European Conference on Speech Communication and Technology (Eurospeech-2001), 2001: 79–82.
Jakobson, R., Fant, G., and Halle, M. Preliminaries to Speech Anlysis: The Distinctive Features and Their Correlates. Cambridge, MA, MIT Press, 1961.
Koumpis, K. and Renals, S. The role of prosody in a voicemail summarization system. In: Proceedings of the ISCA Workshop on Prosody in Speech Recognition and Understanding, 2001: 93–98.
Kuijk, D. and van and Boves, L. Acoustic characteristics of lexical prominence in continuous telephone speech. Speech Communication 27 (1999): 95–111.
Ladefoged, P. A Course in Phonetics (3rd ed.). New York, Harcourt, 1993.
Lehiste, I. Suprasegmentals. Cambridge, MA, MIT Press, 1970.
Lehiste, I. Suprasegmental features of speech. In: N. Lass (ed.). Principles of Experimental Phonetics, St. Louis, Mosby, 1996: 226–244.
Lindblom, B. Spectrographic study of vowel reduction. Journal of the Acoustical Society of America 35 1963: 1773–1781.
Lindblom, B. Explaining phonetic variation: A sketch of the H and H theory. In: W.J. Hardcastle and A. Marchal (eds.), Speech Production and Speech Modelling, Dordrecht, Kluwer, 1990: 403–439.
Lovejoy, A.O. The Great Chain of Being. Cambridge, MA, Harvard University Press, 1939.
Öhman, S.E.G. Coarticulation in VCV-utterances: Spectrographic measurements. Journal of the Acoustical Society of America 39 1965: 151–168.
Popper, K. The Logic of Scientific Discovery. London, Hutchinson. [originally published in German, 1934] 1959.
Ries, A. and Ries, L. The 22 Immutable Laws of Branding New York, Harper, 1998.
Ritchie, D. (ed.) Rashomon. New Brunswick, NJ, Rutgers University Press, 1987.
Silipo, R. and Greenberg, S. Automatic transcription of prosodic prominence for spontaneous English discourse. In: Proceedings of the XIVth International Congress of Phonetic Sciences, 1999: 2351–2354.
Silipo, R. and Greenberg, S. Prosodic stress revisited: Reassessing the role of fundamental frequency. In: Proceedings of the NIST Speech Transcription Workshop 2000.
Weiner, J. The Beak of the Finch. New York, Knopf, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer
About this chapter
Cite this chapter
Greenberg, S. (2005). From Here to Utility. In: Barry, W.J., van Dommelen, W.A. (eds) The Integration of Phonetic Knowledge in Speech Technology. Text, Speech and Language Technology, vol 25. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2637-4_7
Download citation
DOI: https://doi.org/10.1007/1-4020-2637-4_7
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2635-5
Online ISBN: 978-1-4020-2637-9
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)