Skip to main content

The Encoding of Spoken Texts

  • Chapter
Text Encoding Initiative

Abstract

There is a great deal of variation in the encoding of spoken texts in electronic form, both with respect to the types of features represented and the way particular features are rendered. This paper surveys problems in the electronic representation of speech and presents the solutions proposed by the Text Encoding Initiative. The special tags needed for the encoding of spoken texts are discussed, including a mechanism for temporal alignment. Further work is needed on phonological aspects, parallel representation, and on the development of software which connects the systematic underlying representation with a workable format for input and display.

Stig Johansson is Professor of English Language at the Department of British and American Studies, University of Oslo. He is co-ordinating secretary of the International Computer Archive of Modern English (ICAME) and editor of the ICAME Journal. Recent publications include Frequency Analysis of English Vocabulary and Grammar (with Knut Hofland, Clarendon Press, 1989) and English Computer Corpora (with Anna-Brita Stenström, Mouton de Gruyter, 1991).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Atkinson, J.M. and J. Heritage, eds. Structures of Social Action: Studies in Conversation Analysis. Cambridge: Cambridge University Press, 1984.

    Google Scholar 

  • Biber, D. Variation across Speech and Writing. Cambridge: Cambridge University Press, 1988.

    Book  Google Scholar 

  • Boase, S. London-Lund Corpus: Example Text and Transcription Guide. Survey of English Usage, University College London, 1990.

    Google Scholar 

  • Coulthard, M. and M. Montgomery, eds. Studies in Discourse Analysis. London: Routledge & Kegan Paul, 1981.

    Google Scholar 

  • Crowdy, S. “The Longman Approach to Spoken Corpus Design”. Manuscript, 1991.

    Google Scholar 

  • Crystal, D. A Dictionary of Linguistics and Phonetics. 3rd ed. Oxford: Blackwell, 1991.

    Google Scholar 

  • Du Bois, J.W., S. Schuetze-Cobum, D. Paolino and S. Dimming. Discourse Transcription. Santa Barbara: University of California, Santa Barbara, 1990.

    Google Scholar 

  • Edwards, J.A. and M.D. Lampert, eds. Talking Data: Transcription and Coding in Discourse Research. Hillsdale, NJ: LAwrence Erlbaum, 1993.

    Google Scholar 

  • Gaylord, H. “Character Sets”. In this volume.

    Google Scholar 

  • Giordano, R. “The TEI Header and the Documentation of Electronic Texts”. In this volume.

    Google Scholar 

  • Johansson, S., L. Burnard, J. Edwards and A. Rosta. “Working Paper on Spoken Texts”. Text Encoding Initiative, Spoken Text Work Group, 1991.

    Google Scholar 

  • Loman, B. and N. Jörgensen. Manual för analys och beskrivning av makrosyntagmer. Lund: Studentlitteratur, 1971.

    Google Scholar 

  • Sinclair, J. and M. Coulthard. Towards an Analysis of Discourse: The English Used by Teachers and Pupils. London: Oxford University Press, 1975.

    Google Scholar 

  • Sperberg-McQueen, C.M. and L. Burnard, eds. Guidelines for the Encoding and Interchange of Machine-readable Texts. Draft version 1.0. Chicago and Oxford: Association for Computers and the Humanities/Association for Computational Linguistics/ Association for Literary and Linguistic Computing, 1990.

    Google Scholar 

  • Sperberg-McQueen, C.M. and L. Burnard, eds. Guidelines for Electronic Text Encoding and Interchange (TEI P3). Chicago and Oxford: Association for Computers and the Humanities/Association for Computational Linguistics/Association for Literary and Linguistic Computing, 1994.

    Google Scholar 

  • Svartvik, J. and R. Quirk, eds. A Corpus of English Conversation. Lund Studies in English 56. Lund: Lund University Press, 1980.

    Google Scholar 

  • Terkel, S. Working. People Talk about What They Do all Day and How They Feel about What They Do. New York: Avon Books, 1975.

    Google Scholar 

  • The White House Transcripts. Submission of Recorded Presidential Conversations to the Committee on the Judiciary of the House of Representatives by President Nixon. By the New York Times Staff for the White House Transcripts. New York: Bantam Books, 1974.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Johansson, S. (1995). The Encoding of Spoken Texts. In: Ide, N., Véronis, J. (eds) Text Encoding Initiative. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-0325-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-0325-1_12

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-0-7923-3704-1

  • Online ISBN: 978-94-011-0325-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics