Advertisement

Annotations and Tools for an Activity Based Spoken Language Corpus

  • Jens Allwood
  • Leif Grönqvist
  • Elisabeth Ahlsén
  • Magnus Gunnarsson
Chapter
Part of the Text, Speech and Language Technology book series (TLTB, volume 22)

Abstract

The paper contains a description of the Spoken Language Corpus of Swedish at the Department of Linguistics, Göteborg University (GSLC), and a summary of the various types of analysis and tools that have been developed for work on this corpus. Work on the corpus was started in the late 1970:s. It is incrementally growing and presently consists of 1.3 million words from about 25 different social activities. The corpus was initiated to meet a growing interest in naturalistic spoken language data. It is based on the fact that spoken language varies considerably in different social activities with regard to pronunciation, vocabulary, grammar and communicative functions. The goal of the corpus is to include spoken language from as many social activities as possible to get a more complete understanding of the role of language and communication in human social life. This type of spoken language corpus is still fairly unique even for English, since many spoken language corpora (certainly for Swedish) have been collected for special purposes, like speech recognition, phonetics, dialectal variation or interaction with a computerized dialog system in a very narrow domain, e.g. MapTask (Isard and Carletta 1995), TRAINS (Heeman and Allen 1994), Waxholm (Blomberg et al. 1993).

Keywords

Code Schema Communication Management Dialectal Variation Contrastive Stress Human Social Life 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jens Allwood (1976) Linguistic Communication as Action and Cooperation. “Gothenburg Monographs in Linguistics” 2. Göteborg University, Department of Linguistics, 257 p.Google Scholar
  2. Jens Allwood (1978) On the Analysis of Communicative Action. In “The Structure of Action”, M. Brenner, ed., Basil Blackwell, Oxford, pp. 168–191.Google Scholar
  3. Jens Allwood (1993) Feedback in Second Language Acquisition, In “Adult Language Acquisition. Cross Linguistic Perspectives”, Vol. II. C. Perdue, ed., Cambridge: Cambridge University Press, Cambridge, pp. 37–51.Google Scholar
  4. Jens Allwood (1994) Obligations and Options in Dialogue, Think, Vol 3, May, ITK, Tilburg University, 9–18.Google Scholar
  5. Jens Allwood, ed, (1996 and later editions) Talspråksfrekvenser, Ny och utvidgad upplaga. Gothenburg Papers in Theoretical Linguistics S21. Göteborg University, Department of Linguistics, 418 p.Google Scholar
  6. Jens Allwood (1998) Some Frequency based Differences between Spoken and Written Swedish. In Timo Haukioja, ed., Proceedings of the 16th Scandinavian Conference of Linguistics, Turku University, Department of Linguistics, pp. 18–29.Google Scholar
  7. Jens Allwood, (2000) An Activity Based Approach to Pragmatics. In “Abduction, Belief and Context in Dialogue; Studies in Computational Pragmatics”, H. Bunt, & B. Black, eds., John Benjamins, Amsterdam, pp. 47–80.Google Scholar
  8. Jens Allwood, ed., (2001) Dialog Coding — Function and Grammar: Göteborg Coding Schemas. Gothenburg Papers in Theoretical Linguistics GPTL 85. Göteborg University, Department of Linguistics, 67 p.Google Scholar
  9. Jens Allwood and Johan Hagman (1994) Some Simple Measures of Spoken Interaction. In F. Gregersen, & J. Allwood, eds., “Spoken Language, Proceedings of the XIV Conference of Scandinavian Linguistics”, pp. 3–22.Google Scholar
  10. Jens Allwood, Elisabeth Ahlsen, Joakim Nivre and Staffan Larsson (2001) Own communication management. In J. Allwood, ed., (2001) Dialog Coding — Function and Grammar: Göteborg Coding Schemas. Gothenburg Papers in Theoretical Linguistics GPTL 85. Göteborg University, Department of Linguistics, pp. 45–52.Google Scholar
  11. Jens Allwood, Joakim Nivre and Elisabeth Ahlsén (1990) Speech Management: On the Non-Written Life of Speech. Nordic Journal of Linguistics, 13, 3–48.CrossRefGoogle Scholar
  12. Mats Blomberg, Rolf Carlson, Kjell Elenius, Björn Granström, Jonatan Gustafson, Sheri Hunnicutt, Roger Lindell and Lennart Neovius (1993) An experimental dialogue system: WAXHOLM, “Proceedings of EUROSPEECH 93”, pp 1867–1870.Google Scholar
  13. BNC British National Corpus, Oxford University Computing Services, 13 Banbury Road, Oxford OX2 6NNGoogle Scholar
  14. Mark G. Core and James, F. Allen (1997) Coding Dialogs with the DAMSL Annotation Scheme. In Working Notes of AAAI Fall Symposium on Communicative Action in Humans and Machines, Boston, MA, November 1997.Google Scholar
  15. Laila Dybkjær, Niels Ole Bernsen, Hans Dybkjasr, David McKelvie and Andreas Mengel (1998) The MATE Markup Framework. MATE Deliverable D1.2, November 1998, 15 p.Google Scholar
  16. Frans Gregersen (1991) The Copenhagen Study in Urban Sociolinguistics, 1+2; Reitzel, Copenhagen.Google Scholar
  17. H. Paul Grice (1975). Logic and conversation. In “Syntax and Semantics” Vol. 3: Speech Acts, P. Cole and J. L. Morgan, eds., Seminar Press, New York, pp. 41–58.Google Scholar
  18. Leif Grönqvist (1999) Kodningsvisualisering med Framemaker. Göteborg University, Department of Linguistics, 8 p.Google Scholar
  19. Leif Grönqvist (2000a) The MultiTool User’s Manual. A tool for browsing and synchronizing transcribed dialogues and corresponding video recordings. Göteborg University, Department of Linguistics, 6 p.Google Scholar
  20. Leif Grönqvist (2000b) The TraSA v0.8 Users Manual. A user friendly graphical tool for automatic transcription statistics. Göteborg University, Department of Linguistics, 8 p.Google Scholar
  21. E. Hanssen, T. Hoel, E.H. Jahr, O. Rekdal and G. Wiggen (eds.) (1978) Oslomål.Google Scholar
  22. Peter A. Heeman and James, F. Allen (1994) The TRAINS 93 Dialogues. TRAINS Technical Note 94-2.Google Scholar
  23. Peter Juel Henrichsen (1997) Talesprog med Ansigtsøftning, IAAS, Univ. of Copenhagen, Instrumentalis 10/97 (in Danish), 66 p.Google Scholar
  24. Janet Holmes, Bernadette Vine and Gary Johnson (1998) Guide to the Wellington Corpus of Spoken New Zealand English. Victoria University of Wellington, Wellington.Google Scholar
  25. Amy Isard and Jean Carletta (1995) Transaction and action coding in the Map Task Corpus. Research Paper HCRORP-65, 27 p.Google Scholar
  26. Staffan Larsson (1997) TRACTOR v1.0b1 användarmanual. Göteborg University, Department of Linguistics, 10 p.Google Scholar
  27. Christpher D. Manning and Hinrich Schütze (1999) Foundations of Statistical Natural Language Processing, The MIT Press, Boston, Mass., 620p.Google Scholar
  28. V. Mantha, J. Hamaker, N. Desmulch, A. Ganapathiraju and J. Picone (1999) Improved Monosyllabic WOrd Modeling on SWITCHBOARD. Mississippi State University, Dept. of Electrical & Computer Engineering.Google Scholar
  29. Joakim Nivre (1999a) Transcription Standard. Version 6.2. Göteborg University. Department of Linguistics, 38 p.Google Scholar
  30. Joakim Nivre (1999b) Modifierad StandardOrtografi (MSO) Version 6, Göteborg University, Department of Linguistics, 9 p.Google Scholar
  31. Joakim Nivre, Kristina Tullgren, Jens Allwood, Elisabeth Ahlsén, Jenny Holm, Leif Grönqvist, Dario Lopez-Kästen and Sylvana Sofkova (1998) Towards multimodal spoken language corpora: TransTool and SyncTool. Proceedings of ACL-COLING 1998, June 1998.Google Scholar
  32. Joakim Nivre and Leif Grönqvist (2001) Tagging a corpus of Spoken Swedish. Forthcoming in International Journal of Corpus Linguistics.Google Scholar
  33. Ulla Richthoff (2000) En svensk barnspråkskorpus. Uppbyggnad och analyser. Department of Linguistics, Göteborg University.Google Scholar
  34. Roeland van Hout and Toni Rietveld (1993) Statistical Techniques for the Study of Language and Language Behaviour. Berlin & New York: Mouton de Gruyter, 400 p.Google Scholar
  35. Jan Svartvik (ed.) (1990), The London Corpus of Spoken English: Description and Research. “Lund Studies in English” 82. Lund University Press, 350 p.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2003

Authors and Affiliations

  • Jens Allwood
    • 1
  • Leif Grönqvist
    • 1
  • Elisabeth Ahlsén
    • 1
  • Magnus Gunnarsson
    • 1
  1. 1.Dep. of LinguisticsGöteborgs UniversityGöteborgSweden

Personalised recommendations