Skip to main content

Fuzzy Limits: Researching Discourse in the Internet with Corpora

  • Living reference work entry
  • First Online:
Book cover Second International Handbook of Internet Research
  • 223 Accesses

Abstract

Internet has provided us with an amount of linguistic data without precedents. For those who research discourse and communication, it is an unexpected gift with a huge potential. However, this gift comes with important challenges we have to face. First, large corpora make us to use quantitative methods in fields where we were used to qualitative approaches. In order to change it, new strategies are being developed, such as the Corpus Assisted Discourse Studies (Baker et al. Discourse Soc 19(3):273–305, 2008; Partington et al. Patterns and meanings in discourse. John Benjamins Publishing Company, Amsterdam, 2013).

Secondly, traditional units of analysis need to be redefined. Communication through Internet has its own characteristics, and some of them do not fit in previous definitions. There are two main reasons for this regarding discourse analysis. On the one hand, current interactions are multimedia. Video, image, and sound are not necessarily subordinated to text in Internet, and researchers ‘need to look beyond language to better understand how people communicate and interact in digital environments’ (Jewitt. Multimodal analysis. In: Georgakopoulou S (ed) The Routledge handbook of language and digital communication. Routledge, London, 2016). Recent approaches, such as Multimodal Critical Discourse Studies (Machin. Crit Discourse Stud 10:347, 2013), move in this direction.

On the other hand, limits have become fuzzy. Interactions in Internet work in new ways, even when we call them conversations or chats (Alcántara-Plá. Estudios de Lingüística del Español 35(1):214–233, 2014). If we study them with our current units of analysis, these “conversations” will seem fragmentary and unstructured.

In this chapter, we describe these new challenges and the solutions that have been adopted so far, drawing attention to the major problems that still remain unsolved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Alcántara-Plá M (2014) Las unidades discursivas en los mensajes instantáneos de wasap. Estudios de Lingüística del Español 35(1):214–233

    Google Scholar 

  • Alcántara-Plá M (2017) Palabras invasoras. El español de las nuevas tecnologías. Los libros de la catarata, Madrid

    Google Scholar 

  • Alcántara-Plá M, Ruiz-Sánchez A (2017) Not for twitter: migration as a silenced topic in 2015 Spain general election. In: Schröter M, Taylor C (eds) Exploring silence and absence in discourse: empirical approaches. Palgrave Macmillan, London

    Google Scholar 

  • Androutsopoulos J (2011) From variation to heteroglossia in the study of computer-mediated discourse. In: Digital discourse: language in the new media. Oxford University Press, Oxford, pp 277–298

    Google Scholar 

  • Baker P, McEnery T (2015) Corpora and discourse studies: integrating discourse and corpora. Springer, Netherlands, Amsterdam

    Book  Google Scholar 

  • Baker P, Gabrielatos C, Khosravinik M, Krzyzanowski M, McEnery T, Wodak R (2008) A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press. Discourse Soc 19(3):273–305

    Article  Google Scholar 

  • Baron NS (2009) Are instant messages speech? In: International handbook of internet research. Springer, Netherlands, Amsterdam

    Google Scholar 

  • Baron A, Rayson P, Archer D (2009) Word frequency and key word statistics in corpus linguistics. Anglistik 20(1):41–67

    Google Scholar 

  • Bauman R, Briggs CL (1990) Poetics and performance as critical perspectives on language and social life. Annu Rev Anthropol 19:59–88

    Article  Google Scholar 

  • Beesley KR, Karttunen L (2003) Finite-state morphology: xerox tools and techniques. CSLI, Stanford

    Google Scholar 

  • Biber D, Conrad S, Reppen R (1998) Corpus linguistics: investigating language structure and use. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Bolter JD, Grusin R (2000) Remediation: understanding new media. MIT Press, Cambridge

    Google Scholar 

  • Bybee J, Hopper P (eds) (2001) Frequency and the emergence of linguistic structure. John Benjamins, Amsterdam

    Google Scholar 

  • Croft W, Cruse DA (2004) Cognitive linguistics. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Crystal D (2008) Txtng: the gr8 db8. OUP Oxford, Oxford

    Google Scholar 

  • Elm MS (2009) Language deterioration revisited: the extent and function of English content in a Swedish chat room. In: International handbook of internet research. Springer, Netherlands, Amsterdam, pp 437–453

    Google Scholar 

  • Fillmore CJ (1985) Frames and the semantics of understanding. Quaderni di semantica 6(2): 222–254

    Google Scholar 

  • Gee JP (2004) Situated language and learning: a critique of traditional schooling. Routledge, London

    Google Scholar 

  • Gee JP (2015) Discourse analysis of games. In Jones RH, Chik A, Hafner CA (eds) Discourse and digital practices: doing discourse analysis in the digital age. Routledge, London

    Google Scholar 

  • Genosko G (2016) Critical semiotics. Theory, from information to affect. Bloomsbury, London

    Google Scholar 

  • Georgakopoulou A, Spilioti T (eds) (2016) The Routledge handbook of language and digital communication. Routledge, London

    Google Scholar 

  • Gibson J (1979) The ecological approach to visual perception. Houghton Mifflin, Boston

    Google Scholar 

  • Givón T (2005) Context as other minds. John Benjamins Publishing Company, Amsterdam

    Book  Google Scholar 

  • Hafner CA (2015) Co-constructing identity in virtual worlds for children. In: Jones, Chik and Hafner (2015)

    Google Scholar 

  • Halliday MAK (1978) Language as social semiotic: the social interpretation of language and meaning. Edward Arnold, London

    Google Scholar 

  • Halliday MAK, Matthiessen CMIM (2004) An introduction to functional grammar. Arnold, London

    Google Scholar 

  • Heyd T (2016) Digital genres and processes of remediation. In: The Routledge handbook of language and digital communication. Routledge, London

    Google Scholar 

  • Hunston S (2010) Corpus approaches to evaluation: phraseology and evaluative language. Routledge, London

    Google Scholar 

  • Jaworski A, Coupland N (2014) The discourse reader. Routledge, London

    Google Scholar 

  • Jewitt C (2016) Multimodal analysis. In: Georgakopoulou S (ed) The Routledge handbook of language and digital communication. Routledge, London

    Google Scholar 

  • Jones RJ, Chik A, Hafner CA (2015) Discourse and digital practices. Doing discourse analysis in the digital age. Routledge, London

    Google Scholar 

  • Koskenniemi K (1984) A general computational model for word-form recognition and production. In: Proceedings of the 10th international conference on computational linguistics. Association for Computational Linguistics, pp 178–181

    Google Scholar 

  • Kress G (2010) Multimodality: a social semiotic approach to contemporary communication. Routledge, London

    Google Scholar 

  • Kress G, van Leeuwen T (2006) Reading images: a visual grammar of design. Routledge, London

    Google Scholar 

  • Langacker RW (1987) Foundations of cognitive grammar: theoretical prerequisites, vol 1. Stanford University Press, California

    Google Scholar 

  • Machin D (2013) What is multimodal critical discourse studies? Crit Discourse Stud 10:347

    Article  Google Scholar 

  • Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge

    Google Scholar 

  • MODE (2012) Glossary of multimodal terms. https://multimodalityglossary.wordpress.com/. Retrieved 10/10/2017

  • Morley J, Bayley P (2009) Corpus-assisted discourse studies on the Iraq war: wording the war. Routledge, London

    Google Scholar 

  • O’Reilly T (2005) What is Web 2.0. Design patterns and business models for the next generation of software. http://www.oreilly.com/pub/a/web2/archive/what-is-web-20.html. Retrieved 10/10/2017

  • Palmer DD (2000) Tokenisation and sentence segmentation. In: Handbook of natural language processing. Marcel Dekker, New York, pp 11–35

    Google Scholar 

  • Partington A, Duguid A, Taylor C (2013) Patterns and meanings in discourse. John Benjamins Publishing Company, Amsterdam

    Book  Google Scholar 

  • Rafaeli S, Ariel Y (2007) Assessing interactivity in computer-mediated research. In: Joinson AN, McKenna KYA, Postmes T, Reips U-D (eds) The Oxford handbook of internet psychology. Oxford University Press, Oxford

    Google Scholar 

  • Schank RC, Abelson RP (1977) Scripts, plans, goals, and understanding: an inquiry into human knowledge structures. Lawrence Erlbaum Associates, Hillsdale

    Google Scholar 

  • Silverstein M (1992) The indeterminacy of contextualization: when is enough enough. In: Auer P, di Luzio A (eds) The contextualization of language. John Benjamins Publishing Company, Amsterdam, pp 55–75

    Chapter  Google Scholar 

  • Stubbs M (2007) On texts, corpora and models of language. In: Hoey M (ed) Text, discourse and corpora: theory and analysis. A&C Black, London

    Google Scholar 

  • Szudarski P (2017) Corpus linguistics for vocabulary. Routledge, London

    Google Scholar 

  • Widdowson HG (2008) Text, context, pretext: critical Isssues in discourse analysis. Blackwell Publishing Ltd, Oxford

    Google Scholar 

  • Wiedemann G (2016) Text Mining for Qualitative Data Analysis in the social sciences. Springer Fachmedien, Wiesbaden

    Book  Google Scholar 

  • Yates SJ (1996) Oral and written aspects of computer conferencing. In: Herring S (ed) Computer-mediated communication. Linguistic, social and cross-cultural perspectives. John Benjamins Publishing Company, Amsterdam, pp 29–46

    Chapter  Google Scholar 

  • Zanchetta E, Baroni M, Bernardini S (2011) Corpora for the masses: the BootCaT front-end. In: Corpus Linguistics 2011. University of Birmingham, Birmingham

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel Alcántara-Plá .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media B.V., part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Alcántara-Plá, M. (2018). Fuzzy Limits: Researching Discourse in the Internet with Corpora. In: Hunsinger, J., Klastrup, L., Allen, M. (eds) Second International Handbook of Internet Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-1202-4_30-1

Download citation

  • DOI: https://doi.org/10.1007/978-94-024-1202-4_30-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-024-1202-4

  • Online ISBN: 978-94-024-1202-4

  • eBook Packages: Springer Reference Biomedicine and Life SciencesReference Module Biomedical and Life Sciences

Publish with us

Policies and ethics