Skip to main content

Autocorrelated Errors in Experimental Data in the Language Sciences: Some Solutions Offered by Generalized Additive Mixed Models

  • Chapter
  • First Online:
Mixed-Effects Regression Models in Linguistics

Abstract

A problem that tends to be ignored in the statistical analysis of experimental data in the language sciences is that responses often constitute time series, which raises the problem of autocorrelated errors. If the errors indeed show autocorrelational structure, evaluation of the significance of predictors in the model becomes problematic due to potential anti-conservatism of p-values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The parametric coefficients suggest that regularity is irrelevant as predictor of naming times, that singulars are named faster than plurals, that words with voiced initial segments have longer naming times, as do words with a large number of words at Hamming distance 1 at the initial segment. Words with a greater Shannon entropy calculated over the probability distribution of their inflectional variants elicited shorter response times. A thin plate regression spline for log-transformed word frequency suggests a roughly U-shaped effect (not shown) for this predictor.

  2. 2.

    For this to work properly, it is necessary to use treatment contrasts for ordinal factors, in R: options(contrasts = c("contr.treatment", "contr.treatment")).

  3. 3.

    The details of the coefficients in the present model differ from those obtained in the analysis of Baayen [1]. Thanks to the factor smooths for subject and compound and the inclusion of a thin plate regression spline for word frequency, the present model provides a better fit (aic 177077.4 versus 187308), suggesting the present reanalysis may provide a more accurate window on sex-specific realizations of compounds’ pitch.

  4. 4.

    Data points with an absolute amplitude exceeding 15 μV, approximately 2.6% of the data points, were removed to obtain an approximately Gaussian response variable.

References

  1. Baayen RH (2013) Multivariate statistics. In: Podesva R, Sharma D (eds) Research methods in linguistics. Cambridge University Press, Cambridge, pp 337–372

    Google Scholar 

  2. Baayen RH, Milin P (2010) Analyzing reaction times. Int J Psychol Res 3:12–28

    Article  Google Scholar 

  3. Baayen R, Vasishth S, Bates D, Kliegl R (2015) Out of the cage of shadows. arxiv.org. http://arxiv.org/abs/1511.03120

  4. Bates D, Mächler M, Bolker B, Walker S (2015) Fitting linear mixed-effects models using lme4. J Stat Softw 67(1):1–48

    Article  Google Scholar 

  5. Broadbent D (1971) Decision and stress. Academic Press, New York

    Google Scholar 

  6. DeCat C, Baayen RH, Klepousniotou E (2014) Electrophysiological correlates of noun-noun compound processing by non-native speakers of English. In: Proceedings of the first workshop on computational approaches to compound analysis (ComAComA 2014). Association for Computational Linguistics and Dublin City University, Dublin, Ireland, pp 41–52

    Chapter  Google Scholar 

  7. DeCat C, Klepousniotou E, Baayen RH (2015) Representational deficit or processing effect? A neuro-psychological study of noun-noun compound processing by very advanced l2 speakers of English. Front Psychol (Lang Sci) 6:77

    Google Scholar 

  8. De Vaan L, Schreuder R, Baayen RH (2007) Regular morphologically complex neologisms leave detectable traces in the mental lexicon. Ment Lexicon 2:1–23

    Article  Google Scholar 

  9. Koesling K, Kunter G, Baayen RH, Plag I (2012) Prominence in triconstituent compounds: pitch contours and linguistic theory. Lang Speech 56(4):529–554

    Article  Google Scholar 

  10. Lin X, Zhang D (1999) Inference in generalized additive mixed models using smoothing splines. J R Stat Soc Ser B 61:381–400

    Article  MathSciNet  MATH  Google Scholar 

  11. Paeschke A, Kienast M, Sendlmeier W (1999) F0-contours in emotional speech. In: Proceedings of the 14th International Congress of Phonetic Sciences, vol 2, pp 929–932

    Google Scholar 

  12. Sanders A (1998) Elements of human performance: reaction processes and attention in human skill. Lawrence Erlbaum, Mahwah, NJ

    Google Scholar 

  13. Tabak W (2010) Semantics and (ir)regular inflection in morphological processing. PhD thesis, University of Nijmegen. Ponsen & Looijen, Ede

    Google Scholar 

  14. Taylor TE, Lupker SJ (2001) Sequential effects in naming: a time-criterion account. J Exp Psychol Learn Mem Cogn 27:117–138.

    Article  Google Scholar 

  15. Traunmüller H, Eriksson A (1995) The frequency range of the voice fundamental in the speech of male and female adults. Institutionen för lingvistik, Stockholms Universitet, S-106 91 Stockholm, Sweden

    Google Scholar 

  16. Trouvain J, Barry WJ (2000) The prosody of excitement in horse race commentaries. In: ISCA tutorial and research workshop (ITRW) on speech and emotion

    Google Scholar 

  17. Welford A (1980) Choice reaction time: basic concepts. In: Welford A (ed) Reaction times. Accademic Press, New York, pp 73–128

    Google Scholar 

  18. Wilkinson G, Rogers C (1973) Symbolic description of factorial models for analysis of variance. Appl Stat 22:392–399

    Article  Google Scholar 

  19. Wood SN (2006) Generalized additive models. Chapman & Hall/CRC, New York

    MATH  Google Scholar 

  20. Wood SN (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc (B) 73:3–36

    Article  MathSciNet  Google Scholar 

  21. Wood SN (2013) On p-values for smooth components of an extended generalized additive model. Biometrika 100:221–228

    Article  MathSciNet  MATH  Google Scholar 

  22. Wood SN (2013) A simple test for random effects in regression models. Biometrika 100:1005–1010

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Harald Baayen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Baayen, R.H., van Rij, J., de Cat, C., Wood, S. (2018). Autocorrelated Errors in Experimental Data in the Language Sciences: Some Solutions Offered by Generalized Additive Mixed Models. In: Speelman, D., Heylen, K., Geeraerts, D. (eds) Mixed-Effects Regression Models in Linguistics. Quantitative Methods in the Humanities and Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-69830-4_4

Download citation

Publish with us

Policies and ethics