Spelling errors respect morphology: a corpus study of Hebrew orthography
The paper aims to account for linguistic and processing factors responsible for the incidence of spelling errors in Hebrew. The theoretical goal is to disentangle a complex interaction between morphology, phonology, and orthography in production of written words. We focused on a specific spelling error in Hebrew: an overt representation of the word-internal segment/i/by the letter Y (י). This Y-insertion goes against the prescriptive spelling rules (cf. substandard MYRPST מירפסת vs conventional MRPST מרפסת,/miʁpeset/‘balcony’) and yet in our data it affects 25% of nouns with an appropriate phonological environment. Corpus analyses of unedited texts further revealed that errors proliferated in lower-frequency words, but their occurrence was much less likely if it would disrupt a morphological unit. These results point to morphology and statistical patterns of language use in Hebrew as major mechanisms driving orthographic learning: the paper discusses repercussions of our findings for theories of reading.
KeywordsSpelling Morphology Corpus study Hebrew Orthography
Victor Kuperman’s contribution was partially supported by the Canadian Natural Sciences and Engineering Research Council of Canada Discovery Grant RGPIN/402395-2012 415 (Kuperman, PI), the Ontario Early Researcher award (Kuperman, PI), the Canada Research Chair (Tier 2; Kuperman, PI), the Social Sciences and Humanities Research Council of Canada Partnership Training Grant 895-2016-1008 (Libben, PI), the Canada Foundation for Innovation Leaders Opportunity Fund (Kuperman, PI), and the Lady Davis Visiting Professorship at the Hebrew University of Jerusalem.
- Baayen, R. H. (2001). Word frequency distributions (Vol. 18). Berlin: Springer.Google Scholar
- Bosman, A. M. T., & Van Orden, G. C. (1997). Why spelling is more difficult than reading. In C. A. Perfetti, L. Rieben, & M. Fayol (Eds.), Learning to spell: Research, theory, and practice across languages (pp. 173–194). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
- Ernestus, M. T. C. (2000). Voice assimilation and segment reduction in casual Dutch: A corpus-based study of the phonology-phonetics interface. Ph.D. dissertation, Vrije Universiteit, Amsterdam (LOT Series 36).Google Scholar
- Ernestus, M., & Baayen, R. H. (2006). The functionality of incomplete neutralization in Dutch: The case of past-tense formation. Laboratory Phonology, 8(1), 27–49.Google Scholar
- Fox, J. (2015). Applied regression analysis and generalized linear models. Thousand Oaks: Sage Publications.Google Scholar
- Fox, J., & Weisberg, S. (2011). An R companion to applied regression. Thousand Oaks: Sage Publications.Google Scholar
- Hothorn, T., Bretz, F., Westfall, P., Heiberger, R. M., Schuetzenmeister, A., Scheibe, S., & Hothorn, M. T. (2017). Multcomp: Simultaneous inference in general parametric models. R package version 1-4.8. Available at: https://cran.r-project.org/web/packages/multcomp/multcomp.pdf.
- Kandel, S., Álvarez, C. J., & Vallée, N. (2008). Morphemes also serve as processing units in handwriting production. In M. Baciu (Ed.), Neuropsychology and cognition of language. Behavioural, neuropsychological and neuroimaging studies of spoken and written language (pp. 87–100). Kerala: Research Signpost.Google Scholar
- Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2015). ImerTest: tests in linear mixed effects models. R package version 2.0-20. Vienna: R Foundation for Statistical Computing. Available at http://orbit.dtu.dk/files/140635100/lmerTestJStatSoft2017.pdf.Google Scholar
- Linzen, T. (2009). Corpus of blog postings collected from the Israblog website. Tel Aviv: Tel Aviv University.Google Scholar
- Milin, P., Kuperman, V., Kostic, A., & Baayen, R. H. (2009). Words and paradigms bit by bit: An information-theoretic approach to the processing of inection and derivation. In J. Blevins & J. Blevins (Eds.), Analogy in grammar: Form and acquisition (pp. 214–252). Oxford: Oxford University Press.CrossRefGoogle Scholar
- Perfetti, C. A. (1985). Reading ability. New York: Oxford University Press.Google Scholar
- Protopapas, A., Fakou, A., Drakopoulou, S., Skaloumbakas, C., & Mouzaki, A. (2013). What do spelling errors tell us? Classification and analysis of errors made by Greek schoolchildren with and without dyslexia. Reading and Writing, 26(5), 615–646. https://doi.org/10.1007/s11145-012-9378-3.CrossRefGoogle Scholar
- R Core Team. (2015). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Google Scholar
- Ravid, D., & Kubi, E. (2003). What is a spelling error? The discrepancy between perception and reality. Faits de Langue, 22, 87–98.Google Scholar
- Schiff, R., & Ravid, D. (2004). Vowel representation in written Hebrew: Phonological, orthographic and morphological contexts. Reading and Writing, 17, 245–265. https://doi.org/10.1023/B:READ.0000017668.48386.90.CrossRefGoogle Scholar
- Seidenberg, M. S. (2011). Reading in different writing systems: One architecture, multiple solutions. In P. McCardle, B. Miller, J. Lee, & O. Tzeng (Eds.), Dyslexia across languages. Orthography and the brain-gene-behavior link (pp. 151–174). Baltimore, MD: Brookes.Google Scholar
- Shimron, J., & Sivan, T. (1994). Reading proficiency and orthography: Evidence from Hebrew and English. Language Learning, 44, 5–27. https://doi.org/10.1111/j.1467-1770.1994.tb01447.x.CrossRefGoogle Scholar
- Weingarten, R., Nottbusch, G., & Will, U. (2004). Morphemes, syllables and graphemes in written word production. In T. Pechmann & C. Habel (Eds.), Multidisciplinary approaches to language production (pp. 529–572). Berlin, New York: de Gruyter.Google Scholar