Subject Realization in Japanese Conversation by Native and Non-native Speakers: Exemplifying a New Paradigm for Learner Corpus Research

  • Stefan Th. GriesEmail author
  • Allison S. Adelman
Part of the Yearbook of Corpus Linguistics and Pragmatics book series (YCLP, volume 2)


In the field of Learner Corpus Research, Gries and Deshors (Corpora 9(1):109–136, 2014) developed a two-step regression procedure (MuPDAR) to determine how and why choices made by non-native speakers differ from those made by native speakers more comprehensively than traditional learner corpus research allows for. In this chapter, we will extend and test their proposal to determine whether it can also be applied to pragmatic and grammatical phenomena (subject realization/omission in Japanese), and whether it can help study categorical differences between learner and native-speaker choices; we do so by also showing that the more advanced method of mixed-effects modeling can be very fruitfully integrated into the proposed MuPDAR method. The results of our study show that Japanese native speakers’ choices of subject realization are affected by discourse-functional factors such as givenness and contrast of referents and that, while learners are able to handle extreme values of givenness and marked cases of contrast, they still struggle (more) with intermediate degrees of givenness and unmarked/non-contrastive referents. We conclude by discussing the role of MuPDAR in Learner Corpus Research in general and its advantages over traditional corpus analysis in that field and error analysis in particular.


Learner corpora Regression modeling Subject realization Japanese Givenness and contrast 


  1. Aijmer, K. (2005). Modality in advanced Swedish learners’ written interlanguage. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 55–76). Amsterdam/Philadelphia: John Benjamins.Google Scholar
  2. Altenberg, B. (2005). Using bilingual corpus evidence in learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition, and foreign language teaching (pp. 37–54). Amsterdam/Philadelphia: John Benjamins.Google Scholar
  3. Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  4. Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412.CrossRefGoogle Scholar
  5. Bates, D., Maechler, M., & Bolker, B. (2013). lme4. Linear mixed-effects models using S4 classes.
  6. Clancy, P. M. (1980). Referential choice in English and Japanese narrative discourse. In W. Chafe (Ed.), The pear stories: Cognitive and linguistics aspects of narrative production (pp. 127–202). Norwood: Ablex.Google Scholar
  7. Collentine, J., & Asención-Delaney, Y. (2010). A corpus-based analysis of the discourse functions of ser/estar + adjective in three levels of Spanish as FL learners. Language Learning, 60(2), 409–445.CrossRefGoogle Scholar
  8. Cosme, C. (2008). Participle clauses in learner English: The role of transfer. In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking up contrastive and learner corpus research (pp. 177–200). Amsterdam/Atlanta: Rodopi.Google Scholar
  9. Crawley, M. J. (2013). The R book (2nd ed.). Chichester: Wiley.Google Scholar
  10. Doğruöz, A. S., & Gries, S. T. (2012). Spread of on-going changes in an immigrant language: Turkish in the Netherlands. Review of Cognitive Linguistics, 10(2), 401–426.CrossRefGoogle Scholar
  11. Du Bois, J. W. (2006). Representing discourse. Ms., University of California, Santa Barbara.Google Scholar
  12. Faraway, J. J. (2006). Extending the linear model with R: Generalized linear, mixed-effects and non-parametric regression models. Boca Raton: Chapman & Hall/CRC.Google Scholar
  13. Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in contrast. Text-based cross-linguistic studies (pp. 37–51). Lund: Lund University Press.Google Scholar
  14. Granger, S. (2002). A bird’s eye view of learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3–33). Amsterdam/Philadelphia: John Benjamins.CrossRefGoogle Scholar
  15. Granger, S. (2004). Computer learner corpus research: Current status and future prospects. In U. Connor & T. Upton (Eds.), Applied corpus linguistics: A multidimensional perspective (pp. 123–145). Amsterdam: Rodopi.Google Scholar
  16. Gries, S. T., & Deshors, S. C. (2014). Using regressions to explore deviations between corpus data and a standard/target: Two suggestions. Corpora, 9(1), 109–136.CrossRefGoogle Scholar
  17. Gries, S. T., & Wulff, S. (2009). Psycholinguistic and corpus linguistic evidence for L2 constructions. Annual Review of Cognitive Linguistics, 7, 163–186.CrossRefGoogle Scholar
  18. Gries, S. T., & Wulff, S. (2013). The genitive alternation in Chinese and German ESL learners: Towards a multifactorial notion of context in learner corpus research. International Journal of Corpus Linguistics, 18(3), 327–356.CrossRefGoogle Scholar
  19. Harrell, F. E., Jr. (2001). Regression modeling strategies with applications to linear models, logistic regression, and survival analysis. Berlin/New York: Springer.Google Scholar
  20. Hasselgård, H., & Johansson, S. (2012). Learner corpora and contrastive interlanguage analysis. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 33–61). Amsterdam/Philadelphia: John Benjamins.Google Scholar
  21. Hinds, J. (1982). Ellipsis in Japanese. Edmonton: Linguistic Research, Inc.Google Scholar
  22. Hundt, M., & Vogel, K. (2011). Overuse of the progressive in ESL and learner Englishes – Fact or fiction? In J. Mukherjee & M. Hundt (Eds.), Exploring second-language varieties of English and learner Englishes: Bridging a paradigm gap (pp. 145–165). Amsterdam/Philadelphia: John Benjamins.CrossRefGoogle Scholar
  23. Hypermedia Corpus of Spoken Japanese. (2010). Accessed Fall, 2010.
  24. Iwasaki, S. (2002). Japanese. Amsterdam/Philadelphia: John Benjamins.Google Scholar
  25. Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59(4), 434–446.CrossRefGoogle Scholar
  26. Jarvis, S., & Crossley, S. A. (Eds.). (2012). Approaching language transfer through text classification explorations in the detection-based approach. Bristol: Multilingual Matters.Google Scholar
  27. Krzeskowski, T. (1990). Contrasting languages: the scope of contrastive linguistics. Berlin & New York: Mouton de Gruyter.Google Scholar
  28. Kuno, S. (1973). The structure of the Japanese language. Cambridge, MA: MIT Press.Google Scholar
  29. Learner’s Language Corpus of Japanese. (2013). Accessed Spring, 2013.
  30. Miglio, V. G., Gries, S. T., Harris, M. J., Wheeler, E. M., & Santana-Paixão, R. (2013). Spanish lo(s)-le(s) clitic alternations in psych verbs: A multifactorial corpus-based analysis. Somerville: Cascadilla Press.Google Scholar
  31. Neff van Aertselaer, J. A., & Bunce, C. (2012). The use of small corpora for tracing the development of academic literacies. In F. Meunier, S. De Cock, G. Gilquin, & M. Paquot (Eds.), A taste for corpora: In honour of Sylviane Granger (pp. 63–83). Amsterdam/Philadelphia: John Benjamins.Google Scholar
  32. Ono, T., & Thompson, S. A. (1997). Deconstructing ‘Zero Anaphora’ in Japanese. Proceedings of the Annual Meeting of the Berkeley Linguistics Society, 23, 481–491.Google Scholar
  33. Pery-Woodley, M.-P. (1990). Contrasting discourses: Contrastive analysis and a discourse approach to writing. Language Teaching, 23(3), 143–151.Google Scholar
  34. Rogatcheva, S. (2012). Perfect problems: A corpus-based comparison of the perfect in Bulgarian and German EFL writing. In S. Hoffmann, P. Rayson, & G. Leech (Eds.), English corpus linguistics: Looking back, moving forward (pp. 149–163). Amsterdam: Rodopi.Google Scholar
  35. Schütze, C. T. (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: The University of Chicago Press.Google Scholar
  36. Shibatani, M. (1985). Passives and related constructions: A prototype analysis. Language, 61(4), 821–848.CrossRefGoogle Scholar
  37. Takagi, T. (2002). Contextual resources for interferring unexpressed referents in Japanese conversations. Pragmatics, 12(2), 153–182.Google Scholar
  38. Tono, Y. (2004). Multiple comparisons of IL, L1 and TL corpora: The case of L2 acquisition of verb subcategorization patterns by Japanese learners of English. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and language learners (pp. 45–66). Amsterdam/Philadelphia: John Benjamins.CrossRefGoogle Scholar
  39. Zuur, A. F., Ieno, E. N., Walker, N., & Saveliev, A. A. (2009). Mixed effects models and extensions in ecology with R. Berlin/New York: Springer.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of LinguisticsUniversity of California, Santa BarbaraSanta BarbaraUSA

Personalised recommendations