Constraints on learning disjunctive, unidimensional auditory and phonetic categories
Abstract
Phonetic categories must be learned, but the processes that allow that learning to unfold are still under debate. The current study investigates constraints on the structure of categories that can be learned and whether these constraints are speech-specific. Category structure constraints are a key difference between theories of category learning, which can roughly be divided into instance-based learning (i.e., exemplar only) and abstractionist learning (i.e., at least partly rule-based or prototype-based) theories. Abstractionist theories can relatively easily accommodate constraints on the structure of categories that can be learned, whereas instance-based theories cannot easily include such constraints. The current study included three groups to investigate these possible constraints as well as their speech specificity: English speakers learning German speech categories, German speakers learning German speech categories, and English speakers learning musical instrument categories, with each group including participants who learned different sets of categories. Both speech groups had greater difficulty learning disjunctive categories (ones that require an “or” statement) than nondisjunctive categories, which suggests that instance-based learning alone is insufficient to explain the learning of the participants learning phonetic categories. This fact was true for both novices (English speakers) and experts (German speakers), which implies that expertise with the materials used cannot explain the patterns observed. However, the same was not true for the musical instrument categories, suggesting a degree of domain-specificity in these constraints that cannot be explained through recourse to expertise alone.
Keywords
Category learning Categorization Speech perception PhoneticsNotes
Acknowledgments
This work was supported by a National Science Foundation Graduate Research Fellowship award and a University of Maryland, College Park Graduate School Flagship Fellowship, to C.C.H., as well as NSF IGERT Grant 0801465 (PI: C. Phillips), an NSF Linguistics Doctoral Dissertation Research Improvement grant awarded to W.J.I. (co-PI: CCH), BCS 1650791, the Maryland Language Science Center, and the University of Maryland–University of Tübingen International Interdisciplinary Research and Teaching Collaboration. Data from this experiment will be available online, in line with the Open Science Framework, for participants who consented to sharing data. We thank Peter Deaville, Stephen DeVilbiss, Scott Kaplowitz, Priyanka Konanur, Zoe Schlueter, and the members of the UMD Language Development Lab for their help in running English-speaking participants for this project. We thank Andrea Weber’s lab at Eberhard Karls University, Tübingen, for their space and support for the German-speaking participants, particularly Sara Beck, Ann-Kathrin Grohe, Lisa Kienzle, and Sarah Schwarz. Michael Key generously allowed us to use his German fricative stimuli for this study.
Many people have provided discussion and insightful comments during the development and publication process in this study. Among others, we would like to thank Ann Bradlow, Al Braun, Catherine Carr, Bharath Chandrasekaran, Jeff Chrabaszcz, Karthik Durvasula, Naomi Feldman, Drew Hendrickson, Lori Holt, Ellen Lau, Todd Maddox, Holger Mitterer, Emily Myers, Chris Neufeld, Rob Nosofsky, Janet Pierrehumbert, Eva Reinisch, Kirsten Smayda, and Joe Toscano. Jared Linck gave immensely useful statistical support. Portions of this research were presented at the eighty-eighth annual Meeting of the Linguistics Society of America (LSA) in Minneapolis, Minnesota, travel to which was financed by the Department of Linguistics at the University of Maryland, College Park, at the sixth annual Michigan State Undergraduate Linguistics Conference (MSULC) in East Lansing, Michigan, travel to which was funded by the Department of Linguistics and Germanic, Slavic, Asian, and African Languages of Michigan State University, and at the Mini-Workshop on Phonetic Processing and Learning at Eberhard Karls University, Tübingen in Tübingen, Baden-Württemberg, Germany.
References
- Ashby, F. G., & Alfonso-Reese, L. A. (1995). Categorization as probability density estimation. Journal of Mathematical Psychology, 39(2), 216–233.Google Scholar
- Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105(3), 442–481. https://doi.org/10.1037/0033-295X.105.3.442 Google Scholar
- Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14(1), 33–53. https://doi.org/10.1037/0278-7393.14.1.33 Google Scholar
- Ashby, F. G., & Maddox, W. T. (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37(3), 372–400. https://doi.org/10.1006/jmps.1993.1023 Google Scholar
- Ashby, F. G., & Townsend, J. T. (1986). Varieties of perceptual independence. Psychological Review, 93(2), 154–179.Google Scholar
- Ashby, F. G., & Waldron, E. M. (1999). On the nature of implicit categorization. Psychonomic Bulletin & Review, 6(3), 363–378. https://doi.org/10.3758/BF03210826 Google Scholar
- Boersma, P., & Weenink, D. (2001). Praat: Doing phonetics by computer. Glot International, 5(9/10), 341–345.Google Scholar
- Braida, L. D., Lim, J. S., Berliner, J. E., Durlach, N. I., Rabinowitz, W. M., & Purks, S. R. (1984). Intensity perception. XIII. Perceptual anchor model of context-coding. Journal of the Acoustical Society of America, 76(3), 722–731. https://doi.org/10.1121/1.391258 Google Scholar
- Buxó-Lugo, A., & Watson, D. G. (2016). Evidence for the influence of syntax on prosodic parsing. Journal of Memory and Language, 90, 1–13. https://doi.org/10.1016/j.jml.2016.03.001 Google Scholar
- Bybee, J. (2002). Phonological evidence for exemplar storage of multiword sequences. Studies in Second Language Acquisition, 24(2), 215–221. https://doi.org/10.1017/S0272263102002061 Google Scholar
- Carroll, J. D., & Chang, J.-J. J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3), 283–319. https://doi.org/10.1007/BF02310791 Google Scholar
- Chandrasekaran, B., Koslov, S. R., & Maddox, W. T. (2014). Toward a dual-learning systems model of speech category learning. Frontiers in Psychology, 5, 825. https://doi.org/10.3389/fpsyg.2014.00825 Google Scholar
- Chandrasekaran, B., Sampath, P. D., & Wong, P. C. M. (2010). Individual variability in cue-weighting and lexical tone learning. Journal of the Acoustical Society of America, 128(1), 456–465. https://doi.org/10.1121/1.3445785 Google Scholar
- Chandrasekaran, B., Yi, H.-G., & Maddox, W. T. (2014). Dual-learning systems during speech category learning. Psychonomic Bulletin & Review, 21(2), 488–495. https://doi.org/10.3758/s13423-013-0501-5 Google Scholar
- Dahan, D., Drucker, S. J., & Scarborough, R. A. (2008). Talker adaptation in speech perception: Adjusting the signal or the representations? Cognition, 108(3), 710–718. https://doi.org/10.1016/j.cognition.2008.06.003 Google Scholar
- Davis, M. H., & Gaskell, M. G. (2009). A complementary systems account of word learning: Neural and behavioural evidence. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 364(1536), 3773–3800. https://doi.org/10.1098/rstb.2009.0111 Google Scholar
- Diehl, R. L. (2000). Searching for an auditory description of vowel categories. Phonetica, 57(2/4), 267–274. https://doi.org/10.1159/000028479 Google Scholar
- Diehl, R. L. (2008). Acoustic and auditory phonetics: The adaptive design of speech sound systems. Philosophical Transactions of the Royal Society, B: Biological Sciences, 363(1493), 965–978. https://doi.org/10.1098/rstb.2007.2153 Google Scholar
- Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55, 149–179. https://doi.org/10.1146/annurev.psych.55.090902.142028 Google Scholar
- Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105(2), 251–279. https://doi.org/10.1037/0033-295X.105.2.251 Google Scholar
- Goldstone, R. L., & Hendrickson, A. T. (2010). Categorical perception. Wiley Interdisciplinary Reviews: Cognitive Science, 1(1), 69–78. https://doi.org/10.1002/wcs.026 Google Scholar
- Goldstone, R. L., Lippa, Y., & Shiffrin, R. M. (2001). Altering object representations through category learning. Cognition, 78, 27–43. https://doi.org/10.1016/S0010-0277(00)00099-8 Google Scholar
- Goodman, N. D., Tenenbaum, J. B., Feldman, J., & Griffiths, T. L. (2008). A rational analysis of rule-based concept learning. Cognitive Science, 32(1), 108–154. https://doi.org/10.1080/03640210701802071 Google Scholar
- Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech understanding. Journal of Phonetics, 31(3–4), 373–405. https://doi.org/10.1016/j.wocn.2003.09.006 Google Scholar
- Hay, J., Nolan, A., & Drager, K. (2006). From fush to feesh: Exemplar priming in speech perception. Linguistic Review, 23(3), 351–379. https://doi.org/10.1515/TLR.2006.014 Google Scholar
- Hintzman, D. L. (1986). “Schema abstraction” in a multiple-trace memory model. Psychological Review. https://doi.org/10.1037/0033-295X.93.4.411
- Holt, L. L., & Lotto, A. J. (2008). Speech perception wthin an auditory cognitive science framework. Current Directions in Psychological Science, 17(1), 42–46. https://doi.org/10.1111/j.1467-8721.2008.00545.x Google Scholar
- Holt, L. L., Lotto, A. J., & Diehl, R. L. (2004). Auditory discontinuities interact with categorization: Implications for speech perception. Journal of the Acoustical Society of America, 116(3), 1763–1773. https://doi.org/10.1121/1.1778838 Google Scholar
- Homa, D., Cross, J., Cornell, D., Goldman, D., & Shwartz, S. (1973). Prototype abstraction and classification of new instances as a function of number of instances defining the prototype. Journal of Experimental Psychology, 101(1), 116–122. https://doi.org/10.1037/h0035772 Google Scholar
- Johnson, E. K., & Seidl, A. (2008). Clause segmentation by 6-month-old infants: A crosslinguistic perspective. Infancy, 13(5), 440–455. https://doi.org/10.1080/15250000802329321 Google Scholar
- Johnson, K. (2007). Decisions and mechanisms in exemplar-based phonology. In M. J. Sole, P. Speeter Beddor, & M. Ohala (Eds.), Experimental approaches to phonology: In honor of John Ohala (pp. 25–40). New York, NY: Oxford University Press.Google Scholar
- Kassambara, A., Kosinski, M., Biecek, P., & Fabian, S. (2018). survminer: Drawing survival curves using ‘ggplot2’ [Computer software]. Retrieved from https://cran.r-project.org/web/packages/survminer/index.html
- Kemp, C., Perfors, A., & Tenenbaum, J. B. (2007). Learning overhypotheses with hierarchical Bayesian models. Developmental Science, 10(3), 307–321. https://doi.org/10.1111/j.1467-7687.2007.00585.x Google Scholar
- Key, M. (2014). Positive expectation in the processing of allophones. Journal of the Acoustical Society of America, 135(6), EL350–EL356. https://doi.org/10.1121/1.4879669 Google Scholar
- Kingston, J. (2003). Learning foreign vowels. Language and Speech, 46(2/3), 295–349. https://doi.org/10.1177/00238309030460020201 Google Scholar
- Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers. Journal of Memory and Language, 56, 1–15. https://doi.org/10.1016/j.jml.2006.07.010 Google Scholar
- Lin, D. Y., & Wei, L. J. (1989). The robust inference for the Cox proportional hazards model. Journal of the American Statistical Association, 84(408), 1074–1078.Google Scholar
- Lindsay, S., & Gaskell, M. G. (2010). A complementary systems account of word learning in L1 and L2. Language Learning, 60(Suppl. 2), 45–63.Google Scholar
- Lisker, L. (1985). The pursuit of invariance in speech signals. Journal of the Acoustical Society of America, 77(3), 1199–1202. https://doi.org/10.1121/1.392185 Google Scholar
- Livingston, K. R., Andrews, J. K., & Harnad, S. (1998). Categorical perception effects induced by category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(3), 732–753.Google Scholar
- Lotto, A. J., Sato, M., & Diehl, R. L. (2004). Mapping the task for the second language learner: The case of Japanese acquisition of /r/ and /l/. In J. Slifka, S. Manuel, & M. Matthies (Eds.), From sound to sense (pp. C181–C186). Cambridge, MA: MIT Press.Google Scholar
- Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111(2), 309–332. https://doi.org/10.1037/0033-295X.111.2.309 Google Scholar
- Maddox, W. T., & Chandrasekaran, B. (2014). Tests of a dual-system model of speech category learning. Bilingualism: Language and Cognition, 17(4), 709–728. https://doi.org/10.1017/S1366728913000783 Google Scholar
- Maddox, W. T., Chandrasekaran, B., Smayda, K., Yi, H.-G., Koslov, S., & Beevers, C. G. (2014). Elevated depressive symptoms enhance reflexive but not reflective auditory category learning. Cortex, 58, 186–198. https://doi.org/10.1016/j.cortex.2014.06.013 Google Scholar
- Maddox, W. T., Molis, M. R., & Diehl, R. L. (2002). Generalizing a neuropsychological model of visual categorization to auditory categorization of vowels. Perception & Psychophysics, 64(4), 584–597. https://doi.org/10.3758/BF03194728 Google Scholar
- Mair, P., De Leeuw, J., Borg, I., & Groenen, P. J. F. (2016). smacof: Multidimensional scaling [Computer software]. Retrieved from https://cran.r-project.org/web/packages/smacof/index.html
- Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163–170.Google Scholar
- McKinley, S. C., & Nosofsky, R. M. (1995). Investigations of exemplar and decision bound models in large, ill-defined category structures. Journal of Experimental Psychology: Human Perception and Performance, 21(1), 128–148. https://doi.org/10.1037/0096-1523.21.1.128 Google Scholar
- McMurray, B., Aslin, R. N., & Toscano, J. C. (2009). Statistical learning of phonetic categories: Insights from a computational approach. Developmental Science, 12(3), 369–378. https://doi.org/10.1111/j.1467-7687.2009.00822.x Google Scholar
- McMurray, B., & Jongman, A. (2011). What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review, 118(2), 219–246. https://doi.org/10.1037/a0022325 Google Scholar
- Minda, J. P., Desroches, A. S., & Church, B. A. (2008). Learning rule-described and non-rule-described categories: A comparison of children and adults. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(6), 1518–1533. https://doi.org/10.1037/a0013355 Google Scholar
- Moreton, E., Pater, J., & Pertsova, K. (2017). Phonological concept learning. Cognitive Science, 41(1), 4–69. https://doi.org/10.1111/cogs.12319 Google Scholar
- Myers, E. B. (2014). Emergence of category-level sensitivities in non-native speech sound learning. Frontiers in Neuroscience, 8, 238. https://doi.org/10.3389/fnins.2014.00238 Google Scholar
- Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204–238. https://doi.org/10.1016/S0010-0285(03)00006-9 Google Scholar
- Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115(1), 39–61. https://doi.org/10.1037/0096-3445.115.1.39 Google Scholar
- Nosofsky, R. M. (1987). Attention and learning processes in the identification and categorization of intergral stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13(1), 87–108.Google Scholar
- Nosofsky, R. M., Gluck, M. A., Palmeri, T. J., McKinley, S. C., & Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepard, Hovland, and Jenkins (1961). Memory & Cognition, 22(3), 352–369.Google Scholar
- Nosofsky, R. M., & Palmeri, T. J. (1998). A rule-plus-exception model for classifying objects in continuous-dimension spaces. Psychonomic Bulletin & Review, 5(3), 345–369.Google Scholar
- Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Rule-plus-exception model of classification learning. Psychological Review, 101(1), 53–79. https://doi.org/10.1037/0033-295X.101.1.53 Google Scholar
- Pajak, B., & Levy, R. (2014). The role of abstraction in non-native speech perception. Journal of Phonetics, 46(1), 147–160. https://doi.org/10.1016/j.wocn.2014.07.001 Google Scholar
- Palmeri, T. J., Wong, A. C.-N., & Gauthier, I. (2004). Computational approaches to the development of perceptual expertise. Trends in Cognitive Sciences, 8(8), 378–386. https://doi.org/10.1016/j.tics.2004.06.001 Google Scholar
- Peto, R., & Peto, J. (1972). Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society: Series A (General), 135(2), 185–207.Google Scholar
- Pierrehumbert, J. B. (2002). Word-specific phonetics. In C. Gussenhoven & N. Warner (Eds.), Laboratory Phonology 7 (pp. 101–139). Berlin, Germany: Mouton de Gruyter.Google Scholar
- Pierrehumbert, J. B. (2003). Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech, 46(2/3), 115–154. https://doi.org/10.1177/00238309030460020501 Google Scholar
- Pitt, M. A., Dilley, L., & Tat, M. (2011). Exploring the role of exposure frequency in recognizing pronunciation variants. Journal of Phonetics, 39(3), 304–311. https://doi.org/10.1016/j.wocn.2010.07.004 Google Scholar
- Poeppel, D., Idsardi, W. J., & van Wassenhove, V. (2008). Speech perception at the interface of neurobiology and linguistics. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 363(1493), 1071–1086. https://doi.org/10.1098/rstb.2007.2160 Google Scholar
- Port, R. F. (2007). How are words stored in memory? Beyond phones and phonemes. New Ideas in Psychology, 25(2), 145–172. https://doi.org/10.1016/j.newideapsych.2007.02.001 Google Scholar
- Port, R. F. (2010). Rich memory and distributed phonology. Language Sciences, 32(1), 43–55. https://doi.org/10.1016/j.langsci.2009.06.001 Google Scholar
- Pycha, A. (2009). Lengthened affricates as a test case for the phoneticsphonology interface. Journal of the International Phonetic Association, 39(1), 1–31. https://doi.org/10.1017/S0025100308003666 Google Scholar
- Pycha, A. (2010). A test case for the phonetics-phonology interface: Gemination restrictions in Hungarian. Phonology, 27(1), 119–152. https://doi.org/10.1017/S0952675710000059 Google Scholar
- Repp, B. H. (1982). Phonetic trading relations and context effects: New experimental evidence for a speech mode of perception. Psychological Bulletin, 92(1), 81–110. https://doi.org/10.1037/0033-2909.92.1.81 Google Scholar
- Rocamora, M., López, E., & Jure, L. (2009, September). Wind instruments synthesis toolbox for generation of music audio signals with labeled partials. Papter presented at the 12th Brazilian Symposium on Computer Music, Recife, Brazil.Google Scholar
- Rosseel, Y. (2002). Mixture models of categorization. Journal of Mathematical Psychology, 46(2), 178–210. https://doi.org/10.1006/jmps.2001.1379 Google Scholar
- Samuel, A. G. (1982). Phonetic prototypes. Perception & Psychophysics, 31(4), 307–314. https://doi.org/10.3758/BF03202653 Google Scholar
- Scharinger, M., Henry, M. J., & Obleser, J. (2013). Prior experience with negative spectral correlations promotes information integration during auditory category learning. Memory & Cognition, 41(5), 752–768. https://doi.org/10.3758/s13421-013-0294-9 Google Scholar
- Shepard, R. N., Hovland, C. I., & Jenkins, H. M. (1961). Learning and memorization of classifications. Psychological Monographs: General and Applied, 75(13), 1–42. https://doi.org/10.1037/h0093825 Google Scholar
- Slote, J., & Strand, J. F. (2016). Conducting spoken word recognition research online: Validation and a new timing method. Behavior Research Methods, 48(2), 553–566. https://doi.org/10.3758/s13428-015-0599-7 Google Scholar
- Smith, R., & Hawkins, S. (2012). Production and perception of speaker-specific phonetic detail at word boundaries. Journal of Phonetics, 40(2), 213–233. https://doi.org/10.1016/j.wocn.2011.11.003 Google Scholar
- Squire, L. R. (2009). Memory and brain systems: 1969–2009. Journal of Neuroscience, 29(41), 12711–12716. https://doi.org/10.1523/JNEUROSCI.3575-09.2009 Google Scholar
- Therneau, T. M. (2015). survival: A package for survival analysis in S [Computer software]. Retrieved from https://cran.r-project.org/package=survival
- Therneau, T. M., Grambsch, P. M., & Fleming, T. R. (1990). Martingale-based residuals for survival models. Biometrika, 77(1), 147–160.Google Scholar
- Toscano, J. C., & McMurray, B. (2010). Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science, 34(3), 434–464. https://doi.org/10.1111/j.1551-6709.2009.01077.x.Cue Google Scholar
- Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/procedural model. Cognition, 92(1/2), 231–270. https://doi.org/10.1016/j.cognition.2003.10.008 Google Scholar
- Ullman, M. T. (2016). The declarative/procedural model: A neurobiological model of language learning, knowledge, and use. In G. Hickok & S. L. Small (Eds.), Neurobiology of Language (pp. 953–968). Amsterdam, Netherlands: Elsevier.Google Scholar
- Zeithamova, D., & Maddox, W. T. (2006). Dual-task interference in perceptual category learning. Memory & Cognition, 34(2), 387–398.Google Scholar