Constraints on learning disjunctive, unidimensional auditory and phonetic categories

  • Christopher C. HeffnerEmail author
  • William J. Idsardi
  • Rochelle S. Newman
Perceptual/Cognitive Constraints on the Structure of Speech Communication: In Honor of Randy Diehl


Phonetic categories must be learned, but the processes that allow that learning to unfold are still under debate. The current study investigates constraints on the structure of categories that can be learned and whether these constraints are speech-specific. Category structure constraints are a key difference between theories of category learning, which can roughly be divided into instance-based learning (i.e., exemplar only) and abstractionist learning (i.e., at least partly rule-based or prototype-based) theories. Abstractionist theories can relatively easily accommodate constraints on the structure of categories that can be learned, whereas instance-based theories cannot easily include such constraints. The current study included three groups to investigate these possible constraints as well as their speech specificity: English speakers learning German speech categories, German speakers learning German speech categories, and English speakers learning musical instrument categories, with each group including participants who learned different sets of categories. Both speech groups had greater difficulty learning disjunctive categories (ones that require an “or” statement) than nondisjunctive categories, which suggests that instance-based learning alone is insufficient to explain the learning of the participants learning phonetic categories. This fact was true for both novices (English speakers) and experts (German speakers), which implies that expertise with the materials used cannot explain the patterns observed. However, the same was not true for the musical instrument categories, suggesting a degree of domain-specificity in these constraints that cannot be explained through recourse to expertise alone.


Category learning Categorization Speech perception Phonetics 



This work was supported by a National Science Foundation Graduate Research Fellowship award and a University of Maryland, College Park Graduate School Flagship Fellowship, to C.C.H., as well as NSF IGERT Grant 0801465 (PI: C. Phillips), an NSF Linguistics Doctoral Dissertation Research Improvement grant awarded to W.J.I. (co-PI: CCH), BCS 1650791, the Maryland Language Science Center, and the University of Maryland–University of Tübingen International Interdisciplinary Research and Teaching Collaboration. Data from this experiment will be available online, in line with the Open Science Framework, for participants who consented to sharing data. We thank Peter Deaville, Stephen DeVilbiss, Scott Kaplowitz, Priyanka Konanur, Zoe Schlueter, and the members of the UMD Language Development Lab for their help in running English-speaking participants for this project. We thank Andrea Weber’s lab at Eberhard Karls University, Tübingen, for their space and support for the German-speaking participants, particularly Sara Beck, Ann-Kathrin Grohe, Lisa Kienzle, and Sarah Schwarz. Michael Key generously allowed us to use his German fricative stimuli for this study.

Many people have provided discussion and insightful comments during the development and publication process in this study. Among others, we would like to thank Ann Bradlow, Al Braun, Catherine Carr, Bharath Chandrasekaran, Jeff Chrabaszcz, Karthik Durvasula, Naomi Feldman, Drew Hendrickson, Lori Holt, Ellen Lau, Todd Maddox, Holger Mitterer, Emily Myers, Chris Neufeld, Rob Nosofsky, Janet Pierrehumbert, Eva Reinisch, Kirsten Smayda, and Joe Toscano. Jared Linck gave immensely useful statistical support. Portions of this research were presented at the eighty-eighth annual Meeting of the Linguistics Society of America (LSA) in Minneapolis, Minnesota, travel to which was financed by the Department of Linguistics at the University of Maryland, College Park, at the sixth annual Michigan State Undergraduate Linguistics Conference (MSULC) in East Lansing, Michigan, travel to which was funded by the Department of Linguistics and Germanic, Slavic, Asian, and African Languages of Michigan State University, and at the Mini-Workshop on Phonetic Processing and Learning at Eberhard Karls University, Tübingen in Tübingen, Baden-Württemberg, Germany.


  1. Ashby, F. G., & Alfonso-Reese, L. A. (1995). Categorization as probability density estimation. Journal of Mathematical Psychology, 39(2), 216–233.Google Scholar
  2. Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105(3), 442–481. Google Scholar
  3. Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14(1), 33–53. Google Scholar
  4. Ashby, F. G., & Maddox, W. T. (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37(3), 372–400. Google Scholar
  5. Ashby, F. G., & Townsend, J. T. (1986). Varieties of perceptual independence. Psychological Review, 93(2), 154–179.Google Scholar
  6. Ashby, F. G., & Waldron, E. M. (1999). On the nature of implicit categorization. Psychonomic Bulletin & Review, 6(3), 363–378. Google Scholar
  7. Boersma, P., & Weenink, D. (2001). Praat: Doing phonetics by computer. Glot International, 5(9/10), 341–345.Google Scholar
  8. Braida, L. D., Lim, J. S., Berliner, J. E., Durlach, N. I., Rabinowitz, W. M., & Purks, S. R. (1984). Intensity perception. XIII. Perceptual anchor model of context-coding. Journal of the Acoustical Society of America, 76(3), 722–731. Google Scholar
  9. Buxó-Lugo, A., & Watson, D. G. (2016). Evidence for the influence of syntax on prosodic parsing. Journal of Memory and Language, 90, 1–13. Google Scholar
  10. Bybee, J. (2002). Phonological evidence for exemplar storage of multiword sequences. Studies in Second Language Acquisition, 24(2), 215–221. Google Scholar
  11. Carroll, J. D., & Chang, J.-J. J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3), 283–319. Google Scholar
  12. Chandrasekaran, B., Koslov, S. R., & Maddox, W. T. (2014). Toward a dual-learning systems model of speech category learning. Frontiers in Psychology, 5, 825. Google Scholar
  13. Chandrasekaran, B., Sampath, P. D., & Wong, P. C. M. (2010). Individual variability in cue-weighting and lexical tone learning. Journal of the Acoustical Society of America, 128(1), 456–465. Google Scholar
  14. Chandrasekaran, B., Yi, H.-G., & Maddox, W. T. (2014). Dual-learning systems during speech category learning. Psychonomic Bulletin & Review, 21(2), 488–495. Google Scholar
  15. Dahan, D., Drucker, S. J., & Scarborough, R. A. (2008). Talker adaptation in speech perception: Adjusting the signal or the representations? Cognition, 108(3), 710–718. Google Scholar
  16. Davis, M. H., & Gaskell, M. G. (2009). A complementary systems account of word learning: Neural and behavioural evidence. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 364(1536), 3773–3800. Google Scholar
  17. Diehl, R. L. (2000). Searching for an auditory description of vowel categories. Phonetica, 57(2/4), 267–274. Google Scholar
  18. Diehl, R. L. (2008). Acoustic and auditory phonetics: The adaptive design of speech sound systems. Philosophical Transactions of the Royal Society, B: Biological Sciences, 363(1493), 965–978. Google Scholar
  19. Diehl, R. L., Lotto, A. J., & Holt, L. L. (2004). Speech perception. Annual Review of Psychology, 55, 149–179. Google Scholar
  20. Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105(2), 251–279. Google Scholar
  21. Goldstone, R. L., & Hendrickson, A. T. (2010). Categorical perception. Wiley Interdisciplinary Reviews: Cognitive Science, 1(1), 69–78. Google Scholar
  22. Goldstone, R. L., Lippa, Y., & Shiffrin, R. M. (2001). Altering object representations through category learning. Cognition, 78, 27–43. Google Scholar
  23. Goodman, N. D., Tenenbaum, J. B., Feldman, J., & Griffiths, T. L. (2008). A rational analysis of rule-based concept learning. Cognitive Science, 32(1), 108–154. Google Scholar
  24. Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech understanding. Journal of Phonetics, 31(3–4), 373–405. Google Scholar
  25. Hay, J., Nolan, A., & Drager, K. (2006). From fush to feesh: Exemplar priming in speech perception. Linguistic Review, 23(3), 351–379. Google Scholar
  26. Hintzman, D. L. (1986). “Schema abstraction” in a multiple-trace memory model. Psychological Review.
  27. Holt, L. L., & Lotto, A. J. (2008). Speech perception wthin an auditory cognitive science framework. Current Directions in Psychological Science, 17(1), 42–46. Google Scholar
  28. Holt, L. L., Lotto, A. J., & Diehl, R. L. (2004). Auditory discontinuities interact with categorization: Implications for speech perception. Journal of the Acoustical Society of America, 116(3), 1763–1773. Google Scholar
  29. Homa, D., Cross, J., Cornell, D., Goldman, D., & Shwartz, S. (1973). Prototype abstraction and classification of new instances as a function of number of instances defining the prototype. Journal of Experimental Psychology, 101(1), 116–122. Google Scholar
  30. Johnson, E. K., & Seidl, A. (2008). Clause segmentation by 6-month-old infants: A crosslinguistic perspective. Infancy, 13(5), 440–455. Google Scholar
  31. Johnson, K. (2007). Decisions and mechanisms in exemplar-based phonology. In M. J. Sole, P. Speeter Beddor, & M. Ohala (Eds.), Experimental approaches to phonology: In honor of John Ohala (pp. 25–40). New York, NY: Oxford University Press.Google Scholar
  32. Kassambara, A., Kosinski, M., Biecek, P., & Fabian, S. (2018). survminer: Drawing survival curves using ‘ggplot2’ [Computer software]. Retrieved from
  33. Kemp, C., Perfors, A., & Tenenbaum, J. B. (2007). Learning overhypotheses with hierarchical Bayesian models. Developmental Science, 10(3), 307–321. Google Scholar
  34. Key, M. (2014). Positive expectation in the processing of allophones. Journal of the Acoustical Society of America, 135(6), EL350–EL356. Google Scholar
  35. Kingston, J. (2003). Learning foreign vowels. Language and Speech, 46(2/3), 295–349. Google Scholar
  36. Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers. Journal of Memory and Language, 56, 1–15. Google Scholar
  37. Lin, D. Y., & Wei, L. J. (1989). The robust inference for the Cox proportional hazards model. Journal of the American Statistical Association, 84(408), 1074–1078.Google Scholar
  38. Lindsay, S., & Gaskell, M. G. (2010). A complementary systems account of word learning in L1 and L2. Language Learning, 60(Suppl. 2), 45–63.Google Scholar
  39. Lisker, L. (1985). The pursuit of invariance in speech signals. Journal of the Acoustical Society of America, 77(3), 1199–1202. Google Scholar
  40. Livingston, K. R., Andrews, J. K., & Harnad, S. (1998). Categorical perception effects induced by category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(3), 732–753.Google Scholar
  41. Lotto, A. J., Sato, M., & Diehl, R. L. (2004). Mapping the task for the second language learner: The case of Japanese acquisition of /r/ and /l/. In J. Slifka, S. Manuel, & M. Matthies (Eds.), From sound to sense (pp. C181–C186). Cambridge, MA: MIT Press.Google Scholar
  42. Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111(2), 309–332. Google Scholar
  43. Maddox, W. T., & Chandrasekaran, B. (2014). Tests of a dual-system model of speech category learning. Bilingualism: Language and Cognition, 17(4), 709–728. Google Scholar
  44. Maddox, W. T., Chandrasekaran, B., Smayda, K., Yi, H.-G., Koslov, S., & Beevers, C. G. (2014). Elevated depressive symptoms enhance reflexive but not reflective auditory category learning. Cortex, 58, 186–198. Google Scholar
  45. Maddox, W. T., Molis, M. R., & Diehl, R. L. (2002). Generalizing a neuropsychological model of visual categorization to auditory categorization of vowels. Perception & Psychophysics, 64(4), 584–597. Google Scholar
  46. Mair, P., De Leeuw, J., Borg, I., & Groenen, P. J. F. (2016). smacof: Multidimensional scaling [Computer software]. Retrieved from
  47. Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163–170.Google Scholar
  48. McKinley, S. C., & Nosofsky, R. M. (1995). Investigations of exemplar and decision bound models in large, ill-defined category structures. Journal of Experimental Psychology: Human Perception and Performance, 21(1), 128–148. Google Scholar
  49. McMurray, B., Aslin, R. N., & Toscano, J. C. (2009). Statistical learning of phonetic categories: Insights from a computational approach. Developmental Science, 12(3), 369–378. Google Scholar
  50. McMurray, B., & Jongman, A. (2011). What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review, 118(2), 219–246. Google Scholar
  51. Minda, J. P., Desroches, A. S., & Church, B. A. (2008). Learning rule-described and non-rule-described categories: A comparison of children and adults. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(6), 1518–1533. Google Scholar
  52. Moreton, E., Pater, J., & Pertsova, K. (2017). Phonological concept learning. Cognitive Science, 41(1), 4–69. Google Scholar
  53. Myers, E. B. (2014). Emergence of category-level sensitivities in non-native speech sound learning. Frontiers in Neuroscience, 8, 238. Google Scholar
  54. Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204–238. Google Scholar
  55. Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115(1), 39–61. Google Scholar
  56. Nosofsky, R. M. (1987). Attention and learning processes in the identification and categorization of intergral stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13(1), 87–108.Google Scholar
  57. Nosofsky, R. M., Gluck, M. A., Palmeri, T. J., McKinley, S. C., & Glauthier, P. (1994). Comparing models of rule-based classification learning: A replication and extension of Shepard, Hovland, and Jenkins (1961). Memory & Cognition, 22(3), 352–369.Google Scholar
  58. Nosofsky, R. M., & Palmeri, T. J. (1998). A rule-plus-exception model for classifying objects in continuous-dimension spaces. Psychonomic Bulletin & Review, 5(3), 345–369.Google Scholar
  59. Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Rule-plus-exception model of classification learning. Psychological Review, 101(1), 53–79. Google Scholar
  60. Pajak, B., & Levy, R. (2014). The role of abstraction in non-native speech perception. Journal of Phonetics, 46(1), 147–160. Google Scholar
  61. Palmeri, T. J., Wong, A. C.-N., & Gauthier, I. (2004). Computational approaches to the development of perceptual expertise. Trends in Cognitive Sciences, 8(8), 378–386. Google Scholar
  62. Peto, R., & Peto, J. (1972). Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society: Series A (General), 135(2), 185–207.Google Scholar
  63. Pierrehumbert, J. B. (2002). Word-specific phonetics. In C. Gussenhoven & N. Warner (Eds.), Laboratory Phonology 7 (pp. 101–139). Berlin, Germany: Mouton de Gruyter.Google Scholar
  64. Pierrehumbert, J. B. (2003). Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech, 46(2/3), 115–154. Google Scholar
  65. Pitt, M. A., Dilley, L., & Tat, M. (2011). Exploring the role of exposure frequency in recognizing pronunciation variants. Journal of Phonetics, 39(3), 304–311. Google Scholar
  66. Poeppel, D., Idsardi, W. J., & van Wassenhove, V. (2008). Speech perception at the interface of neurobiology and linguistics. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 363(1493), 1071–1086. Google Scholar
  67. Port, R. F. (2007). How are words stored in memory? Beyond phones and phonemes. New Ideas in Psychology, 25(2), 145–172. Google Scholar
  68. Port, R. F. (2010). Rich memory and distributed phonology. Language Sciences, 32(1), 43–55. Google Scholar
  69. Pycha, A. (2009). Lengthened affricates as a test case for the phoneticsphonology interface. Journal of the International Phonetic Association, 39(1), 1–31. Google Scholar
  70. Pycha, A. (2010). A test case for the phonetics-phonology interface: Gemination restrictions in Hungarian. Phonology, 27(1), 119–152. Google Scholar
  71. Repp, B. H. (1982). Phonetic trading relations and context effects: New experimental evidence for a speech mode of perception. Psychological Bulletin, 92(1), 81–110. Google Scholar
  72. Rocamora, M., López, E., & Jure, L. (2009, September). Wind instruments synthesis toolbox for generation of music audio signals with labeled partials. Papter presented at the 12th Brazilian Symposium on Computer Music, Recife, Brazil.Google Scholar
  73. Rosseel, Y. (2002). Mixture models of categorization. Journal of Mathematical Psychology, 46(2), 178–210. Google Scholar
  74. Samuel, A. G. (1982). Phonetic prototypes. Perception & Psychophysics, 31(4), 307–314. Google Scholar
  75. Scharinger, M., Henry, M. J., & Obleser, J. (2013). Prior experience with negative spectral correlations promotes information integration during auditory category learning. Memory & Cognition, 41(5), 752–768. Google Scholar
  76. Shepard, R. N., Hovland, C. I., & Jenkins, H. M. (1961). Learning and memorization of classifications. Psychological Monographs: General and Applied, 75(13), 1–42. Google Scholar
  77. Slote, J., & Strand, J. F. (2016). Conducting spoken word recognition research online: Validation and a new timing method. Behavior Research Methods, 48(2), 553–566. Google Scholar
  78. Smith, R., & Hawkins, S. (2012). Production and perception of speaker-specific phonetic detail at word boundaries. Journal of Phonetics, 40(2), 213–233. Google Scholar
  79. Squire, L. R. (2009). Memory and brain systems: 1969–2009. Journal of Neuroscience, 29(41), 12711–12716. Google Scholar
  80. Therneau, T. M. (2015). survival: A package for survival analysis in S [Computer software]. Retrieved from
  81. Therneau, T. M., Grambsch, P. M., & Fleming, T. R. (1990). Martingale-based residuals for survival models. Biometrika, 77(1), 147–160.Google Scholar
  82. Toscano, J. C., & McMurray, B. (2010). Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science, 34(3), 434–464. Google Scholar
  83. Ullman, M. T. (2004). Contributions of memory circuits to language: The declarative/procedural model. Cognition, 92(1/2), 231–270. Google Scholar
  84. Ullman, M. T. (2016). The declarative/procedural model: A neurobiological model of language learning, knowledge, and use. In G. Hickok & S. L. Small (Eds.), Neurobiology of Language (pp. 953–968). Amsterdam, Netherlands: Elsevier.Google Scholar
  85. Zeithamova, D., & Maddox, W. T. (2006). Dual-task interference in perceptual category learning. Memory & Cognition, 34(2), 387–398.Google Scholar

Copyright information

© The Psychonomic Society, Inc. 2019

Authors and Affiliations

  1. 1.Program in Neuroscience and Cognitive ScienceUniversity of MarylandCollege ParkUSA
  2. 2.Department of LinguisticsUniversity of MarylandCollege ParkUSA
  3. 3.Department of Speech and Hearing SciencesUniversity of MarylandCollege ParkUSA
  4. 4.Department of Speech, Language, and Hearing SciencesUniversity of ConnecticutStorrsUSA

Personalised recommendations