Abstract
Technology has been used for many decades for phonological research as well as for teaching phonetics, phonology, and pronunciation. However, it is only in the last 15 years that the incorporation of speech technology into linguistic and applied linguistic inquiry has begun to yield major results in research and practice. The purpose of this chapter is to examine advances and new directions in acoustic analysis and speech recognition as they relate to issues of phonology, both from a research perspective of quantifying and measuring segmental phonemes and prosody, and from the practical perspective of using technology to teach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alwang G. (1999). Speech recognition. PC Magazine, 10 November 1999. Cited in Gupta & Schulze (2000).
Anderson-Hsieh, J. (1992). Using electronic visual feedback to teach supraseg- mentals. System, 20, 51–62.
Anderson-Hsieh, J. (1994). Interpreting visual feedback on suprasegmentals in computer assisted pronunciation instruction. CALICO Journal, 11, 5–22.
Bayley, R., & Preston, D.R. (1996). Second language acquisition and linguistic variation. Amsterdam: John Benjamins.
Bernstein, J. (1997). Automatic spoken language assessment by telephone (Technical Report No. 5–97), Menlo Park, CA: Entropic, Inc.
Blake, R. (2000). Review of Roberto’s Restaurant CD-ROM. Language Learning and Technology, 4 (2), 31–6.
Boersma, P. (2001). PRAAT, a system for doing phonetics by computer. Glot International, 5(9/10), 341–345.
Brazil, D. (1997). The communicative value of intonation in English. Cambridge, U.K.: Cambridge University Press.
Browman, C. P., & Goldstein, L. (1986). Towards an articulatory phonology. Phonology Yearbook, 3, 219–252.
Browman, C. P., & Goldstein, L. (1989). Articulatory gestures as phonological units. Phonology, 6, 201–251.
Carey, M. (2004). Visual feedback for pronunciation of vowels: Kay Sona-Match. CALICO Journal, 21, 571–601.
Cauldwell, R. (2002). Streaming speech: Listening and pronunciation for advanced learners of English. In D. Teeler (Ed.), Talking computers (pp. 18–22). Whitstable, U.K.: IATEFL.
Cedergren, H. J. & Sankoff, D. (1974). Variable rules: Performance as a statistical reflection of competence. Language, 50, 333–55.
Chang, S. (2002). A syllable, articulatory-feature, and stress-accent model of speech recognition. Doctoral dissertation, University of California at Berkeley.
Chun, D. M. (1998). Signal analysis software for teaching discourse intonation. Language Learning & Technology, 2, 61–77. Retrieved January 31, 2006, from http://llt.msu.edu/vol2num1/article4.
Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam: John Benjamins.
Chun, D. M. (2005). Review of Streaming Speech. TESOL Quarterly 39, 559–62.
Chun, D. M., Hardison, D. M., & Pennington, M. C. (2004). Technologies for prosody in context: Past and future of L2 research and practice. Paper presented in the Colloquium on the State-of-the-Art of L2 Phonology Research at the Annual Conference of the American Association of Applied Linguistics. Portland, Oregon.
Cosi, P., Cohen, M. A., & Massaro, D. W. (2002). Baldini: Baldi speaks Italian! In J. H. L. Hansen, & B. Pellom (Eds.), International Conference on Spoken Language Processing 2002 (pp. 2349–52). Sydney, Australia: Causal Productions PTY, Ltd.
Couper-Kuhlen, E., & Selting, M. (Eds.) (1996). Prosody in conversation. Cambridge: Cambridge University Press.
Dalby, J. and Kewley-Port, D. (1999). Explicit pronunciation training using automatic speech recognition technology. CALICO Journal, 16, 425–45.
Darhower, M. (2003). Review of Connected Speech, CALICO Journal, 20 (3), 603–12.
de Bot, K. (1983). Visual feedback of intonation I: Effectiveness and induced practice behavior. Language and Speech, 26, 331–50.
de la Vaux, S. K., & Massaro, D. W. (2004). Audiovisual speech gating: Examining information and information processing. Cognitive Process, 5, 106–12.
Delmonte, R. (2000). SLIM prosodic automatic tools for self-learning instruction. Speech Communication, 30, 145–66.
Derwing, T., & Munro, M. J. (2001). What speaking rates do non-native listeners prefer? Applied Linguistics, 22, 324–37.
Derwing, T., Munro, M., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34, 592–603.
Du Bois, J. W., Schuetze-Coburn, S., Paolino, D., & Cumming, S. (1992). Discourse transcription. Santa Barbara Papers in Linguistics, 4. Santa Barbara: Department of Linguistics, University of California at Santa Barbara.
Egan, K. B. (1999). Speaking: A critical skill and a challenge. CALICO Journal, 16, 277–93.
Egbert, J. (2004). Review of Connected Speech, Language Learning and Technology, 8 (1), 24–8.
Ehsani, F., & Knodt, E. (1998). Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning and Technology, 2(1), 45–60.
Eskenazi, M. (1998). Using automatic speech processing for foreign language pronunciation tutoring: Some issues and a prototype. Language Learning and Technology 2(2), 62–76.
Eskenazi, M. (1999). Using a computer in foreign language pronunciation training: What advantages? CALICO Journal, 16, 447–69.
Franco, H., & Neumeyer, L. (1996). Automatic scoring of pronunciation quality for language instruction. Journal of the Acoustical Society of America 100, 2763.
Franco, H., Neumeyer, L., Kim, Y., & Ronen, O. (1997). Automatic pronunciation scoring for language instruction. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Volume 2, pp. 1471–74. Munich.
Galley, M., McKeown, K., Hirschberg, J., & Shriberg, E. (2004). Identifying agreement and disagreement in conversational speech: Use of Bayesian networks to model pragmatic dependencies. To appear in Proceedings of the 42nd Meeting of the ACL, Barcelona.
Greenberg, S. (2005). From here to utility — Melding phonetic insight with speech technology. In W. Barry and W. van Dommelen (Eds.), The integration of phonetic knowledge in speech technology (pp. 107–32). Dordrecht: Kluwer.
Gupta, P., & Schulze, M. (2000). Human language technologies. Retrieved January 31, 2006, from http://www.ict41t.org/en/en_mod3-5.htm.
Hardison, D. M. (1999). Bimodal speech perception by native and nonnative speakers of English: Factors influencing the McGurk effect. Language Learning, 49, 213–83.
Hardison, D. M. (2003). Acquisition of second-language speech: Effects of visual cues, context and talker variability. Applied Psycholinguistics, 24, 495–522.
Hardison, D. M. (2004). Generalization of computer-assisted prosody training: Quantitative and qualitative findings. Language Learning & Technology, 8, 34–52.
Retrieved January 31, 2006, from http://llt.msu.edu/vol8numl/hardison.
Hardison, D. M. (2005). Contextualized computer-based L2 prosody training: Evaluating the effects of discourse context and video input. CALICO Journal, 22, 175–90.
Harless, W. G., Zier, M. A. & Duncan, R. C. (1999). Virtual dialogues with native speakers: The evaluation of an interactive multimedia method. CALICO Journal, 16, 313–37.
Harris, N. (2000). Considerations for use in language training. Retrieved January 31, 2006, from http://www.dyned.com/about/speech.shtml.
Hew, S.-H., & Ohki, M. (2004). Effect of animated graphic annotations and immediate visual feedback in aiding Japanese pronunciation learning: A comparative study. CALICO Journal, 21, 397–420.
Hincks, R. (2003). Speech technologies for pronunciation feedback and evaluation. ReCALL 15(1), 3–20.
Hirschberg, J. (2002). Communication and prosody: Functional aspects of prosody. Speech Communication, 36, 31–43.
Jager, S., Nerbonne, J., & Van Essen, A. (Eds.) (1998). Language teaching and language technology. Lisse: Swets & Zeitlinger.
Jenkins, J. (2004). Research in teaching pronunciation and intonation. Annual Review of Applied Linguistics, 24, 109–25.
Kaltenboeck, G. (2001). A multimedia approach to suprasegmentals: Using a CD- ROM for English intonation teaching. Phonetics Teaching and Learning Conference Proceedings (pp. 19–22). London. Retrieved January 31, 2006 from Error! Hyperlink reference not valid.
Kawai, G., & Hirose, K. (1997). A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents. Proceedings of Eurospeech, the 3rd European Conference on Speech Communication and Technology, 2 (pp. 657–60). Rhodes.
Kawai, G., & Hirose, K. (2000). Teaching the pronunciation of Japanese double- mora phonemes using speech recognition technology. Speech Communication, 30, 131–43.
Kim, Y., Franco, H., & Neumeyer, L. (1997). Automatic pronunciation scoring of specific phone segments for language instruction. Proceedings of Eurospeech, the 3rd European Conference on Speech Communication and Technology, Volume 2 (pp. 645–8). Rhodes.
Kipp, M. (2001). Anvil — A generic annotation tool for multimodal dialogue. Proceedings of Eurospeech, the 7th European Conference on Speech Communication and Technology (pp. 1367–70). Aalborg, Denmark.
Kormos, J., & Dénes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System 32, 145–64.
Labov, W. (1969). Contraction, deletion and inherent variability of the English copula. Language 45, 715–62.
Ladefoged, P. (1996). Elements of acoustic phonetics. Chicago: University of Chicago Press.
Ladefoged, P. (2001). Vowels and consonants. Oxford: Blackwell.
Ladefoged, P. (2004). Phonetic data analysis: An introduction to phonetic fieldwork and instrumental techniques. Oxford: Blackwell.
LaRocca, S., Morgan, J., & Bellinger, S. (1999). On the path to 2X learning: Exploring the possibilities of advanced speech recognition. CALICO Journal, 16, 295–310.
Levis, J. M. (1999). Intonation in theory and practice, revisited. TESOL Quarterly, 33(1), 37–63.
Levis, J. M. and Pickering, L. (2004). Teaching intonation in discourse using speech visualization technology. System, 32, 505–24.
Lion, A. (2004). Review of Streaming Speech, Language Learning and Technology, 8 (2), 23–32.
Llisterri, J. (2004). Computer assisted pronunciation teaching references. Retrieved January 31, 2006, from http://liceu.uab.es=/~joaquim/applied_linguistics/L2_phonetics/CALL_Pron_Bib.html.
Marsi, E. (2001). Intonation in spoken language generation. Doctoral dissertation, University of Nijmegen.
Massaro, D. W. & Light, R. (2003). Read my tongue movements: Bimodal learning to perceive and produce non-native speech/r/and/l/. Proceedings of Eurospeech (Interspeech), 8th European Conference on Speech Communication and Technology. Geneva.
Munro, M. J., & Derwing, T. M. (2001). Modeling perceptions of the accentedness and comprehensibility of L2 speech: The role of speaking rate. Studies in Second Language Acquisition, 23, 451–68.
Neri, A., Cucchiarini, C., Strik, H., & Boves, L. (2002). The pedagogy-technology interface in computer assisted pronunciation training. CALL Journal, 15, 441–67.
Neumeyer, L., Franco, H., Digalakis, V., & Weintraub, M. (2000). Automatic scoring of pronunciation quality. Speech Communication, 30, 83–93.
Pennington, M. C. (1999). Computer-aided pronunciation pedagogy: Promise, limitations, directions. Computer Assisted Language Learning, 12, 427–40.
Pennington, M. C., Ellis, N. C., Lee, Y. P., & Lau, L. (1999). Instructing intonation in a second language: Lessons from a study with Hong Kong Cantonese undergraduate English majors. Unpublished manuscript.
Pennington, M. C. & Esling, J. H. (1996). Computer-assisted development of spoken language skills. In M. C. Pennington (Ed.), The power of CALL (pp. 153–89). Houston: Athelstan.
Petrie, G. M. (2005). Review of Streaming speech: Listening and pronunciation for advanced learners of English, CALICO Journal, 22, 731–40.
Pickering, L. (2001). The role of tone choice in improving ITA communication in the classroom. TESOL Quarterly, 35, 233–55.
Pickering, L. (2002). Patterns of intonation in cross-cultural communication exchange structure in NS TA and ITA classroom discourse. Proceedings of the Seventh Annual Conference on Language, Interaction and Culture (pp. 1–17). University of California at Santa Barbara.
Pickering, L. (2004). The structure and function of intonational paragraphs in native and nonnative instructional discourse. English for Specific Purposes, 23, 19–43.
Pierrehumbert, J. B. (2003), Probabilistic phonology: Discrimination and robustness. In R. Bod, J. Hay & S. Jannedy (Eds.), Probability theory in linguistics (pp. 177–228). Cambridge: MIT Press.
Pitrelli, J. F., Beckman, M. E., & Hirschberg, J. (1994). Evaluation of prosodic transcription labeling reliability in the ToBI framework. International Conference on Spoken Language Processing, Volume 1 (pp. 123–6). Yokohama.
Rixon, S. (2004). Review of Streaming Speech. Modem English Teacher, 77–8.
Rypa, M. E., & Price, P. (1999). VILTS: A tale of two technologies. CALICO Journal, 16 (3), 385–404.
Setter, J. (2003). Review of Streaming Speech: Listening and pronunciation for advanced learners of English. Journal of the International Phonetic Association, 33, 240–44.
Shriberg, E., & Stolcke, A. (2004a). Direct modeling of prosody: An overview of applications in automatic speech processing. Proceedings of the International Conference on Speech Prosody (pp. 1–8). Nara, Japan.
Shriberg, E., & Stolcke, A. (2004b). Prosody modeling for automatic speech recognition and understanding. In M. Johnson, M. Ostendorf, S. Khudanpur, & R. Rosenfeld (Eds.), Mathematical foundations of speech and language modeling, Volume 138 in IMA Volumes in Mathematics and its Applications (pp. 105–14). New York: Springer-Verlag.
Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., and Hirschberg, J. (1992). ToBI: A standard for labeling English prosody. Proceedings of International Conference on Spoken Language Processing, Volume 2 (pp. 867–70). Banff.
Verhofstadt, K. (2002). A critical analysis of commercial computer-assisted pronunciation materials. Doctoral dissertation, University of Ghent. Retrieved January 31, 2006, from http://members.tripod.com/katrienverhofstadt/.
Wachowicz, K. A., and Scott, B. (1999). Software that listens: It’s not a question of whether, it’s a question of how. CALICO Journal, 16, 253–76.
Weinberg, A., and Knoerr, H. (2003). Learning French pronunciation: Audiocassettes or multimedia? CALICO Journal, 20(2), 315–36.
Weltens, B., & de Bot, K. (1984). Visual feedback of intonation II: Feedback delay and quality of feedback. Language and Speech, 27, 79–88.
Wennerstrom, A. (1997). Discourse intonation and second language acquisition: Three genre-based studies. Doctoral dissertation, University of Washington at Seattle.
Wennerstrom, A. (2000). The role of intonation in second language fluency. In H. Riggenbach (Ed.), Perspectives on fluency (pp. 102–27). Ann Arbor, MI: University of Michigan Press.
Wennerstrom, A. (2001). The music of everyday speech. Oxford: Oxford University Press.
Wichmann, A. (2000). Intonation in text and discourse. Harlow: Longman.
Wilson, D. (2004). Review of Streaming Speech. English Teaching Professional, 32, 46.
Witt, S., & Young, S. (1997). Language learning based on non-native speech recognition. Proceedings of Eurospeech, the 3rd European Conference on Speech Communication and Technology (pp. 633–6). Rhodes.
Editor information
Editors and Affiliations
Copyright information
© 2007 Dorothy M. Chun
About this chapter
Cite this chapter
Chun, D.M. (2007). Technological advances in researching and teaching phonology. In: Pennington, M.C. (eds) Phonology in Context. Palgrave Advances in Linguistics. Palgrave Macmillan, London. https://doi.org/10.1057/9780230625396_11
Download citation
DOI: https://doi.org/10.1057/9780230625396_11
Publisher Name: Palgrave Macmillan, London
Print ISBN: 978-1-4039-3537-3
Online ISBN: 978-0-230-62539-6
eBook Packages: Palgrave Language & Linguistics CollectionEducation (R0)