Acoustic-Phonetic Knowledge Representation: Implications from Spectrogram Reading Experiments

  • Victor W. Zue
Conference paper
Part of the NATO Advanced Study Institutes Series book series (ASIC, volume 88)


This paper presents a summary of several spectrogram reading experiments designed mainly to uncover the amount of phonetic information that is contained in the speech signal. The task involved identifying the phonetic contents of an utterance only from a visual examination of the spectrogram. The results generally support the notion that there is a great deal of phonetic information in the speech signal that can be extracted by the proper application of phonetic rules. From these results, it is argued that phonetic recognition in speech recognition systems can be improved substantially, and that improved phonetic recognition will lead to speech recognition systems of greatly increased complexity and sophistication.


Speech Recognition Speech Signal Speech Sound Speech Recognition System Stop Consonant 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Blumstein, S.E. and Stevens, K.N. (1979) “Acoustic Invariance in Speech Production: Evidence from Measurements of the Spectral Characteristics of Stop Consonants,” J. Acoust. Soc. Am., Vol. 66, No. 4, 1001–1017.CrossRefGoogle Scholar
  2. Cohen, P.S. and Mercer, R.L. (1975) “The Phonological Component of an Automatic Speech Recognition System,” in Speech Recognition: Invited Papers Presented at the 1974 IEEE Symposium, ed. D.R. Reddy, 275–320, (Academic Press, New York).Google Scholar
  3. Cole, R.A. and Zue, V.W. (1980) “Speech as Eyes See It,” Chapter 24 in Attention and Performance VIII, ed. R.S. Nickerson, 475–494 (Lawrence Erlbaum Asso., Hillsdale, New Jersey).Google Scholar
  4. Cole, R.A., Rudnicky, A.I., Zue, V.W., and Reddy, D.R. (1980) “Speech as Patterns on Paper,” Chapter 1 in Perception and Production of Fluent Speech, ed. R.A, Cole, 3–50 (Lawrence Erlbaum Asso., Hillsdale, New Jersey).Google Scholar
  5. Cutler, A. and Foss, D.J. (1977) “On the Role of Sentence Stress in Sentence Processing,” Language and Speech, Vol. 20, 1–10.Google Scholar
  6. Fant, G. (1962) “Descriptive Analysis of the Acoustic Aspects of Speech,” Logos, Vol. 5, 3–17.Google Scholar
  7. Hyde, S.R. (1972) “Automatic Speech Recognition: A Critical Survey and Discussion of the Literature,” in Human Communication: A unified View, edited by E.E. David and P.B. Denes (McGraw-Hill, New York).Google Scholar
  8. Kameny, I. (1975) “Comparison of Formant Spaces of Retroflexed and Nonretroflexed Vowels,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-23, 38–49.CrossRefGoogle Scholar
  9. Kiang, N. S.-Y. (1980) “Processing of Speech by the Auditory Nervous System,” J. Acoust. Soc. Am., Vol. 68, 830–835.CrossRefGoogle Scholar
  10. Klatt, D.H. (1975) “Voice Onset Time, Frication and Aspiration in Word-Initial Consonant Clusters,” J. Speech and Hearing Research, Vol. 18, 686–706.Google Scholar
  11. Klatt, D.H. (1976) “Linguistic Uses of Segmental Duration in English: Acoustic and Perceptual Evidence,” J, Acoust. Soc. Am., Vol. 59, No. 5, 1208–1221.CrossRefGoogle Scholar
  12. Klatt, D.H. (1977) “Review of the ARPA Speech Understanding Project,” J. Acoust. Soc. Am., Vol. 62, No. 6, 1345–1366.CrossRefGoogle Scholar
  13. Klatt, D.H. and Stevens, K.N. (1973) “On the Automatic Recognition of Continuous Speech; Implications from a Spectrogram-Reading Experiment,” IEEE Transactions on Audio and Electroacoustics, AU-21, 210–217.CrossRefGoogle Scholar
  14. Koenig, W., Dunn, H.K., and Lacey, L.Y. (1946) “The Sound Spectrograph,” J. Acoust. Soc. Am., Vol.18, 19–49.CrossRefGoogle Scholar
  15. Lea, W.A. (1980) Trends in Speech Recognition, (Prentice-Hall, Englewood Cliffs, New Jersey).Google Scholar
  16. Liberman, A.M., Cooper, F.S., Shankweiler, D.P., and Studdert-Kennedy, M. (1968) “Why Are Speech Spectrograms Hard to Read?” American Annals for the Deaf, 1968, Vol. 113, 127–133.Google Scholar
  17. Lindblom, B.E.F. and Svensson, S.G. (1973) “Interaction between Segmental and Nonsegmental Factors in Speech Recognition,” IEEE Transactions on Audio and Electroacoustics, AU-21, 536–545.CrossRefGoogle Scholar
  18. Newell, A., Barnett, J., Forgie, J.W., Green, C.C., Klatt, D.H., Licklider, J.C.R., Munson, J., Reddy, D.R., and Woods, W.A. (1973) Speech Understanding Systems: Final Report of a Study Group (North-Holland/American Elsevier, Amsterdam).Google Scholar
  19. Oshika, B.T., Zue, Y.W., Weeks, R.V., Nue, H., and Aurbach, J. (1975) “The Role of Phonological Rules in Speech Understanding Research,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-23, 104–112.CrossRefGoogle Scholar
  20. Potter, R., Kopp, G., and Green, H. (1947) Visible Speech, (van Nostrand, New York).Google Scholar
  21. Seneff, S. (1979) “A Spectrogram Reading Experiment,” Term paper submitted for a Graduate Course on Sound, Speech, and Hearing, Massachusetts Institute of Technology.Google Scholar
  22. Svensson, S.G. (1974) Prosody and Grammar in Speech Perception, Monographs from the Institute of Linguistics, University of Stockholm, (MILOS), Vol. 2.Google Scholar
  23. Umeda, N. (1975) “Vowel Duration in American English,” J. Acoust. Soc. Am., Vol. 58, 434–445.CrossRefGoogle Scholar
  24. Umeda, N. (1977) “Consonant Duration in American English,” J. Acoust. Soc. Am., Vol. 61, 846–858.CrossRefGoogle Scholar
  25. Zue, V.W. (1976) “Acoustic Characteristics of Stop Consonants: A Controlled Study,” Sc.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology; Also published by the University of Indiana Linguistic Club.Google Scholar
  26. Zue, V.W. and Laferriere, M. (1979) “Acoustic Study of Medial /t,d/ in American English,” J. Acoust. Soc. Am., Vol. 66, No. 4, 1039–1050.CrossRefGoogle Scholar
  27. Zue, V.W. and Shattuck-Hufnagel S. (1980) “Palatalization of /s/ in American English: When is a /š7 not a /š/?” J. Acoust. Soc. Am., Vol. 67, S27.CrossRefGoogle Scholar

Copyright information

© D. Reidel Publishing Company, Dordrecht, Holland 1982

Authors and Affiliations

  • Victor W. Zue
    • 1
  1. 1.Department of Electrical Engineering and Computer ScienceMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations