Machine/Computer Approaches

  • Harry Hollien
Part of the Applied Psycholinguistics and Communication Disorders book series (APCD)

Abstract

The speaker recognition scene changes radically when attempts are made to apply modern technology to the problem. Indeed, with the seeming limitless power of electronic hardware and computers, it appears that solutions are but a step away. Yet such may not be the case. For example, many years have passed since the earliest efforts were made to develop machines that would (1) type letters dictated by voice, (2) automatically translate the speech of one language into another, (3) understand spoken speech and (4) identify a person from voice analysis alone. Authors such as Hecker (40) insist that there are no machines which are both as sensitive and as powerful (for these purposes) as the human ear. What Hecker means by “ear” is, of course, the entire auditory sensory system coupled to the brain, with all its sophisticated memory and cognitive functions. He may be correct in his assumptions, but I do not think so. Hence, the issue I will address in this chapter is: can machines/computers be made to operate at least as efficiently as the auditory system for speaker identification purposes? That is, can they be made to mimic these processes or, if not mimic them, at least parallel the recognition task by some other method? Probably so, but the task is not an easy one.

Keywords

Speech Recognition Speech Signal Speaker Recognition Speech Sample Speaker Verification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Atal, B. S. (1972) Automatic Speaker Recognition Based on Pitch Contours, J. Acoust. Soc. Amer. 52:1687–1697.CrossRefGoogle Scholar
  2. 2.
    Atal, B. S. (1974) Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification, J. Acoust. Soc. Amer. 55:1304–1312.CrossRefGoogle Scholar
  3. 3.
    Atal, B. S. (1976) Automatic Recognition of Speakers from Their Voices, Proceed. IEEE 64:460–475.CrossRefGoogle Scholar
  4. 4.
    Bakis, R. and Dixon, N. R. (1982) Toward Speaker-Independent Recognition-by-Synthesis, IEEE Proceed. ICASSP, 566-569.Google Scholar
  5. 5.
    Basztura, C. S. and Majewski, W. (1978) The Application of Long-Term Analysis of the Zero-Crossing of a Speech Signal in Automatic Speaker Identification, Arch. Acoust. 3:3–15.Google Scholar
  6. 6.
    Becker, R. W, Clarke, F. R., Poza, F. and Young, J. R. (1973) A Semi-Automatic Speaker Recognition System, Research, LEAA, U.S. Dept of Justice, Washington, DC, 1–37.Google Scholar
  7. 7.
    Bobrow, D. C. and Klatt, D. H. (1968) A Limited Speech Recognition System, AFIPS Conf. Proceed. Thompson Book Co., Washington, DC, 33:305–318.Google Scholar
  8. 8.
    Bogner, R. E. (1981) On Talker Verification via Orthogonal Parameters, IEEE Trans. Acoust. Speech Signal Process. ASSP 29:1–12.CrossRefGoogle Scholar
  9. 9.
    Bricker, P. D., Gnanadesikan, R., Mathews, M. V, Pruzansky, S., Tukey P. A., Wachter, K. W. and Warner, J. L. (1971) Statistical Techniques for Talker Identification, Bell System Tech. J. 50:1427–1450.Google Scholar
  10. 10.
    Bricker, P. D. and Pruzanski, S. (1966) Effects of Stimulus Content and Duration on Talker Identification, J. Acoust. Soc. Amer. 40:1441–1450.CrossRefGoogle Scholar
  11. 11.
    Bunge, E. (1975) Automatic Speaker Recognition by Computers, Proceed., 8th Internat. Cong. Phonetic Sci., Leeds, UK.Google Scholar
  12. 12.
    Bunge, E. (1977) Automatic Speaker Recognition System Auros for Security Systems and Forensic Voice Identification, Proceed., Internat. Conf. Crime Countermeas., Oxford, UK, 1-8.Google Scholar
  13. 13.
    Calinski, T., Jassem, W. and Kaczmarck, Z. (1970) Investigation of Vowel Formant Frequencies as Personal Voice Characteristics by Means of Multivariate Analysis of Variance, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 2:7–40.Google Scholar
  14. 14.
    Carbonell, J. R., Stevens, K. N., Williams, C. E. and Woods, B. (1965) Speaker Identification by a Matching-From-Samples Technique, J. Acoust. Soc. Amer. 40:1205–1206.CrossRefGoogle Scholar
  15. 15.
    Cheun, R. S. (1978) Feature Selection Using Adaptive Learning Network for Text-Independent Speaker Verification, J. Acoust. Soc. Amer. 64:S182.Google Scholar
  16. 16.
    Clarke, F. R. and Becker, R. W. (1969) Comparison of Techniques for Discriminating Among Talkers, J. Speech Hear. Res. 12:747–761.PubMedGoogle Scholar
  17. 17.
    Compton, A. J. (1963) Effects of Filtering and Vocal Duration Upon the Identification of Speakers Aurally, J. Acoust. Soc. Amer. 35:1748–1752.CrossRefGoogle Scholar
  18. 18.
    Das, S. K. and Mohn, W. S. (1969) Pattern Recognition in Speaker Verification, Proceed. Joint Comput. Conf., AFIPS Conf., Mondale, NY, 35:721-732.Google Scholar
  19. 19.
    Das, S. K. and Mohn, W. S. (1971) A Scheme for Speech Processing in Automatic Speaker Verification, IEEE Trans. Audio Electroacoust. AU-19:32–43.CrossRefGoogle Scholar
  20. 20.
    Doddington, G. R. (1970) A Method of Speaker Verification, Unpublished Ph.D. Dissertation, University of Wisconsin.Google Scholar
  21. 21.
    Doddington, G. R. (1980) Whither Speech Recognition? in Trends in Speech Recognition (W. Lea, Ed.), NY, Prentice-Hall, 556–561.Google Scholar
  22. 22.
    Doddington, G. R., Hyrick, B. and Beek, B. (1974) Some Results on Speaker Verification Using Amplitude Spectra, J. Acoust. Soc. Amer. 55:S463.CrossRefGoogle Scholar
  23. 23.
    Doherty, E. T. (1976) An Evaluation of Selected Acoustic Parameters for Use in Speaker Identification, J. Phonetics 4:321–326.Google Scholar
  24. 24.
    Doherty, E. T. and Hollien, H. (1978) Multiple Factor Speaker Identification of Normal and Distorted Speech, J. Phonetics 6:1–8.Google Scholar
  25. 25.
    Edie, J. and Sebestyen, G. S. (1972) Voice Identification General Criteria, Report RADCTDR-62-278, Rome Air Develp. Ctr., Air Force Systems Command, Griffis AFB, NY.Google Scholar
  26. 26.
    Endres, W., Bambach, W. and Flosser, G. (1971) Voice Spectrograms as a Function of Age, Voice Disguise and Voice Imitation, J. Acoust. Soc. Amer. 49:1842–1848.CrossRefGoogle Scholar
  27. 27.
    Everett, S. S. (1985) Automatic Speaker Recognition Using Vocoded Speech, IEEE ICASSP CH 2118:383–386.Google Scholar
  28. 28.
    Feiz, W and DeGeorge, M. (1985) A Speaker Verification System for Access Control, IEEE ICASSP, CH 2118:399–402.Google Scholar
  29. 29.
    Floyd, W (1964) Voice Identification Techniques, Report RADC-TDR-64-312, Rome Air Develp. Ctr., Air Force Systems Command, Griffis AFB, NY.Google Scholar
  30. 30.
    Foodman, M. J. (1981) Experiments in Automatic Speaker Verification, Proceed., Carnahan Conf. Crime Countermeasures, Lexington, KY, May.Google Scholar
  31. 31.
    Furui, S. (1974) An Analysis of Long-Term Variation of Feature Parameters of Speech and Its Application to Talker Recognition, Electronic Comm. Japan A57:880–887.Google Scholar
  32. 32.
    Furui, S. (1978) Effects of Long-Term Spectral Variability on Speaker Recognition, J. Acoust. Soc. Amer. 64:S183.CrossRefGoogle Scholar
  33. 33.
    Goldstein, U. G. (1976) Speaker-Identifying Features Based on Formant Tracks, J. Acoust. Soc. Amer. 59:176–182.CrossRefGoogle Scholar
  34. 34.
    Gubrynowicz, R. (1973) Application of a Statistical Spectrum Analysis to Automatic Voice Identification, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 3:171–180.Google Scholar
  35. 35.
    Hair, G. D. and Rekieta, T. W. (1973) Speaker Identification Research Final Report, Research, U.S. Dept. of Justice, LEAA, Washington, DC, 38–74.Google Scholar
  36. 36.
    Hall, M. (1975) Spectrographic Analysis of Interspeaker and Intraspeaker Variabilities of Professional Mimicry, Unpublished MA Thesis, Michigan State University.Google Scholar
  37. 37.
    Hargreaves, W. A. and Starkweather, J. A. (1963) Recognition of Speaker Identity, Lang., Speech 6:63–67.Google Scholar
  38. 38.
    Hazen, B. M. (1972) Speaker Identification Using Spectrograms Made on Different Sound Spectrographs, Unpublished MA Thesis, State University of New York, Buffalo.Google Scholar
  39. 39.
    Hazen, B. M. (1973) Effects of Differing Phonetic Contexts on Spectrographic Speaker Identification, J. Acoust. Soc. Amer. 54:650–660.CrossRefGoogle Scholar
  40. 40.
    Hecker, M. H. L., Stevens, K. N., von Bismarck, G. and Williams, C. E. (1968) Manifestations of Task-Induced Stress in the Acoustic Speech Signal, J. Acoust. Soc. Amer. 44:993–1001.CrossRefGoogle Scholar
  41. 41.
    Hennessey, J. J. (1970) An Analysis of Voiceprint Identification, Unpublished MA Thesis, Michigan State University.Google Scholar
  42. 42.
    Hollien, H. (1974) The Peculiar Case of “Voiceprints,” J. Acoust. Soc. Amer. 56:210–213.CrossRefGoogle Scholar
  43. 43.
    Hollien H. (1980) Vocal Indicators of Psychological Stress, in Forensic Psychology and Psychiatry (F. Wright, C. Bahn and R. Reiber, Eds.), New York Academy of Sciences, 47–72.Google Scholar
  44. 44.
    Hollien, H. (1985) Natural Speech Vectors in Speaker Identification, Proceed., Speech Tech’ 85, New York, Media Dimensions Inc., 331–334.Google Scholar
  45. 45.
    Hollien, H., Childers, D. G. and Doherty, E. T. (1977) Semi-Automatic Speaker Identification System (SAUSI), Proceed., IEEE, ICASSP 26:768–771.Google Scholar
  46. 46.
    Hollien, H., Geifer, M. P. and Huntley, R. (1990) The Natural Speech Vector Concept in Speaker Identification, Neue Tend. Amgerwandten, Phonetik III, Hamburg, Helmut Buske, Verlag, 62:71–87.Google Scholar
  47. 47.
    Hollien, H., Hicks, J. W., Jr. and Oliver, L. H. (1990) A Semiautomatic System for Speaker Identification, Neue Tend. Amgerwandten, Phonetik III, Hamburg, Helmut Buske, Verlag, 62:88–106.Google Scholar
  48. 48.
    Hollien, H. and McGlone, R. E. (1976) An Evaluation of the “Voiceprint” Technique of Speaker Recognition, Proceed., Carnahan Conf. Crime Counter-measures., 30-45; reprinted in Nat. J. Crim. Def. 2:117-130, 1976 and in Course Handbook, Institute Contin. Legal Ed., Ann Arbor, Michigan, 391-404.Google Scholar
  49. 49.
    Hollien, H. and Majewski, W. (1977) Speaker Identification by Long-Term Spectra Under Normal and Distorted Speech Conditions, J. Acoust. Soc. Amer. 62:975–980.CrossRefGoogle Scholar
  50. 50.
    Hollien, H., Majewski, W. and Hollien, P. A. (1975) Analysis of F0 as a Speaker Identification Technique, Eighth Internat. Cong. Phonetic Sci., Abstract of Papers, 337.Google Scholar
  51. 51.
    Hunt, M. (1983) Further Experiments in Text-Independent Speaker Recognition Over Communications Channels, Proceed. ICASSP, Boston, 563-566.Google Scholar
  52. 52.
    Ichikawa, A., Nakajima, A. and Nakata, K. (1979) Speaker Verification from Actual Telephone Voice, J. Acoust. Soc. Japan 35:63–69.Google Scholar
  53. 53.
    Iles, M. (1972) Speaker Identification as a Function of Fundamental Frequency and Resonant Frequencies, Unpublished Ph.D. Dissertation, University of Florida.Google Scholar
  54. 54.
    Jassem, W. (1968) Formant Frequencies as Cues to Speaker Discrimination, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 1:9-41.Google Scholar
  55. 55.
    Jassem, W., Steffen-Batog, M. and Czajka, S. (1973) Statistical Characteristics of Short-Term Average of Distribution as Personal Voice Features, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 3:209-228.Google Scholar
  56. 56.
    Jesorsky, P. (1977) Principles of Automatic Speaker Recognition in Natural Lang. Comm. with Computers (L. Bolc, Ed.), 1-15.Google Scholar
  57. 57.
    Johnson, C. C., Hollien, H. and Hicks, J. W, Jr. (1984) Speaker Identification Utilizing Selected Temporal Speech Features, J. Phonetics 12:319–327.Google Scholar
  58. 58.
    Kashyap, R. L. (1975) Speaker Recognition from an Unknown Utterance and Speaker Speech Interaction, IEEE Trans. Acoust. Speech Sig. Process. ASSP-24:481–488.Google Scholar
  59. 59.
    Kersta, L. G. (1962) Voiceprint Identification, Nature 196:1253–1257.CrossRefGoogle Scholar
  60. 60.
    Kosiel, U. (1973) Statistical Analysis of Speaker-Dependent Differences in the Long-Term Average Spectrum of Polish Speech, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 3:180-208.Google Scholar
  61. 61.
    Ladefoged, P. and Broadbent, D. E. (1957) Information Conveyed by Vowels, J. Acoust. Soc. Amer. 29:98–104.CrossRefGoogle Scholar
  62. 62.
    LaRiviere, C. L. (1971) Some Acoustic and Perceptual Correlates of Speaker Identification, Unpublished Ph.D. Dissertation, University of Florida.Google Scholar
  63. 63.
    LaRiviere, C. L. (1974) Speaker Identification for Turbulent Portions of Fricatives, Phonetica 29:98–104.Google Scholar
  64. 64.
    LaRiviere, C. L. (1975) Contributions of Fundamental Frequency and Formant Frequencies to Speaker Identification, Phonetica 31:185–197.CrossRefGoogle Scholar
  65. 65.
    Li, K. P., Dammann, J. E. and Chapman, W. D. (1966) Experimental Studies in Speaker Verification Using an Adaptive System, J. Acoust. Soc. Amer. 40:966–978.CrossRefGoogle Scholar
  66. 66.
    Li, K. P. and Wrench, E. H., Jr. (1983) An Approach to Text-Independent Speaker Recognition with Short Utterances, Proceed. ICASSP, Boston, MA, 555-558.Google Scholar
  67. 67.
    Luck, J. E. (1969) Automatic Speaker Verification Using Cepstral Measurements, J. Acoust. Soc. Amer. 46:1026–1032.CrossRefGoogle Scholar
  68. 68.
    Lummis, R. C. (1972a) Implementation of an On-Line Speaker Verification Scheme, J. Acoust. Soc. Amer. 52:S181.CrossRefGoogle Scholar
  69. 69.
    Lummis, R. C. (1972b) Speaker Verification: A Step Toward the ‘Checkless’ Society, Bell Laboratories Record 50:254–259.Google Scholar
  70. 70.
    Lummis, R. C. (1973) Speaker Verification by Computer Using Speech Intensity for Temporal Registration, IEEE Trans. Audio. Electroacoust. AU-21:50–59.Google Scholar
  71. 71.
    Majewski, W and Hollien, H. (1974) Euclidean Distance Between Long-Term Speech Spectra and a Criterion for Speaker Identification Proceed. Speech Comm. Seminar-74, Stockholm, Sweden, 3:303-310.Google Scholar
  72. 72.
    Makhoul, J. and Wolf, J. (1973) The Use of a Two-Pole Linear Prediction Model in Speech Recognition, Bolt, Beranek and Newman Report No. 2537, 1-21.Google Scholar
  73. 73.
    Meeker, W. F. (1967) Speaker Authentication Techniques, Tech. Report ECOM-02526-F, U.S. Army Electronics Command, Ft. Monmouth, NJ.Google Scholar
  74. 74.
    Meltzer, D. and Lehiste, I. (1972) Vowel and Speaker Identification in Natural and Synthetic Speech, J. Acoust. Soc. Amer. 51:S131.CrossRefGoogle Scholar
  75. 75.
    Ney, H. and Giercoff, R. (1982) Speaker Recognition Using a Feature Welding Technique, Proceed. ICASSP, Paris, 1645-1648.Google Scholar
  76. 76.
    Obrecht, D. H. (1975) Fingerprints and Voiceprint Identification, Proceed., Eighth Internal. Cong. Phonetic Sci., Leeds, UK.Google Scholar
  77. 77.
    Preusse, J. W. (1971) Word Recognition and Speaker Authentication Using Amplitude Independent and Time Independent Word Features, Tech. Report, ECOM-3439, U.S. Army Electronics Command, Ft. Monmouth, NJ.Google Scholar
  78. 78.
    Pruzanski, S. (1963) Pattern Matching Procedure for Automatic Talker Recognition, J. Acoust. Soc. Amer. 35:354–358.CrossRefGoogle Scholar
  79. 79.
    Pruzanski, S. and Mathews, M. W. (1964) Talker-Recognition Procedure Based on Analysis of Variance, J. Acoust. Soc. Amer. 36:2041–2047.CrossRefGoogle Scholar
  80. 80.
    Ramishvili, G. S. (1965) Automatic Recognition of Speaking Persons, Report FTG-TT-65-1079, Air Force Systems Command, Wright-Patterson AFB.Google Scholar
  81. 81.
    Ramishvilli, G. S. (1966) Automatic Voice Recognition, Engng. Cybernetics, 5:84–90.Google Scholar
  82. 82.
    Ramishvili, G. S. (1974) Experiments on Automatic Verification of Speakers, Proceed., Second Internal. Conf. Pattern Recognition, Copenhagen, 389-393.Google Scholar
  83. 83.
    Rosenberg, A. E. (1973) Listener Performance in Speaker Verification Tasks, IEEE Trans. Audio, Electroacoust. AU-21:221–225.CrossRefGoogle Scholar
  84. 84.
    Rosenberg, A. E. (1974) A Practical Implementation of an Automatic Speaker Verification System, Proceed., Eighth Internal. Cong. Acoustics, London, 1:268.Google Scholar
  85. 85.
    Rosenberg, A. E. (1975) Evaluation of an Automatic Speaker Verification System Over Telephone Lines, J. Acoust. Soc. Amer. 57:S23.CrossRefGoogle Scholar
  86. 86.
    Rosenberg, A. E. (1976) Automatic Speech Verification: A Review, Proceed., IEEE 64:475–487.CrossRefGoogle Scholar
  87. 87.
    Rothman, H. B. (1975) Perceptual (Aural) and Spectrographic Investigation of Speaker Homogeneity, J. Acoust. Soc. Amer. 58:S107.CrossRefGoogle Scholar
  88. 88.
    Sambur, M. R. (1973) Speaker Recognition and Verification Using Linear Prediction Analysis, QPR No. 108, Massachusetts Institute of Technology, 261-268.Google Scholar
  89. 89.
    Sambur, M. R. (1975) Selection of Acoustic Features for Speaker Identification, IEEE Trans. on Acoustics, Speech and Signal Process. ASSP-23:176–192.CrossRefGoogle Scholar
  90. 90.
    Sambur, M. R. (1976a) Speaker Recognition Using Orthogonal Linear Prediction, IEEE Trans. Acoust. Speech, Signal Process. ASSP-24:283–287.CrossRefGoogle Scholar
  91. 91.
    Sambur, M. R. (1976b) Text-Independent Speaker Recognition Using Orthogonal Linear Prediction, Proceed., IEEE ICASSP, Philadelphia, PA, 727-729.Google Scholar
  92. 92.
    Scarr, R. W. A. (1971) Speech Recognition by Machine—Art or Science? Electronics and Power, 302-307.Google Scholar
  93. 93.
    Schwartz, R., Roncos, S. and Berouti, M. (1982) The Application of Probability Density Estimation to Text-Independent Speaker Identification, Proceed., ICASSP, 1649-1652.Google Scholar
  94. 94.
    Smith, J. E. (1962) Decision-Theoretic Speaker Recognizer, J. Acoust. Soc. Amer. 34:1988.CrossRefGoogle Scholar
  95. 95.
    Steffen-Batog. M., Jassem, W. and Gruszka-Koscielak, H. (1970) Statistical Distribution of Short-Term f0 Values as a Personal Voice Characteristic, in Speech Analysis and Synthesis (W. Jassem, Ed.), Warsaw, Poland, 2:197-208.Google Scholar
  96. 96.
    Stevens, K. N. (1971) Sources of Inter-and Intra-Speaker Variability in the Acoustic Properties of Speech Sounds, Proceed., Seventh Inter. Cong, of Phonetic Sci., Montreal, 206-232.Google Scholar
  97. 97.
    Stevens, K. N., Williams, C. E., Carbonell, J. R. and Woods, D. (1968) Speaker Authentication and Identification: A Comparison of Spectrographic and Auditory Presentation of Speech Materials, J. Acoust. Soc. Amer. 44:1596–1607.CrossRefGoogle Scholar
  98. 98.
    Tarnoczy, T. (1961) Uber Das Individuelle Sprach Spectrum, Proceed, Fourth Inter. Cong. Phonetic Sciences, 259-264.Google Scholar
  99. 99.
    Tosi, O., Oyer, H., Lashbrook, W., Pedrey, C., Nichol, J. and Nash, W. (1972) Experiment on Voice Identification, J. Acoust. Soc. Amer. 51:2030–2043.CrossRefGoogle Scholar
  100. 100.
    Voiers, W (1964) Perceptual Basis of Speaker Identity, J. Acoust. Soc. Amer. 36:1065–1073.CrossRefGoogle Scholar
  101. 101.
    Waldrop, M. M. (1988) A Landmark in Speech Recognition, Science 240:1615.PubMedCrossRefGoogle Scholar
  102. 102.
    Wolf, J. J. (1970) Simulation of the Measurement Phase of an Automatic Speaker Recognition System, J. Acoust. Soc. Amer. 47:S83.CrossRefGoogle Scholar
  103. 103.
    Wolf, J. J. (1972) Efficient Acoustic Parameters for Speaker Recognition, J. Acoust. Soc. Amer. 51:2044–2055.CrossRefGoogle Scholar
  104. 104.
    Wolf, J., Krasner, M., Karnofsky, K., Schwartz, R. and Roucos, S. (1983) Further Investigation of Probabilistic Methods For Text-Independent Speaker Identification, Proceed. ICASSP, 551-554.Google Scholar
  105. 105.
    Young, M. A. and Campbell, R. A. (1967) Effects of Context on Talker Identification, J. Acoust. Soc. Amer. 42:1250–1254.CrossRefGoogle Scholar
  106. 106.
    Yang, M. C. K., Hollien, H. and Huntley, R. (1986) A Speaker Identification System for Field use, Speech Tech’ 86, New York, Media Dimensions, 277–280.Google Scholar
  107. 107.
    Zalewski, J., Majewski, W. and Hollien, H. (1975) Cross-Correlation Between Long-Term Speech Spectra as a Criterion for Speaker Identification, Acustica 34:20–24.Google Scholar

Copyright information

© Springer Science+Business Media New York 1990

Authors and Affiliations

  • Harry Hollien
    • 1
  1. 1.University of FloridaGainesvilleUSA

Personalised recommendations