Abstract
Consideration was given to the traditional methods of studying the informative attributes of voice signal. A method based on editing the dynamic spectrogram of the voice signal and its subsequent restoration in the time domain, which is a modification of the analysis–synthesis procedure, was proposed. Preliminary results of using it for speech analysis were presented.
Similar content being viewed by others
REFERENCES
Picone, J.W., Signal Modeling Techniques in Speech Recognition, Proc. IEEE, 1993, vol. 81, no. 9, pp. 1215–1247.
Kolokolov, A.S., Signal Preprocessing for Speech Recognition, Avtom. Telemekh., 2002, no. 3, pp. 160–168.
Macho, D. and Nadeu, C., Comparison of Spectral Derivative Parameters for Robust Speech, Proc. Eurospeech, Alborg, Dinamarca, 2001, pp. 205–208.
4. Hermansky, H. and Morgan, N., RASTA Processing of Speech, IEEE Trans. Speech Audio Processing, 1994, vol. 2, no. 4, pp.587–589.
Kolokolov, A.S., Preprocessing and Segmentation of the Speech Signal in the Frequency Domain for Speech Recognition, Avtom. Telemekh., 2003, no. 6, pp. 152–162.
Fant, G., Acoustic Theory of Speech Production, Berlin: Mouton de Gruyter, 1971. Translated under the title Analiz, sintez i vospriyatie rechi, Moscow: Svyaz', 1968.
Chistovich, L.A., Ventsov, A.V., Granstrem, M.P., et al., Speech Physiology. Human Perception of Speech, in Rukovodstvo po fiziologii(Manual of Physiology), Leningrad: Nauka, 1976.
Zue, V.W. and Cole, R.A., Experiments on Spectrogram Reading, Proc. ICASSP-79, 1979, pp. 116–119.
Zue, V.W., Linguistic Approach to Computer-assisted Speech Recognition, Proc. IEEE, 1985, vol. 73, no. 11, pp.75–91.
Potterb, R.K., Koppb, G.A., and Greenb, H.C., Visible Speech, New York: Van Nostrand, 1947.
Rabiner, L.R. and Shafer, R.W., Digital Processing of Speech Signals, Englewood Cliffs: Prentice-Hall, 1978. Translated under the title Tsifrovaya obrabotka rechevykh signalov, Moscow: Radio i Svyaz', 1981.
Flanagan, J.L., Speech Analysis, Synthesis and Perception, Berlin: Springer, 1965. Translated under the title Analiz, sintez i vospriyatie rechi, Moscow: Svyaz', 1968.
Dergach, M.F., Statistics of Perception of Unvoiced Explosive and Fricative Consonants Depending on their Duration, in Voprosy statistiki rechi(Questions of Speech Statistics), Leningrad: Leningrad. Gos. Univ., 1962.
Dukel'skii, N.I., Printsipy segmentatsii rechevogo potoka(Principles of Segmentation of Speech Flow), Moscow: AN SSSR, 1962.
Cooper F.S., Delattre P.C., Liberman A.M., et al., Some Experiments on the Perception of Synthetic Speech Sounds, J. Acoust. Soc. Am., 1952, vol. 24, pp. 597–606.
Blumstein, S.E. and Stevens, K.N., Perceptual Invariance and Onset Spectra for Stop Consonants in Different Vowel Environments, J. Acoust. Soc. Am., 1980, vol. 67, pp. 648–662.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kolokolov, A.S. Study of the Informative Attributes of Voice Signal by Spectrum Editing. Automation and Remote Control 65, 1338–1347 (2004). https://doi.org/10.1023/B:AURC.0000038734.70006.d8
Issue Date:
DOI: https://doi.org/10.1023/B:AURC.0000038734.70006.d8