Abstract
In the paper, results of theoretical and experimental studies of dynamics of parameters of short correlation functions of speech signals are exposed. We present the results of the theoretical investigation of the dependence of the maxima of correlation functions of quasi-periodic signals in a general form on the characteristics defining the degree of their quasi-periodicity. Based on these dependences, estimates for the parameters of the quasi-periodicity degree, of the length of quasi-periodic intervals, of the main period, etc. are constructed. The obtained estimates are interpreted in the case of speech signals, and it is shown that many important parameters used in speech technologies can be calculated from them. Experimental results on segmentation of isolated words into separate phonemes are given as an example of efficiency of the approach based on the analysis of correlation functions. It is shown that the segmentation based on monitoring of singularities of dynamics of parameters of short correlation turns out to be stable and adequate to perception.
Similar content being viewed by others
References
V. N. Sorokin and A. L. Tsyplikhin, “Segmentation and Vowel Recognition,” Informatsionnye protsessy 4(2), 202–220 (2004).
T. C. T. Yin, J. C. K. Chan, and L. H. Carney, “Effects of Interaural Time delays of Noise on Low-Frequency Cells in the Cat’s Inferior Colliculus: III. Evidence of Cross-Correlation,” Neurophysiol. 58, pp. 562–583.
E. Seifritz, F. Esposito, F. Hennel, et al., “Spatiotemporal Pattern of Neural Processing in the Human Auditory Cortex,” Science 297(5587), 1706–1708 (2003).
M. M. Sondhi, “New Methods of Pitch Extraction,” IEEE Trans. on Audio and Electroacoustics AU-16(2), 262 (1968).
A. Cheveigne and H. Kawahara, “YIN, a Fundamental Frequency Estimator for Speech and Music,” J. Acoust. Soc. Amer. 111(4), 1917 (2002).
D. Gerhard, “Pitch Extraction and Fundamental Frequency: History and Current Techniques,” Technical Report TR-CS 2003-06 (2003).
V. N. Sorokin and V. P. Trifonenkov, “On Autocorrelation Analysis of Speech Signals,” Akusticheskii Zh. 42(3), 418–425 (1996).
V. E. Antsiperov and V. A. Morozov, “Dynamics of Characteristics of Short Autocorrelation Functions of Speech Signals,” Radiotekhnika i Elektronika 40(12), 1427–1425 (2004).
Author information
Authors and Affiliations
Corresponding author
Additional information
Vyacheslav E. Antsiperov. Born in 1959. Graduated from the Moscow Institute of Physics and Technology in 1982. Received candidate’s degree in 1986. Senior Researcher at the Institute of Radio Engineering and Electronics, Russian Academy of Sciences. Scientific interests: information theory, recognition and identification theory, and computer modeling of biological aspects of human activities, including speech and image recognition and modeling of other functions of central nervous system. Author of more than 30 papers.
Vladimir A. Morozov. Born in 1932. Graduated from the Moscow Institute of Energetics in 1956. Received candidate’s degree in 1964. Chief of the Statistical Radiophysics Department of the Institute of Radio Engineering and Electronics, Russian Academy of Sciences. Scientific interests: information theory, weak signal detection, signal processing, stochastic recognition, and speech recognition. Author of more than 80 papers.
Sergei A. Nikitov. Born in 1955. Graduated from the Moscow Institute of Physics and Technology in 1979. Received candidate’s degree in 1982 and Doctoral degree in 1991. Professor, Principal Researcher, the Head of Laboratory at the Institute of Radio Engineering and Electronics, Russian Academy of Sciences. Since 2004 corresponding member of the Russian Academy of Sciences. Scientific interests: informatics, including speech processing, magnetoelectronics, and nonlinear dynamics. Author of more than 100 papers.
Rights and permissions
About this article
Cite this article
Antsiperov, V.E., Morozov, V.A. & Nikitov, S.A. Segmentation of isolated words based on dynamics of parameters of short correlation functions. Pattern Recognit. Image Anal. 17, 527–538 (2007). https://doi.org/10.1134/S1054661807040116
Received:
Issue Date:
DOI: https://doi.org/10.1134/S1054661807040116