A comparative study of several approaches to short-term frequency analysis of a speech signal

Kolokolov, A. S.; Lyubinskii, I. A.

doi:10.1134/S0005117915100100

A comparative study of several approaches to short-term frequency analysis of a speech signal

Control in Social Economic Systems, Medicine, and Biology
Published: 21 October 2015

Volume 76, pages 1828–1833, (2015)
Cite this article

Automation and Remote Control Aims and scope Submit manuscript

A. S. Kolokolov¹ &
I. A. Lyubinskii¹

49 Accesses
1 Citation
Explore all metrics

Abstract

We study how the time-frequency representation of a speech signal depends on the chosen method of frequency analysis. We consider dynamical spectrograms obtained with a set of band-pass filters with different parameters and different order of their position along the frequency axis. We show that when a set of filters with parameters close to the filters of an audial analyzer is used, information on vowels and consonants in the speech signal is more uniformly distributed across the frequency axis, and spectral maxima related to the first and second formants of a vowel are more explicitly expressed, which is very important for speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonlinear Dynamical Analysis of Speech Signals

Computing Spectral Characteristics from Short Signals and Nonstationary Processes

Article 01 January 2018

Time-varying spectral analysis: theory and applications

Article 01 December 2018

References

Johansson, A., Helbing, D., Al-Abideen, H.Z., et al., From Crowd Dynamics to Crowd Safety: A Video-Based Analysis, Adv. Complex Syst., 2008, vol. 11, no. 4, pp. 497–527.
Article MATH Google Scholar
Musse, S.R. and Thalmann, D., A Model of Human Crowd Behavior: Group Inter-Relationship and Collision Detection Analysis Computer Animation and Simulations ’97, in Proc. Eur. Workshop, Budapest, Wien: Springer, 1997, pp. 39–51.
Google Scholar
Helitsvaara, S., Korhonen, T., Hostikka, S., et al., Counterflow Model for Agent-Based Simulation of Crowd Dynamics, Building Environment, 2012, vol. 48, no. 1, pp. 89–100.
Article Google Scholar
Ding, A.W., Implementing Real-Time Grouping for Fast Egress in Emergency, Safety Sci., 2011, vol. 49, no. 10, pp. 1404–1411.
Article Google Scholar
Wen-Hu Qin, Guo-Hui Su, and Xiao-Na Li., Technology for Simulating Crowd Evacuation Behaviors, Int. J. Automat. Comput., 2009, vol. 6, no. 4, pp. 351–355.
Article Google Scholar
Bonabeau, E., Agent-Based Modeling: Methods and Techniques for Simulating Human Systems, Proc. Natl. Acad. Sci., 2002, vol. 99, no. 3, pp. 7280–7287.
Article Google Scholar
Helbing, D., Johansson, A., and Al-Abideen, H.Z., Crowd Turbulence: The Physics of Crowd Disasters, in Fifth Int. Conf. Nonlinear Mechanics (ICNM-V), 2007, pp. 967–969.
Google Scholar
Kirik, E.S., Kruglov, D.V., and Yurgel’yan, T.B., On Discrete Model of Human Motion with an Element of Environmental Analysis, Zh. SFU, Ser. Mat. Phys., 2008, vol. 1, no. 3, pp. 262–271.
Google Scholar
Evsyukov, A.A., 3D Simulator of Evacuation of People at Fire in the Educational Institutions, in Collected Papers VII Int. Conf. “Innovative Informational-and-Pedagogical Technologies in Education,” Moscow, 2012, pp. 98–104.
Google Scholar
Akopov, A.S. and Beklaryan, L., Simulation of Human Crowd Behavior in Extreme Situations, Int. J. Pure Appl. Math., 2012, vol. 79, no. 1, pp. 121–138.
MATH MathSciNet Google Scholar
Oppenheim, A.V. and Schafer, R.W., Digital Signal Processing, Englewood Cliffs: Prentice Hall, 1989. Translated under the title Tsifrovaya obrabotka signalov, Moscow: Tekhnosfera, 2006
Google Scholar
Springer Handbook of Speech Processing, Benesty, J., Sondhi, M.M., and Huang, Y., Eds., Berlin: Springe, 2008
Rabiner, L.R. and Shafer, R.W., Digital Processing of Speech Signals, Englewood Cliffs: Prentice Hall, 1978. Translated under the title Tsifrovaya obrabotka rechevykh signalov, Moscow: Radio i Svyaz’, 1981
Google Scholar
Levinson, S.E., Structural Methods of Automatic Speech Recognition, TIIER, 1985, vol. 83, no. 11, pp. 100–129.
Google Scholar
Zue, V.W. and Cole, R.A., Experiments on Spectrogram Reading, Proc. ICASSP-79, 1979, pp. 116–119.
Google Scholar
Zue, V.W., Linguistic Approach to Computer-assisted Speech Recognition, Proc. IEEE, 1985, vol. 73, no. 11, pp. 75–91.
Article Google Scholar
Chistovich, L.A., Ventsov, A.V., Granstrem, M.P., et al., Fiziologiya rechi. Vospriyatie rechi chelovekom (Physiology of Speech. Human Perception of Speech), Leningrad: Nauka, 1976.
Google Scholar
Potter, R.K., Kopp, G.A., and Green, H.C., Visible speech, New York: Van Nostrand, 1947
Google Scholar
Fant, G., Acoustic Theory of Speech Perception, Mouton: ’s-Gravenhage, 1960. Translated under the title Akusticheskaya teoriya recheobrazovaniya, Moscow: Nauka, 1964
Google Scholar
Zwicker, E. and Feldtkeller, R., Das Ohr als Nachrichtenempfänger, Stuttgart: S. Hirzel Verlag, 1976. Translated under the title Ukho kak priemnik informatsii, Moscow: Svyaz’, 1971
Google Scholar
Zwicker, E. and Terhardt, E., Analytical Expressions for Critical-band Rate and Critical Bandwidth as a Function of Frequency, J. Acoust. Soc. Am., 1980, vol. 68, no. 5, pp. 1523–1525.
Article Google Scholar
Traunmuller, H., Analytical Expressions for the Tonotopic Sensory Scale, J. Acoust. Soc. Am., 1990, vol. 88, no. 1, pp. 97–100.
Article Google Scholar
D’yakonov, V.P., Veivlety. Ot teorii k praktike (Wavelets. From Theory to Practice), Moscow: SOLONPress, 2004.
Google Scholar
Harris, F.J., Using Windows in the Harmonic Analysis by the Method of Discrete Fourier Transform, Proc. IEEE, 1978, vol. 66, no. 1, pp. 60–96.
Article Google Scholar
Cooper, F.S., Delattre, P.C., Liberman, A.M., et al., Experiments on the Perception of Synthetic Speech Sounds, J. Acoust. Soc. Am., 1952, vol. 24, pp. 597–606.
Article Google Scholar
Blumstein, S.E. and Stevens, K.N., Perceptual Invariance and Onset Spectra for Stop Consonants in Different Vowel Environments, J. Acoust. Soc. Am., 1980, vol. 67, pp. 648–662.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Trapeznikov Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia
A. S. Kolokolov & I. A. Lyubinskii

Authors

A. S. Kolokolov
View author publications
You can also search for this author in PubMed Google Scholar
I. A. Lyubinskii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. S. Kolokolov.

Additional information

Original Russian Text © A.S. Kolokolov, I.A. Lyubinskii, 2015, published in Avtomatika i Telemekhanika, 2015, No. 10, pp. 144–151.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kolokolov, A.S., Lyubinskii, I.A. A comparative study of several approaches to short-term frequency analysis of a speech signal. Autom Remote Control 76, 1828–1833 (2015). https://doi.org/10.1134/S0005117915100100

Download citation

Received: 28 October 2014
Published: 21 October 2015
Issue Date: October 2015
DOI: https://doi.org/10.1134/S0005117915100100

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative study of several approaches to short-term frequency analysis of a speech signal

Abstract

Access this article

Similar content being viewed by others

Nonlinear Dynamical Analysis of Speech Signals

Computing Spectral Characteristics from Short Signals and Nonstationary Processes

Time-varying spectral analysis: theory and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comparative study of several approaches to short-term frequency analysis of a speech signal

Abstract

Access this article

Similar content being viewed by others

Nonlinear Dynamical Analysis of Speech Signals

Computing Spectral Characteristics from Short Signals and Nonstationary Processes

Time-varying spectral analysis: theory and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation