Advertisement

A Novel AFM Signal Model for Parametric Representation of Speech Phonemes

  • Mohan BansalEmail author
  • Pradip Sircar
Article
  • 15 Downloads

Abstract

A new multicomponent multitone amplitude and frequency-modulated signal model for parametric modelling of speech phoneme (voiced and unvoiced) is presented in this paper. As the speech signal is a multicomponent non-stationary signal, the Fourier–Bessel expansion is used to separate all individual components from the multicomponent speech signal. The parameter estimation is done by analysing the amplitude envelope (AE) and instantaneous frequency (IF) of the signal component separately. The AE and IF functions for separated components are extracted by using the discrete energy separation algorithm. The amplitude-modulated signal parameters and the amplitude of the signal are estimated by analysing the AE function, whereas the frequency-modulated signal parameters and the carrier frequency of the signal are estimated by analysing the IF function. This technique is found to be quite efficient for accurate parameter estimation of the speech phoneme. As an illustration of model-based speech processing, the proposed model is used for various speech signal processing applications.

Keywords

Amplitude and frequency modulation Discrete energy separation algorithm Fourier–Bessel expansion Parametric model Speech analysis/synthesis 

Notes

References

  1. 1.
    M. Bansal, P. Sircar, Low bit-rate speech coding based on multicomponent AFM signal model. Int. J. Speech Technol. 21(4), 783–795 (2018)CrossRefGoogle Scholar
  2. 2.
    M. Bansal, P. Sircar, Parametric representation of voiced speech phoneme using multicomponent AM signal model. In: 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), pp. 128–133Google Scholar
  3. 3.
    Formant. Wikipedia. https://en.wikipedia.org/wiki/Formant. Accessed 07 August 2018
  4. 4.
    S. Furui, M.M. Sondhi, Advances in Speech Signal Processing (Marcel Dekker, New York, 1991)zbMATHGoogle Scholar
  5. 5.
    A.A. Giordano, F.M. Hsu, Least Square Estimation with Applications to Digital Signal Processing (Wiley, Newyork, 1985)CrossRefGoogle Scholar
  6. 6.
    A. Gray, J. Markel, Distance measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 24(5), 380–391 (1976)CrossRefGoogle Scholar
  7. 7.
    A.S. Hood, R.B. Pachori, V.K. Reddy, P. Sircar, Parametric representation of speech employing multi-component AFM signal model. Int. J. Speech Technol. 18(3), 287–303 (2015)CrossRefGoogle Scholar
  8. 8.
    X. Hu, S. Peng, W.L. Hwang, Multicomponent AM-FM signal separation and demodulation with null space pursuit. Signal Image Video Process. 7(6), 1093–1102 (2013)CrossRefGoogle Scholar
  9. 9.
    Z. Jackson, C. Souza, J. Flaks, H. Nicolas, Jakobovski/free-spoken-digit-dataset v1. 0.7 (2018)Google Scholar
  10. 10.
    N.S. Jayant, P. Noll, Digital Coding of Waveforms: Principles and Applications to Speech and Video (Prentice Hall, Englewood Cliffs, 1984)Google Scholar
  11. 11.
    S.M. Kay, Modern Spectral Estimation: Theory and Application (Prentice-Hall, Englewood Cliffs, 1988)zbMATHGoogle Scholar
  12. 12.
    P. Kroon, E.F. Deprettere, A class of analysis-by-synthesis predictive coders for high quality speech coding at rates between 4.8 and 16 kbit/s. IEEE J. Sel. Areas Commun. 6(2), 353–363 (1988)CrossRefGoogle Scholar
  13. 13.
    S.R. Livingstone, F.A. Russo, The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS one 13(5), e0196391 (2018)CrossRefGoogle Scholar
  14. 14.
    P. Maragos, J.F. Kaiser, T.F. Quatieri, Energy separation in signal modulations with application to speech analysis. IEEE Trans. Signal Process. 41(10), 3024–3051 (1993)CrossRefzbMATHGoogle Scholar
  15. 15.
    P. Maragos, J.F. Kaiser, T.F. Quatieri, On amplitude and frequency demodulation using energy operators. IEEE Trans. Signal Process. 41(4), 1532–1550 (1993)CrossRefzbMATHGoogle Scholar
  16. 16.
    R.J. McAulay, T.F. Quatieri, Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. Acoust. Speech Signal Process. 34(4), 744–754 (1986)CrossRefGoogle Scholar
  17. 17.
    R.J. McAulay, T.F. Quatieri, Low-rate Speech Coding Based on the Sinusoidal Model, in Advances in Speech Signal Processing, ed. by S. Furui, M.M. Sondhi (Marcel Dekker, Newyork, 1991), pp. 165–208Google Scholar
  18. 18.
    R.B. Pachori, P. Sircar, Speech analysis using Fourier–Bessel expansion and discrete energy separation algorithm. In: Proceedings 12th Digital Signal Processing Workshop and 4th Signal Processing Education Workshop, pp. 423–428 (2006)Google Scholar
  19. 19.
    R.B. Pachori, P. Sircar, Analysis of multicomponent AM-FM signals using FB-DESA method. Dig. Signal Process. 20(1), 42–62 (2010)CrossRefGoogle Scholar
  20. 20.
    Y. Pantazis, O. Rosec, Y. Stylianou, Adaptive AM-FM signal decomposition with application to speech analysis. IEEE Trans. Audio Speech Lang. Process. 19(2), 290–300 (2011)CrossRefGoogle Scholar
  21. 21.
    A. Potamianos, Speech Processing Applications Using an AMFM Modulation Model. Ph.D. Thesis, Harvard University, Cambridge(1995)Google Scholar
  22. 22.
    L.R. Rabiner, B.H. Juang, Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs, 1993)Google Scholar
  23. 23.
    J. Schroeder, Signal processing via Fourier-Bessel series expansion. Dig. Signal Process. 3(2), 112–124 (1993)CrossRefGoogle Scholar
  24. 24.
    R. Sharma, L. Vignolo, G. Schlotthauer, M.A. Colominas, H.L. Rufiner, S. Prasanna, Empirical mode decomposition for adaptive AM-FM analysis of speech: a review. Speech Commun. 88, 39–64 (2017)CrossRefGoogle Scholar
  25. 25.
    P. Sircar, R.K. Saini, Parametric modeling of speech by complex AM and FM signals. Dig. Signal Process. 17(6), 1055–1064 (2007)CrossRefGoogle Scholar
  26. 26.
    P. Sircar, S. Sharma, Complex FM signal model for non-stationary signals. Signal Process. 57(3), 283–304 (1997)CrossRefzbMATHGoogle Scholar
  27. 27.
    P. Sircar, M.S. Syali, Complex AM signal model for non-stationary signals. Signal Process. 53(1), 35–45 (1996)CrossRefzbMATHGoogle Scholar
  28. 28.
    M. Sokolova, G. Lapalme, A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)CrossRefGoogle Scholar
  29. 29.
    A.S. Spanias, Speech coding: a tutorial review. Proc. IEEE 82(10), 1541–1582 (1994)CrossRefGoogle Scholar
  30. 30.
    P. Tsiakoulis, A. Potamianos, Statistical analysis of amplitude modulation in speech signals using an AM-FM model. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3981–3984. IEEE Computer Society (2009)Google Scholar
  31. 31.
    B. Wei, J.D. Gibson, Comparison of distance measures in discrete spectral modeling. Master’s thesis, Southern methodist University, Dallas (2001)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIndian Institute of Technology KanpurKanpurIndia

Personalised recommendations