From Volatility Modeling of Financial Time-Series to Stochastic Modeling and Enhancement of Speech Signals

Cohen, Israel

doi:10.1007/3-540-27489-8_5

Israel Cohen⁴

Part of the book series: Signals and Communication Technology ((SCT))

2410 Accesses
6 Citations

Abstract

Modeling speech signals in the short-time Fourier transform (STFT) domain is a fundamental problem in designing speech enhancement systems. This chapter introduces a novel modeling approach, which is based on generalized autoregressive conditional heteroscedasticity (GARCH). GARCH is widely-used for volatility modeling of financial time-series such as exchange rates and stock returns. GARCH models take into account the heavy tailed distribution and volatility clustering characteristics of financial time-series. Spectral analysis shows that speech signals in the STFT domain are also characterized by heavy tailed distributions and volatility clustering. We demonstrate the application of GARCH modeling to speech enhancement, and show its advantage compared to using the conventional decision-directed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Y. Ephraim and I. Cohen, “Recent advancements in speech enhancement,” in The Electrical Engineering Handbook, 3rd ed. CRC Press, to be published. [Online]. Available: http://ece.gmu.edu/~yephraim/ephraim.html
Google Scholar
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoustics, Speech and Signal Processing, vol. ASSP-32, pp. 1109–1121, Dec. 1984.
Article Google Scholar
—, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoustics, Speech and Signal Processing, vol. ASSP-33, pp. 443–445, Apr. 1985.
Article Google Scholar
A. J. Accardi and R. V. Cox, “A modular approach to speech enhancement with an application to speech coding,” in Proc. IEEE ICASSP, 1999, pp. 201–204.
Google Scholar
J. Sohn, N. S. Kim, and W. Sung, “A statistical model-based voice activity detector,” IEEE Signal Processing Letters, vol. 6, pp. 1–3, Jan. 1999.
Article Google Scholar
I. Cohen and B. Berdugo, “Speech enhancement for non-stationary noise environments,” Signal Processing, vol. 81, pp. 2403–2418, Nov. 2001.
Article MATH Google Scholar
T. Lotter, C. Benien, and P. Vary, “Multichannel speech enhancement using bayesian spectral amplitude estimation,” in Proc. IEEE ICASSP, 2003, pp. I_832–I_835.
Google Scholar
P. J. Wolfe and S. J. Godsill, “Efficient alternatives to the Ephraim and Malah suppression rule for audio signal enhancement,” special issue of EURASIP JASP on Digital Audio for Multimedia Communications, vol. 2003, pp. 1043–1051, Sept. 2003.
MATH Google Scholar
J. Porter and S. Boll, “Optimal estimators for spectral restoration of noisy speech,” in Proc. IEEE ICASSP, 1984, pp. 18A.2.1–18A.2.4.
Google Scholar
R. Martin, “Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors,” in Proc. IEEE ICASSP, 2002, pp. I-253–I-256.
Google Scholar
S. Gazor and W. Zhang, “Speech probability distribution,” IEEE Signal Processing Letters, vol. 10, pp. 204–207, July 2003.
Article Google Scholar
—, “A soft voice activity detector based on a laplacian-gaussian model,” IEEE Trans. Speech and Audio Processing, vol. 11, pp. 498–505, Sept. 2003.
Article Google Scholar
R. Martin and C. Breithaupt, “Speech enhancement in the DFT domain using Laplacian speech priors,” in Proc. IWAENC, 2003, pp. 87–90.
Google Scholar
Y. Ephraim and D. Malah, “Signal to noise ratio estimation for enhancing speech using the Viterbi algorithm,” Technion-Israel Institute of Technology, Haifa, Israel, Technical Report, EE PUB 489, Mar. 1984.
Google Scholar
O. Cappé, “Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. Acoustics, Speech and Signal Processing, vol. 2, pp. 345–349, Apr. 1994.
Google Scholar
B. H. Juang and L. R. Rabiner, “Mixture autoregressive hidden Markov models for speech signals,” IEEE Trans. Acoustics, Speech and Signal Processing, vol. ASSP-33, pp. 1404–1413, Dec. 1985.
Article Google Scholar
Y. Ephraim and N. Merhav, “Hidden Markov processes,” IEEE Trans. Information Theory, vol. 48, pp. 1518–1568, June 2002.
Article MathSciNet MATH Google Scholar
H. Sameti, H. Sheikhzadeh, L. Deng, and R. L. Brennan, “HMM-based strategies for enhancement of speech signals embedded in nonstationary noise,” IEEE Trans. Speech and Audio Processing, vol. 6, pp. 445–455, Sept. 1998.
Article Google Scholar
I. Cohen, “Modeling speech signals in the time-frequency domain using GARCH,” Signal Processing, vol. 84, pp. 2453–2459, Dec. 2004.
Article Google Scholar
R. F. Engle, Ed., ARCH Selected Readings. New York: Oxford University Press Inc., 1995.
Google Scholar
T. Bollerslev, R. Y. ChouKenneth, and F. Kroner, “ARCH modeling in finance: A review of the theory and empirical evidence,” Journal of Econometrics, vol. 52, pp. 5–59, Apr.–May 1992.
Article MATH Google Scholar
I. Cohen, “Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging,” IEEE Trans. Speech and Audio Processing, vol. 11, pp. 466–475, Sept. 2003.
Article Google Scholar
—, “Relaxed statistical model for speech enhancement and a priori SNR estimation,” to appear in IEEE Trans. Speech and Audio Processing.
Google Scholar
J. S. Garofolo, “Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database,” National Institute of Standards and Technology (NIST), Gaithersburg, Maryland, Tech. Rep., (prototype as of Dec. 1988).
Google Scholar
A. Stuart and J. K. Ord, Kendall’s Advanced Theory of Statistics. 6th ed. London, UK: Edward Arnold, vol. 1, 1994.
Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of Royal Statistical Society (B), vol. 39, pp. 1–38, 1977.
MathSciNet MATH Google Scholar
G. J. McLachlan and T. Krishnan, The EM Algorithm and Extensions. New York: Wiley, 1997.
MATH Google Scholar
E. K. Berndt, B. H. Hall, R. E. Hall, and J. A. Hausman, “Estimation and inference in nonlinear structural models,” Annals of Economic and Social Measurement, vol. 4, pp. 653–665, 1974.
Google Scholar
T. Bollerslev, “Generalized autoregressive conditional heteroskedasticity,” Journal of Econometrics, vol. 31, pp. 307–327, Apr. 1986.
Article MATH MathSciNet Google Scholar
I. Cohen, “Speech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity model,” Technion-Israel Institute of Technology, Haifa, Israel, Technical Report, EE PUB 1425, Apr. 2004.
Google Scholar
J. S. Lim and A. V. Oppenheim, “Enhancement and bandwidth compression of noisy speech,” Proceedings of the IEEE, vol. 67, pp. 1586–1604, Dec. 1979.
Article Google Scholar
M. Berouti, R. Schwartz, and J. Makhoul, “Enhancement of speech corrupted by acoustic noise,” in Proc. IEEE ICASSP, 1979, pp. 208–211.
Google Scholar

Download references

Author information

Authors and Affiliations

Technion — Israel Institute of Technology, Haifa, 32000, Israel
Israel Cohen

Authors

Israel Cohen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cohen, I. (2005). From Volatility Modeling of Financial Time-Series to Stochastic Modeling and Enhancement of Speech Signals. In: Speech Enhancement. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27489-8_5

Download citation

DOI: https://doi.org/10.1007/3-540-27489-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24039-6
Online ISBN: 978-3-540-27489-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics