Advertisement

International Journal of Speech Technology

, Volume 16, Issue 2, pp 171–179 | Cite as

The optimized wavelet filters for speech compression

  • A. Kumar
  • G. K. Singh
  • G. Rajesh
  • K. Ranjeet
Article

Abstract

In this paper, optimized wavelet filters for speech compression are proposed whose wavelet filter coefficients are derived with different window techniques such as Kaiser and Blackman windows via simple linear optimization. When the developed wavelet filters are exploited for speech compression, they not only give better compression ratio but also yield good fidelity parameters as compared to other wavelet filters. A comparative study of performance of different existing wavelet filters and the proposed wavelet filters is made in terms of compression ratio (CR), signal-to-noise ratio (SNR), peak signal-to-noise ratio (PSNR) and normalized root-mean square error (NRMSE) at different thresholding levels. The simulation result included in this paper shows increased efficacy and improved performance of the proposed filters in the field of speech signal processing.

Keywords

Optimized wavelet Speech compression Huffman encoding Discrete wavelet transform (DWT) 

References

  1. Agbinya, J. I. (1996). Discrete wavelet transform techniques in speech processing. In IEEE Tencon digital signal processing applications proceedings (pp. 514–519). New York: IEEE. CrossRefGoogle Scholar
  2. Arif, M., & Anand, R. S. (2012). Turning point algorithm for speech signal compression. International Journal of Speech Technology. doi: 10.1007/s10772-012-9151-7. Google Scholar
  3. Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41, 909–996. MathSciNetMATHCrossRefGoogle Scholar
  4. Daubechies, I. (1992). Ten lectures on wavelets. CBMS-NSF. Google Scholar
  5. Dusan, S., Flanagan, J. L., Karve, A., & Balaraman, M. (2007). Speech compression using polynomial approximation. IEEE Transactions on Audio, Speech, and Language Processing, 15(2), 387–397. CrossRefGoogle Scholar
  6. Fgee, E. B., Philips, W. J., & Robertson, W. (1999). Comparing audio compression using wavelet with other audio compression schemes. Proceedings IEEE Electrical and Computer Engineering, 2, 698–701. Google Scholar
  7. Gershikov, E., & Porat, M. (2007). On color transforms and bit allocation for optimal subband image compression. Signal Processing. Image Communication, 22, 1–18. CrossRefGoogle Scholar
  8. Gersho, A. (1992). Speech coding. In A. N. Ince (Ed.), Digital speech processing (pp. 73–100). Boston: Kluwer Academic. CrossRefGoogle Scholar
  9. Gersho, A. (1994). Advance in speech and audio compression. Proceedings of the IEEE, 82(6), 900–918. CrossRefGoogle Scholar
  10. Gibson, J. D. (2005). Speech coding methods, standards, and applications. IEEE Circuits and Systems Magazine, 5(4), 30–49. CrossRefGoogle Scholar
  11. Joseph, S. M. (2010). Spoken digit compression using wavelet packet. In IEEE international conference on signal and image processing (ICSIP-2010) (pp. 255–259). CrossRefGoogle Scholar
  12. Junejo, N., Ahmed, N., Unar, M. A., & Rajput, A. Q. K. (2005). Speech and image compression using discrete wavelet transform. In IEEE symposium on advances in wired and wireless communication (pp. 45–48). Google Scholar
  13. Kumar, A., Singh, G. K., & Anand, R. S. (2008). Near perfect reconstruction quadrature mirror filter. International Journal of Computer Science and Engineering, 2(3), 121–123. Google Scholar
  14. Laskar, R. H., Banerjee, K., Talukdar, F. A., & Sreenivasa Rao, K. (2012). A pitch synchronous approach to design voice conversion system using source-filter correlation. International Journal of Speech Technology, 15, 419–431. CrossRefGoogle Scholar
  15. Magboun, H. M., Ali, N., Osman, M. A., & Alfandi, S. A. (2010). Multimedia speech compression techniques. In IEEE international conference on computing science and information technology (ICCSIT) (Vol. 9, pp. 498–502). Google Scholar
  16. Mallat, S. G. (1987). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 11(7), 674–684. CrossRefGoogle Scholar
  17. Mallat, S. G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674–693. MATHCrossRefGoogle Scholar
  18. McCauley, J., Ming, J., Stewart, D., & Hanna, P. (2005). Subband correlation and robust speech recognition. IEEE Transactions on Speech and Audio Processing, 13(5), 956–964. CrossRefGoogle Scholar
  19. Misiti, M., Misiti, Y., Oppenheim, G., & Poggi, J. (2000). Matlab wavelet tool box. The Math Works Inc. Google Scholar
  20. Najih, A. M. M. A., Ramli, A. R., Ibrahim, A., & Syed, A. R. (2003). Speech compression using discreet wavelet transform. In Proceedings of 4th national conference on telecommunication technology (pp. 1–3). CrossRefGoogle Scholar
  21. Ntalampiras, S., & Fakotakis, N. (2012). Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing, 3(1), 116–125. CrossRefGoogle Scholar
  22. Osman, M. A., Al, N., Magboud, H. M., & Alfandi, S. A. (2010). Speech compression using LPC and wavelet. In IEEE international conference on computer engineering and technology (ICCET) (Vol. 7, pp. 92–99). Google Scholar
  23. Ramchandran, K., Vetterli, M., & Herley, C. (1996). Wavelet, subband coding, and best bases. Proceedings of the IEEE, 84(4), 541–560. CrossRefGoogle Scholar
  24. Satt, A., & Malah, D. (1989). Design of uniform DFT filter banks optimized for subband coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(11), 1672–1679. CrossRefGoogle Scholar
  25. Shahin, I. M. A. (2012). Speaker identification investigation and analysis in unbiased and biased emotional talking environments. International Journal of Speech Technology, 15, 325–334. CrossRefGoogle Scholar
  26. Shao, Y., & Chang, C. H. (2011). Bayesian separation with sparsity promotion in perceptual wavelet domain for speech enhancement and hybrid speech recognition. IEEE Transactions on Systems, Man and Cybernetics. Part A: System and Humans, 41(2), 284–293. CrossRefGoogle Scholar
  27. Shlomot, E., Cuperman, V., & Gersho, A. (1998). Combined harmonic and waveform coding of speech at low bit rates. In IEEE conference on acoustics, speech and signal processing (ICASSP98) (Vol. 2, pp. 585–588). Google Scholar
  28. Shlomot, E., Cuperman, V., & Gersho, A. (2001). Hybrid coding: combined harmonic and waveform coding of speech at 4 kb/s. IEEE Transactions on Speech and Audio Processing, 9(6), 632–645. CrossRefGoogle Scholar
  29. Vankateswaran, P., Sanyal, A., Das, S., Nandi, R., & Sanyal, S. K. (2009). An efficient time domain speech compression algorithm based on LPC and sub-band coding techniques. Journal of Communication, 4(6), 423–428. Google Scholar
  30. Vetterli, M., & Kovacevic, J. (1995). Wavelets and subband coding. New York: Prentice Hall. MATHGoogle Scholar
  31. Xie, N., Dong, G., & Zhang, T. (2011). Using lossless data compression in data storage systems: not for saving space. IEEE Transactions on Computers, 60(3), 335–345. MathSciNetCrossRefGoogle Scholar
  32. Young, R. M. (1980). An introduction to nonharmonic Fourier series. New York: Academic Press. MATHGoogle Scholar
  33. Zois, E. N., & Anastassopoulos, V. (2000). Morphological waveform coding for writer identification. Pattern Recognition, 33(3), 385–398. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Indian Institute of Information Technology Design and ManufacturingJabalpurIndia
  2. 2.Indian Institute of TechnologyRoorkeeIndia

Personalised recommendations