Advertisement

New Encoding Algorithm for Distributed Speech Recognition Based on DTFS Transform

  • Azzedine Touazi
  • Mohamed Debyeche
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7340)

Abstract

The paper presents a new algorithm for efficient compression of front-end feature extracted parameters used in distributed speech recognition systems (DSR). In the proposed method the source encoder is mainly based on discrete time Fourier series (DTFS) by interpolation using Fourier coefficients with conventional vector quantization. The system provides a compression bit rate as low as 4 kbps; the experiments were carried out on the TIDigits Aurora2 database [1]. The simulation results show good recognition performance without dramatic change comparing with ETSI STQ-AURORA standard front-end feature compression algorithm with quantized features at 4.4 kbps [2].

Keywords

Distributed speech recognition Vector quantization Discrete time Fourier series Aurora2 database 

References

  1. 1.
    Hirsch, H.G., Pearce, D.: The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions. In: 6th International Conference on Spoken Language Processing, ICSLP, China (October 2000)Google Scholar
  2. 2.
    ETSI Standard Document.: Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-end Feature Extraction algorithm Compression Algorithm, ETSI ES 201 108 V 1.1.3 (September 2003)Google Scholar
  3. 3.
    Kiss, I., Kapanen, P.: Robust feature vector compression algorithm for distributed speech recognition. In: Eurospeech (1999)Google Scholar
  4. 4.
    Zhu, Q., Alwan, A.: An efficient and scalable 2D-DCT based feature coding scheme for remote speech recognition. In: Proc. IEEE Int. Conf. Acoustic, Speech. Signal Processing (2001)Google Scholar
  5. 5.
    Garcia, J.E., Ortega, A., Miguel, A., Lleida, E.: Predictive vector quantization using the M-algorithm for distributed speech recognition. In: VI Jornadas en Tecnología del Habla and II Iberian SLTech Workshop, FALA (2010)Google Scholar
  6. 6.
    So, S., Paliwal, K.K.: Quantization of speech features, source coding. In: Automatic Speech Recognition on Mobile Devices and over Communication Networks Advances (2008), http://www.springerlink.com/content/u1t465157615k202/
  7. 7.
    Smith, J.O.: Mathematics of the discrete Fourier transform (DFT) with audio applications, Center for Computer Research in Music and Acoustics (CCRMA), Department of Music, Stanford University, https://ccrma.stanford.edu/~jos/mdft/ (viewed May 2011)
  8. 8.
    Choy, E.L.T.: Waveform Interpolation Speech Coder at 4 kb/s, Department of Electrical and Computer Engineering, McGill University, Montreal, Canada (August 1998)Google Scholar
  9. 9.
    Elfataoui, M., Mirchandani, G.: A Frequency- Domain Method for Generation of Discrete-Time Analytic Signals. IEEE Trans on Signal Processing 54(9) (September 2006)Google Scholar
  10. 10.
    Zhang, Y.: Acoustic model and pronunciation adaptation in automatic speech recognition. University of Miami (2006)Google Scholar
  11. 11.
    Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. on Communications 28, 84–95 (1980)CrossRefGoogle Scholar
  12. 12.
    Pearlman, W.A., Gray, R.M.: Source coding of the discrete Fourier transform. IEEE Trans. on Information Theory IT-24(6), 683–692 (1978)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book, HTK Version 3.4, Cambridge University Engineering Department (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Azzedine Touazi
    • 1
  • Mohamed Debyeche
    • 1
  1. 1.University of Sciences and Technology Houari BoumedieneBab EzzouarAlgeria

Personalised recommendations