Speech Endpoint Detection Based on Improvement Feature and S-Transform

  • Lu XunboEmail author
  • Zhu ChunliEmail author
  • Li XinEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 924)


In the low SNR and non-stationary noise environment, traditional feature detection methods will lead to a sharp drop in detection performance. This paper proposes an improved speech endpoint detection algorithm based on S-transform (ST). ST has the advantages of both Short Fast Fourier Transform (SFFT) and Wavelet Transform (WT). It can extract more robust MFCC features. In this paper, the ST is combined with spectral subtraction to transform the speech into the time-frequency joint domain in order to obtain a purer speech. Then the dynamic threshold updating mechanism is used to detect the noisy speech with two-parameter double threshold method. Through Matlab simulation, the improved algorithm presented in this paper is compared with two other algorithms. The experimental results reveal that this algorithm has a higher accuracy in endpoint detection. Moreover, it has a great advantage both in detection rate and error rate.


Endpoint detection S-transform Spectral subtraction MFCC Uniform sub-band variance 


  1. 1.
    Stockwel, R.G., Mansinha, L., Lowe, R.P.: Localization of the complex spectrum: the S-transform. J. IEEE Trans. Signal Procesing. 44(4), 998–1001 (1996)CrossRefGoogle Scholar
  2. 2.
    Huang, L., Yang, C.: A novel approach to robust speech endpoint detection in car environments. In: IEEE International Conference on, vol. 3, pp. 1751–1754. IEEE (2000)Google Scholar
  3. 3.
    Nakagawa, S., Wang, L.: Speaker identification and verification by combining MFCC and phase information. In: IEEE International Conference on Acoustics, vol. 20, no. 4, pp. 4529–4532 (2009)Google Scholar
  4. 4.
    Yin, R., Cheng, J.: Improved feature extraction algorithm based on DWT-MFCC. J. Modern Electron. Technol. 40(9), 18–21 (2017)Google Scholar
  5. 5.
    Zhang, Z., Yao, E., Shi, Y.: Audio endpoints detection algorithm based on wavelet analysis and MFCC. J. Electron. Meas. Technol. 39(7), 62–66 (2016)Google Scholar
  6. 6.
    Zeng, S., Jingxiang, L.: Speech endpoint detection method based on fusion of MFCC distance and logarithmic energy parameter. J. Audio Eng. 40(9), 51–55 (2016)Google Scholar
  7. 7.
    Cao, D., Gao, X., Gao, L.: An improved endpoint detection algorithm based on MFCC Cosine Value. J. Wirel. Pers. Commun. 95, 2073–2090 (2017)CrossRefGoogle Scholar
  8. 8.
    Wang, H., Yu, Z.: SMFCC: a novel feature extraction method for speech signal. J. Comput. Appl. 36(6), 1735–1740 (2016)Google Scholar
  9. 9.
    Goh, Z., Tan, T.-C., Tan, B.T.G.: Postprocesing method for suppressing musical noise generated by spectral subtraction. IEEE Trans. Speech Audio Process. 6(3), 28–292 (1998)Google Scholar
  10. 10.
    Paliwa, K., Wojcicki, K., Schwerin, B.: Single-channel speech enhancement using spectral subtraction in the short-time modulationdomain. J. Speech Commun. 52(5), 450–475 (2010)CrossRefGoogle Scholar
  11. 11.
    Wang, Z.-F.: Speech endpoint detection method research based on double threshold-frequency band variance. Electron. Des. Eng. 24(19), 86–88 (2016)Google Scholar
  12. 12.
    Sun, Y., Wu, Y., Li, P.: Research on speech endpoint detection based on the improved dual-threshold. J. Chang. Univ. Sci. Technol. (Nat. Sci. Ed.) 39(1), 92–95 (2016)Google Scholar
  13. 13.
    Wang, W., Hu, G., Yang, L., et al.: Research of endpoint detection based on spectral subtraction and uniform sub-band spectrum variance. Audio Eng. 40(5), 40–43 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.School of Mechatronic Engineering and AutomationShanghai UniversityShanghaiChina

Personalised recommendations