Advertisement

International Journal of Speech Technology

, Volume 19, Issue 3, pp 433–448 | Cite as

Robust analysis for improvement of vowel onset point detection under noisy conditions

  • Partha Saha
  • Ujwala Baruah
  • R. H. Laskar
  • Songhita Mishra
  • Suman Paul Choudhury
  • Tushar Kanti Das
Article
  • 201 Downloads

Abstract

Vowel onset point (VOP) is the instant of time at which the vowel region starts in a speech signal. The VOPs are used as anchor points to design various speech based systems. Different algorithms exist in the literature to identify the occurrences of vowels in continuous spoken utterances. The algorithm based on combined evidences derived from source excitation, spectral peaks and modulation spectrum have been used as a baseline system for the present study. The baseline system provides a satisfactory level of performance under clean data condition. However under noisy data condition the performance of the previous system may be improved further by additional pre-processing of the raw speech data and post-processing the detected VOPs. In this paper we propose to use the speech enhancement techniques as pre-processing module to remove the noise from the speech data under different noisy conditions. The pre-processed speech data is then passed through the baseline system to detect the VOPs. It has been observed that there exist several spurious VOPs at the output of the baseline system. We propose to use a post-processing module based on average signal-to-noise ratio and information derived from the glottal closure instant to remove the spurious VOPs. The experiments were carried out on clean, artificially injected noisy, and data collected from the practical noisy environments. The results suggest that the proposed system using pre-processing and post-processing modules is robust and shows an improvement of 28–35 % over the existing baseline system by removing the spurious VOPs under different noisy conditions.

Keywords

Vowel onset point (VOP) Excitation source Spectral peak Modulation spectrum Glottal closure instance (GCI) Minimum mean square error (MMSE) 

Notes

Acknowledgments

This work is supported by the project titled “Development of Speech based Multi-Level Person Authentication System”, funded by the Department of Information Technology (DIT), New Delhi, India.

References

  1. Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction, IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.CrossRefGoogle Scholar
  2. Ephrain, Y., & Malah, D. (1984). Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.CrossRefGoogle Scholar
  3. Garofolo, J. D. (1993). TIMIT acoustic-phonetic continuous speech corpus linguistic data consortium. Philadelphia, PA: TIMIT.Google Scholar
  4. Hermes, D. J. (1990). Vowel onset detection. Journal of the Acoustical Society of America, 87, 866–873.CrossRefGoogle Scholar
  5. Murty, K. S. R., & Yegnanarayana, B. (2008). Epoch extraction from speech signals. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1602–1613.CrossRefGoogle Scholar
  6. Prasanna, S. R. M., & Pradhan, G. (2011). Significance of vowel-like regions for speaker verification under degraded condition. IEEE Transactions on Audio, Speech and Language Processing, 19(8), 2552–2565.CrossRefGoogle Scholar
  7. Prasanna, S. R. M., Reddy, B. V. S., & Krishnamoorthy, P. (2009). Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 556–565.CrossRefGoogle Scholar
  8. Prasanna, S. R. M. & Yegnanarayana, B. (2005). Detection of vowel onset point events using excitation source information. in Proceeding of the interspeech, (pp. 1133-1136), Lisbon.Google Scholar
  9. Prasanna, S. R. M., Zachariah, J. M., & Yegnanarayana, B. (2003). Begin-end detection using vowel onset points (pp. 33–39). Mumbai: Proceedings of Workshop on Spoken Language Processing.Google Scholar
  10. Rao, J. Y. S. R. K., Sekhar, C. C. & Yegnanarayana, B. (1999). Neural networks based approach for detection of vowel onset points. In Proceeding of the International Conference Advances in Pattern Recognition and Digital Techniques, (pp. 316–320), Calcutta.Google Scholar
  11. Rao, K. S., & Yegnanarayana, B. (2009). “Duration modification using glottal closure instants and vowel onset points, Speech Communication, 15(12), 1263–1269.CrossRefGoogle Scholar
  12. Sekhar, C. C. (1996). Neural network models for recognition of stop consonant-vowel (SCV) segments in continuous speech. Ph.D. dissertation, Department of Computer Science and Engineering Indian Institute of Technology Madras, Chennai.Google Scholar
  13. ‘TIMIT acoustic-phonetic continuous speech corpus. (1990). National Institute of Standards and Technology Gaithersburg, MD, NTIS Order PB91-505065, Speech Disc 1-1.1.Google Scholar
  14. Vuppala, A. K., Rao, K. S., Chakrabarti, S., Krishnamoorthy, P., & Prasanna, S. R. M. (2011). Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing. International Journal of Speech Technology, 14(3), 259–272.CrossRefGoogle Scholar
  15. Wang, J. H., & Chen, S. H. (1999). A C/V segmentation algorithm for Mandarin speech using wavelet transforms. Proceeding of the International Conference on Acoustic, Speech and Signal Processing, 1, 1261–1264.Google Scholar
  16. Wang, J. F., Wu, C. H., Chang, S. H., & Lee, J. Y. (1991). A heirarchical neural network based C/V segmentation algorithm for Mandarin speech recognition. IEEE Transactions on Signal Processing, 39(9), 2141–2146.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Partha Saha
    • 1
  • Ujwala Baruah
    • 2
  • R. H. Laskar
    • 1
  • Songhita Mishra
    • 1
  • Suman Paul Choudhury
    • 1
  • Tushar Kanti Das
    • 1
  1. 1.Department of ECENITSilcharIndia
  2. 2.Department of CSENITSilcharIndia

Personalised recommendations