Abstract
The paper discusses a novel system for the estimation of distance of a target speaker by involving statistical properties in a reverberant condition. The system involves the extraction of statistical features from both cepstral and envelope coefficients of a speaker at different distances. Further, different spectral or monaural features are analysed at distinct distances for different room environments. The distance-dependent statistical properties are considered for the feature extraction process. A set of statistical parameters are used to learn GMM-EM pattern recognizer for effective classification. The results observed that the system performance is very much dependent on the reverberation time and also robustness of the monaural features. The results of the proposed system show the significant improvement in signal-to-noise ratio of 0 dB (babble noise) under reverberation time 0.48 s over other existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Georganti, E., May, T., van de Par, S., Harma, A., Mourjopoulos, J.: Speaker distance detection using a single microphone. IEEE Trans. Audio Speech Lang. Process. 19(7), 1949–1961 (2011)
Venkatesan, R., Balaji Ganesh, A.: Full sound source localization of binaural signals. In: International Conference on Wireless Communication, Signal Processing and Networking (2017). (Accepted)
Lu, Y.C., Cooke, M.: Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Commun. 53(5), 622–642 (2011)
Kuster, M.: Estimating the direct-to-reverberant energy ratio from the coherence between coincident pressure and particle velocity. J. Acoust. Soc. Am. 130(6), 3781–3787 (2011)
Hioka, Y., Niwa, K., Sakauchi, S., Furuya, K., Haneda, Y.: Estimating direct-to-reverberant energy ratio using D/R spatial correlation matrix model. IEEE Trans. Audio Speech Lang. Process. 19(8), 2374–2384 (2011)
Lu, Y.C., Cooke, M.: Binaural estimation of sound source distance via the direct reverberant energy ratio for static and moving sources. IEEE Trans. Audio Speech Lang. Process. 18(7), 1793–1805 (2010)
Georganti, E., May, T., Van de Par, S., Mourjopoulos, J.: Sound source distance estimation in rooms based on statistical properties of binaural signals. IEEE Trans. Audio Speech Lang. Process. 21(8), 1727–1741 (2013)
Vesa, S.: Sound source distance learning based on binaural signals. In: Proceedings of the 2007 Workshop on Application of Signal Processing Audio, Acoustic, ,pp 271–274 (2007)
Sadjadi, S.O., Hansen, J.H.L.: Mean Hilbert envelope coefficients (MHEC) for robust speaker and language identification. Speech Commun. 72, 138–148 (2015)
Sengupta, N., Sahidullah, M., Saha, G.: Lung sound classification using cepstral-based statistical features. Comput. Biol. Med. 75, 118–129 (2016)
Venkatesan, R., Balaji Ganesh, A.: Unsupervised auditory saliency enabled binaural scene analyzer for speaker localization and recognition. In: Advances in Signal Processing and Intelligent Recognition Systems, vol. 674. Springer, Berlin (2018)
Vesa, S.: Binaural Source distance learning in rooms. IEEE Trans. Audio Speech Lang. Process. 17(8), 1498–1507 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Venkatesan, R., Ganesh, A.B. (2019). Estimation of Distance of a Target Speech Source by Involving Monaural Features and Statistical Properties. In: Satapathy, S., Bhateja, V., Das, S. (eds) Smart Intelligent Computing and Applications . Smart Innovation, Systems and Technologies, vol 104. Springer, Singapore. https://doi.org/10.1007/978-981-13-1921-1_20
Download citation
DOI: https://doi.org/10.1007/978-981-13-1921-1_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1920-4
Online ISBN: 978-981-13-1921-1
eBook Packages: EngineeringEngineering (R0)