Abstract
The estimation of speaker age using vocal features is studied in this paper. Firstly, a large number of utterances from various speakers are collected for analysis. Secondly, the vocal features including prosodic and spectral features are extracted and compared among different age groups. The spectral energy ratios are proposed to effectively classify speakers from different age groups. Thirdly, artificial neural network is used to model the age features. Age model is learned for segment level classifiers and the probabilities of age distribution is used to generate effective features. Finally, age regression is implemented based on deep neural network. Experimental results show that the proposed model is effective, and the age features are robust to speaker variance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hu, M., Zheng, Y., Ren, F., Jiang, H.: Age estimation and gender classification of facial images based on Local Directional Pattern. In: Proceedings of IEEE International Conference on Cloud Computing and Intelligence Systems, Shenzhen, China, pp. 103–107 (2014)
Kalantari, S., Dean, D., Ghaemmaghami, H.: Cross database training of audio-visual hidden Markov models for phone recognition. Math. Probl. Eng. 6(2), 2141–2146 (2015)
Wang, K., An, N., Li, B.N.: Speech emotion recognition using fourier parameters. IEEE Trans. Affect. Comput. 6(1), 69–75 (2015)
Wang, F., Sahli, H., Gao, J.: Relevance units machine based dimensional and continuous speech emotion prediction. Multimed. Tools Appl. 74(22), 9983–10000 (2015)
Schoetz, S.: Perception, analysis and synthesis of speaker age. Ph.D. dissertation, Department of Computer Science, Lund University, Lund (2006)
Brueckl, M., Sendlmeier, W.: Aging female voices: an acoustic and perceptive analysis. In: Proceedings of Voice Quality: Functions, Analysis and Synthesis, ISCA Tutorial and Research Workshop, Geneva, 27–29 August, pp. 163–168 (2003)
Ramig, L.A., Ringel, R.L.: Effects of physiological aging on selected acoustic characteristics of voice. J. Speech Hear. Res. 26(1), 22–30 (1983)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Mei, G., Min, X. (2018). Automatic Age Estimation Based on Vocal Cues and Deep Neural Network. In: Qiao, F., Patnaik, S., Wang, J. (eds) Recent Developments in Mechatronics and Intelligent Robotics. ICMIR 2017. Advances in Intelligent Systems and Computing, vol 690. Springer, Cham. https://doi.org/10.1007/978-3-319-65978-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-65978-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65977-0
Online ISBN: 978-3-319-65978-7
eBook Packages: EngineeringEngineering (R0)