Signal speech reconstruction and noise removal using convolutional denoising audioencoders with neural deep learning
- 17 Downloads
Datasets exist in real life in many formats (audio, music, image,...). In our case, we have them from various sources mixed together. Our mixtures represent noisy audio data that need to be extracted (features), compressed and analysed in order to be presented in a standard way. The resulted data will be used for the Blind Source Separation task. In this paper, we deal with two types of autoencoders: convolutional and denoising. The novelty of our work is to reconstruct the audio signal in the output of the neural network after extracting the meaningful features that present the pure and the powerful information. Simulation results show a great performance, yielding of 87% for the reconstructed signals that will be included in the automated system used for real word applications.
KeywordsDenoising autoencoder Convolutional autoencoder BSS Keras Deep learning Neural network
Drs Reyes and Ventura want to acknowledge the economical support of the Spanish Ministry of Economy and Competitiveness and the Fund of Regional Development (Project TIN2017-83445-P).
- 1.Li, Y., Wang, F., Chen, Y., Cichocki, A., & Sejnowski, T. (2017). The effects of audiovisual inputs on solving the cocktail party problem in the human brain: An fmri study. Cerebral Cortex, 28, 1–15.Google Scholar
- 2.Févotte, C., & Cardoso, J.-F. (2005). Maximum likelihood approach for blind audio source separation using time-frequency gaussian source models. In IEEE workshop on applications of signal processing to audio and acoustics (pp. 78–81). IEEE.Google Scholar
- 3.Duong, N. Q. K., Vincent, E., & Gribonval, R. (2010). Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Transactions on Audio, Speech, and Language Processing, 18(7), 1830–1840.Google Scholar
- 4.Romano, J. M. T., Romis, A., Cavalcante, C. C., & Suyama, R. (2016). Unsupervised signal processing: Channel equalization and source separation. Boca Raton: CRC Press.Google Scholar
- 5.Zhang, R., Zhu, J.-Y., Isola, P., Geng, X., Lin, A. S., Yu, T., et al. (2017). Real-time user-guided image colorization with learned deep priors. arXiv preprint arXiv:1705.02999.
- 6.Chandna, P., Miron, M., Janer, J., & Gómez, E. (2017). Monoaural audio source separation using deep convolutional neural networks. In International conference on latent variable analysis and signal separation (pp. 258–266). Springer.Google Scholar
- 7.Dubey, N., & Mehra, R. (2015). Blind audio source separation (bass): An unsupervised approach. International Journal of Electrical and Electronics Engineering, 2, 29–33.Google Scholar
- 8.Zhao, M., Wang, D., Zhang, Z., & Zhang, X. (2015). Music removal by convolutional denoising autoencoder in speech recognition. In 2015 Asia-Pacific signal and information processing association annual summit and conference (APSIPA) (pp. 338–341). IEEE.Google Scholar
- 9.Katsamanis, A., Black, M., Georgiou, P. G., Goldstein, L., & Narayanan, S. (2011). Sailalign: Robust long speech-text alignment. In Proceedings of workshop on new tools and methods for very-large scale phonetics research.Google Scholar
- 10.Houda, A., & Otman, C. (2015). Blind audio source separation: State-of-art. International Journal of Computer Applications, 130(4), 1–6.Google Scholar
- 11.Houda, A., & Otman, C. (2017). A novel method based on gaussianity and sparsity for signal separation algorithms. International Journal of Electrical and Computer Engineering (IJECE), 7(4), 1906–1914.Google Scholar
- 12.Kim, E., Hannan, D., & Kenyon, G. (2017). Deep sparse coding for invariant multimodal halle berry neurons. arXiv preprint arXiv:1711.07998.
- 13.Middlebrooks, J. C., & Simon, J. Z. (2017). Ear and brain mechanisms for parsing the auditory scene. In The Auditory System at the Cocktail Party (pp. 1–6). Springer.Google Scholar
- 15.Leglaive, S., Badeau, R., & Richard, G. (2017). Separating time-frequency sources from time-domain convolutive mixtures using non-negative matrix factorization. In 2017 IEEE workshop on applications of signal processing to audio and acoustics (WASPAA) (pp 264–268). IEEE.Google Scholar
- 16.Jang, G., Kim, H.-G., & Oh, Y.-H. (2014). Audio source separation using a deep autoencoder. arXiv preprint arXiv:1412.7193.
- 17.Wang, D., & Chen, J. (2018). Supervised speech separation based on deep learning: An overview. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 975, 8887.Google Scholar
- 18.Abouzid, H, & Chakkor, O. (2017). Blind source separation approach for audio signals based on support vector machine classification. In Proceedings of the 2nd international conference on computing and wireless communication systems (p. 39). ACM.Google Scholar
- 19.Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41.Google Scholar
- 20.Pawar, R. V., Jalnekar, R. M., & Chitode, J. S. (2018). Review of various stages in speaker recognition system, performance measures and recognition toolkits. Analog Integrated Circuits and Signal Processing, 94(2), 247–257.Google Scholar
- 21.Bello, J. P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., & Sandler, M. B. (2005). A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing, 13(5), 1035–1047.Google Scholar
- 23.Degara-Quintela, N., Pena, A., Sobreira-Seoane, M., & Torres-Guijarro, S. Knowledge-based onset detection in musical applications.Google Scholar
- 24.Dannenberg, R. B. (1984). An on-line algorithm for real-time accompaniment. In ICMC (Vol. 84, pp. 193–198).Google Scholar
- 25.Sarroff, A. M, & Casey, M. A. (2014). Musical audio synthesis using autoencoding neural nets. In ICMC.Google Scholar
- 26.Abouzid, H., & Chakkor, O. (2018). Dimension reduction techniques for signal separation algorithms. In International conference on big data, cloud and applications (pp. 326–340). Springer.Google Scholar
- 27.Liutkus, A., Stöter, F.-R., Rafii, Z., Kitamura, D., Rivet, B., Ito, N., et al. (2017). The 2016 signal separation evaluation campaign. In International conference on latent variable analysis and signal separation (pp. 323–332). Springer.Google Scholar