Convolutional Recurrent Neural Networks and Acoustic Data Augmentation for Snore Detection

Vesperini, Fabio; Romeo, Luca; Principi, Emanuele; Monteriù, Andrea; Squartini, Stefano

doi:10.1007/978-981-13-8950-4_4

Fabio Vesperini⁷,
Luca Romeo⁷,
Emanuele Principi⁷,
Andrea Monteriù⁷ &
…
Stefano Squartini⁷

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 151))

1019 Accesses
3 Citations

Abstract

In this paper, we propose an algorithm for snoring sounds detection based on convolutional recurrent neural networks (CRNN). The log Mel energy spectrum of the audio signal is extracted from overnight recordings and is used as input to the CRNN with the aim to detect the precise onset and offset time of the sound events. The dataset used in the experiments is highly imbalanced toward the non-snore class. A data augmentation technique is introduced, that consists in generating new snore examples by simulating the target acoustic scenario. The application of CRNN with the acoustic data augmentation constitutes the main contribution of the work in the snore detection scenario. The performance of the algorithm has been assessed on the A3-Snore corpus, a dataset which consists of more than seven hours of recordings of two snorers and consistent environmental noise. Experimental results, expressed in terms of Average Precision (AP), show that the combination of CRNN and data augmentation in the raw data domain is effective, obtaining an AP up to 94.92%, giving superior results within the related literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amiriparian, S., Gerczuk, M., Ottl, S., Cummins, N., Freitag, M., Pugachevskiy, S., Baird, A., Schuller, B.: Snore sound classification using image-based deep spectrum features. In: Proceeding of Interspeech. Stockholm, Sweden (Aug 20–24 2017)
Google Scholar
Banno, K., Kryger, M.H.: Sleep apnea: clinical investigations in humans. Sleep Med. 8, 400–426 (2007)
Article Google Scholar
Blumen, M.B., Salva, M.A.Q., Vaugier, I., Leroux, K., d’Ortho, M.P., Barbot, F., Chabolle, F., Lofaso, F.: Is snoring intensity responsible for the sleep partner’s poor quality of sleep? Sleep Breath. 16(3), 903–907 (2012)
Article Google Scholar
Cakir, E., Virtanen, T.: Convolutional recurrent neural networks for rare sound event detection. In: Proceeding of DCASE. pp. 27–31 (2017)
Google Scholar
Cavusoglu, M., Kamasak, M., Erogul, O., Ciloglu, T., Serinagaoglu, Y., Akcam, T.: An efficient method for snore/nonsnore classification of sleep sounds. Physiol. Meas. 28(8), 841 (2007)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Chokroverty, S.: Sleep Disorders Medicine E-Book: Basic Science, Technical Considerations, and Clinical Aspects. Elsevier Health Sciences (2009)
Google Scholar
Chollet, F., et al.: Keras (2015)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
Freitag, M., Amiriparian, S., Cummins, N., Gerczuk, M., Schuller, B.: An ‘end-to-evolution’hybrid approach for snore sound classification. In: Proceeding of Interspeech. Stockholm, Sweden (Aug 20–24 2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceeding of ICMC, pp. 448–456 (2015)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lumeng, J.C., Chervin, R.D.: Epidemiology of pediatric obstructive sleep apnea. Proc. Am. Thorac. Soc. 5(2), 242–252 (2008)
Article Google Scholar
Pevernagie, D., Aarts, R.M., De Meyer, M.: The acoustics of snoring. Sleep Med. Rev. 14(2), 131–144 (2010)
Article Google Scholar
Qian, K., Xu, Z., Xu, H., Wu, Y., Zhao, Z.: Automatic detection, segmentation and classification of snore related signals from overnight audio recording. IET Signal Proc. 9(1), 21–29 (2015)
Article Google Scholar
Scheibler, R., Bezzam, E., Dokmanić, I.: Pyroomacoustics: a Python package for audio room simulations and array processing algorithms. In: Proceeding of ICASSP. Calgary, Canada (Apr 15–20 2018)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Bergelson, E., Krajewski, J., Janott, C., Amatuni, A., Casillas, M., Seidl, A., Soderstrom, M., et al.: The interspeech 2017 computational paralinguistics challenge: addressee, cold & snoring. In: Computational Paralinguistics Challenge (ComParE), Interspeech 2017. pp. 3442–3446 (2017)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Strollo, P.J.J., Rogers, R.M.: Obstructive sleep apnea. N. Engl. J. Med. 334(2), 99–104 (1996)
Article Google Scholar
Valenti, M., Tonelli, D., Vesperini, F., Principi, E., Squartini, S.: A neural network approach for sound event detection in real life audio. In: 2017 25th European Signal Processing Conference (EUSIPCO), pp. 2754–2758. IEEE (2017)
Google Scholar
Vesperini, F., Galli, A., Gabrielli, L., Principi, E., Squartini, S.: Snore sounds excitation localization by using scattering transform and deep neural networks. In: Proceeding of the Int. Joint Conf. on Neural Networks (IJCNN). Rio de Janeiro, Brazil (Jul 8–13 2018), to appear
Google Scholar
Vesperini, F., Vecchiotti, P., Principi, E., Squartini, S., Piazza, F.: Localizing speakers in multiple rooms by using deep neural networks. Comput. Speech Lang. 49, 83–106 (2018)
Article Google Scholar
Virtanen, T., Mesaros, A., Heittola, T., Diment, A., Vincent, E., Benetos, E., Elizalde, B.M.: Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE2017). Tampere University of Technology, Laboratory of Signal Processing (2017)
Google Scholar
Young, T., Finn, L., Kim, H., et al.: Nasal obstruction as a risk factor for sleep-disordered breathing. J. Allergy Clin. Immunol. 99(2), S757–S762 (1997)
Article Google Scholar
Zeiler, M.D.: AdaDelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)

Download references

Author information

Authors and Affiliations

Department of Information Engineering, Università Politecnica delle Marche Via Brecce Bianche, 60131, Ancona, Italy
Fabio Vesperini, Luca Romeo, Emanuele Principi, Andrea Monteriù & Stefano Squartini

Authors

Fabio Vesperini
View author publications
You can also search for this author in PubMed Google Scholar
Luca Romeo
View author publications
You can also search for this author in PubMed Google Scholar
Emanuele Principi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Monteriù
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Squartini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emanuele Principi .

Editor information

Editors and Affiliations

Department of Psychology, University of Campania Luigi Vanvitelli, Caserta, Italy
Anna Esposito
Tecnocampus, Mataró, Spain
Marcos Faundez-Zanuy
Department of Civil, Environment, Energy and Materials Engineering, Mediterranea University of Reggio Calabria, Reggio Calabria, Italy
Francesco Carlo Morabito
Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino, Turin, Italy
Eros Pasero

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vesperini, F., Romeo, L., Principi, E., Monteriù, A., Squartini, S. (2020). Convolutional Recurrent Neural Networks and Acoustic Data Augmentation for Snore Detection. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Approaches to Dynamics of Signal Exchanges. Smart Innovation, Systems and Technologies, vol 151. Springer, Singapore. https://doi.org/10.1007/978-981-13-8950-4_4

Download citation

DOI: https://doi.org/10.1007/978-981-13-8950-4_4
Published: 19 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8949-8
Online ISBN: 978-981-13-8950-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics