Skip to main content

QuadCOINS-Network: A Deep Learning Approach to Sound Source Localization

  • Conference paper
  • First Online:
Complex, Intelligent and Software Intensive Systems (CISIS 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1194))

Included in the following conference series:

  • 1428 Accesses

Abstract

With the advent of new generations of personal assistants integrated with voice-controlled devices (e.g., Apple Siri, Google Assistant, Amazon Alexa, etc.), the demand for efficient mechanisms to detect, localize and recognize the source of sound events is raising up. As such, microphone-array based devices using improved algorithms are of interest for the research community. In this context, the recent success of deep learning algorithms in various domains (e.g., computer vision, speech recognition, etc.) opens the door to their application to the SELD (Sound Event Localization and Detection) problem. Here, the challenge stands on effectively combining deep neural networks (DNNs) with embedded devices driving specific configurations of the microphone arrays. In this work, we propose the QuadCOIN system. It is an embedded system executing the algorithms needed to detect and localize a sound event in the space all around, which exploits a specific arrangement of microphones that improves the precision in estimating the sound source position. Specifically, our system is composed of an embedded computing device coupled with four groups of microphones, each arranged as a small grid of four sensing elements (i.e., four microphone arrays). The embedded computing device collects the estimations of the event localization from the four groups of sensors, and then determines the exact position of the sound source. To this end, each group of microphones runs a cutting-edge Convolutional Neural Network (CNN), which allows to detect events of interest. The CNN has been trained using datasets generated through a developed in-house framework. As proof of the feasibility of the proposed system, we implemented it on low-cost hardware, which is composed of a single board computer (SBC) and four ST-BlueCOIN microphone arrays. Experimental results carried out on the QuadCOIN system, demonstrate its precision and accuracy in detecting sound events and localizing the corresponding sound sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/sharathadavanne/seld-net.

  2. 2.

    https://www.st.com/en/evaluation-tools/steval-bcnkt01v1.html.

References

  1. Zafari, F., et al.: A survey of indoor localization systems and technologies. IEEE Commun. Surv. Tutorials 21, 2568–2599 (2019)

    Article  Google Scholar 

  2. Huang, Y., Benesty, J., Elko, G.W.: Source localization. In: Audio Signal Processing for Next-Generation Multimedia Communication Systems, pp. 229–253. Springer, Boston, MA, (2004)

    Google Scholar 

  3. Ijaz, F., et al.: Indoor positioning: a review of indoor ultrasonic positioning systems. In: Proceedings of the 15th International Conference on Advanced Communications Technology (ICACT) (2013)

    Google Scholar 

  4. Mesaros, A., et al.: Acoustic event detection in real life recordings. In: 2010 18th European Signal Processing Conference. IEEE (2010)

    Google Scholar 

  5. Hayashi, T., et al.: Duration-controlled LSTM for polyphonic sound event detection. In: IEEE/ACM TASLP (2017)

    Google Scholar 

  6. Cakir, E., et al.: Polyphonic sound event detection using multi label deep neural networks. In: IEEE IJCNN-2015 (2015)

    Google Scholar 

  7. Liu, K., et al.: Guoguo: enabling fine-grained indoor localization via smartphone. In: Proceedings of the 11th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys) (2013)

    Google Scholar 

  8. Huang, W., et al.: WalkieLokie: sensing relative positions of surrounding presenters by acoustic signals. In: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp) (2016)

    Google Scholar 

  9. Mandal, A., et al.: Beep: 3D indoor positioning using audible sound. In: Proceedings of the IEEE Consumer Communications and Networking Conference (CCNC) (2005)

    Google Scholar 

  10. Adavanne, S., et al.: Sound event localization and detection of overlapping sources using convolutional recurrent neural networks. IEEE J. Sel. Top. Sig. Process. 13, 34–48 (2018)

    Article  Google Scholar 

  11. Scionti, A., Ciccia, S., Terzo, O.: Soundfactory: a framework for generating datasets for deep learning seld algorithms. In: Proceedings of the ACM International Conference on Computing Frontiers (CF20) (2020)

    Google Scholar 

  12. Google: A large-scale dataset of manually annotated audio events, 7 February 2020. https://research.google.com/audioset/index.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Ciccia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ciccia, S., Scionti, A., Vitali, G., Terzo, O. (2021). QuadCOINS-Network: A Deep Learning Approach to Sound Source Localization. In: Barolli, L., Poniszewska-Maranda, A., Enokido, T. (eds) Complex, Intelligent and Software Intensive Systems. CISIS 2020. Advances in Intelligent Systems and Computing, vol 1194. Springer, Cham. https://doi.org/10.1007/978-3-030-50454-0_13

Download citation

Publish with us

Policies and ethics