Phased microphone array for sound source localization with deep learning

Ma, Wei; Liu, Xun

doi:10.1007/s42401-019-00026-w

Phased microphone array for sound source localization with deep learning

Original Paper
Published: 14 May 2019

Volume 2, pages 71–81, (2019)
Cite this article

Aerospace Systems Aims and scope Submit manuscript

2234 Accesses
26 Citations
3 Altmetric
Explore all metrics

Abstract

To phased microphone array for sound source localization, algorithm with both high computational efficiency and high precision is a persistent pursuit until now. In this paper, convolutional neural network (CNN) a kind of deep learning is preliminarily applied as a new algorithm. The input of CNN is only cross-spectral matrix, while the output of CNN is source distribution. With regard to computing speed in applications, CNN once trained is as fast as conventional beamforming, and is significantly faster than the most famous deconvolution algorithm DAMAS. With regard to measurement accuracy in applications, at high frequency, CNN can reconstruct the sound localizations with up to 100% test accuracy, although sidelobes may appear in some situations. In addition, CNN has a spatial resolution nearly as that of DAMAS and better than that of the conventional beamforming. CNN test accuracy decreases with frequency decreasing; however, in most incorrect samples, CNN results are not far away from the correct results. This exciting result means that CNN perfectly finds source distribution directly from cross-spectral matrix without given propagation function and microphone positions in advance, and thus, CNN deserves to be further explored as a new algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multichannel learning-based approach for sound source separation in reverberant environments

Article Open access 20 November 2021

You-Siang Chen, Zi-Jie Lin & Mingsian R. Bai

Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenvalue Decomposition

Raw Multichannel Processing Using Deep Neural Networks

References

Johnson DH, Dudgeon DE (1993) Array signal processing: concepts and techniques. Prentice Hall, New Jersey
MATH Google Scholar
Michel U (2006) History of acoustic beamforming, BeBeC-2006-01, 1–17
Sarradj E (2010) A fast signal subspace approach for the determination of absolute levels from phased microphone array measurements. J Sound Vib 329:1553–1569
Article Google Scholar
Huang X, Long B, Vinogradov I, Peers E (2012) Adaptive beamforming for array signal processing in aeroacoustic measurements. J Acoust Soc Am 131:2152–2161
Article Google Scholar
Dougherty RP (2014) Functional beamforming. In: 5th Berlin beamforming conference 2014, BeBeC-2014-01
Brooks TF, Humphreys WM (2004) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays, AIAA-2004-2954
Brooks TF, Humphreys WM (2006) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays. J Sound Vib 294:856–879
Article Google Scholar
Lawson CL, Hanson RJ (1995) Solving least square problems (Chapter 23). SIAM,
Sijtsma P (2007) CLEAN based on spatial source coherence. Int J Aeroacoust 6:357–374
Article Google Scholar
Dougherty RP (2005) Extension of DAMAS and benefits and limitations of deconvolution in beamforming. AIAA 2005–2961
Ma W, Liu X (2017) Improving the efficiency of DAMAS for sound source localization via wavelet compression computational grid. J Sound Vib 395:341–353
Article Google Scholar
Ma W, Liu X (2017) DAMAS with compression computational grid for acoustic source mapping. J Sound Vib 410:473–484
Article Google Scholar
Ma W, Liu X (2018) Compression computational grid based on functional beamforming for acoustic source localization. Appl Acoust 134:75–87
Article Google Scholar
Goodfellow I, Bengio Y, Courville A (2017) Deep learning. www.deeplearningbook.org
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Article Google Scholar
Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20:30–42
Article Google Scholar
Krizhevsky A, Hinton SIG (2012) Imagenet classification with deep convolutional neural networks. In: Communications of the ACM 60
Hezaveh YD, Levasseur LP, Marshall PJ (2017) Fast automated analysis of strong gravitational lenses with convolutional neural networks. Nature 548:555–557
Article Google Scholar
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez Arthur T, Hubert Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of go without human knowledge. Nature 550:354–359
Article Google Scholar
Ling J, Kurzawski A, Templeton J (2016) Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. J Fluid Mech 807:155–166
Article MathSciNet Google Scholar
Kutz JN (2017) Deep learning in fluid dynamics. J Fluid Mech 814:1–4
Article Google Scholar
Chollet F (2015) Keras, GitHub Repository

Download references

Author information

Wei Ma and Xun Liu contributed equally to this work.

Authors and Affiliations

School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai, People’s Republic of China
Wei Ma
Shanghai KeyGo Technology Company Limited, Shanghai, People’s Republic of China
Xun Liu

Authors

Wei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xun Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, W., Liu, X. Phased microphone array for sound source localization with deep learning. AS 2, 71–81 (2019). https://doi.org/10.1007/s42401-019-00026-w

Download citation

Received: 08 March 2019
Revised: 28 April 2019
Accepted: 03 May 2019
Published: 14 May 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s42401-019-00026-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phased microphone array for sound source localization with deep learning

Abstract

Access this article

Similar content being viewed by others

A multichannel learning-based approach for sound source separation in reverberant environments

Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenvalue Decomposition

Raw Multichannel Processing Using Deep Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Phased microphone array for sound source localization with deep learning

Abstract

Access this article

Similar content being viewed by others

A multichannel learning-based approach for sound source separation in reverberant environments

Simultaneous Sound Source Localization by Proposed Cuboids Nested Microphone Array Based on Subband Generalized Eigenvalue Decomposition

Raw Multichannel Processing Using Deep Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation