Abstract
To phased microphone array for sound source localization, algorithm with both high computational efficiency and high precision is a persistent pursuit until now. In this paper, convolutional neural network (CNN) a kind of deep learning is preliminarily applied as a new algorithm. The input of CNN is only cross-spectral matrix, while the output of CNN is source distribution. With regard to computing speed in applications, CNN once trained is as fast as conventional beamforming, and is significantly faster than the most famous deconvolution algorithm DAMAS. With regard to measurement accuracy in applications, at high frequency, CNN can reconstruct the sound localizations with up to 100% test accuracy, although sidelobes may appear in some situations. In addition, CNN has a spatial resolution nearly as that of DAMAS and better than that of the conventional beamforming. CNN test accuracy decreases with frequency decreasing; however, in most incorrect samples, CNN results are not far away from the correct results. This exciting result means that CNN perfectly finds source distribution directly from cross-spectral matrix without given propagation function and microphone positions in advance, and thus, CNN deserves to be further explored as a new algorithm.
Similar content being viewed by others
References
Johnson DH, Dudgeon DE (1993) Array signal processing: concepts and techniques. Prentice Hall, New Jersey
Michel U (2006) History of acoustic beamforming, BeBeC-2006-01, 1–17
Sarradj E (2010) A fast signal subspace approach for the determination of absolute levels from phased microphone array measurements. J Sound Vib 329:1553–1569
Huang X, Long B, Vinogradov I, Peers E (2012) Adaptive beamforming for array signal processing in aeroacoustic measurements. J Acoust Soc Am 131:2152–2161
Dougherty RP (2014) Functional beamforming. In: 5th Berlin beamforming conference 2014, BeBeC-2014-01
Brooks TF, Humphreys WM (2004) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays, AIAA-2004-2954
Brooks TF, Humphreys WM (2006) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays. J Sound Vib 294:856–879
Lawson CL, Hanson RJ (1995) Solving least square problems (Chapter 23). SIAM,
Sijtsma P (2007) CLEAN based on spatial source coherence. Int J Aeroacoust 6:357–374
Dougherty RP (2005) Extension of DAMAS and benefits and limitations of deconvolution in beamforming. AIAA 2005–2961
Ma W, Liu X (2017) Improving the efficiency of DAMAS for sound source localization via wavelet compression computational grid. J Sound Vib 395:341–353
Ma W, Liu X (2017) DAMAS with compression computational grid for acoustic source mapping. J Sound Vib 410:473–484
Ma W, Liu X (2018) Compression computational grid based on functional beamforming for acoustic source localization. Appl Acoust 134:75–87
Goodfellow I, Bengio Y, Courville A (2017) Deep learning. www.deeplearningbook.org
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20:30–42
Krizhevsky A, Hinton SIG (2012) Imagenet classification with deep convolutional neural networks. In: Communications of the ACM 60
Hezaveh YD, Levasseur LP, Marshall PJ (2017) Fast automated analysis of strong gravitational lenses with convolutional neural networks. Nature 548:555–557
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez Arthur T, Hubert Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of go without human knowledge. Nature 550:354–359
Ling J, Kurzawski A, Templeton J (2016) Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. J Fluid Mech 807:155–166
Kutz JN (2017) Deep learning in fluid dynamics. J Fluid Mech 814:1–4
Chollet F (2015) Keras, GitHub Repository
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ma, W., Liu, X. Phased microphone array for sound source localization with deep learning. AS 2, 71–81 (2019). https://doi.org/10.1007/s42401-019-00026-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42401-019-00026-w