Skip to main content
Log in

Phased microphone array for sound source localization with deep learning

  • Original Paper
  • Published:
Aerospace Systems Aims and scope Submit manuscript

Abstract

To phased microphone array for sound source localization, algorithm with both high computational efficiency and high precision is a persistent pursuit until now. In this paper, convolutional neural network (CNN) a kind of deep learning is preliminarily applied as a new algorithm. The input of CNN is only cross-spectral matrix, while the output of CNN is source distribution. With regard to computing speed in applications, CNN once trained is as fast as conventional beamforming, and is significantly faster than the most famous deconvolution algorithm DAMAS. With regard to measurement accuracy in applications, at high frequency, CNN can reconstruct the sound localizations with up to 100% test accuracy, although sidelobes may appear in some situations. In addition, CNN has a spatial resolution nearly as that of DAMAS and better than that of the conventional beamforming. CNN test accuracy decreases with frequency decreasing; however, in most incorrect samples, CNN results are not far away from the correct results. This exciting result means that CNN perfectly finds source distribution directly from cross-spectral matrix without given propagation function and microphone positions in advance, and thus, CNN deserves to be further explored as a new algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Johnson DH, Dudgeon DE (1993) Array signal processing: concepts and techniques. Prentice Hall, New Jersey

    MATH  Google Scholar 

  2. Michel U (2006) History of acoustic beamforming, BeBeC-2006-01, 1–17

  3. Sarradj E (2010) A fast signal subspace approach for the determination of absolute levels from phased microphone array measurements. J Sound Vib 329:1553–1569

    Article  Google Scholar 

  4. Huang X, Long B, Vinogradov I, Peers E (2012) Adaptive beamforming for array signal processing in aeroacoustic measurements. J Acoust Soc Am 131:2152–2161

    Article  Google Scholar 

  5. Dougherty RP (2014) Functional beamforming. In: 5th Berlin beamforming conference 2014, BeBeC-2014-01

  6. Brooks TF, Humphreys WM (2004) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays, AIAA-2004-2954

  7. Brooks TF, Humphreys WM (2006) A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays. J Sound Vib 294:856–879

    Article  Google Scholar 

  8. Lawson CL, Hanson RJ (1995) Solving least square problems (Chapter 23). SIAM,

  9. Sijtsma P (2007) CLEAN based on spatial source coherence. Int J Aeroacoust 6:357–374

    Article  Google Scholar 

  10. Dougherty RP (2005) Extension of DAMAS and benefits and limitations of deconvolution in beamforming. AIAA 2005–2961

  11. Ma W, Liu X (2017) Improving the efficiency of DAMAS for sound source localization via wavelet compression computational grid. J Sound Vib 395:341–353

    Article  Google Scholar 

  12. Ma W, Liu X (2017) DAMAS with compression computational grid for acoustic source mapping. J Sound Vib 410:473–484

    Article  Google Scholar 

  13. Ma W, Liu X (2018) Compression computational grid based on functional beamforming for acoustic source localization. Appl Acoust 134:75–87

    Article  Google Scholar 

  14. Goodfellow I, Bengio Y, Courville A (2017) Deep learning. www.deeplearningbook.org

  15. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  16. Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20:30–42

    Article  Google Scholar 

  17. Krizhevsky A, Hinton SIG (2012) Imagenet classification with deep convolutional neural networks. In: Communications of the ACM 60

  18. Hezaveh YD, Levasseur LP, Marshall PJ (2017) Fast automated analysis of strong gravitational lenses with convolutional neural networks. Nature 548:555–557

    Article  Google Scholar 

  19. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez Arthur T, Hubert Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, Hassabis D (2017) Mastering the game of go without human knowledge. Nature 550:354–359

    Article  Google Scholar 

  20. Ling J, Kurzawski A, Templeton J (2016) Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. J Fluid Mech 807:155–166

    Article  MathSciNet  Google Scholar 

  21. Kutz JN (2017) Deep learning in fluid dynamics. J Fluid Mech 814:1–4

    Article  Google Scholar 

  22. Chollet F (2015) Keras, GitHub Repository

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Ma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, W., Liu, X. Phased microphone array for sound source localization with deep learning. AS 2, 71–81 (2019). https://doi.org/10.1007/s42401-019-00026-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42401-019-00026-w

Keywords

Navigation