Skip to main content

An Adaptive Non Reference Anchor Array Framework for Distant Speech Recognition

  • Conference paper
Book cover Advances in Multimedia Information Processing – PCM 2012 (PCM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7674))

Included in the following conference series:

Abstract

Distant speech recognition over microphone arrays is challenging, especially in multi source environments. In this paper, a non reference anchor array (NRA) framework for distant speech recognition is proposed. The NRA framework uses a non reference anchor array to capture the interfering speech sources, in addition to the primary array that captures the speech source of interest. The framework uses a linearly constrained minimum variance beam former (LC-MV) beam former such that the signal coming from the look direction is preserved while rejecting correlated interferences coming from the same direction as the source of interest. The performance of the proposed method discussed herein is evaluated by conducting experiments on clean speech acquisition from distant microphones and also on distant speech recognition on the TIMIT and MONC databases. Experimental results obtained from the proposed method indicate a reasonable improvement over correlation, subspace and standard minimum variance beam forming methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, J., Benesty, J., Huang, Y., Doclo, S.: New insights into the noise reduction wiener filter. IEEE Transactions on Audio, Speech, and Language Processing 14(4), 1218–1234 (2006)

    Google Scholar 

  2. Chen, J., Benesty, J., Huang, Y.A.: On the optimal linear filtering techniques for noise reduction. Speech Communication 49(4), 305–316 (2007)

    Google Scholar 

  3. Meyer, J., Elko, G.: Spherical microphone arrays for 3d sound recording. In: Audio Signal Processing for Next-Generation Multimedia Communication Systems, pp. 67–89 (2004)

    Google Scholar 

  4. Meyer, J., Elko, G.: A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, vol. 2, p. II-1781. IEEE (2002)

    Google Scholar 

  5. Capon, J.: High-resolution frequency-wavenumber spectrum analysis. Proceedings of the IEEE 57(8), 1408–1418 (1969)

    Google Scholar 

  6. Zhang, W., Rao, B.D.: Robust broadband beamformer with diagonally loaded constraint matrix and its application to speech recognition. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, vol. 1, p. I. IEEE (2006)

    Google Scholar 

  7. Li, J., Stoica, P., Wang, Z.: On robust capon beamforming and diagonal loading. IEEE Transactions on Signal Processing 51(7), 1702–1715 (2003)

    Google Scholar 

  8. Van Trees, H.L.: Optimum Array Processing. Wiley-Interscience (2002)

    Google Scholar 

  9. Zue, V., Seneff, S., Glass, J.: Speech database development at mit: Timit and beyond. Speech Communication 9(4), 351–356 (1990)

    Google Scholar 

  10. Levi, A.: Multi Channel Overlapping Numbers Corpus distribution, Linguistic Data Consortium (2003), http://cslu.cse.ogi.edu/corpora/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shukla, A., Nathwani, K., Hegde, R.M. (2012). An Adaptive Non Reference Anchor Array Framework for Distant Speech Recognition. In: Lin, W., et al. Advances in Multimedia Information Processing – PCM 2012. PCM 2012. Lecture Notes in Computer Science, vol 7674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34778-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34778-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34777-1

  • Online ISBN: 978-3-642-34778-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics