Skip to main content

Joint \(L1-L2\) Regularisation for Blind Speech Deconvolution

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2017 (PCM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10735))

Included in the following conference series:

  • 2764 Accesses

Abstract

The purpose of blind speech deconvolution is to recover both the original speech source and the room impulse response (RIR) from the observed reverberant speech. This can be beneficial for improving speech intelligibility and speech perception. However, the problem is ill-posed, which often requires additional knowledge to solve. In order to address this problem, prior information (such as the sparseness of signal or acoustic channel) is often exploited. In this paper, we propose a joint \(L1-L2\) regularisation based blind speech deconvolution method for a single-input and single-output (SISO) acoustic system with a high level of reverberation, where both the sparsity and density of the room impulse responses (RIR) are considered, by imposing an L1 and L2 norm constraint on their early and late part respectively. By employing an alternating strategy, both the source signal and early part in the RIR can be well reconstructed while the late part of the RIR can be suppressed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 155.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allen, J.B., Berkley, D.A.: Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 65(4), 943–950 (1979)

    Article  Google Scholar 

  2. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)

    MATH  Google Scholar 

  3. Choudhary, S., Mitra, U.: Fundamental limits of blind deconvolution part I: Ambiguity kernel. arXiv preprint arXiv:1411.3810 (2014)

  4. Choudhary, S., Mitra, U.: Fundamental limits of blind deconvolution part II: Sparsity-ambiguity trade-offs. arXiv preprint arXiv:1503.03184 (2015)

  5. Chouzenoux, E., Pesquet, J.C., Repetti, A.: A block coordinate variable metric forward-backward algorithm. J. Glob. Optim. 66(3), 457–485 (2016)

    Article  MathSciNet  Google Scholar 

  6. Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Bauschke, H., Burachik, R., Combettes, P., Elser, V., Luke, D., Wolkowicz, H. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering. SOIA, vol. 49, pp. 185–212. Springer, New York (2011). https://doi.org/10.1007/978-1-4419-9569-8_10

    Chapter  MATH  Google Scholar 

  7. Grant, M., Boyd, S., Grant, M., Boyd, S., Blondel, V., Boyd, S., Kimura, H.: CVX: Matlab software for disciplined convex programming, version 2.1. In: Recent Advances in Learning and Control, pp. 95–110 (2008)

    Google Scholar 

  8. Guan, J., Wang, X., Wang, W., Huang, L.: Sparse blind speech deconvolution with dynamic range regularization and indicator function. Circ. Syst. Sig. Process. 36(10), 4145–4160 (2017)

    Article  Google Scholar 

  9. Hu, Y., Kokkinakis, K.: Effects of early and late reflections on intelligibility of reverberated speech by cochlear implant listeners. J. Acoust. Soc. Am. 135(1), EL22–EL28 (2014)

    Article  Google Scholar 

  10. Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale l1-regularized least squares. IEEE J. Sel. Top. Sig. Process. 1(4), 606–617 (2007)

    Article  Google Scholar 

  11. Klatte, M., Lachmann, T., Meis, M., et al.: Effects of noise and reverberation on speech perception and listening comprehension of children and adults in a classroom-like setting. Noise Health 12(49), 270 (2010)

    Article  Google Scholar 

  12. Mosayyebpour, S., Esmaeili, M., Gulliver, T.A.: Single-microphone early and late reverberation suppression in noisy speech. IEEE Trans. Audio, Speech, Lang. Process. 21(2), 322–335 (2013)

    Article  Google Scholar 

  13. Nakatani, T., Miyoshi, M., Kinoshita, K.: One microphone blind dereverberation based on quasi-periodicity of speech signals. In: NIPS, pp. 1417–1424 (2003)

    Google Scholar 

  14. Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)

    Article  Google Scholar 

  15. Schuller, B.: Affective speaker state analysis in the presence of reverberation. Int. J. Speech Technol. 14(2), 77–87 (2011)

    Article  Google Scholar 

  16. Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia, pp. 47–56 (2014)

    Google Scholar 

  17. Zhao, S., Yao, H., Gao, Y., Ji, R.R., Ding, G.: Continuous probability distribution prediction of image emotions via multi-task shared sparse regression. IEEE Transactions on Multimedia PP(99), 1 (2016)

    Google Scholar 

  18. Zhao, S., Yao, H., Jiang, X., Sun, X.: Predicting discrete probability distribution of image emotions. In: IEEE International Conference on Image Processing, pp. 2459–2463. IEEE (2015)

    Google Scholar 

Download references

Acknowledgements

The work was conducted when J. Guan visited the Centre for Vision, Speech and Signal Processing (CVSSP), the University of Surrey. This work was supported in part by International Exchange and Cooperation Foundation of Shenzhen City, China (No. GJHZ20150312114149569).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guan, J., Wang, X., Xie, Z., Qi, S., Wang, W. (2018). Joint \(L1-L2\) Regularisation for Blind Speech Deconvolution. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10735. Springer, Cham. https://doi.org/10.1007/978-3-319-77380-3_80

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77380-3_80

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77379-7

  • Online ISBN: 978-3-319-77380-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics