Skip to main content

Text-Independent Speaker Verification from Mixed Speech of Multiple Speakers via Using Pole Distribution of Speech Signals

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11306))

Included in the following conference series:

Abstract

This paper presents a method of text-independent speaker verification from mixed speech of multiple speakers via using pole distribution of speech signals. The poles of speech signal derived from all-pole speech production model are obtained via a neural net called bagging CAN2 (competitive associative net 2) for learning efficient piecewise linear approximation of nonlinear function. We show an analysis that poles of mixed speech are expected to be composed of the poles farther from zeros of ARMA (autoregressive moving average) models of constituent speeches. By means of experiments using unmixed and mixed speeches, we show the distribution of the poles of speeches has two typical regions: one involves poles which change suddenly with the change of the speech from unmixed to mixed, and the other involves poles which change continuously with the change of the mixing weight, which is considered to support the analysis. We execute experiments of speaker verification, and obtain the following properties of recall and precision as measures of verification performance: the recall decreases suddenly with the change of the speech from unmixed to mixed, while the precision does not decreases so much with the decrease of SNR (signal to noise ratio) until below 0 dB. Finally, we show the usefulness of the present method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)

    Article  Google Scholar 

  2. Beigi, H.: Fundamentals of Speaker Recognition. Springer, New York (2011). https://doi.org/10.1007/978-0-387-77592-0

    Book  MATH  Google Scholar 

  3. Kurogi S., Ueno T., Sawa M.: A batch learning method for competitive associative net and its application to function approximation. In: Proceedings of SCI 2004, vol. V, pp. 24–28 (2004)

    Google Scholar 

  4. Kurogi, S., Mineishi, S., Sato, S.: An analysis of speaker recognition using bagging CAN2 and pole distribution of speech signals. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds.) ICONIP 2010, part I. LNCS, vol. 6443, pp. 363–370. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17537-4_45

    Chapter  Google Scholar 

  5. Kurogi S., Nedachi N.: Reproduction and recognition of vowels using piecewise linear predictive coefficients obtained by competitive associative nets. In: Proceedings of SICE- ICCAS2006, CD-ROM (2006)

    Google Scholar 

  6. Sakashita, S., Takeguchi, S., Matsuo, K., Kurogi, S.: Probabilistic prediction for text-prompted speaker verification capable of accepting spoken words with the same meaning but different pronunciations. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016, part IV. LNCS, vol. 9950, pp. 312–320. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46681-1_38

    Chapter  Google Scholar 

  7. Sakata, K., Sakashita, S., Matsuo, K., Kurogi, S.: Speaker detection in audio stream via probabilistic prediction using generalized GEBI. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016, part IV. LNCS, vol. 9950, pp. 302–311. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46681-1_37

    Chapter  Google Scholar 

  8. Bronkhorst A.W.: The cocktail-party problem revisited: early processing and selection of multi-talker speech. Atten. Percept. Psychophys. (2015). https://doi.org/10.3758/s13414-015-0882-9

    Article  Google Scholar 

  9. Wang, Y., Sun, W.: Multi-speaker recognition in cocktail party problem. In: Proceedings of International Conference on Communications, Signal Processing, and Systems arXiv:1712.01742 (2017)

  10. Bimbot, N., et al.: A tutorial on text-independent speaker verification. J. Appl. Signal Process. 2004, 430–451 (2004)

    Google Scholar 

  11. Kurogi, S.: Improving generalization performance via out-of-bag estimate using variable size of bags. J. Jpn. Neural Netw. Soc. 16(2), 81–92 (2009)

    Google Scholar 

  12. Aldhaheri, W.R., Al-Saadi, F.E.: Robust text-independent speaker recognition with short utterance in noisy environment using SVD as a matching measure. J. King Saud Univ. Comput. Inf. Sci. Arch. 17, 25–44 (2004)

    Google Scholar 

  13. Kurogi, S., Sato, S., Ichimaru, K.: Speaker recognition using pole distribution of speech signals obtained by bagging CAN2. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, part I. LNCS, vol. 5863, pp. 622–629. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10677-4_71

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuichi Kurogi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tagomori, T., Matsuo, K., Kurogi, S. (2018). Text-Independent Speaker Verification from Mixed Speech of Multiple Speakers via Using Pole Distribution of Speech Signals. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11306. Springer, Cham. https://doi.org/10.1007/978-3-030-04224-0_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04224-0_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04223-3

  • Online ISBN: 978-3-030-04224-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics