An Adaptive BIC Approach for Robust Speaker Change Detection in Continuous Audio Streams

Žibert, Janez; Brodnik, Andrej; Mihelič, France

doi:10.1007/978-3-642-04208-9_30

Janez Žibert²¹,
Andrej Brodnik²¹ &
France Mihelič²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5729))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

858 Accesses

Abstract

In this paper we focus on an audio segmentation. We present a novel method for robust and accurate detection of acoustic change points in continuous audio streams. The presented segmentation procedure was developed as a part of an audio diarization system for broadcast news audio indexing. In the presented approach, we tried to remove a need for using pre-determined decision-thresholds for detecting of segment boundaries, which are usually the case in the standard segmentation procedures. The proposed segmentation aims to estimate decision-thresholds directly from the currently processed audio data and thus reduces a need for additional threshold tuning from development data. It employs change-detection methods from two well-established audio segmentation approaches based on the Bayesian Information Criterion. Combining methods from both approaches enabled us to adaptively tune boundary-detection thresholds from the underlying processing data. All three segmentation procedures are tested and compared on a broadcast news audio database, where our proposed audio segmentation procedure shows its potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, S., Gopalakrishnan, P.S.: Speaker, environment and channel change detection and clustering via the Bayesian information criterion. In: Proceedings of the DARPA Speech Recognition Workshop, Lansdowne, USA, pp. 127–132 (1998)
Google Scholar
Delacourt, P., Wellekens, C.J.: DISTBIC: A speaker-based segmentation for audio data indexing. Speech Communication 32(1-2), 111–126 (2000)
Article Google Scholar
Ajmera, J., McCowan, I., Bourlard, H.: Robust speaker change detection. IEEE Signal Processing Letters 11(8) (2004)
Google Scholar
Fiscus, J.G., Garofolo, J.S., Le, A., Martin, A.F., Pallett, D.S., Przybocki, M.A., Sanders, G.: Results of the Fall 2004 STT and MDE Evaluation. In: Proceedings of the Fall 2004 Rich Transcription Workshop, Palisades, NY, USA (2004)
Google Scholar
Istrate, D., Scheffer, N., Fredouille, C., Bonastre, J.-F.: Broadcast News Speaker Tracking for ESTER 2005 Campaign. In: Proceedings of Interspeech 2005 - Eurospeech, Lisbon, Portugal, September 2005, pp. 2445–2448 (2005)
Google Scholar
Kemp, T., Schmidt, M., Westphal, M., Waibel, A.: Strategies for Automatic Segmentation of Audio Data. In: Proc. of the ICASSP, vol. (3), pp. 1423–1426 (2000)
Google Scholar
Meignier, S., Bonastre, J.-F., Fredouille, C., Merlin, T.: Evolutive HMM for Multi-Speaker Tracking System. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Istanbul, Turkey (2000)
Google Scholar
Pallett, D.S., Lamel, L. (eds): Automatic transcription of Broadcast News data. Speech Communication 37(1-2), 1–159 (2002)
Article Google Scholar
Siegler, M.A., Jain, U., Raj, B., Stern, R.M.: Automatic Segmentation, Classification and Clustering of Broadcast News. In: Proc. 1997 DARPA Speech Recognition Workshop, Chantilly, VA, February 1997, pp. 97–99 (1997)
Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 2nd edn. Academic Press, Elsevier, USA (2003)
Google Scholar
Tranter, S., Reynolds, D.: An Overview of Automatic Speaker Diarisation Systems. IEEE Transactions on Speech, Audio and Language Processing, Special Issue on Rich Transcription 14(5), 1557–1565 (2006)
Article Google Scholar
Tritschler, A., Gopinath, R.: Improved speaker segmentation and segments clustering using the Bayesian information criterion. In: Proceedings of the EUROSPEECH 1999, Budapest, Hungary, September 1999, pp. 679–682 (1999)
Google Scholar
Žibert, J., Mihelič, F.: Development of Slovenian Broadcast News Speech Database. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, May 2004, pp. 2095–2098 (2004)
Google Scholar
Žibert, J., et al.: The COST278 Broadcast News Segmentation and Speaker Clustering Evaluation - Overview, Methodology, Systems, Results. In: Proceedings of Interspeech 2005, Lisbon, Portugal, pp. 629–632 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Primorska Institute of Natural Sciences and Technology, University of Primorska, Muzejski trg 2, Koper, SI, 6000, Slovenia
Janez Žibert & Andrej Brodnik
Faculty of Electrical Engineering, University of Ljubljana, Tržaška 25, Ljubljana, SI, 1000, Slovenia
France Mihelič

Authors

Janez Žibert
View author publications
You can also search for this author in PubMed Google Scholar
Andrej Brodnik
View author publications
You can also search for this author in PubMed Google Scholar
France Mihelič
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Wet Bohemia at Pilsen, Czech Republic
Václav Matoušek
Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Pavel Mautner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Žibert, J., Brodnik, A., Mihelič, F. (2009). An Adaptive BIC Approach for Robust Speaker Change Detection in Continuous Audio Streams. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2009. Lecture Notes in Computer Science(), vol 5729. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04208-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-04208-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04207-2
Online ISBN: 978-3-642-04208-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics