Challenges in Audio Processing of Terrorist-Related Data

Gauvain, Jodie; Lamel, Lori; Le, Viet Bac; Despres, Julien; Gauvain, Jean-Luc; Messaoudi, Abdel; Vieru, Bianca; Kheder, Waad Ben

doi:10.1007/978-3-030-05716-9_7

Jodie Gauvain¹⁹,
Lori Lamel²⁰,
Viet Bac Le¹⁹,
Julien Despres¹⁹,
Jean-Luc Gauvain²⁰,
Abdel Messaoudi¹⁹,
Bianca Vieru¹⁹ &
…
Waad Ben Kheder²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11296))

Included in the following conference series:

International Conference on Multimedia Modeling

2151 Accesses
1 Citations

Abstract

Much information in multimedia data related to terrorist activity can be extracted from the audio content. Our work in ongoing projects aims to provide a complete description of the audio portion of multimedia documents. The information that can be extracted can be derived from diarization, classification of acoustic events, language and speaker segmentation and clustering, as well as automatic transcription of the speech portions. An important consideration is ensuring that the audio processing technologies are well suited to the types of data of interest to the law enforcement agencies. While language identification and speech recognition may be considered as ’mature technologies’, our experience is that even state-of-the-art systems require customisation and enhancements to address the challenges of terrorist-related audio documents.

This work was partially financed by the Horizon 2020 project DANTE - Detecting and analysing terrorist-related online contents and financing activities and the French National Agency for Research as part of the SALSA project (Speech and Language technologies for Security Applications) under grant ANR-14-CE28-0021.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Vu, N.T. et al.: A first speech recognition system for Mandarin-English code-switch conversational speech. In: IEEE ICASSP (2012)
Google Scholar
Gauvain, J.L., Lamel, L., Adda, G.: Audio partitioningt and transcription for broadcast data indexation. Multimed. Tools Appl. 14, 187–200 (2001)
Article Google Scholar
House, A.S., Neuburg, E.P.: Toward automatic identification of the language of an utterance. I. Preliminary methodological considerations. JASA 62(3), 708–713 (1977)
Article Google Scholar
Gauvain, J.L., Lamel, L.: Identification of non-linguistic speech features. In: Human Language Technology (HLT 1993), pp. 96–101. ACL (1993)
Google Scholar
Lamel, L., Gauvain, J.L.: A phone-based approach to non-linguistic speech feature identification. Comput. Speech Lang. 9(1), 87–103 (1995). https://doi.org/10.1006/csla.1995.0005
Article Google Scholar
Zissman, M.: Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans. Speech Audio 4, 31–44 (1996)
Article Google Scholar
Benzeghiba, M. Gauvain, J.L., Lamel, L.: Improved n-gram phonotactic models for language recognition. In: Interspeech (2010)
Google Scholar
Kadambe, S., Hieronymus, J.: Language identification with phonological and lexical models. In: IEEE ICASSP (1995)
Google Scholar
Gauvain, J.L., Messaoudi, A., Schwenk, H.: Language recognition using phone lattices. In: ICSLP, pp. 1283–1286, Jeju Island (2004)
Google Scholar
Dehak, N. et al.: Language recognition via i-vectors and dimensionality reduction. In: Interspeech, pp. 857–860, Florence (2011)
Google Scholar
Martinez, D. et al.: Language recognition in iVectors space. In: Interspeech (2011)
Google Scholar
Hinton, G., et al.: Deep neural networks foracoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Weinreich, U.: Languages in Contact. Mouton, The Hague (1953)
Google Scholar
Demby, G.: How code-switching explains the world (2013)
Google Scholar
Amazouz, D., Adda-Decker, M, Lamel, L.: Addressing code-switching in French/Algerian Arabic speech. In: Proceedings of Interspeech 2017, pp. 62–66 (2017)
Google Scholar
Jelinek, F.: Continuous speech recognition by statistical methods. Proc. IEEE 64, 532–556 (1976)
Article Google Scholar
Schwartz, R. et al.: Improved hidden Markov modeling of phonemes for continuous speech recognition. In: IEEE ICASSP, vol. 3, pp. 35.6.1–35.6.4 (1984)
Google Scholar
Peddinti, V., Povey, D., Khudanpur, S.: A time delay neural network architecture for efficient modeling of long temporal contexts. In: Interspeech (2015)
Google Scholar
Cui, X., Goel, V., Kingsbury, B.: Data augmentation for deep neural network acoustic modelling. In: IEEE ICASSP, pp. 5619–5623 (2014)
Google Scholar
Ragni, A., et al.: Data augmentation for low resource languages. In: Interspeech, pp. 810–814, Singapore (2014)
Google Scholar
Gemmeke, J.F., et al.: Audio set: an ontology and human-labeled dataset for audio events. In: IEEE ICASSP, pp. 776–780 (2017)
Google Scholar
Hershey, S. et al.: CNN architectures for large-scale audio classification. In: IEEE ICASSP, pp. 131–135 (2017)
Google Scholar
Takahashi, N. et al.: Deep convolutional neural networks and data augmentation for acoustic event detection, arXiv preprint arXiv:1604.07160 (2016)
Snyder, D., Chen, G., Povey, D.: MUSAN: a music, speech, and noise corpus, CoRR abs/1510.08484 (2015). http://arxiv.org/pdf/1510.08484v1.pdf
Martin, A. Garofolo, J.: NIST speech processing evaluations: LVCSR, speaker recognition, language recognition. In: IEEE Workshop on Signal Processing Applications for Public Security and Forensics, pp. 1–7 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Vocapia Research, Orsay, France
Jodie Gauvain, Viet Bac Le, Julien Despres, Abdel Messaoudi & Bianca Vieru
CNRS-LIMSI, TLP, Orsay, France
Lori Lamel, Jean-Luc Gauvain & Waad Ben Kheder

Authors

Jodie Gauvain
View author publications
You can also search for this author in PubMed Google Scholar
Lori Lamel
View author publications
You can also search for this author in PubMed Google Scholar
Viet Bac Le
View author publications
You can also search for this author in PubMed Google Scholar
Julien Despres
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Luc Gauvain
View author publications
You can also search for this author in PubMed Google Scholar
Abdel Messaoudi
View author publications
You can also search for this author in PubMed Google Scholar
Bianca Vieru
View author publications
You can also search for this author in PubMed Google Scholar
Waad Ben Kheder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jodie Gauvain .

Editor information

Editors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Ioannis Kompatsiaris
EURECOM, Sophia Antipolis, France
Benoit Huet
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Vasileios Mezaris
Dublin City University, Dublin, Ireland
Cathal Gurrin
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Information Technologies Institute, Centre for Research and Technology Hellas, Thessaloniki, Greece
Stefanos Vrochidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gauvain, J. et al. (2019). Challenges in Audio Processing of Terrorist-Related Data. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11296. Springer, Cham. https://doi.org/10.1007/978-3-030-05716-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-05716-9_7
Published: 11 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05715-2
Online ISBN: 978-3-030-05716-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics