Unsupervised Segmentation of Meeting Configurations and Activities using Speech Activity Detection

Brdiczka, Oliver; Vaufreydaz, Dominique; Maisonnasse, Jérôme; Reignier, Patrick

doi:10.1007/0-387-34224-9_23

Oliver Brdiczka⁴,
Dominique Vaufreydaz⁴,
Jérôme Maisonnasse⁴ &
…
Patrick Reignier⁴

Part of the book series: IFIP International Federation for Information Processing ((IFIPAICT,volume 204))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

1955 Accesses
8 Citations

Abstract

This paper addresses the problem of segmenting small group meetings in order to detect different group configurations and activities in an intelligent environment. Our approach takes speech activity detection of individuals attending a meeting as input. The goal is to separate distinct distributions of speech activity observation corresponding to distinct group configurations and activities. We propose an unsupervised method based on the calculation of the Jeffrey divergence between histograms of speech activity observations. These histograms are generated from adjacent windows of variable size slid from the beginning to the end of a meeting recording. The peaks of the resulting Jeffrey divergence curves are detected using successive robust mean estimation. After a merging and filtering process, the retained peaks are used to select the best model, i.e. the best speech activity distribution allocation for a given meeting recording. These distinct distributions can be interpreted as distinct segments of group configuration and activity. To evaluate, we recorded 6 small group meetings. We measured the correspondence between detected segments and labeled group configurations and activities. The obtained results are promising, in particular as our method is completely unsupervised.

Download to read the full chapter text

Chapter PDF

A Preliminary Study of Acoustic Events Classification with Factor Analysis in Meeting Rooms

SmartMeeting: An Novel Mobile Voice Meeting Minutes Generation and Analysis System

Article 23 July 2019

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Basu S., Conversational Scene Analysis, Ph.D. Thesis, MIT Department of EECS. September, 2002.
Google Scholar
Brdiczka, O., Maisonnasse, J., and Reignier, P., Automatic Detection of Interaction Groups, Proc. Int’l Conf. Multimodal Interfaces, 2005 (to appear).
Google Scholar
Burger, S., MacLaren, V., and Yu, H., The ISL Meeting Corpus; The Impact of Meeting Type on Speech Style, Proc, of ICSLP 2002, Denver, CO, USA, 2002.
Google Scholar
Lamel L., Gauvain J.L., Eskenazi M., BREF, a large vocabulary spoken corpus for French, Eurospeech’91, Gênes (Italie), 1991
Google Scholar
McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., and Zhang, D., Automatic Analysis of Multimodal Group Actions in Meetings, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 305–317, March 2005.
Article Google Scholar
Metze, F., Mc Donough, J., Soltau, H., Waibel, A., Lavie, A., Burger, S., Langley, C., Levin, L., Schultz, T., Pianesi, F., Cattoni, R., Lazzari, G., Mana, N., Pianta, E., Besacier, L., Blanchon, H., Vaufreydaz, D., Taddei, L., The Nespole! Speech-to-Speech Translation System, Human Language Technologies 2002, San Diego, California (USA), March 2002.
Google Scholar
Muehlenbrock, M., Brdiczka, O., Snowdon, D., and Meunier, J.-L., Learning to Detect User Activity and Availability from a Variety of Sensor Data, Proc. IEEE Int’l Conference on Pervasive Computing and Communications, March 2004.
Google Scholar
Puzicha, J., Hofmann, Th., and Buhmann, J., Non-parametric Similarity Measures for Unsupervised Texture Segmentation and Image Retrieval, Proc. Int’l Conf. Computer Vision and Pattern Recognition, 1997.
Google Scholar
Qian, R. J., Sezan, M., and Mathews, K. E., Face Tracking Using Robust Statistical Estimation, Proc. Workshop on Perceptual User Interfaces, San Francisco, 1998.
Google Scholar
Rabiner L., Juang B.H., Fundamentals of Speech Recognition, Prentice Hall PTR, ISBN 0-130-15157-2,1993.
Google Scholar
Stiefelhagen, R., Steusloff, H., and Waibel, A., CHIL-Computers in the Human Interaction Loop, Proc. Int’l Workshop on Image Analysis for Multimedia Interactive Services, 2004.
Google Scholar
Taboada J., Feijoo S., Balsa R., Hernandez C., Explicit estimation of speech boundaries, IEEE Proc. Sci. Meas. Technol., vol. 141, pp. 153–159, 1994.
Article Google Scholar
Vaufreydaz, D., Modélisaiion statistique du langage à partir d’Internet pour la reconnaissance automatique de la parole continue, Ph.D. thesis in Computer Sciences, University Joseph Fourier, Grenoble (France), 226 pages, January 2002.
Google Scholar
Weiser, M., Ubiquitous Computing: Definition 1, http://www.ubiq.com/hypertext/weiser/UbiHome.html. March 1996.
Google Scholar
Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, L., and Lathoud, G., Multimodal Group Action Clustering in Meetings, Proc. Int’l Workshop on Video Surveillance & Sensor Networks, 2004.
Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Rhône-Alpes, 655 Av. de 1’Europe, 38330, Montbonnot, France
Oliver Brdiczka, Dominique Vaufreydaz, Jérôme Maisonnasse & Patrick Reignier

Authors

Oliver Brdiczka
View author publications
You can also search for this author in PubMed Google Scholar
Dominique Vaufreydaz
View author publications
You can also search for this author in PubMed Google Scholar
Jérôme Maisonnasse
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Reignier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of the Aegean, Greece
Ilias Maglogiannis
ICCS/NTUA, Greece
Kostas Karpouzis
University of Plymouth, UK
Max Bramer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brdiczka, O., Vaufreydaz, D., Maisonnasse, J., Reignier, P. (2006). Unsupervised Segmentation of Meeting Configurations and Activities using Speech Activity Detection. In: Maglogiannis, I., Karpouzis, K., Bramer, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2006. IFIP International Federation for Information Processing, vol 204. Springer, Boston, MA . https://doi.org/10.1007/0-387-34224-9_23

Download citation

DOI: https://doi.org/10.1007/0-387-34224-9_23
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34223-8
Online ISBN: 978-0-387-34224-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Unsupervised Segmentation of Meeting Configurations and Activities using Speech Activity Detection

Abstract

Chapter PDF

Similar content being viewed by others

A Preliminary Study of Acoustic Events Classification with Factor Analysis in Meeting Rooms

SmartMeeting: An Novel Mobile Voice Meeting Minutes Generation and Analysis System

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Unsupervised Segmentation of Meeting Configurations and Activities using Speech Activity Detection

Abstract

Chapter PDF

Similar content being viewed by others

A Preliminary Study of Acoustic Events Classification with Factor Analysis in Meeting Rooms

SmartMeeting: An Novel Mobile Voice Meeting Minutes Generation and Analysis System

Change Points Detection in Multivariate Signal Applied to Human Activity Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation