Meeting State Recognition from Visual and Aural Labels

Cuřín, Jan; Fleury, Pascal; Kleindienst, Jan; Kessl, Robert

doi:10.1007/978-3-540-78155-4_3

Jan Cuřín¹,
Pascal Fleury¹,
Jan Kleindienst¹ &
…
Robert Kessl¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4892))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

1002 Accesses
3 Citations

Abstract

In this paper we present a meeting state recognizer based on a combination of multi-modal sensor data in a smart room. Our approach is based on the training of a statistical model to use semantical cues generated by perceptual components. These perceptual components generate these cues in processing the output of one or multiple sensors. The presented recognizer is designed to work with an arbitrary combination of multi-modal input sensors. We have defined a set of states representing both meeting and non-meeting situations, and a set of features we base our classification on. Thus, we can model situations like presentation or break which are important information for many applications. We have hand-annotated a set of meeting recordings to verify our statistical classification, as appropriate multi-modal corpora are currently very sparse. We have also used several statistical classification methods for the best classification, which we validated on the hand-annotated corpus of real meeting data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Banerjee, S., Rudnicky, A.I.: Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. In: Proceedings of ICSLP 2004, Jeju Island, Korea (2004)
Google Scholar
Hakeem, A., Shah, M.: Ontology and taxonomy collaborated framework for meeting classification. In: ICPR 2004. Proceedings of the 17th International Conference on Pattern Recognition (2004)
Google Scholar
Wang, J., Chen, G., Kotz, D.: A meeting detector and its applications. NH, USA (2004)
Google Scholar
Campbell, N., Suzuki, N.: Working with very sparse data to detect speaker and listener participation in a meetings corpus. In: Proceedings of Multimodal Behaviour Theory to Usable Models, Genova, Italy (2006)
Google Scholar
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Kasairos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: a pre–anouncement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2005)
Chapter Google Scholar
McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group action in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 305–317 (2005)
Article Google Scholar
Chen, L., Rose, R.T., Parrill, F., Han, X., Tu, J., Huang, Z., Harper, M., Quek, F., McNeill, D., Tuttle, R.: (VACE) multimodal meeting corpus. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 40–51. Springer, Heidelberg (2005)
Chapter Google Scholar
Stiefelhagen, R., Bowers, R.: CLEAR (Classification of Events. MD, USA (2007), http://isl.ira.uka.de/clear07/
Google Scholar
Danninger, M., Robles, E., Takayama, L., Wang, Q., Kluge, T., Nass, C., Stiefelhagen, R.: The connector service - predicting availability in mobile contexts. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 129–141. Springer, Heidelberg (2006)
Chapter Google Scholar
Crowley, J.L., Coutaz, J., Rey, G., Reignier, P.: Perceptual components for context aware computing. In: Borriello, G., Holmquist, L.E. (eds.) UbiComp 2002. LNCS, vol. 2498, Springer, Heidelberg (2002)
Chapter Google Scholar
Fleury, P., Cuřín, J., Kleindienst, J.: SitCom - development platform for multimodal perceptual services. In: Marik, V., Vyatkin, V., Colombo, A.W. (eds.) HoloMAS 2007. LNCS (LNAI), vol. 4659, pp. 104–113. Springer, Heidelberg (2007)
Chapter Google Scholar
Witten, I.H., Frank, E.: Data mining. Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

IBM, Prague, Czech Republic
Jan Cuřín, Pascal Fleury, Jan Kleindienst & Robert Kessl

Authors

Jan Cuřín
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Fleury
View author publications
You can also search for this author in PubMed Google Scholar
Jan Kleindienst
View author publications
You can also search for this author in PubMed Google Scholar
Robert Kessl
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Andrei Popescu-Belis Steve Renals Hervé Bourlard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cuřín, J., Fleury, P., Kleindienst, J., Kessl, R. (2008). Meeting State Recognition from Visual and Aural Labels. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2007. Lecture Notes in Computer Science, vol 4892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78155-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-78155-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78154-7
Online ISBN: 978-3-540-78155-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics