Skip to main content

Using Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4299))

Included in the following conference series:

Abstract

Multi-modal recordings of meetings provide the basis for meeting browsing and for remote meetings. However it is often not useful to store or transmit all visual channels. In this work we show how a virtual meeting director selects one of seven possible video modes. We then present several audio, visual, and lexical features for a virtual director. In an experimental section we evaluate the features, their influence on the camera selection, and the properties of the generated video stream. The chosen features all allow a real- or near real-time processing and can therefore not only be applied to offline browsing, but also for a remote meeting assistant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Hames, M., Dielmann, A., Gatica-Perez, D., Reiter, S., Renals, S., Rigoll, G., Zhang, D.: Multimodal integration for meeting group action segmentation and recognition. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 52–63. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Al-Hames, M., Hain, T., Cernocky, J., Schreiber, S., Poel, M., Muller, R., Marcel, S., van Leeuwen, D., Odobez, J.M., Ba, S., Bourlard, H., Cardinaux, F., Gatica-Perez, D., Janin, A., Motlicek, P., Reiter, S., Renals, S., van Rest, J., Rienks, R., Rigoll, G., Smith, K., Thean, A., Zemcik, P.: Audio-visual processing in meetings: Seven questions and current AMI answers. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 24–35. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meetings corpus. In: Proceedings of the Measuring Behavior 2005 symposium on Annotating and measuring Meeting Behavior (2005)

    Google Scholar 

  4. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2003)

    Google Scholar 

  5. Potucek, I., Sumec, S., Spanel, M.: Participant activity detection by hands and face movement tracking in the meeting room. In: Proceedings IEEE Computer Graphics International (CGI), pp. 632–635 (2004)

    Google Scholar 

  6. Pratt, W.K.: Digital image processing. John Wiley & Sons, Chichester (2001)

    Book  Google Scholar 

  7. Smith, K., Schreiber, S., Beran, V., Potúcek, I., Gatica-Perez, D.: A comparitive study of head tracking methods. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Waibel, A., Steusloff, H., Stiefelhagen, R., the CHIL Project Consortium: CHIL: Computers in the human interaction loop. In: Proceedings of the NIST ICASSP Meeting Recognition Workshop (2004)

    Google Scholar 

  9. Wallhoff, F., Zobl, M., Rigoll, G.: Action segmentation and recognition in meeting room scenarios. In: Proceedings IEEE International Conference on Image Processing (ICIP), Singapore (October 2004)

    Google Scholar 

  10. Wellner, P., Flynn, M., Guillemot, M.: Browsing recorded meetings with Ferret. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Yang, M.-H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: A survey. IEEE Transasctions on Pattern Analysis and Machine Intelligence 24(1), 34–58 (2002)

    Article  Google Scholar 

  12. Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I., Lathoud, G.: Modeling individual and group actions in meetings: a two-layer hmm framework. In: Proceedings IEEE Workshop on Event Mining at the Conference on Computer Vision and Pattern Recognition (CVPR) (2004)

    Google Scholar 

  13. Zobl, M., Wallhoff, F., Rigoll, G.: Action recognition in meeting scenarios using global motion features. In: Ferryman, J. (ed.) Proceedings Fourth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS-ICVS), pp. 32–36 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Al-Hames, M., Hörnler, B., Scheuermann, C., Rigoll, G. (2006). Using Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director. In: Renals, S., Bengio, S., Fiscus, J.G. (eds) Machine Learning for Multimodal Interaction. MLMI 2006. Lecture Notes in Computer Science, vol 4299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11965152_6

Download citation

  • DOI: https://doi.org/10.1007/11965152_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69267-6

  • Online ISBN: 978-3-540-69268-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics