Skip to main content
Log in

A Multimedia System for Temporally Situated Perceptual Psycholinguistic Analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Perceptual analysis of video (analysis by unaided ear and eye) plays an important role in such disciplines as psychology, psycholinguistics, linguistics, anthropology, and neurology. In the specific domain of psycholinguistic analysis of gesture and speech, researchers micro-analyze videos of subjects using a high quality video cassette recorder that has a digital freeze capability down to the specific frame. Such analyses are very labor intensive and slow. We present a multimedia system for perceptual analysis of video data using a multiple, dynamically linked representation model. The system components are linked through a time portal with a current time focus. The system provides mechanisms to analyze overlapping hierarchical interpretations of the discourse, and integrates visual gesture analysis, speech analysis, video gaze analysis, and text transcription into a coordinated whole. The various interaction components facilitate accurate multi-point access to the data. While this system is currently used to analyze gesture, speech and gaze in human discourse, the system described may be applied to any other field where careful analysis of temporal synchronies in video is important.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. Ansari, Y. Dai, J. Lou, D. McNeill, and F. Quek, “Representation of prosodic structure in speech using nonlinear methods,” in 1999 Workshop on Nonlinear Signal and Image Processing, Antalya, Turkey, 1999.

  2. A.F. Bobick, “Representational frames in video annotation,” in Proceedings of the 27th Annual Asilomar Conference on Signals, Systems and Computers,Nov. 1993. Also appears as MIT Media Laboratory Perceptual Computing Section Technical Report No. 251.

  3. S.E. Brennan, “Centering attention in discourse,” Language and Cognitive Processes, Vol. 10, No. 2, pp. 137-167, 1995.

    Google Scholar 

  4. J. Delin, “Presupposition and shared knowledge in it-clefts,” Language and Cognitive Processes, Vol. 10, No. 2, pp. 97-120, 1995.

    Google Scholar 

  5. P.C. Gordon, B.J. Grosz, and L.A. Gilliom, “Pronouns, names, and the centering of attention in discourse,” Cognitive Science, Vol. 17, No. 3, pp. 311-347, 1993.

    Google Scholar 

  6. A. Kendon, “Current issues in the study of gesture,” in The Biological Foundations of Gestures: Motor and Semiotic Aspects, J.-L. Nespoulous, P. Peron, and A.R. Lecours (Eds.), Lawrence Erlbaum Associates: Hillsdale, NJ, 1986, pp. 23-47.

    Google Scholar 

  7. R.B. Kozma, “A reply: Media and methods,” Educational Technology Research and Development, Vol. 42, No. 3, pp. 1-14, 1994.

    Google Scholar 

  8. R.B. Kozma, J. Russel, T. Jones, N. Marx, and J. Davis, “The use of multiple, linked representations to facilitate science understanding,” in International Perspectives on the Design of Technology-Supported Learning Environments, S. Vosniadou, E. De Corte, R. Glaser, and H. Mandl (Eds.), Lawrence Erlbaum Associates: Mahwah, NJ, 1996.

    Google Scholar 

  9. D. Mayhew, Principles and Guidelines in Software User Interface Design, Prentice-Hall Inc.: Englewood Cliffs, NJ, 1992.

    Google Scholar 

  10. D. McNeill, Hand and Mind: What Gestures Reveal about Thought, University of Chicago Press: Chicago, 1992.

    Google Scholar 

  11. C.H. Nakatani, B.J. Grosz, D.D. Ahn, and J. Hirschberg, “Instructions for annotating discourses,” Technical Report TR-21-95, Center for Research in Computing Technology, Harvard U., Cambridge, MA, 1995.

    Google Scholar 

  12. S. Nobe, “Represenational gestures, cognitive rythms, and accoustic aspects of speech: A network-threshold model of gesture production,” PhD Thesis, Department of Psychology, University of Chicago, 1996.

  13. S. Nobe, “When do most spontaneous representational gestures actually occur with respect to speech?” in Language and Gesture, D. McNeill (Ed.), Cambridge University Press: Cambridge, 2000.

    Google Scholar 

  14. F. Quek, “Content-based video access system,” US provisional patent application Serial No. 60/053,353 filed on 07/22/1997. PCT application Serial No. PCT/US98/15063 filed on 07/22/1998. UIC file number: CQ037.

  15. F. Quek, R. Bryll, and X. Ma, “Vector coherence mapping: A parallel algorithm for image flow computation with fuzzy combination of multiple constraints,” Technical Report No. VISLab-01-00, Vision Interfaces and Systems Laboratory, CSE Department, Wright State University, Dayton, OH, January 2001.

    Google Scholar 

  16. F. Quek, D. McNeill, R. Ansari, X. Ma, R. Bryll, S. Duncan, and K.-E. McCullough, “Gesture cues for conversational interaction in monocular video,” in ICCV'99 International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, Corfu, Greece, Sept. 26-27, 1999, pp. 64-69.

  17. F. Quek, D. McNeill, R. Bryll, C. Kirbas, H. Arslan, K.-E. McCullough, N. Furuyama, and R. Ansari, “Gesture, speech, and gaze cues for discourse segmentation,” in IEEE Conf. on CVPR, Hilton Head Island, South Carolina, June 13-15, 2000, Submitted.

  18. Boon-Lock Yeo and M.M. Yeung, “Retrieving and visualizing video,” Communications of the ACM, Vol. 40, No. 12, pp. 43-52, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Quek, F., Bryll, R., Kirbas, C. et al. A Multimedia System for Temporally Situated Perceptual Psycholinguistic Analysis. Multimedia Tools and Applications 18, 91–114 (2002). https://doi.org/10.1023/A:1016229624425

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1016229624425

Navigation