Abstract
This paper reports initial research on supporting Visually Mediated Interaction (VMI) by developing person-specific and generic gesture models for the control of active cameras.We describe a time-delay variant of the Radial Basis Function (TDRBF) network and evaluate its performance on recognising simple pointing and waving hand gestures in image sequences. Experimental results are presented that show that high levels of performance can be obtained for this type of gesture recognition using such techniques, both for particular individuals and across a set of individuals. Characteristic visual evidence can be automatically selected, depending on the task demands.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Ahmad and V. Tresp. Some solutions to the missing feature problem in vision. In S. J. Hanson, J. D. Cowan, and C. L. Giles, editors, Advances in Neural Information Processing Systems, volume 5, pages 393–400, San Mateo, CA, 1993. Morgan Kaufmann.
M. R. Berthold. A Time Delay radial basis function network for phoneme recognition. In Proceedings of IEEE International Conference on Neural Networks, volume 7, pages 4470–4473, Orlando, FL, 1994. IEEE Computer Society Press.
C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK, 1995.
A. F. Bobick. Computers seeing action. In R. B. Fisher and E. Trucco, editors, Proceedings of British Machine Vision Conference, pages 13–22, Edinburgh, 1996. BMVA Press.
H. Buxton and S. Gong. Visual surveillance in a dynamic and uncertain world. Artificial Intelligence, 78:431–459, 1995.
R. Cutler and M. Turk. View-based interpretation of real-time optical flow for gesture recognition. In Proceedings of IEEE International Conference on Automatic Face & Gesture Recognition, pages 416–421, Nara, Japan, 1998. IEEE Computer Society Press.
J. Elman. Finding structure in time. Cognitive Science, 14:179–211, 1990.
S. Gong. Visual observation as reactive learning. In Proceedings of SPIE International Conference on Adaptive & Learning Systems, pages 265–270, Orlando, FL, 1992.
J. A. Hertz, A. Krogh, and R. G. Palmer. Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City CA, 1991.
G. E. Hinton and Z. Ghahramani. Generative models for discovering sparse distributed representations. Philosophical Transactions of Royal Society London, Series B, 352:1177–1190, 1997.
A. J. Howell. Automatic face recognition using radial basis function networks. PhD thesis, University of Sussex, 1997.
A. J. Howell and H. Buxton. Face recognition using radial basis function neural networks. In R. B. Fisher and E. Trucco, editors, Proceedings of British Machine Vision Conference, pages 455–464, Edinburgh, 1996. BMVA Press.
A. J. Howell and H. Buxton. Towards unconstrained face recognition from image sequences. In Proceedings of International Conference on Automatic Face & Gesture Recognition, pages 224–229, Killington, VT, 1996. IEEE Computer Society Press.
A. J. Howell and H. Buxton. Recognising simple behaviours using time-delay RBF networks. Neural Processing Letters, 5:97–104, 1997.
A. J. Howell and H. Buxton. Towards visually mediated interaction using appearance-based models. In Proceedings of ECCV’98 Workshop on Perception of Human Action, Freiburg, Germany, 1998.
M. I. Jordan. Serial order: A parallel, distributed processing approach. In J. L. Elman and D. E. Rumelhart, editors, Advances in Connectionist Theory: Speech. Lawrence Erlbaum, Hillsdale, NJ, 1989.
Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1:541–551, 1989.
S. J. McKenna and S. Gong. Tracking faces. In Proceedings of International Conference on Automatic Face & Gesture Recognition, pages 271–276, Killington, VT, 1996. IEEE Computer Society Press.
S. J. McKenna and S. Gong. Gesture recognition for visually mediated interaction using probabilistic event trajectories. In Proceedings of British Machine Vision Conference, Southampton, UK, 1998. BMVA Press.
S. J. McKenna, S. Gong, and Y. Raja. Face recognition in dynamic scenes. In A. F. Clark, editor, Proceedings of British Machine Vision Conference, pages 140–151, Colchester, UK, 1997. BMVA Press.
J. Moody and C. Darken. Learning with localized receptive fields. In D. Touretzky, G. Hinton, and T. Sejnowski, editors, Proceedings of 1988 Connectionist Models Summer School, pages 133–143, Pittsburgh, PA, 1988. Morgan Kaufmann.
J. Moody and C. Darken. Fast learning in networks of locally-tuned processing units. Neural Computation, 1:281–294, 1989.
M. C. Mozer. Neural net architectures for temporal sequence processing. In A. S. Weigend and N. A. Gershenfeld, editors, Time Series Prediction: Predicting the Future and Understanding the Past, pages 243–264. Addison-Wesley, Redwood City, CA, 1994.
A. Pentland. Smart rooms. Scientific American, 274(4):68–76, 1996.
C. Pinhanez and A. F. Bobick. Approximate world models: Incorporating qualitative and linguistic information into vision systems. In Proceedings of AAAI’96, pages 1116–1123, Portland, OR, 1996.
A. Psarrou, H. Buxton, and S. Gong. Modelling spatio-temporal trajectories and face signatures on partially recurrent neural networks. In Proceedings of IEEE International Conference on Neural Networks, volume 5, pages 2226–2231, Perth, Australia, 1995.
Y. Raja, S. J. McKenna, and S. Gong. Tracking and segmenting people in varying lighting conditions using colour. In Proceedings of IEEE International Conference on Automatic Face & Gesture Recognition, pages 228–233, Nara, Japan, 1998. IEEE Computer Society Press.
M. Turk. Visual interaction with lifelike characters. In Proceedings of International Conference on Automatic Face & Gesture Recognition, pages 368–373, Killington, VT, 1996. IEEE Computer Society Press.
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang. Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, & Signal Processing, 37:328–339, 1989.
C. R. Wren, A. Azarbayejani, T. Darrell, and A. P. Pentland. Pfinder: Real-time tracking of the human body. In Proceedings of International Conference on Automatic Face & Gesture Recognition, pages 51–56, Killington, VT, 1996. IEEE Computer Society Press.
C. R. Wren and A. P. Pentland. Dynamic models of human motion. In Proceedings of IEEE International Conference on Automatic Face & Gesture Recognition, pages 22–27, Nara, Japan, 1998. IEEE Computer Society Press.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Howell, A.J., Buxton, H. (1999). Gesture Recognition for Visually Mediated Interaction. In: Braffort, A., Gherbi, R., Gibet, S., Teil, D., Richardson, J. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 1999. Lecture Notes in Computer Science(), vol 1739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46616-9_13
Download citation
DOI: https://doi.org/10.1007/3-540-46616-9_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66935-7
Online ISBN: 978-3-540-46616-1
eBook Packages: Springer Book Archive