Skip to main content

Unsupervised Learning of Head Pose through Spike-Timing Dependent Plasticity

  • Conference paper
Book cover Perception in Multimodal Dialogue Systems (PIT 2008)

Abstract

We present a biologically inspired model for learning prototypical representations of head poses. The model employs populations of integrate-and-fire neurons and operates in the temporal domain. Times-to-first spike (latencies) are used to develop a rank-order code, which is invariant to global contrast and brightness changes. Our model consists of 3 layers. In the first layer, populations of Gabor filters are used to extract feature maps from the input image. Filter activities are converted into spike latencies to determine their temporal spike order. In layer 2, intermediate level neurons respond selectively to feature combinations that are statistically significant in the presented image dataset. Synaptic connectivity between layer 1 and 2 is adapted by a mechanism of spike-timing dependent plasticity (STDP). This mechanism realises an unsupervised Hebbian learning scheme that modifies synaptic weights according to their timing between pre- and postsynaptic spike. The third layer employs a radial basis function (RBF) classifier to evaluate neural responses from layer 2. Our results show quantitatively that the network performs well in discriminating between 9 different input poses gathered from 200 subjects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albrecht, D., Geisler, W., Frazor, R., Crane, A.: Visual cortex neurons of monkey and cats: Temporal dynamics of the contrast response fucntion. Neurophysiology 88, 888–913 (2002)

    Google Scholar 

  2. Beardsley, P.A.: A qualitative approach to classifying head and eye pose. In: IEEE Workshop on Applications of Computer Vision, p. 208. IEEE Computer Society Press, Los Alamitos (1998)

    Google Scholar 

  3. Bi, G., Poo, M.: Synaptic modification by correlated activity: Hebb’s postulate revisited. Annu. Revi. Neurosci. 24, 139–166 (2001)

    Article  Google Scholar 

  4. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Proc. of the European Conference on Computer Vision, vol. 2, pp. 484–498 (1998)

    Google Scholar 

  5. Daugman, J.: Complete discrete 2d gabor transforms by neural networks for image analysis and compression. Transactions on Acoustics, Speech, and Signal Processing 36(7), 1169–1179 (1988)

    Article  MATH  Google Scholar 

  6. Delorme, A., Thorpe, S.: Face identification using one spike per neuron. Neural Networks 14, 795–803 (2001)

    Article  Google Scholar 

  7. Guyonneau, R., van Rullen, R., Thorpe, S.: Neurons tuned to the earliest spikes through stdp. Neural Computation 17(4), 859–879 (2005)

    Article  MATH  Google Scholar 

  8. Krüger, N., Pötzsch, M., von der Malsburg, C.: Determination of face position and pose with a learned representation based on labelled graphs. Image Vision Comput. 15(8), 665–673 (1997)

    Article  Google Scholar 

  9. Masquelier, T., Thorpe, S.: Unsupervised learning of visual features through spike timing dependent plasticity. PLoS Comput. Biol. 3(2), 31 (2007)

    Article  Google Scholar 

  10. Meyers, E., Wolf, L.: Using biologically inspiered features for face processing. International Journal of Computer Vision 76, 93–104 (2008)

    Article  Google Scholar 

  11. Nagai, Y.: The role of motion information in learning human-robot joint attention. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation, pp. 2069–2074 (2005)

    Google Scholar 

  12. Pentland, A., Moghaddam, B., Starner, T.: View-based and modular eigenspaces for face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (1994)

    Google Scholar 

  13. Phillips, P., Moon, H., Rauss, P., Rizvi, S.: The feret evaluation methodology for face recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(10), 1090–1104 (2000)

    Article  Google Scholar 

  14. Poppe, R., Rienks, R., Heylen, D.: Accuracy of head orientation perception in triadic situations: Experiment in a virtual environment. Perception 36(7), 971–979 (2007)

    Article  Google Scholar 

  15. Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2, 1019–1025 (1999)

    Article  Google Scholar 

  16. Sperling, G.: Model of visual adaptation and contrast detection. Perception and Psychophysic 8, 143–157 (1970)

    Google Scholar 

  17. Strauss, P.-M.: A slds for perception and interaction in multi-user environments. In: 2nd Int’l Conf. on Intelligent Environments, pp. 171–174 (2006)

    Google Scholar 

  18. Thorpe, S.: Parallel processing in neural systems and computers. In: Spike arrival times: A highly efficient coding scheme for neural networks, pp. 91–94. Elsevier, Amsterdam (1990)

    Google Scholar 

  19. Vatahska, T., Bennewitz, M., Behnke, S.: Feature based head pose estimation from images. In: IEEE Conf. on Humanoid Robots (2007)

    Google Scholar 

  20. Voit, M., Nickel, K., Stiefelhagen, R.: Neural Network-Based Head Pose Estimation and Multi-view Fusion. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 291–298. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  21. Weidenbacher, U., Layher, G., Bayerl, P., Neumann, H.: Detection of Head Pose and Gaze Direction for Human-Computer Interaction. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Weber, M. (eds.) PIT 2006. LNCS (LNAI), vol. 4021, pp. 9–19. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  22. Zhao, W., Chellappa, R., Rosenfeld, A., Phillips, P.: Face recognition: A literature survey. ACM Computing Surveys 35(4), 399–458 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elisabeth André Laila Dybkjær Wolfgang Minker Heiko Neumann Roberto Pieraccini Michael Weber

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Weidenbacher, U., Neumann, H. (2008). Unsupervised Learning of Head Pose through Spike-Timing Dependent Plasticity. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69369-7_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69368-0

  • Online ISBN: 978-3-540-69369-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics