Skip to main content

Semi-Latent Dirichlet Allocation: A Hierarchical Model for Human Action Recognition

  • Conference paper
Human Motion – Understanding, Modeling, Capture and Animation (HuMo 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4814))

Included in the following conference series:

Abstract

We propose a new method for human action recognition from video sequences using latent topic models. Video sequences are represented by a novel “bag-of-words” representation, where each frame corresponds to a “word”. The major difference between our model and previous latent topic models for recognition problems in computer vision is that, our model is trained in a “semi-supervised” way. Our model has several advantages over other similar models. First of all, the training is much easier due to the decoupling of the model parameters. Secondly, it naturally solves the problem of how to choose the appropriate number of latent topics. Thirdly, it achieves much better performance by utilizing the information provided by the class labels in the training set. We present action classification and irregularity detection results, and show improvement over previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bissacco, A., Yang, M.H., Soatto, S.: Detecting humans via their pose. In: NIPS. Advances in Neural Information Processing Systems, vol. 19, pp. 169–176. MIT Press, Cambridge (2007)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    Article  MATH  Google Scholar 

  3. Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(3), 257–267 (2001)

    Article  Google Scholar 

  4. Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: IEEE International Conference on Computer Vision, vol. 1, pp. 462–469 (2005)

    Google Scholar 

  5. Bosch, A., Zisserman, A., Munoz, X.: Scene classification via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Cutler, R., Davis, L.S.: Robust real-time periodic motion detection, analysis, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 781–796 (2000)

    Article  Google Scholar 

  7. Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: ICCV 2005. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2005)

    Google Scholar 

  8. Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: IEEE International Conference on Computer Vision, vol. 2, pp. 726–733 (2003)

    Google Scholar 

  9. Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4), 594–611 (2006)

    Article  Google Scholar 

  10. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005)

    Google Scholar 

  11. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)

    Google Scholar 

  12. Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: IEEE International Conference on Computer Vision, vol. 2, pp. 1458–1465 (2005)

    Google Scholar 

  13. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR. Proceedings of Twenty-Second Annual International Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)

    Google Scholar 

  14. Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: IEEE International Conference on Computer Vision, vol. 1, pp. 166–173 (2005)

    Google Scholar 

  15. Lazebnik, S., Schmid, C., Ponce, J.: A maximum entropy framework for part-based texture and object recognition. In: IEEE International Conference on Computer Vision, vol. 1, pp. 832–838 (2005)

    Google Scholar 

  16. Little, J.L., Boyd, J.E.: Recognizing people by their gait: The shape of motion. Videre 1(2), 1–32 (1998)

    Google Scholar 

  17. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of the DARPA Image Understanding Workshop, pp. 121–130 (April 1981)

    Google Scholar 

  18. Minka, T.P.: Estimating a Dirichlet distribution. Technical report, Massachusetts Institute of Technology (2000)

    Google Scholar 

  19. Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: British Machine Vision Conference, vol. 3, pp. 1249–1258 (2006)

    Google Scholar 

  20. Polana, R., Nelson, R.C.: Detection and recognition of periodic, non-rigid motion. International Journal of Computer Vision 23(3), 261–282 (1997)

    Article  Google Scholar 

  21. Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. International Journal of Computer Vision 50(2), 203–226 (2002)

    Article  MATH  Google Scholar 

  22. Russell, B.C., Efros, A.A., Sivic, J., Freeman, W.T., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1605–1614 (2006)

    Google Scholar 

  23. Sabzmeydani, P., Mori, G.: Detecting pedestrians by learning shapelet features. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  24. Schuldt, C., Laptev, L., Caputo, B.: Recognizing human actions: a local SVM approach. In: IEEE International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)

    Google Scholar 

  25. Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: IEEE International Conference on Computer Vision, vol. 1, pp. 370–377 (2005)

    Google Scholar 

  26. Sullivan, J., Carlsson, S.: Recognizing and tracking human action. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 629–644. Springer, Heidelberg (2002)

    Google Scholar 

  27. Zhong, H., Shi, J., Visontai, M.: Detecting unusual activity in video. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 819–826 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ahmed Elgammal Bodo Rosenhahn Reinhard Klette

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Y., Sabzmeydani, P., Mori, G. (2007). Semi-Latent Dirichlet Allocation: A Hierarchical Model for Human Action Recognition. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds) Human Motion – Understanding, Modeling, Capture and Animation. HuMo 2007. Lecture Notes in Computer Science, vol 4814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75703-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75703-0_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75702-3

  • Online ISBN: 978-3-540-75703-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics