Advertisement

Identifying Surprising Events in Videos Using Bayesian Topic Models

  • Avishai Hendel
  • Daphna Weinshall
  • Shmuel Peleg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6494)

Abstract

Automatic processing of video data is essential in order to allow efficient access to large amounts of video content, a crucial point in such applications as video mining and surveillance. In this paper we focus on the problem of identifying interesting parts of the video. Specifically, we seek to identify atypical video events, which are the events a human user is usually looking for. To this end we employ the notion of Bayesian surprise, as defined in [1,2], in which an event is considered surprising if its occurrence leads to a large change in the probability of the world model. We propose to compute this abstract measure of surprise by first modeling a corpus of video events using the Latent Dirichlet Allocation model. Subsequently, we measure the change in the Dirichlet prior of the LDA model as a result of each video event’s occurrence. This change of the Dirichlet prior leads to a closed form expression for an event’s level of surprise, which can then be inferred directly from the observed data. We tested our algorithm on a real dataset of video data, taken by a camera observing an urban street intersection. The results demonstrate our ability to detect atypical events, such as a car making a U-turn or a person crossing an intersection diagonally.

Keywords

Topic Model Latent Dirichlet Allocation Latent Topic Surprising Event Latent Dirichlet Allocation Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Itti, L., Baldi, P.: A principled approach to detecting surprising events in video. In: CVPR, vol. (1), pp. 631–637 (2005)Google Scholar
  2. 2.
    Schmidhuber, J.: Driven by compression progress: A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. In: Pezzulo, G., Butz, M.V., Sigaud, O., Baldassarre, G. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 5499, pp. 48–76. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    Boiman, O., Irani, M.: Detecting irregularities in images and in video. International Journal of Computer Vision 74, 17–31 (2007)CrossRefGoogle Scholar
  4. 4.
    Pritch, Y., Rav-Acha, A., Peleg, S.: Nonchronological video synopsis and indexing. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1971–1984 (2008)CrossRefGoogle Scholar
  5. 5.
    Ranganathan, A., Dellaert, F.: Bayesian surprise and landmark detection. In: ICRA, pp. 2017–2023 (2009)Google Scholar
  6. 6.
    Hospedales, T., Gong, S., Xiang, T.: A markov clustering topic model for mining behaviour in video. In: ICCV (2009)Google Scholar
  7. 7.
    Wang, X., Ma, X., Grimson, E.: Unsupervised activity perception by hierarchical bayesian models. In: CVPR (2007)Google Scholar
  8. 8.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  9. 9.
    Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, vol. (2), pp. 524–531 (2005)Google Scholar
  10. 10.
    Laptev, I., Lindeberg, T.: Local descriptors for spatio-temporal recognition. In: MacLean, W.J. (ed.) SCVMA 2004. LNCS, vol. 3667, pp. 91–103. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their localization in images. In: ICCV, pp. 370–377 (2005)Google Scholar
  12. 12.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision 79, 299–318 (2008)CrossRefGoogle Scholar
  13. 13.
    Pritch, Y., Ratovitch, S., Hendel, A., Peleg, S.: Clustered synopsis of surveillance video. In: AVSS, pp. 195–200 (2009)Google Scholar
  14. 14.
    Sun, J., Zhang, W., Tang, X., Shum, H.-Y.: Background cut. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 628–641. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR, pp. 2004–2011 (2009)Google Scholar
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. In: NIPS, pp. 601–608 (2001)Google Scholar
  17. 17.
    Hofmann, T.: Probabilistic latent semantic analysis. In: UAI, pp. 289–296 (1999)Google Scholar
  18. 18.
    Jordan, M.I., Ghahramani, Z., Jaakkola, T., Saul, L.K.: An introduction to variational methods for graphical models. Machine Learning 37, 183–233 (1999)CrossRefzbMATHGoogle Scholar
  19. 19.
    Penny, W.D.: Kullback-liebler divergences of normal, gamma, dirichlet and wishart densities. Technical report, Wellcome Department of Cognitive Neurology (2001)Google Scholar
  20. 20.
    Hughes, R., Huang, H., Zegeer, C., Cynecki, M.: Evaluation of automated pedestrian detection at signalized intersections (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Avishai Hendel
    • 1
  • Daphna Weinshall
    • 1
  • Shmuel Peleg
    • 1
  1. 1.School of Computer ScienceThe Hebrew UniversityJerusalemIsrael

Personalised recommendations