Abstract
Advanced surveillance systems for behavior recognition in outdoor traffic scenes depend strongly on the particular configuration of the scenario. Scene-independent trajectory analysis techniques statistically infer semantics in locations where motion occurs, and such inferences are typically limited to abnormality. Thus, it is interesting to design contributions that automatically categorize more specific semantic regions. State-of-the-art approaches for unsupervised scene labeling exploit trajectory data to segment areas like sources, sinks, or waiting zones. Our method, in addition, incorporates scene-independent knowledge to assign more meaningful labels like crosswalks, sidewalks, or parking spaces. First, a spatiotemporal scene model is obtained from trajectory analysis. Subsequently, a so-called GI-MRF inference process reinforces spatial coherence, and incorporates taxonomy-guided smoothness constraints. Our method achieves automatic and effective labeling of conceptual regions in urban scenarios, and is robust to tracking errors. Experimental validation on 5 surveillance databases has been conducted to assess the generality and accuracy of the segmentations. The resulting scene models are used for model-based behavior analysis.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Robertson, N., Reid, I.: A general method for human activity recognition in video. CVIU 104, 232–248 (2006)
Ballan, L., Bertini, M., Serra, G., Del Bimbo, A.: Video annotation and retrieval using ontologies and rule learning. IEEE Multimedia (2010)
Makris, D., Ellis, T.: Learning semantic scene models from observing activity in visual surveillance. IEEE TSCM, Part B 35, 397–408 (2005)
Hu, W., Xiao, X., Fu, Z., Xie, D.: A system for learning statistical motion patterns. PAMI 28, 1450–1464 (2006)
Piciarelli, C., Foresti, G.L.: On-line trajectory clustering for anomalous events detection. PRL 27, 1835–1842 (2006)
Basharat, A., Gritai, A., Shah, M.: Learning object motion patterns for anomaly detection and improved object detection. In: CVPR, Anchorage, USA (2008)
Baiget, P., Sommerlade, E., Reid, I., Gonzàlez, J.: Finding prototypes to estimate trajectory development in outdoor scenarios. In: 1st THEMIS, Leeds, UK (2008)
Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006)
Li, J., Gong, S., Xiang, T.: Scene segmentation for behaviour correlation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 383–395. Springer, Heidelberg (2008)
Nagel, H.H., Gerber, R.: Representation of occurrences for road vehicle traffic. AI-Magazine 172, 351–391 (2008)
Gonzàlez, J., Rowe, D., Varona, J., Xavier Roca, F.: Understanding dynamic scenes based on human sequence evaluation. IVC 27, 1433–1444 (2009)
Albanese, M., Chellappa, R., Moscato, V., Picariello, A., Subrahmanian, V.S., Turaga, P., Udrea, O.: A constrained probabilistic petri net framework for human activity detection in video. IEEE TOM 10, 982–996 (2008)
Fusier, F., Valentin, V., Bremond, F., Thonnat, M., Borg, M., Thirde, D., Ferryman, J.: Video understanding for complex activity recognition. MVA 18, 167–188 (2007)
Kumar, M., Torr, P., Zisserman, A.: Obj. Cut. In: CVPR (2005)
Kumar, S., Hebert, M.: Discriminative fields for modeling spatial dependencies in natural images. In: Advances in Neural Information Processing Systems, vol. 16 (2004)
Winn, J., Shotton, J.: The layout consistent random field for recognizing and segmenting partially occluded objects. In: CVPR, pp. 37–44 (2006)
Shotton, J., Johnson, M., Cipolla, R., Center, T., Kawasaki, J.: Semantic texton forests for image categorization and segmentation. In: CVPR (2008)
Li, S.: Markov random field modeling in image analysis. Springer, Heidelberg (2001)
Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge Univ. Press, Cambridge (2003)
Felzenszwalb, P., Huttenlocher, D.: Efficient belief propagation for early vision. IJCV 70, 41–54 (2006)
Croft, W., Cruse, D.: Cognitive linguistics. Cambridge Univ. Press, Cambridge (2004)
Rowe, D., Gonzàlez, J., Pedersoli, M., Villanueva, J.: On tracking inside groups. Machine Vision and Applications 21, 113–127 (2010)
Bose, B., Grimson, E.: Improving object classification in far-field video. In: CVPR (2004)
Black, J., Makris, D., Ellis, T.: Hierarchical database for a multi-camera surveillance system. Pattern Analysis and Applications 7, 430–446 (2004)
Landis, J., Koch, G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fernández, C., Gonzàlez, J., Roca, X. (2010). Automatic Learning of Background Semantics in Generic Surveilled Scenes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds) Computer Vision – ECCV 2010. ECCV 2010. Lecture Notes in Computer Science, vol 6312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15552-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-15552-9_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15551-2
Online ISBN: 978-3-642-15552-9
eBook Packages: Computer ScienceComputer Science (R0)