Abstract
This chapter introduces some basic methods to deal with groups of people in surveillance settings. Recently, modeling groups has become a very active trend for video surveillance researchers. Our solution is proper of the recently forged field of social signaling, since it embeds notions of social psychology into computer vision techniques, offering a novel research perspective for the video surveillance community. In particular, we present methods to discover and track groups of people, and to infer what is the focus of attention of each person, that is, we estimate the portion of a scene that is frequently observed by people. Each method we present is evaluated in an experimental section on real scenario, that gives a clear idea of its performance and potentialities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ba, S.O., Odobez, J.-M.: A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, pp. 75–87. Springer, Heidelberg (2006)
Bazzani, L., Cristani, M., Murino, V.: Collaborative particle filters for group tracking. In: IEEE International Conference on Image Processing (2010)
Bazzani, L., Cristani, M., Perina, A., Farenzena, M., Murino, V.: Multiple-shot person re-identification by hpe signature. In: 20th International Conference on Pattern Recognition (ICPR), pp. 1413–1416 (August 2010)
Bazzani, L., Tosato, D., Cristani, M., Farenzena, M., Pagetti, G., Menegaz, G., Murino, V.: Social interactions by visual focus of attention in a three-dimensional environment. In: Expert Systems (2011) (in Print)
Benfold, B., Reid, I.: Guiding visual surveillance by tracking human attention. In: Proceedings of the 20th British Machine Vision Conference (September 2009)
Breiman, L., Friedman, J.H., Olshen, R., Stone, C.J.: Classification and Regression Trees. Ann. Math. Statist. 19, 293–325 (1984)
Brown, M., Lowe, D.G.: Unsupervised 3d object recognition and reconstruction in unordered datasets. In: Proceedings of the Fifth International Conference on 3-D Digital Imaging and Modeling, pp. 56–63. IEEE Computer Society, Washington, DC (2005)
Cheng, D.S., Cristani, M., Stoppa, M., Bazzani, L., Murino, V.: Custom pictorial structures for re-identification. In: British Machine Vision Conference, BMVC (2011) (in Print)
Choudhury, T., Pentland, A.: The sociometer: A wearable device for understanding human networks. In: CSCW - Workshop on ACCUCE (2002)
Cohn, J.F.: Foundations of human computing: facial expression and emotion. In: Proceedings of the 8th International Conference on Multimodal Interfaces, pp. 233–238. ACM (2006)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)
Doucet, A., de Freitas, N., Gordon, N. (eds.): Sequential Monte Carlo methods in practice. Springer (2001)
Ekman, P.: Facial expression and emotion. American Psychologist 48(4), 384 (1993)
Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2360–2367 (June 2010)
Farenzena, M., Tavano, A., Bazzani, L., Tosato, D., Pagetti, G., Menegaz, G., Murino, V., Cristani, M.: Social interaction by visual focus of attention in a three-dimensional environment. In: Workshop on Pattern Recognition and Artificial Intelligence for Human Behavior Analysis at AI*IA (2009)
Farenzena, M., Bazzani, L., Murino, V., Cristani, M.: Towards a Subject-Centered Analysis for Automated Video Surveillance. In: Foggia, P., Sansone, C., Vento, M. (eds.) ICIAP 2009. LNCS, vol. 5716, pp. 481–489. Springer, Heidelberg (2009)
Freeman, L.: Social networks and the structure experiment. In: Research Methods in Social Network Analysis, pp. 11–40 (1989)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. The Annals of Statistics 28(2), 337–374 (2000)
Gennari, G., Hager, G.D.: Probabilistic data association methods in visual tracking of groups. In: IEEE Conference on Computer Vision and Pattern Recognition (2004)
Gherardi, R., Farenzena, M., Fusiello, A.: Improving the efficiency of hierarchical structure-and-motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1594–1600 (June 2010)
Hall, E.T.: The hidden dimension, vol. 6. Doubleday, New York (1966)
Hongeng, S., Nevatia, R.: Large-scale event detection using semi-hidden markov models. In: IEEE International Conference on Computer Vision, vol. 2 (2003)
Isard, M., Blake, A.: Condensation: Conditional density propagation for visual tracking. International Journal of Computer Vision 29, 5–28 (1998)
Isard, M., MacCormick, J.: BraMBLe: a bayesian multiple-blob tracker. In: Int. Conf Computer Vision, vol. 2, pp. 34–41 (2001)
Jabarin, B., Wu, J., Vertegaal, R., Grigorov, L.: Establishing remote conversations through eye contact with physical awareness proxies. In: CHI 2003 Extended Abstracts (2003)
Julier, S., Uhlmann, J.: A new extension of the kalman filter to nonlinear systems. In: Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, Orlando, FL (1997)
Kalman, R.E.: A new approach to linear filtering and prediction problems. Tran. of the ASME Journal of Basic Engineering (82), 35–45 (1960)
Kasturi, R., Goldgof, D., Soundararajan, P., Manohar, V., Garofolo, J., Bowers, R., Boonstra, M., Korzhova, V., Zhang, J.: Framework for performance evaluation of face, text, and vehicle detection and tracking in video: Data, metrics, and protocol. IEEE Transactions on Pattern Analysis and Machine Intelligence, 319–336 (2009)
Knapp, M.L., Hall, J.A.: Nonverbal communication in human interaction. Wadsworth Pub. Co. (2009)
Lablack, A., Djeraba, C.: Analysis of human behaviour in front of a target scene. In: IEEE International Conference on Pattern Recognition, pp. 1–4 (2008)
Lan, T., Wang, Y., Yang, W., Mori, G.: Beyond actions: Discriminative models for contextual group activities. In: Advances in Neural Information Processing Systems, NIPS (2010)
Langton, S.H.R., Watt, R.J., Bruce, V.: Do the eyes have it? cues to the direction of social attention. Trends in Cognitive Neuroscience 4(2), 50–58 (2000)
Lanz, O., Brunelli, R., Chippendale, P., Voit, M., Stiefelhagen, R.: Extracting Interaction Cues: Focus of Attention, Body Pose, and Gestures, pp. 87–93. Springer (2009)
Lanz, O.: Approximate bayesian multibody tracking. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1436–1449 (2006)
Lao, Y., Zheng, F.: Tracking a group of highly correlated targets. In: IEEE International Conference on Image Processing (2009)
Li, S.Z., Zhu, L., Zhang, Z., Blake, A., Zhang, H., Shum, H.-Y.: Statistical Learning of Multi-view Face Detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 67–81. Springer, Heidelberg (2002)
Lin, W.C., Liu, Y.: A lattice-based mrf model for dynamic near-regular texture tracking. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 777–792 (2007)
Lin, W., Sun, M.-T., Poovendran, R., Zhang, Z.: Group event detection with a varying number of group members for video surveillance. IEEE Transactions on Circuits and Systems for Video Technology 20(8), 1057–1067 (2010)
Liu, X., Krahnstoever, N., Ting, Y., Tu, P.: What are customers looking at? Advanced Video and Signal Based Surveillance, 405–410 (2007)
Maggio, E., Piccardo, E., Regazzoni, C., Cavallaro, A.: Particle phd filtering for multi-target visual tracking. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 1101–1104 (2007)
Maggio, E., Smerladi, F., Cavallaro, A.: Combining colour and orientation for adaptive particle filter-based tracking. In: British Machine Vision Conference (2005)
Marques, J.S., Jorge, P.M., Abrantes, A.J., Lemos, J.M.: Tracking groups of pedestrians in video sequences. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshop, vol. 9, pp. 101–101 (June 2003)
Matsumoto, Y., Ogasawara, T., Zelinsky, A.: Behavior recognition based on head-pose and gaze direction measurement. In: Proc. Int’l Conf. Intelligent Robots and Systems, vol. 4, pp. 2127–2132 (2002)
Mauthner, T., Donoser, M., Bischof, H.: Robust tracking of spatial related components. In: IEEE International Conference on Pattern Recognition, pp. 1–4 (December 2008)
Mckenna, S.J., Jabri, S., Duric, Z., Wechsler, H., Rosenfeld, A.: Tracking groups of people. Computer Vision and Image Understanding (2000)
Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 31, 607–626 (2009)
Ni, B., Yan, S., Kassim, A.A.: Recognizing human group activities with localized causalities. In: CVPR 2009, pp. 1470–1477 (2009)
Otsuka, K., Yamato, J., Takemae, Y., Murase, H.: Quantifying interpersonal influence in face-to-face conversations based on visual attention patterns. In: Proceedings of the Conference on Human Factors in Computing Systems, pp. 1175–1180. ACM, New York (2006)
Paisitkriangkrai, S., Shen, C.H., Zhang, J.: Performance evaluation of local features in human classification and detection. Computer Vision, Institution of Engineering and Technology 2(4), 236–246 (2008)
Pan, P., Schonfeld, D.: Dynamic proposal variance and optimal particle allocation in particle filtering for video tracking. IEEE Transactions on Circuits and Systems for Video Technology 18(9), 1268–1279 (2008)
Panero, J., Zelnik, M.: Human dimension & interior space: a source book of design reference standards. Whitney Library of Design (1979)
Park, S., Trivedi, M.M.: Multi-person interaction and activity analysis: a synergistic track- and body-level analysis framework. Mach. Vision Appl. 18, 151–166 (2007)
Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: Proc. 12th International Conference on Computer Vision, Kyoto, Japan (2009)
Pentland, A., Pentland, S.: Honest signals: how they shape our world. The MIT Press (2008)
Pentland, A.: Looking at people: Sensing for ubiquitous and wearable computing. IEEE Trans. Pattern Anal. Mach. Intell. 22, 107–119 (2000)
Preparata, F.P., Shamos, M.I.: Computational geometry: an introduction. Springer (1985)
Psathas, G.: Conversation analysis: The study of talk-in-interaction. Sage Publications, Inc. (1995)
Richmond, V.P., McCroskey, J.C., Payne, S.K.: Nonverbal behavior in interpersonal relations. Allyn and Bacon (2000)
Robertson, N., Reid, I.: Estimating Gaze Direction from Low-Resolution Faces in Video (2006)
Rummel, R.J.: Understanding conflict and war. Sage Publications (1981)
Saul, L.K., Jordan, M.I.: Mixed memory markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Machine Learning 37(1), 75–87 (1999)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37, 297–336 (1999)
Scheflen, A.E.: The significance of posture in communication systems. Communication Theory, 293 (2007)
Scherer, K.R.: Personality markers in speech. Cambridge Univ. Press (1979)
Scovanner, P., Tappen, M.F.: Learning pedestrian dynamics from the real world. In: IEEE International Conference on Computer Vision, pp. 381–388 (2009)
Smith, K., Ba, S., Odobez, J., Gatica-Perez, D.: Tracking the visual focus of attention for a varying number of wandering people. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1–18 (2008)
Smith, K., Gatica-Perez, D., Odobez, J., Ba, S.: Evaluating multi-object tracking. In: IEEE Int. Conf. on Computer Vision and Pattern Recognition, pp. 36–43 (2005)
Smith, P., Shah, M., da Vitoria Lobo, N.: Determining driver visual attention with one camera. IEEE Transactions on Intelligent Transportation Systems 4(4), 205–218 (2003)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. In: ACM Transactions on Graphics, vol. 25, pp. 835–846. ACM (2006)
Stiefelhagen, R., Bowers, R., Fiscus, J. (eds.): Multimodal Technologies for Perception of Humans: International Evaluation Workshops on Classification of Events, Activities and Relationships 2007. Springer, Heidelberg (2008)
Stiefelhagen, R., Garofolo, J. (eds.): Multimodal Technologies for Perception of Humans: First International Evaluation Workshop on Classification of Events, Activities and Relationships 2006. Springer, New York Inc. (2007)
Stiefelhagen, R., Yang, J., Waibel, A.: Modeling focus of attention for meeting indexing based on multiple cues. IEEE Transactions on Neural Networks 13, 928–938 (2002)
Stiefelhagen, R., Finke, M., Yang, J., Waibel, A.: From gaze to focus of attention. Visual Information and Information Systems, 761–768 (1999)
Tosato, D., Farenzena, M., Spera, M., Murino, V., Cristani, M.: Multi-Class Classification on Riemannian Manifolds for Video Surveillance. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 378–391. Springer, Heidelberg (2010)
Tuzel, O., Porikli, F., Meer, P.: Region Covariance: A Fast Descriptor for Detection and Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006)
Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1713–1727 (2008)
Vaswani, N., Chowdhury, A.R., Chellappa, R.: Activity recognition using the dynamics of the configuration of interacting objects. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 633–640 (2003)
Vinciarelli, A., Pantic, M., Bourlard, H.: Social Signal Processing: Survey of an emerging domain. Image and Vision Computing Journal 27(12), 1743–1759 (2009)
Vinciarelli, A., Pantic, M., Bourlard, H., Pentland, A.: Social signals, their function, and automatic analysis: a survey. In: Proceedings of the 10th International Conference on Multimodal Interfaces, pp. 61–68. ACM, New York (2008)
Viola, M., Jones, M.J., Viola, P.: Fast multi-view face detection. In: Proc. of Computer Vision and Pattern Recognition, Citeseer (2003)
Voit, M., Stiefelhagen, R.: Deducing the visual focus of attention from head pose estimation in dynamic multi-view meeting scenarios. In: Proceedings of the 10th International Conference on Multimodal Interfaces, ICMI 2008, pp. 173–180. ACM, New York (2008)
Waibel, A., Schultz, T., Bett, M., Denecke, M., Malkin, R., Rogina, I., Stiefelhagen, R.: SMaRT: the Smart Meeting Room task at ISL. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 752–755 (2003)
Wang, X., Ma, X., Grimson, W.E.L.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Trans. Pattern Anal. Mach. Intell. 31, 539–555 (2009)
Wang, Y.-D., Wu, J.-K., Kassim, A.A., Huang, W.-M.: Tracking a variable number of human groups in video using probability hypothesis density. In: IEEE International Conference on Pattern Recognition (2006)
Warner, R.M., Sugarman, D.B.: Attributions of personality based on physical appearance, speech, and handwriting. Journal of Personality and Social Psychology 50(4), 792 (1986)
Whittaker, S., Frohlich, D., Daly-Jones, O.: Informal workplace communication: what is it like and how might we support it? In: CHI 1994, p. 208 (1994)
Wu, B., Nevatia, R.: Optimizing discrimination-efficiency tradeoff in integrating heterogeneous local features for object detection. In: Proceedings of the International Conference of Computer Vision and Pattern Recognition (2008)
Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. Internation Journal of Computer Vision 82(2) (April 2009)
Wu, B., Ai, H., Huang, C., Lao, S.: Fast rotation invariant multi-view face detection based on real adaboost. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, FGR 2004, pp. 79–84. IEEE Computer Society, Washington, DC (2004)
Zheng, W., Gong, S., Xiang, T.: Associating groups of people. In: Proceedings of the British Machine Vision Conference (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Bazzani, L., Cristani, M., Paggetti, G., Tosato, D., Menegaz, G., Murino, V. (2012). Analyzing Groups: A Social Signaling Perspective. In: Shan, C., Porikli, F., Xiang, T., Gong, S. (eds) Video Analytics for Business Intelligence. Studies in Computational Intelligence, vol 409. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28598-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-28598-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28597-4
Online ISBN: 978-3-642-28598-1
eBook Packages: EngineeringEngineering (R0)