Beyond the Static Camera: Issues and Trends in Active Vision

  • Murad Al Haj
  • Carles Fernández
  • Zhanwu Xiong
  • Ivan Huerta
  • Jordi Gonzàlez
  • Xavier Roca


Maximizing both the area coverage and the resolution per target is highly desirable in many applications of computer vision. However, with a limited number of cameras viewing a scene, the two objectives are contradictory. This chapter is dedicated to active vision systems, trying to achieve a trade-off between these two aims and examining the use of high-level reasoning in such scenarios. The chapter starts by introducing different approaches to active cameras configurations. Later, a single active camera system to track a moving object is developed, offering the reader first-hand understanding of the issues involved. Another section discusses practical considerations in building an active vision platform, taking as an example a multi-camera system developed for a European project. The last section of the chapter reflects upon the future trends of using semantic factors to drive smartly coordinated active systems.


Root Mean Square Deviation Extend Kalman Filter Active Vision World Coordinate System Travel Salesperson Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work has been supported by the European Project FP6 HERMES IST-027110. The authors wish to thank the rest of the partners in the HERMES consortium, namely AVL at Oxford University, BiWi at ETH Zurich, CVMT at Aalborg University and IAKS at Universität Karlsruhe. Also, the authors acknowledge the support of the Spanish Research Programs Consolider-Ingenio 2010: MIPRCV (CSD200700018); Avanza I+D ViCoMo (TSI-020400-2009-133); CENIT-IMAGENIO 2010 SEGUR@; along with the Spanish projects TIN2009-14501-C02-01 and TIN2009-14501-C02-02. Moreover, Murad Al Haj acknowledges the support from the Generalitat de Catalunya through an AGAUR FI predoctoral grant (IUE/2658/2007).


  1. 1.
    Al Haj, M., Bagdanov, A.D., Gonzàlez, J., Roca, F.X.: Robust and efficient multipose face detection using skin color segmentation. In: Pattern Recognition and Image Analysis. Lecture Notes in Computer Science, vol. 5524, pp. 152–159. Springer, Berlin (2009) CrossRefGoogle Scholar
  2. 2.
    Al Haj, M., Bagdanov, A.D., Gonzàlez, J., Roca, F.X.: Reactive object tracking with a single PTZ camera. In: International Conference on Pattern Recognition, pp. 1690–1693 (2010) CrossRefGoogle Scholar
  3. 3.
    Aloimonos, J., Weiss, I., Bandyopadhyay, A.: Active vision. Int. J. Comput. Vis. 1(4), 333–356 (1988) CrossRefGoogle Scholar
  4. 4.
    Anagnostopoulos, C.K., Anagnostopoulos, I.E., Psoroulas, I.D., Kayafas, E.: License plate recognition from still images and video sequences: A survey. IEEE Trans. Intell. Transp. Syst. 9(3), 377–391 (2008) CrossRefGoogle Scholar
  5. 5.
    Bagdanov, A.D., Del Bimbo, A., Nunziati, W.: Improving evidential quality of surveillance imagery through active face tracking. In: International Conference on Pattern Recognition, pp. 1200–1203 (2006) Google Scholar
  6. 6.
    Bashir, F., Porikli, F.: Collaborative tracking of objects in Eptz cameras. In: Visual Communications and Image Processing, vol. 6508, p. 2007 (2007) Google Scholar
  7. 7.
    Bellotto, N., Sommerlade, E., Benfold, B., Bibby, C., Reid, I., Roth, D., Gool, L.V., Fernández, C., Gonzàlez, J.: A distributed camera system for multi-resolution surveillance. In: International Conference on Distributed Smart Cameras (ICDSC), Como, Italy (2009) Google Scholar
  8. 8.
    Calderara, S., Cucchiara, R., Prati, A.: Bayesian-competitive consistent labeling for people surveillance. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 354–360 (2008) CrossRefGoogle Scholar
  9. 9.
    Cohen, I., Sebe, N., Garg, A., Chen, L., Huang, T.S.: Facial expression recognition from video sequences: temporal and static modeling. Comput. Vis. Image Underst. 91(1–2), 160–187 (2003) CrossRefGoogle Scholar
  10. 10.
    Costello, C.J., Diehl, C.P., Banerjee, A., Fisher, H.: Scheduling an active camera to observe people. In: International Workshop on Video Surveillance and Sensor Networks (VSSN) (2004) Google Scholar
  11. 11.
    Del Bimbo, A., Dini, F., Lisanti, G., Pernici, F.: Exploiting distinctive visual landmark maps in pan–tilt–zoom camera networks. Comput. Vis. Image Underst. 114(6), 611–623 (2010). CrossRefGoogle Scholar
  12. 12.
    Denzler, J., Zobel, M., Niemann, H.: Information theoretic focal length selection for real-time active 3-d object tracking. In: International Conference on Computer Vision, pp. 400–407. IEEE Comput. Soc., Los Alamitos (2003) CrossRefGoogle Scholar
  13. 13.
    Erdem, U.M., Sclaroff, S.: Look there! Predicting Where to look for motion in an active camera network. In: International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 105–110. IEEE, New York (2006) Google Scholar
  14. 14.
    Gerber, R., Nagel, H.-H.: Representation of occurrences for road vehicle traffic. Artif. Intell. 172(4–5), 351–391 (2008) CrossRefGoogle Scholar
  15. 15.
    Gonzàlez, J., Rowe, D., Varona, J., Roca, X.: Understanding dynamic scenes based on human sequence evaluation. Image Vis. Comput. 27(10), 1433–1444 (2009) CrossRefGoogle Scholar
  16. 16.
    Hampapur, A., Pankanti, S., Senior, A., Tian, Y.L., Brown, L., Bolle, R.: Face cataloger: Multi-scale imaging for relating identity to location. In: International Conference on Advanced Video and Signal-based Surveillance (AVSS), pp. 13–20. IEEE, New York (2003) Google Scholar
  17. 17.
    Ilie, A., Welch, G., Macenko, M.: A stochastic quality metric for optimal control of active camera network configurations for 3D computer vision tasks. In: International Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications (M2SFA2), Marseille, France (2008) Google Scholar
  18. 18.
    Murray, D.W., Bradshaw, K.J., McLauchlan, P.F., Reid, I.D., Sharkey, P.: Driving saccade to pursuit using image motion. Int. J. Comput. Vis. 16(3), 205–228 (1995) CrossRefGoogle Scholar
  19. 19.
    Nelson, E.D., Cockburn, J.C.: Dual camera zoom control: A study of zoom tracking stability. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE Comput. Soc., Los Alamitos (2007) Google Scholar
  20. 20.
    Peixoto, P., Batista, J., Araujo, H.: A surveillance system combining peripheral and foveated motion tracking. In: International Conference on Pattern Recognition, vol. 1, pp. 574–577. IEEE, New York (2002) Google Scholar
  21. 21.
    Qureshi, F.Z., Terzopoulos, D.: Surveillance in virtual reality: System design and multi-camera control. In: Computer Vision and Pattern Recognition, pp. 1–8 (2007) Google Scholar
  22. 22.
    Qureshi, F.Z., Terzopoulos, D.: Multi-camera control through constraint satisfaction for persistent surveillance. In: International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 211–218. IEEE, New York (2008) CrossRefGoogle Scholar
  23. 23.
    Roth, D., Koller-Meier, E., Rowe, D., Moeslund, T.B., Gool, L.V.: Event-based tracking evaluation metric. In: International Workshop on Motion and Video Computing (WMVC), Copper Mountain, Colorado, USA (2008) Google Scholar
  24. 24.
    Smith, P., Shah, M., da Vitoria Lobo, N.: Integrating multiple levels of zoom to enable activity analysis. Comput. Vis. Image Underst. 103(1), 33–51 (2006) CrossRefGoogle Scholar
  25. 25.
    Sommerlade, E., Reid, I.: Information-theoretic active scene exploration. In: Computer Vision and Pattern Recognition (2008) Google Scholar
  26. 26.
    Tordoff, B.J., Murray, D.W.: A method of reactive zoom control from uncertainty in tracking. Comput. Vis. Image Underst. 105(2), 131–144 (2007) CrossRefGoogle Scholar
  27. 27.
    Wang, J., Zhang, C., Shum, H.: Face image resolution versus face recognition performance based on two global methods. In: Asian Conference on Computer Vision (2004) Google Scholar
  28. 28.
    Welch, G., Bishop, G.: An introduction to the Kalman filter. Technical report, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA (1995) Google Scholar
  29. 29.
    Wrede, S., Hanheide, M., Wachsmuth, S., Sagerer, G.: Integration and coordination in a cognitive vision system. In: International Conference on Computer Vision Systems (ICVS), IEEE Comput. Soc., Los Alamitos (2006) Google Scholar
  30. 30.
    Zhang, Y., Ji, Q.: Facial expression understanding in image sequences using dynamic and active visual information fusion. In: International Conference on Computer Vision (2003) Google Scholar
  31. 31.
    Zhou, X., Collins, R.T., Kanade, T., Metes, P.: A master-slave system to acquire biometric imagery of humans at distance. In: International Workshop on Video Surveillance (VS), pp. 113–120, ACM, New York (2003) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Murad Al Haj
    • 1
  • Carles Fernández
    • 1
  • Zhanwu Xiong
    • 2
  • Ivan Huerta
    • 1
  • Jordi Gonzàlez
    • 2
  • Xavier Roca
    • 2
  1. 1.Computer Vision CenterUniversitat Autònoma de BarcelonaBellaterraSpain
  2. 2.Computer Vision Center and Departament de Ciències de la ComputacióUniversitat Autònoma de BarcelonaBellaterraSpain

Personalised recommendations