Skip to main content

Video Streaming with Interactive Pan/Tilt/Zoom

  • Chapter

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

High-spatial-resolution videos offer the possibility of viewing an arbitrary region-of-interest (RoI) interactively. The user can pan/tilt/zoom while watching the video. This chapter presents spatial-random-access-enabled video compression that encodes the content such that arbitrary RoIs corresponding to different zoom factors can be extracted from the compressed bit-stream. The chapter also covers RoI trajectory prediction, which allows pre-fetching relevant content in a streaming scenario. The more accurate the prediction the lower is the percentage of missing pixels. RoI prediction techniques can perform better by adapting according to the video content in addition to simply extrapolating previous moves of the input device. Finally, the chapter presents a streaming system that employs application-layer peer-to-peer (P2P) multicast while still allowing the users to freely choose individual RoIs. The P2P overlay adapts on-the-fly for exploiting the commonalities in the peers’ RoIs. This enables peers to relay data to each other in real-time, thus drastically reducing the bandwidth required from dedicated servers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   179.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   229.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fehn, C., Weissig, C., Feldmann, I., Mueller, M., Eisert, P., Kauff, P., Bloss, H.: Creation of High-Resolution Video Panoramas of Sport Events. In: Proc. IEEE 8th International Symposium on Multimedia, San Diego, CA, USA (2006)

    Google Scholar 

  2. Kopf, J., Uyttendaele, M., Deussen, O., Cohen, M.F.: Capturing and Viewing Gigapixel Images. In: Proc. ACM SIGGRAPH, San Diego, CA, USA (2007)

    Google Scholar 

  3. Halo: Video conferencing product by Hewlett-Packard, http://www.hp.com/halo/index.html (accessed November 5, 2009)

  4. Smolic, A., McCutchen, D.: 3DAV Exploration of Video-based Rendering Technology in MPEG. IEEE Transactions on Circuits and Systems for Video Technology 14(3), 348–356 (2004)

    Article  Google Scholar 

  5. Dodeca 2360: An omni-directional video camera providing over 100 million pixels per second by Immersive Media, http://www.immersivemedia.com (accessed November 5, 2009)

  6. Video clip showcasing interactive TV with pan/tilt/zoom, http://www.youtube.com/watch?v=Ko9jcIjBXnk (accessed November 5, 2009)

  7. ISO/IEC 15444-1:2004, JPEG 2000 Specification. Standard (2004)

    Google Scholar 

  8. Taubman, D., Rosenbaum, R.: Rate-Distortion Optimized Interactive Browsing of JPEG 2000 Images. In: Proc. IEEE International Conference on Image Processing, Barcelona, Spain (2000)

    Google Scholar 

  9. Taubman, D., Prandolini, R.: Architecture, Philosophy and Performance of JPIP: Internet Protocol Standard for JPEG 2000. In: Proc. SPIE International Symposium on Visual Communications and Image Processing, Lugano, Switzerland (2003)

    Google Scholar 

  10. H.264/AVC/MPEG-4 Part 10 (ISO/IEC 14496-10: Advanced Video Coding). Standard (2003)

    Google Scholar 

  11. Wiegand, T., Sullivan, G., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC Video Coding Standard. IEEE Transactions on Circuits and Systems for Video Technology 13(7), 560–576 (2003)

    Article  Google Scholar 

  12. Dhondt, Y., Lambert, P., Notebaert, S., Van de Walle, R.: Flexible macroblock ordering as a content adaptation tool in H.264/AVC. In: Proc. SPIE Conference on Multimedia Systems and Applications VIII, Boston, MA, USA (2005)

    Google Scholar 

  13. Annex G of H.264/AVC/MPEG-4 Part 10: Scalable Video Coding (SVC). Standard (2007)

    Google Scholar 

  14. Schwarz, H., Marpe, D., Wiegand, T.: Overview of the Scalable Video Coding Extension of the H.264/AVC Standard. IEEE Transactions on Circuits and Systems for Video Technology 17(9), 1103–1120 (2007)

    Article  Google Scholar 

  15. Baccichet, P., Zhu, X., Girod, B.: Network-Aware H.264/AVC Region-of-Interest Coding for a Multi-Camera Wireless Surveillance Network. In: Proc. Picture Coding Symposium, Beijing, China (2006)

    Google Scholar 

  16. Devaux, F., Meessen, J., Parisot, C., Delaigle, J., Macq, B., Vleeschouwer, C.D.: A Flexible Video Transmission System based on JPEG 2000 Conditional Replenishment with Multiple References. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA (2007)

    Google Scholar 

  17. Makar, M., Mavlankar, A., Girod, B.: Compression-Aware Digital Pan/Tilt/Zoom. In: Proc. 43rd Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA (2009)

    Google Scholar 

  18. Girod, B.: The Efficiency of Motion-Compensating Prediction for Hybrid Coding of Video Sequences. IEEE Journal on Selected Areas in Communications 5(7), 1140–1154 (1987)

    Article  Google Scholar 

  19. Girod, B.: Motion-Compensating Prediction with Fractional-Pel Accuracy. IEEE Transactions on Communications 41(4), 604–612 (1993)

    Article  Google Scholar 

  20. Girod, B.: Efficiency analysis of multihypothesis motion-compensated prediction for video coding. IEEE Transactions on Image Processing 9(2), 173–183 (2000)

    Article  Google Scholar 

  21. Gruenheit, C., Smolic, A., Wiegand, T.: Efficient Representation and Interactive Streaming of High-Resolution Panoramic Views. In: Proc. IEEE International Conference on Image Processing, Rochester, NY, USA (2002)

    Google Scholar 

  22. Heymann, S., Smolic, A., Mueller, K., Guo, Y., Rurainsky, J., Eisert, P., Wiegand, T.: Representation, Coding and Interactive Rendering of High-Resolution Panoramic Images and Video using MPEG-4. In: Proc. Panoramic Photogrammetry Workshop, Berlin, Germany (2005)

    Google Scholar 

  23. Kauff, P., Schreer, O.: Virtual Team User Environments - A Step from Tele-cubicles towards Distributed Tele-collaboration in Mediated Workspaces. In: Proc. IEEE International Conference on Multimedia and Expo., Lausanne, Switzerland (2002)

    Google Scholar 

  24. Tanimoto, M.: Free Viewpoint Television — FTV. In: Proc. Picture Coding Symposium, San Francisco, CA, USA (2004)

    Google Scholar 

  25. Smolic, A., Mueller, K., Merkle, P., Fehn, C., Kauff, P., Eisert, P., Wiegand, T.: 3D Video and Free Viewpoint Video - Technologies, Applications and MPEG Standards. In: Proc. IEEE International Conference on Multimedia and Expo., Toronto, ON, Canada (2006)

    Google Scholar 

  26. Shum, H.Y., Kang, S.B., Chan, S.C.: Survey of Image-based Representations and Compression Techniques. IEEE Transactions on Circuits and Systems for Video Technology 13(11), 1020–1037 (2003)

    Article  Google Scholar 

  27. Levoy, M., Hanrahan, P.: Light Field Rendering. In: Proc. ACM SIGGRAPH, New Orleans, LA, USA (1996)

    Google Scholar 

  28. Bauermann, I., Steinbach, E.: RDTC Optimized Compression of Image-Based Scene Representations (Part I): Modeling and Theoretical Analysis. IEEE Transactions on Image Processing 17(5), 709–723 (2008)

    Article  MathSciNet  Google Scholar 

  29. Bauermann, I., Steinbach, E.: RDTC Optimized Compression of Image-Based Scene Representations (Part II): Practical Coding. IEEE Transactions on Image Processing 17(5), 724–736 (2008)

    Article  MathSciNet  Google Scholar 

  30. Kimata, H., Kitahara, M., Kamikura, K., Yashima, Y., Fujii, T., Tanimoto, M.: Low-Delay Multiview Video Coding for Free-viewpoint Video Communication. Systems and Computers in Japan 38(5), 14–29 (2007)

    Article  Google Scholar 

  31. Liu, Y., Huang, Q., Zhao, D., Gao, W.: Low-delay View Random Access for Multi-view Video Coding. In: Proc. IEEE International Symposium on Circuits and Systems, New Orleans, LA, USA (2007)

    Google Scholar 

  32. Flierl, M., Mavlankar, A., Girod, B.: Motion and Disparity Compensated Coding for Multi-View Video. IEEE Transactions on Circuits and Systems for Video Technology 17(11), 1474–1484 (2007) (invited Paper)

    Article  Google Scholar 

  33. Cheung, G., Ortega, A., Cheung, N.M.: Generation of Redundant Frame Structure for Interactive Multiview Streaming. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)

    Google Scholar 

  34. Ramanathan, P., Girod, B.: Rate-Distortion Optimized Streaming of Compressed Light Fields with Multiple Representations. In: Proc. IEEE 14th Packet Video Workshop, Irvine, CA, USA (2004)

    Google Scholar 

  35. Ramanathan, P., Girod, B.: Random Access for Compressed Light Fields using Multiple Representations. In: Proc. IEEE 6th International Workshop on Multimedia Signal Processing, Siena, Italy (2004)

    Google Scholar 

  36. Jagmohan, A., Sehgal, A., Ahuja, N.: Compression of Lightfield Rendered Images using Coset Codes. In: Proc. 37th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA (2003)

    Google Scholar 

  37. Aaron, A., Ramanathan, P., Girod, B.: Wyner-Ziv Coding of Light Fields for Random Access. In: Proc. IEEE 6th Workshop on Multimedia Signal Processing, Siena, Italy (2004)

    Google Scholar 

  38. Cheung, N.M., Ortega, A., Cheung, G.: Distributed Source Coding Techniques for Interactive Multiview Video Streaming. In: Proc. Picture Coding Symposium, Chicago, IL, USA (2009)

    Google Scholar 

  39. Azuma, R., Bishop, G.: A Frequency-domain Analysis of Head-motion Prediction. In: Proc. ACM SIGGRAPH, Los Angeles, CA, USA (1995)

    Google Scholar 

  40. Singhal, S.K., Cheriton, D.R.: Exploiting Position History for Efficient Remote Rendering in Networked Virtual Reality. Presence: Teleoperators and Virtual Environments 4, 169–193 (1995)

    Google Scholar 

  41. Ramanathan, P., Kalman, M., Girod, B.: Rate-Distortion Optimized Interactive Light Field Streaming. IEEE Transactions on Multimedia 9(4), 813–825 (2007)

    Article  Google Scholar 

  42. Kiruluta, A., Eizenman, M., Pasupathy, S.: Predictive Head Movement Tracking using a Kalman Filter. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics 27(2), 326–331 (1997)

    Article  Google Scholar 

  43. Kurutepe, E., Civanlar, M.R., Tekalp, A.M.: A Receiver-driven Multicasting Framework for 3DTV Transmission. In: Proc. 13th European Signal Processing Conference, Antalya, Turkey (2005)

    Google Scholar 

  44. Kurutepe, E., Civanlar, M.R., Tekalp, A.M.: Interactive Transport of Multi-view Videos for 3DTV Applications. Journal of Zhejiang University - Science A (2006)

    Google Scholar 

  45. Deering, S.: Host Extensions for IP Multicasting. RFC 1112 (1989)

    Google Scholar 

  46. Albanna, Z., Almeroth, K., Meyer, D., Schipper, M.: IANA guidelines for IPv4 multicast address assignments. RFC 3171 (2001)

    Google Scholar 

  47. McCanne, S., Jacobson, V., Vetterli, M.: Receiver-Driven Layered Multicast. In: Proc. ACM SIGCOMM, Stanford, CA, USA (1996)

    Google Scholar 

  48. Estrin, D., Handley, M., Helmy, A., Huang, P., Thaler, D.: A Dynamic Bootstrap Mechanism for Rendezvous-based Multicast Routing. In: Proc. IEEE INFOCOM, New York, USA (1999)

    Google Scholar 

  49. Chu, Y.H., Rao, S., Seshan, S., Zhang, H.: A Case for End System Multicast. IEEE Journal on Selected Areas in Communications 20(8), 1456–1471 (2002)

    Article  Google Scholar 

  50. Setton, E., Baccichet, P., Girod, B.: Peer-to-Peer Live Multicast: A Video Perspective. Proceedings of the IEEE 96(1), 25–38 (2008)

    Article  Google Scholar 

  51. Magharei, N., Rejaie, R., Guo, Y.: Mesh or Multiple-Tree: A Comparative Study of Live Peer-to-Peer Streaming Approaches. In: Proc. IEEE INFOCOM (2007)

    Google Scholar 

  52. Agarwal, S., Singh, J., Mavlankar, A., Baccichet, P., Girod, B.: Performance of P2P Live Video Streaming Systems on a Controlled Test-bed. In: Proc. 4th International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, Innsbruck, Austria (2008)

    Google Scholar 

  53. Mavlankar, A., Baccichet, P., Girod, B., Agarwal, S., Singh, J.: Video Quality Assessment and Comparative Evaluation of Peer-to-Peer Video Streaming Systems. In: Proc. IEEE International Conference on Multimedia and Expo., Hanover, Germany (2008)

    Google Scholar 

  54. Kurutepe, E., Sikora, T.: Feasibility of Multi-View Video Streaming Over P2P Networks. In: 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (2008)

    Google Scholar 

  55. Kurutepe, E., Sikora, T.: Multi-view video streaming over p2p networks with low start-up delay. In: Proc. IEEE International Conference on Image Processing (2008)

    Google Scholar 

  56. Mavlankar, A., Baccichet, P., Varodayan, D., Girod, B.: Optimal Slice Size for Streaming Regions of High Resolution Video with Virtual Pan/Tilt/Zoom Functionality. In: Proc. 15th European Signal Processing Conference, Poznan, Poland (2007)

    Google Scholar 

  57. Massey, M., Bender, W.: Salient Stills: Process and Practice. IBM Systems Journal 35(3&4), 557–573 (1996)

    Article  Google Scholar 

  58. Farin, D., de With, P., Effelsberg, W.: Robust Background Estimation for Complex Video Sequences. In: Proc. IEEE International Conference on Image Processing, Barcelona, Spain (2003)

    Google Scholar 

  59. Wiegand, T., Zhang, X., Girod, B.: Long-Term Memory Motion-Compensated Prediction. IEEE Transactions on Circuits and Systems for Video Technology 9(1), 70–84 (1999)

    Article  Google Scholar 

  60. Mavlankar, A., Girod, B.: Background Extraction and Long-Term Memory Motion-Compensated Prediction for Spatial-Random-Access-Enabled Video Coding. In: Proc. Picture Coding Symposium, Chicago, IL, USA (2009)

    Google Scholar 

  61. Bernstein, J., Girod, B., Yuan, X.: Hierarchical Encoding Method and Apparatus Employing Background References for Efficiently Communicating Image Sequences. US Patent (1992)

    Google Scholar 

  62. Hepper, D.: Efficiency Analysis and Application of Uncovered Background Prediction in a Low Bit Rate Image Coder. IEEE Transactions on Communications 38(9), 1578–1584 (1990)

    Article  Google Scholar 

  63. Mavlankar, A., Varodayan, D., Girod, B.: Region-of-Interest Prediction for Interactively Streaming regions of High Resolution Video. In: Proc. IEEE 16th Packet Video Workshop, Lausanne, Switzerland (2007)

    Google Scholar 

  64. Mavlankar, A., Girod, B.: Pre-fetching based on Video Analysis for Interactive Region-of-Interest Streaming of Soccer Sequences. In: Proc. IEEE International Conference on Image Processing, Cairo, Egypt (2009)

    Google Scholar 

  65. Tomasi, C., Kanade, T.: Detection and Tracking of Point Features. Tech. Rep. CMU-CS-91-132, Carnegie Mellon University, Pittsburgh, PA (1991)

    Google Scholar 

  66. Takacs, G., Chandrasekhar, V., Girod, B., Grzeszczuk, R.: Feature Tracking for Mobile Augmented Reality Using Video Coder Motion Vectors. In: Proc. IEEE and ACM 6th International Symposium on Mixed and Augmented Reality, Nara, Japan (2007)

    Google Scholar 

  67. Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Peer-to-Peer Multicast Live Video Streaming with Interactive Virtual Pan/Tilt/Zoom Functionality. In: Proc. IEEE International Conference on Image Processing, San Diego, CA, USA (2008)

    Google Scholar 

  68. Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Optimal Server Bandwidth Allocation for Streaming Multiple Streams via P2P Multicast. In: Proc. IEEE 10th Workshop on Multimedia Signal Processing, Cairns, Australia (2008)

    Google Scholar 

  69. Setton, E., Noh, J., Girod, B.: Rate-Distortion Optimized Video Peer-to-Peer Multicast Streaming. In: Proc. Workshop on Advances in Peer-to-Peer Multimedia Streaming at ACM Multimedia, Singapore (2005) (invited Paper)

    Google Scholar 

  70. Setton, E., Noh, J., Girod, B.: Low Latency Video Streaming over Peer-To-Peer Networks. In: Proc. IEEE International Conference on Multimedia and Expo., Toronto, Canada (2006)

    Google Scholar 

  71. Setton, E., Noh, J., Girod, B.: Congestion-Distortion Optimized Peer-to-Peer Video Streaming. In: Proc. IEEE International Conference on Image Processing, Atlanta, GA, USA (2006)

    Google Scholar 

  72. Baccichet, P., Noh, J., Setton, E., Girod, B.: Content-Aware P2P Video Streaming with Low Latency. In: Proc. IEEE International Conference on Multimedia and Expo., Beijing, China (2007)

    Google Scholar 

  73. Setton, E.: Congestion-Aware Video Streaming over Peer-to-Peer Networks. Ph.D. thesis, Stanford University, Stanford, CA, USA (2006)

    Google Scholar 

  74. Noh, J., Baccichet, P., Girod, B.: Experiences with a Large-Scale Deployment of Stanford Peer-to-Peer Multicast. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)

    Google Scholar 

  75. Mavlankar, A., Noh, J., Baccichet, P., Girod, B.: Optimal Server Bandwidth Allocation among Multiple P2P Multicast Live Video Streaming Sessions. In: Proc. IEEE 17th Packet Video Workshop, Seattle, WA, USA (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mavlankar, A., Girod, B. (2010). Video Streaming with Interactive Pan/Tilt/Zoom. In: Mrak, M., Grgic, M., Kunt, M. (eds) High-Quality Visual Experience. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12802-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12802-8_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12801-1

  • Online ISBN: 978-3-642-12802-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics