Advertisement

Multimedia Tools and Applications

, Volume 22, Issue 3, pp 263–302 | Cite as

Saying What it Means: Semi-Automated (News) Media Annotation

  • Frank Nack
  • Wolfgang Putz
Article

Abstract

This paper considers the automated and semi-automated annotation of audiovisual media in a new type of production framework, A4SM (Authoring System for Syntactic, Semantic and Semiotic Modelling). We present the architecture of the framework, describe a prototypical camera, a handheld device for basic semantic annotation, and an editing suite to demonstrate how video material can be annotated in real time and how this information can not only be used for retrieval but also can be used during the different phases of the production process itself. We then outline the underlying XML Schema based content description structures of A4SM and discuss the pros and cons of our approach of evolving semantic networks as the basis for audio-visual content description.

media production media annotation news production MPEG-7 XML-Schema news production tools 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    T.G. Aguierre Smith and G. Davenport, “The stratification system. A design environment for random access video,” in ACM Workshop on Networking and Operating System Support for Digital Audio and Video, San Diego, California, 1992.Google Scholar
  2. 2.
    P. Aigrain, P. Joly, and V. Longueville, “Medium knowledge-based macro-segmentation of video into sequences,” in IJCAI 95—Workshop on Intelligent Multimedia Information Retrieval, M. Maybury (Ed.), Montréal, 1995, pp. 5–16.Google Scholar
  3. 3.
    R. Arnheim, “Art and visual perception: A psychology of the creative eye,” Faber & Faber: London, 1956.Google Scholar
  4. 4.
    M. Bertini, A. Del Bimbo, and P. Pala, “Content based annotation and retrieval of news videos,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2000), New York City, NY, USA, 2000, pp. 483–488.Google Scholar
  5. 5.
    G.R. Bloch, “Elements d'une machine de montage pour l'audio-visuel,” Ph.D., Ecole Nationale Supérieure Des Télécommunications, 1986.Google Scholar
  6. 6.
    P.J. Bloom, “High-quality digital audio in the entertainment industry: An overview of achievements and challenges,” IEEE Acoust. Speech Signal Process. Mag., Vol. 2, 1985, pp. 2–25.Google Scholar
  7. 7.
    J. Borchers and M. Mühlhäuser, “Design patterns for interactive musical systems,” IEEE Multimedia Magazine, Vol. 5,No. 3, 1998, pp. 36–46.Google Scholar
  8. 8.
    D. Bordwell, “Making meaning—inference and rhetoric in the interpretation of cinema,” Harvard University Press: Cambridge, MA.Google Scholar
  9. 9.
    R.J. Brachman and H.J. Levesque, “Readings in knowledge representation,” Morgan Kaufmann Publishers: San Mateo, CA, 1983.Google Scholar
  10. 10.
    KM Brooks, “Metalinear cinematic narrative: Theory, process, and tool,” Ph.D. Thesis, MIT, 1999.Google Scholar
  11. 11.
    C. Colombo, A. Del Bimbo, and P. Pala, “Semantics in visual information retrieval,” IEEE Multimedia, Vol. 6,No. 3, 1999, pp. 38–53.Google Scholar
  12. 12.
    M. Davis, “Media streams: Representing video for retrieval and repurposing,” Ph.D. MIT, 1995.Google Scholar
  13. 13.
    A. Del Bimbo, “Visual information retrieval,” Morgan Kaufmann: San Francisco, USA, 1999.Google Scholar
  14. 14.
    U. Eco, “A theory of semiotics,” The Macmillan Press: London, 1997.Google Scholar
  15. 15.
    H. Fehlis, Hybrides Trackingsystem für virtuelle Studios, Fernseh + Kinotechnik, Bd. 53, Nr. 5, 1999.Google Scholar
  16. 16.
    J-L. Gauvain, L. Lamel, and G. Adda, “Transcribing broadcast news for audio and video indexing,” Communications of the ACM, Vol. 43,No. 2, 2000, pp. 64–70.Google Scholar
  17. 17.
    A. Gupta and R. Jain, “Visual information retrieval,” Communications of the ACM, Vol. 40, 1997, pp. 71–79.Google Scholar
  18. 18.
    J. Greimas, “Structural semantics: An attempt at a method,” University of Nebraska Press: Lincoln, 1983.Google Scholar
  19. 19.
    F.G. Halasz, “Reflection on notecards: Seven issues for the next generation of hypermedia systems,” Communications of the ACM, Vol. 31,No. 7, 1988.Google Scholar
  20. 20.
    K. Hirata, “Towards formalizing Jazz Piano knowledge with a deductive object-oriented approach,” in Proceedings of Artificial intelligence and Music, IJCAI, Montreal, 1995, pp. 77–80.Google Scholar
  21. 21.
    J. Hunter and L. Armstrong, “A comparison of schemas for video metadata representation,” in Proceedings of the WWW8, Toronto, 1999.Google Scholar
  22. 22.
    J. Hunter and C. Lagoze, “Combining RDF and XML schemas to enhance interoperability between metadata application profiles,” in The Tenth International World Wide Web Conference, Hong Kong, 2001, pp. 457–466.Google Scholar
  23. 23.
    I. Ide, R. Hamada, S. Sakai, and H. Tanaka, “An attribute based news video indexing,” in Workshop Proceedings of the 9th ACM International Conference on Multimedia, Marina del Rey, California, 2000, pp. 195–200.Google Scholar
  24. 24.
    ISO MPEG-7, “Overview of the MPEG-7 standard (version4.0),” Doc. ISO/MPEG N3752, MPEG La Baule Meeting, 2000.Google Scholar
  25. 25.
    ISO MPEG-7, Text of ISO/IEC FCD 15938-2 Information Technology—Multimedia Content Description Interface—Part 2: Description Definition Language, ISO/IEC JTC 1/SC 29/WG 11 N4288, 19/09/2001a.Google Scholar
  26. 26.
    ISO MPEG-7, Text of ISO/IEC 15938-5/FCD Information Technology—Multimedia Content Description Interface—Part 5: Multimedia Description Schemes, ISO/IEC JTC 1/SC 29/WG 11 N4242, 23/10/2001b.Google Scholar
  27. 27.
    S.E. Johnson, P. Jourlin, K. Spärk Jones, and P.C. Woodland, “Audio indexing and retrieval of complete broadcast news shows,” in RIAO' 2000 Conference Proceedings, Collége de France, Paris, France, Vol. 43,No. 2, 2000, pp. 1163–1177.Google Scholar
  28. 28.
    F. Kubala, S. Colbath, D. Liu, A. Srivastava, and J. Makhoul, “Integrated technologies for indexing spoken language,” Communications of the ACM, Vol. 43,No. 2, 2000, pp. 57–63.Google Scholar
  29. 29.
    K. Lemström and J. Tarhio, “Searching monophonic patterns within polyphonic sources,” in RIAO' 2000 Conference Proceedings, Collége de France, Paris, France, Vol. 2, 2000, pp. 1163–1177.Google Scholar
  30. 30.
    C. Lindley, “A video annotation methodology for interactive video sequence generation,” BCS Computer Graphics & Displays Group Conference on Digital Content Creation, Bradford, UK, 2000.Google Scholar
  31. 31.
    M. Melucci and N. Orio, “SMILE: A system for content-based musical information retrieval environments,” in RIAO' 2000 Conference Proceedings, Collége de France, Paris, France, Vol. 2, 2000, pp. 1261–1279.Google Scholar
  32. 32.
    T.J. Mills, D. Pye, N.J. Hollinghurst, and K.R. Wood, “At&TV: Broadcast television and radio retrieval,” in RIAO' 2000 Conference Proceedings, Collége de France, Paris, France, Vol. 2, 2000, pp. 1135–1144.Google Scholar
  33. 33.
    F. Nack, “AUTEUR: The application of video semantics and theme representation in automated video editing,” Ph.D. Lancaster University, 1996.Google Scholar
  34. 34.
    F. Nack and A. Parkes, “The application of video semantics and theme representation in automated video editing,” Multimedia Tools and Applications, H. Zhang (Ed.), Vol. 4,No. 1, 1997, pp 57–83.Google Scholar
  35. 35.
    F. Nack and A. Steinmetz, “Approaches on intelligent video production,” in Proceedings of ECAI-98 Workshop on AI/A life and Entertainment, Brighton, 1998.Google Scholar
  36. 36.
    F. Nack and A. Lindsay, “Everything you wanted to know about MPEG-7: Part I & II IEEE MultiMedia,” pp. 65–77, IEEE Computer Society, 1999, pp 64-73.Google Scholar
  37. 37.
    F. Nack and C. Lindley, “Environments for the production and maintenance of interactive stories,” Workshop on Digital Storytelling, Darmstadt, Germany, 15-16/6/2000.Google Scholar
  38. 38.
    A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-search for video appearance,” in Visual Database Systems, E. Knuth and I.M. Wegener (Eds.), Elsevier Science Publishers: Amsterdam, 1992, pp. 113–127.Google Scholar
  39. 39.
    F. Pachet and D. Cazzly, “A taxonomy of musical genres,” in RIAO' 2000 Conference Proceedings, Collége de France, Paris, France, Vol. 2, 2000, pp. 1238–1245.Google Scholar
  40. 40.
    A.P. Parkes, “An artificial intelligence approach to the conceptual description of videodisc images,” Ph.D. Thesis, Lancaster University, 1989.Google Scholar
  41. 41.
    A.P. Parkes, “Settings and the settings structure: The description and automated propagation of networks for perusing videodisk image states,” in SIGIR '89, N.J. Belkin and C.J. van Rijsbergen (Eds.), Cambridge, MA, 1989, pp. 229–238.Google Scholar
  42. 42.
    S. Pfeiffer, S. Fischer, and Effelsberg, “Automatic audio content analysis,” in Proceedings of the ACM Multimedia 96, New York, 1996, pp. 21–30.Google Scholar
  43. 43.
    RDF Schema, 2001. http://www.w3.org/TR/rdf-schema/.Google Scholar
  44. 44.
    J. Robertson, A. De Quincey, T. Stapleford, and G. Wiggins, “Real-time music generation for a virtual environment,” in Proceedings of ECAI-98 Workshop on AI/A life and Entertainment, 1998, Brighton.Google Scholar
  45. 45.
    W. Sack, “Coding news and popular culture,” in The International Joint Conference on Artificial Intelligence (IJCA93) Workshop on Models of Teaching and Models of Learning. Chambery, Savoie, France, 1993.Google Scholar
  46. 46.
    S. Santini and R. Jaim, “Integrated browsing and querying for image databases,” IEEE MultiMedia, IEEE Computer Society, 2000, pp. 26–39.Google Scholar
  47. 47.
    Semantic Web, 2001. http://www.w3.org/2001/sw/.Google Scholar
  48. 48.
    SMPTE Dynamic Data Dictonary Structure, 6. Draft, 1999.Google Scholar
  49. 49.
    J.F. Sowa, “Conceptual structures: Information processing in mind and machine,” Addison-Wesley Publishing Company: Reading, MA, 1984.Google Scholar
  50. 50.
    TALC, 1999. http://www.de.ibm.com/ide/solutions/dmsc/.Google Scholar
  51. 51.
    Y. Tonomura, A. Akutsu, Y. Taniguchi, and G. Suzuki, “Structured video computing,” IEEE MultiMedia, Vol. 1,No. 3, 1994, pp. 34–43.Google Scholar
  52. 52.
    H.D. Wactler, A.G. Hauptmann, M.G. Christel, R.G. Houghton, and A.M. Olligschlaeger, “Complementary video and audio analysis for broadcast news archives,” Communications of the ACM, Vol. 43,No. 2, 2000, pp. 42–47.Google Scholar
  53. 53.
    E. Wold, T. Blum, D. Keislar, and J. Wheaton, “Content-based classification, search, and retrieval of audio,” IEEE Multimedia Magazine, Vol. 3,No. 3, 1996, pp. 27–36.Google Scholar
  54. 54.
    X-Link, 2001. http://www.w3.org/TR/xlink/.Google Scholar
  55. 55.
    XML Schema, 2000. XML Schema Part 0: Primer, W3C Candidate Recommendation, 24 October 2000, http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/; XML Schema Part 1: Structures, W3C Recommendation, 24 October 2000, http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/; XML Schema Part 2: Datatypes, W3C Recommendation, 24 October 2000, http://www.w3.org/TR/2000/CR-xmlschema-1-20001024/.Google Scholar
  56. 56.
    XML Schema, 2001. XML Schema Part 0: Primer, W3C Recommendation, 2 May 2001 http://www.w3.org/TR/xmlschema-0/; XML Schema Part 1: Structures, W3C Recommendation, 2 May 2001 http://www.w3.org/TR/xmlschema-1/; XML Schema Part 2: Datatypes, W3C Recommendation, 2 May 2001 http://www.w3.org/TR/xmlschema-2/.Google Scholar
  57. 57.
    M.M. Yeung, B. Yeo, W. Wolf, and B. Liu, “Video browsing using clustering and scene transitions on compressed sequences,” in Proceedings IS&T/SPIE '95 Multimedia Computing and Networking, San Jose, SPIE (2417), 1995, pp. 399–413.Google Scholar
  58. 58.
    H. Zhang, Y. Gong, and S.W. Smoliar, “Automated parsing of news video,” in IEEE International Conference on Multimedia Computing and Systems, Boston: IEEE Computer Society Press, 1994, pp. 45–54.Google Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Frank Nack
    • 1
  • Wolfgang Putz
    • 2
  1. 1.CWIAmsterdamThe Netherlands
  2. 2.FHG-IPSIDarmstadtGermany

Personalised recommendations