Advertisement

Mutual Spotting Retrieval between Speech and Video Image Using Self-Organized Network Databases

  • Takashi Endo
  • Jian Xin Zhang
  • Masakyuki Nakazawa
  • Ryuichi Oka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1554)

Abstract

Video codec technology like MPEG and improved performance of microprocessors enable environments to be setup in which large volumes of video images can be stored. The ability to perform search and retrieve operations on stored video is therefore becoming more important. This paper proposes a technique for performing mutual spotting retrieval between speech and video images in which either speech or video is used as a query to retrieve the other. This technique makes use of a network that self organizes itself incrementally and represents redundant structures in degenerate form, which makes for efficient searches. As a result, the capacity of a database can be decreased by about one half for speech and by about three fourths for video when expressed in network form. Applying this technique to a database consisting of six-hours worth of speech and video, it was found that a search from video to speech could be performed in 0.5 seconds per frame.

Keywords

Video Image Vector Quantization Input Symbol Video Query Reachable Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
    H. Ueda, T. Miyatake, S. Yosizawa, “A proposal of an Interactive Motion Picture Editing Scheme Assisted by Media Recognition Technology,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J75-D-II, No.2, pp.216–225, 1992.Google Scholar
  4. 4.
    K. Otsuji, Y. Tonomura, Y. Ohba, “Cut Detecting Method by Projection Detecting Filter,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J77-D-II, No.3, pp.519–528, 1994.Google Scholar
  5. 5.
    Y. Taniguchi, Y. Tonomura, H. Hamada, “A Method for Detecting Shot Changes and Its Application to Access Interfaces to Video,” The transactions of the Institute of Electronics, Information and Communication Engineers (D-II), Vol. J79-D-II, No.4, pp.538–546, 1996.Google Scholar
  6. 6.
    A. Nagasaka, T. Miyatake, H. Ueda, “Realtime Video Scene Detection based on Shot Sequence Encoding,” The transactions of the Institute of Electronics, Information and Communication Engineers (D-II), Vol. J79-D-II, No.4, pp.531–537(1996-4).Google Scholar
  7. 7.
    Y. Yaginuma, M. Sakauchi, “A Proposal of a Synchronization Method between Drama Image, Sound and Scenario Document Using DP Matching,” The transactions of the Institute of Electronics, Information and Communication Engineers(D-II), Vol. J79-D-II, No. 5, pp.747–755, 1996.Google Scholar
  8. 8.
    R. Lienhart, “Indexing and Retrieval of Digital Video Sequences based on Automatic Text Recognition,”Proc. of Int. Multimedia Conf. 96, pp. 419–420, 1996.Google Scholar
  9. 9.
    M. Abdel-Mottaleb, N. Dimitrova, “CONIVAS: CONtent-based Image and Video Access System,” Proc. of Int. Multimedia Conf. 96, pp. 427–428, 1996.Google Scholar
  10. 10.
    Hatano, Quian, Tanaka, “A SOM-Based Information Organizer for Text and Video Data,” Proc. of the International Conference on Database Systems for Advanced Applications, 1997.Google Scholar
  11. 11.
    R. Oka, Y. Itoh, J. Kiyama, C. Zhang, “Concept spotting by image automaton,” In RWC Symposium’ 95, pp. 45–46, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Takashi Endo
    • 1
  • Jian Xin Zhang
    • 2
  • Masakyuki Nakazawa
    • 1
  • Ryuichi Oka
    • 1
  1. 1.Real World Computing PartnershipIbarakiJapan
  2. 2.Mediadrive Co, Ltd.SaitamaJapan

Personalised recommendations