Fast Rotation-Invariant Video Caption Detection Based on Visual Rhythm

  • Felipe Braunger Valio
  • Helio Pedrini
  • Neucimar Jeronimo Leite
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7042)


Text detection in images has been studied and improved for decades. There are many works that extend the existing methods for analyzing videos, however, few of them create or adapt approaches that consider inherent characteristics of videos, such as temporal information. This work proposes a very fast method for identifying video frames that contain text through a special data structure called visual rhythm. The method is robust to detect video captions with respect to font styles, color intensity, and text orientation. A data set was built in our experiments to compare and evaluate the effectiveness of the proposed method.


Video Sequence Video Frame Video Caption Text Orientation Caption Localization 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agnihotri, L., Dimitrova, N.: Text Detection for Video Analysis. In: IEEE Workshop on Content-Based Access of Image and Video Libraries, pp. 109–113 (1999)Google Scholar
  2. 2.
    Chen, D.Y., Hsiao, M.H., Lee, S.Y.: Automatic Closed Caption Detection and Font Size Differentiation in MPEG Video. In: Chang, S.-K., Chen, Z., Lee, S.-Y. (eds.) VISUAL 2002. LNCS, vol. 2314, pp. 276–287. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Chun, S.S., Hyeokman, K., Jung-Rim, K., Sangwook, O., Sanghoon, S.: Fast text caption localization on video using visual rhythm. In: Chang, S.-K., Chen, Z., Lee, S.-Y. (eds.) VISUAL 2002. LNCS, vol. 2314, pp. 259–268. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Karamitroglou, F.: A Proposed Set of Subtitling Standards in Europe. Translation Journal 2(2) (February 2007)Google Scholar
  5. 5.
    Lee, C.C., Chiang, Y.C., Huang, H.M., Tsai, C.L.: A Fast Caption Localization and Detection for News Videos. In: International Conference on Innovative Computing, Information and Control, Los Alamitos, CA, USA, pp. 226–229 (September 2007)Google Scholar
  6. 6.
    Lienhart, R., Wernicke, A.: Localizing and Segmenting Text in Images and Videos. IEEE Transactions on Circuits and Systems for Video Technology 12(4), 256–268 (2002)CrossRefGoogle Scholar
  7. 7.
  8. 8.
    Ngo, C., Pong, T., Chin, R.: Detection of Gradual Transitions through Temporal Slice Analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 36–41 (1999)Google Scholar
  9. 9.
    Sagan, H.: Space-Filling Curves. Springer, New York (1994)CrossRefzbMATHGoogle Scholar
  10. 10.
    Wu, J.-C., Hsieh, J.-W., Chen, Y.-S.: Morphology-based Text Line Extraction. Machine Vision and Applications 19(3), 195–207 (2008)CrossRefGoogle Scholar
  11. 11.
    Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Communications of the ACM 27, 236–239 (1984)CrossRefGoogle Scholar
  12. 12.
    Zhang, Y., Chua, T.S.: Detection of Text Captions in Compressed Domain Video. In: ACM Multimedia, Los Angeles, CA, USA, pp. 201–204 (2000)Google Scholar
  13. 13.
    Zhong, Y., Zhang, H., Jain, A.: Automatic Caption Localization in Compressed Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(4), 385–392 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Felipe Braunger Valio
    • 1
  • Helio Pedrini
    • 1
  • Neucimar Jeronimo Leite
    • 1
  1. 1.Institute of ComputingUniversity of CampinasCampinasBrazil

Personalised recommendations