Character Segmentation and Recognition

  • Tong Lu
  • Shivakumara Palaiahnakote
  • Chew Lim Tan
  • Wenyin Liu
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


This chapter presents methods for character segmentation from text lines and recognition of video characters. It is noted that character segmentation from video text lines detected by video text detection method is not as easy as segmenting characters from scanned document images due to low resolution and complex background of video. This chapter presents a method for word segmentation based on the combination of Fourier and moments. Then, the segmented words are used for character segmentation using top and bottom profile features of the words. This chapter also presents a method which does not require words for character segmentation. Instead, it segments character from text lines directly by exploring gradient vector flow (GVF) for identifying the space between words. Further, this chapter introduces a recognition method without the use of an OCR engine. The method proposes structural features based on eight-directional sectors to facilitate character recognition y calculating representatives for each class of the characters.


Character Recognition Text Line Optical Character Recognition Word Segmentation Character Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
    Jung K, Kim KI, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997CrossRefGoogle Scholar
  3. 3.
    Shivakumara P, Phan TQ, Tan CL (2011) A Laplacian approach to multi-oriented text detection in video. IEEE Trans PAMI 33(2):412–419CrossRefGoogle Scholar
  4. 4.
    Mori M, Sawaki M, Hagita N (2003) Video text recognition using feature compensation as category-dependent feature extraction. In: Proceedings of the ICDAR, pp 645–649Google Scholar
  5. 5.
    Lienhart R, Wernicke A (2002) Localizing and segmenting text in images and videos. IEEE Trans Circ Syst Video Technol 12(4):256–268CrossRefGoogle Scholar
  6. 6.
    Huang X, Ma H, Zhang H (2009) A new video text extraction approach. In: Proceedings of the ICME, pp 650–653Google Scholar
  7. 7.
    Miao G, Zhu G, Jiang S, Huang Q, Xu C, Gao W (2007) A real-time score detection and recognition approach for broadcast basketball video. In: Proceedings of the ICME, pp 1691–1694Google Scholar
  8. 8.
    Kopf S, Haenselmann T, Effelsberg W (2005) Robust character recognition in low-resolution images and videos. Technical report, University of MannheimGoogle Scholar
  9. 9.
    Tse J, Jones C, Curtis D, Yfantis E (2007) An OCR-independent character segmentation using shortest-path in grayscale document images. In: Proceedings of the international conference on machine learning and applications, pp 142–147Google Scholar
  10. 10.
    Kim W, Kim C (2009) A new approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18(2):401–411CrossRefMathSciNetGoogle Scholar
  11. 11.
    Saidane Z, Garcia C (2007) Robust binarization for video text recognition. In: Proceedings of the ICDAR, pp 874–879Google Scholar
  12. 12.
    Chen D, Odobez J (2005) Video text recognition using sequential Monte Carlo and error voting methods. Pattern Recogn Lett 26(9):1386–1403CrossRefGoogle Scholar
  13. 13.
    Lee SH, Kim JH (2008) Complementary combination of holistic and component analysis for recognition of low resolution video character images. Pattern Recogn Lett 29:383–391CrossRefGoogle Scholar
  14. 14.
    Chen D, Odobez JM, Bourland H (2004) Text detection and recognition in images and video frames. Pattern Recogn 37(3):595–608CrossRefGoogle Scholar
  15. 15.
    Tang X, Gao X, Liu J, Zhang H (2002) A spatial-temporal approach for video caption detection and recognition. IEEE Trans Neural Netw 13:961–971CrossRefGoogle Scholar
  16. 16.
    Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proceedings of the ICDAR, pp 606–616Google Scholar
  17. 17.
    Wolf C, Jolion JM (2003) Extraction and Recognition of artificial text in multimedia documents. Pattern Anal Applic 6(4):309–326MathSciNetGoogle Scholar
  18. 18.
    Zang J, Kasturi R (2008) Extraction of text objects in video documents: recent progress. In: Proceedings of the DAS, pp 5–17Google Scholar
  19. 19.
    Jain AK, Yu B (1998) Automatic text location in images and video frames. Pattern Recogn 31:2055–2076CrossRefGoogle Scholar
  20. 20.
    Li H, Doermann D, Kia O (2000) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9:147–156CrossRefGoogle Scholar
  21. 21.
    Kim KL, Jung K, Kim JH (2003) Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans PAMI 25:1631–1639CrossRefGoogle Scholar
  22. 22.
    Saidane Z, Garcia C (2007) Robust binarization for video text recognition. In: Proceedings of the ICDAR, pp 874–879Google Scholar
  23. 23.
    Zhou Z, Li L, Tan CL (2010) Edge based binarization for video text images. In: Proceedings of the ICPR, pp 133–136Google Scholar
  24. 24.
    Jung K (2001) Neural network-based text location in color images. Pattern Recogn Lett 22:1503–1515CrossRefzbMATHGoogle Scholar
  25. 25.
    Hearn D, Pauline Baker M (1994) Computer graphics C version. 2nd edn. Prentice-Hall, Bresenham Line Drawing AlgorithmGoogle Scholar
  26. 26.
    Xu C, Prince JL (1998) Snakes, shapes, and gradient vector flow. IEEE Trans Image Process 7(3):359–369CrossRefzbMATHMathSciNetGoogle Scholar
  27. 27.
    Kass M, Witkin A, Terzopoulos D (1987) Snakes: active contour models. Int J Comput Vision 1(4):321–331CrossRefGoogle Scholar
  28. 28.
    Wang J, Jean J (1993) Segmentation of merged characters by neural networks and shortest path. In: Proceedings of the ACM/SIGAPP symposium on applied computing, pp 762–769Google Scholar
  29. 29.
    Su B, Lu S, Tan CL (2010) Binarization of historical document images using the local maximum and minimum. In: Proceedings of the international workshop on document analysis systems, pp 159–166Google Scholar
  30. 30.
    Bolan S, Shijian L, Tan CL (2010) Binarization of historical document images using the local maximum and minimum. In: Proceedings of the DAS, pp 159–165Google Scholar
  31. 31.
    Shivakumara P, Rajan D, Sadanathan SA (2008) Classification of images: are rule based systems effective when classes are fixed and known? In: Proceedings of the ICPRGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Tong Lu
    • 1
  • Shivakumara Palaiahnakote
    • 2
  • Chew Lim Tan
    • 3
  • Wenyin Liu
    • 4
  1. 1.Department of Computer Science and TechnologyNanjing UniversityNanjingChina
  2. 2.Faculty of CSITUniversity of MalayaKuala LumpurMalaysia
  3. 3.National University of SingaporeSingaporeSingapore
  4. 4.Multimedia Software Engineering Research CenterCity University of Hong KongKowloon TongHong Kong SAR

Personalised recommendations