Skip to main content

Post-processing of Video Text Detection

  • Chapter
  • First Online:
Video Text Detection

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

  • 1095 Accesses

Abstract

This chapter introduces methods for text binarization and recognition as post-processing for text detection. Binarization pertains to the separation of text from the background of the detected text block. As an example of binarization methods, this chapter presents a fusion method which combines wavelet and gradient bands for the text lines with the help of k-means clustering on every row and column to binarize the image. Since video and natural scene images suffer from low resolution and complex background, it is hard to develop effective binarization methods which preserve shapes of characters without losing text pixels. Therefore, this chapter further presents a method for character shape reconstruction using ring radius transform. The method obtains radius values for each pixel in the edge image of the input character image, which is based on distance to the nearest white pixel. The medial axis pixel is then found horizontally and vertically by selecting maximum radius values between the strokes. This medial axis value helps in filling the gap between end points while preserving the shape of the character.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proceedings of the ICDAR, pp 606–616

    Google Scholar 

  2. Zang J, Kasturi R (2008) Extraction of text objects in video documents: recent progress. In: Proceedings of the DAS, pp 5–17

    Google Scholar 

  3. Wang K, Belongie S (2010) Word spotting in the wild. In: Proceedings of the ECCV, pp 591–604

    Google Scholar 

  4. Tang X, Gao X, Liu J, Zhang H (2002) A spatial-temporal approach for video caption detection and recognition. IEEE Trans Neural Netw 13:961–971

    Article  Google Scholar 

  5. Lyu MR, Song J, Cai M (2005) A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans CSVT 15:243–255

    Google Scholar 

  6. Mishara A, Alahari K, Jawahar CV (2011) An MRF model for binarization of natural scene text. In: Proceedings of the ICDAR, pp 11–16

    Google Scholar 

  7. Neumann L, Matas J (2011) A method for text localization and recognition in real-world images. In: Proceedings of the ACCV, pp 770–783

    Google Scholar 

  8. Chen D, Odobez JM (2005) Video text recognition using sequential Monte Carlo and error voting methods. Pattern Recogn Lett 26:1386–1403

    Article  Google Scholar 

  9. Niblack W (1986) An introduction to digital image processing. Prentice Hall, Englewood Cliffs

    Google Scholar 

  10. Sauvola J, Seeppanen T, Haapakoski S, Pietikainen M (1997) Adaptive document binarization. In: Proceedings of the ICDAR, pp 147–152

    Google Scholar 

  11. He J, Do QDM, Downton AC, Kim JH (2005) A comparison of binarization methods for historical archive documents. In: Proceedings of the ICDAR, pp 538–542

    Google Scholar 

  12. Ntirogiannis K, Gotos B, Pratikakis I (2011) Binarization of textual content in video frames. In: Proceedings of the ICDAR, pp 673–677

    Google Scholar 

  13. Saidane Z, Garcia C (2007) Robust binarization for video text recognition. In: Proceedings of the ICDAR, pp 874–879

    Google Scholar 

  14. Zhou Z, Li L, Tan CL (2010) Edge based binarization of video text images. In: Proceedings of the ICPR, pp 133–136

    Google Scholar 

  15. Roy S, Shivakumara P, Roy P, Tan CL (2012) Wavelet-gradient-fusion for video text binarization. In: Proceedings of the ICPR, pp 3300–3303

    Google Scholar 

  16. Shivakumara P, Phan TQ, Tan CL (2011) A Laplacian approach to multi-oriented text detection in video. IEEE Trans PAMI 33:412–419

    Article  Google Scholar 

  17. Pajares G, Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37:1855–1872

    Article  Google Scholar 

  18. Tesseract http://code.google.com/p/tesseract-ocr/

  19. Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13

    Google Scholar 

  20. Zhou P, Li L, Tan CL (2009) Character recognition under severe perspective distortion. In: Proceedings of the ICDAR, pp 676–680

    Google Scholar 

  21. Pan YF, Hou X, Liu CL (2009) Text localization in natural scene images based on conditional random field. In: Proceedings of the ICDAR, pp 6–10

    Google Scholar 

  22. Pan YF, Hou X, Liu CL (2008) A robust system to detect and localize texts in natural scene images. In: Proceedings of the DAS, pp 35–42

    Google Scholar 

  23. Chen D, Odobez JM, Bourlard H (2004) Text detection and recognition in images and video frames. Pattern Recogn 37:595–608

    Article  Google Scholar 

  24. Lee SH, Kim JH (2008) Complementary combination of holistic and component analysis for recognition of low-resolution video character images. Pattern Recogn Lett 29:383–391

    Article  Google Scholar 

  25. Ghosh A, Petkov N (2005) Robustness of shape descriptors to incomplete contour representations. IEEE Trans PAMI 27:1793–1804

    Article  Google Scholar 

  26. Wang J, Yan H (1999) Mending broken handwriting with a macrostructure analysis method to improve recognition. Pattern Recogn Lett 20:855–864

    Article  Google Scholar 

  27. Yu D, Yan H (2001) Reconstruction of broken handwritten digits based on structural morphological features. Pattern Recogn 34:235–254

    Article  MATH  Google Scholar 

  28. Allier B, Emptoz H (2002) Degraded character image restoration using active contours: a first approach. In: Proceedings of the ACM symposium on document engineering, pp 142–148

    Google Scholar 

  29. Allier B, Bali N, Emptoz H (2006) Automatic accurate broken character restoration for patrimonial documents. IJDAR 8:246–261

    Article  Google Scholar 

  30. Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: Proceedings of the CVPR, pp 2963–2970

    Google Scholar 

  31. Phan TQ, Shivakumara P, Tan CL (2011) A gradient vector flow-based method for video character segmentation. In: Proceedings of the ICDAR, pp 1024–1028

    Google Scholar 

  32. Shivakumara P, Ding Bei Hong, Zhao D, Lu S, Tan CL (2012) A new iterative-midpoint-method for video character gap filling. In: Proceedings of the ICPR, pp 673–676

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag London

About this chapter

Cite this chapter

Lu, T., Palaiahnakote, S., Tan, C.L., Liu, W. (2014). Post-processing of Video Text Detection. In: Video Text Detection. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6515-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6515-6_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6514-9

  • Online ISBN: 978-1-4471-6515-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics