Text Extraction Using Sparse Representation over Learning Dictionaries

  • Thanh-Ha DoEmail author
  • Thi Minh Huyen Nguyen
  • K. C. Santosh
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1037)


This paper presents a new approach for text detection using sparse representation over learned dictionaries. More specifically, the K-SVD algorithm is used for constructing two dictionaries, one for the background and one for the text. Then, text detection is done by comparing the error constructions of each patch of image over two dictionaries. Results on ICDAR dataset present that proposed method is competitive related to state-of-the-art methods.


Text extraction Sparse representation Learning dictionary 



This research is funded by the Vietnam National University, Hanoi (VNU) under project number QG.18.04.


  1. 1.
    Aerschot, W., Jansen, M., Bultheel, A.: Normal mesh based geometrical image compression. Image Vis. Comput. 27(4), 459–468 (2009)CrossRefGoogle Scholar
  2. 2.
    Aharon, M., Elad, M., Bruckstein, A.: K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. Sig. Process. 54(11), 4311–4322 (2006)CrossRefGoogle Scholar
  3. 3.
    Angadi, S., Kodabagi, M.: A texture based methodology for text region extraction from low resolution natural scene images. In: Advance Computing Conference, pp. 121–128 (2010)Google Scholar
  4. 4.
    Belaid, A., Santosh, K., D’Andecy, V.P.: Handwritten and printed text separation in real document. In: The Thirteenth International Conference on Machine Vision Applications (2013)Google Scholar
  5. 5.
    Bui, T., Pan, W., Suen, C.: Text detection from natural scene images using topographic maps and sparse representations. In: The IEEE International Conference on Image Processing (2009)Google Scholar
  6. 6.
    Chen, D., Jean-Marc, O., Herve, B.: Text detection and recognition in images and video frames. Pattern Recogn. 37(3), 595–608 (2004)CrossRefGoogle Scholar
  7. 7.
    Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Chen, X., Yuille, A.: Detecting and reading text in natural scenes. In: Proceeding of CVPR (2004)Google Scholar
  9. 9.
    Daubechies, I., Devore, R., Fornasier, M., Gunturk, C.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2009)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Do, T.H., Tabbone, S., Terrades, O.R.: Text/graphic separation using a sparse representation with multi-learned dictionaries. In: The International Conference on Pattern Recognition, pp. 689–692 (2012)Google Scholar
  11. 11.
    Do, T.H., Tabbone, S., Terrades, O.R.: Document noise removal using sparse representations over learned dictionary. In: ACM Symposium on Document Engineering, pp. 161–168 (2013)Google Scholar
  12. 12.
    Donoho, D., Elad, M.: Optimally sparse representation in general (nonorthogonal) dictionaries via ell1 minimization. PNAS 100(5), 2197–2202 (2003)CrossRefGoogle Scholar
  13. 13.
    Elad, M.: Sparse and Redundant Representation: From Theory to Applications in Signal and Images Processing. Springer, New York (2010). Scholar
  14. 14.
    Engan, K., Skretting, K., Husoy, J.H.: Family of iterative LS-based dictionary learning algorithm, ILS-DLA, for sparse signal representation. Digit. Signal Process. 17(1), 32–49 (2007)CrossRefGoogle Scholar
  15. 15.
    Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the CVPR (2010)Google Scholar
  16. 16.
    Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. In: Proceedings of the 17th International Conference on Pattern Recognition, vol. 2, pp. 683–686 (2004)Google Scholar
  17. 17.
    Jain, A., Yu, B.: Automatic text location in images and video frames. Pattern Recogn. 31(12), 2055–2076 (1998)CrossRefGoogle Scholar
  18. 18.
    Jiang, R., Qi, F., Xu, L., Wu, G.: Using connected components’ features to detect and segment text. J. Image Graph. 11, 1653–1656 (2006)Google Scholar
  19. 19.
    Kim, K., Jung, K., Kim, J.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 25(12), 1631–1639 (2003)CrossRefGoogle Scholar
  20. 20.
    Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.: Text extraction and document image segmentation using matched wavelets and MFR model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lee, T.W., Lewicki, M.: Unsupervised image classification, segmentation and enhancement using ICA mixture models. IEEE Trans. Image Process. 11(3), 270–279 (2002)CrossRefGoogle Scholar
  22. 22.
    Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circuits Syst. Video Technol. 12, 256–268 (2002)CrossRefGoogle Scholar
  23. 23.
    Lim, J., Park, J., Medioni, G.: Text segmentation in color images using tensor voting. Image Vis. Comput. 25(5), 671–685 (2007)CrossRefGoogle Scholar
  24. 24.
    Liu, Z., Sarkar, S.: Robust outdoor text detection using text intensity and shape features. In: The 19th International Conference on Pattern Recognition, pp. 1–4 (2008)Google Scholar
  25. 25.
    Lucas, S.M.: ICDAR 2005 text locating competition results. In: Proceedings of the ICDAR (2005)Google Scholar
  26. 26.
    Mallat, S.: Geometrical grouplets. Appl. Comput. Harmonic Anal. 26(2), 161–180 (2009)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Mallat, S.G., Zhang, Z.: Matching pursuits with time-frequency dictionaries. Sig. Process. 41(12), 3397–3415 (1993)CrossRefGoogle Scholar
  28. 28.
    Mallat, S., Zhong, S.: Characterization of signals from multiscale edges. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 710–732 (1992)CrossRefGoogle Scholar
  29. 29.
    Marial, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 689–696 (2009)Google Scholar
  30. 30.
    Pan, W., Bui, T., Suen, C.: Text detection from scene images using sparse representation. In: Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), pp. 1–5 (2008)Google Scholar
  31. 31.
    Pan, Y., Liu, C., Hou, X.: Fast scene text localization by learning-based filtering and verification. In: The 17th IEEE International Conference on Image Processing, pp. 2269–2272 (2010)Google Scholar
  32. 32.
    Park, J., Chung, H., Seong, Y.: Scene text detection suitable for parallelizing on multi-core. In: IEEE International Conference on Image Processing, pp. 2425–2428 (2009)Google Scholar
  33. 33.
    Pati, Y., Rezaiifar, R., Krishnaprasad, P.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993)Google Scholar
  34. 34.
    Santosh, K.C.: g-DICE: graph mining-based document information content exploitation. IJDAR 18(4), 337–355 (2015)CrossRefGoogle Scholar
  35. 35.
    Santosh, K.C.: Document Image Analysis. Current Trends and Challenges in Graphics Recognition. Springer, Singapore (2018). Scholar
  36. 36.
    Skretting, K., Engan, K.: Recursive least squares dictionary learning algorithm. Sig. Process. 58(4), 2121–2130 (2010)MathSciNetCrossRefGoogle Scholar
  37. 37.
    Temlyakov, V.N.: Weak greedy algorithms. Adv. Comput. Math. 12(2–3), 213–227 (2000)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Hoang, T.V., Tabbone, S.: Text extraction from graphical document images using sparse representation. In: Proceedings of the 9th International Workshop on Document Analysis Systems (2010)Google Scholar
  39. 39.
    Wright, J., Ganesh, A., Yang, A., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)CrossRefGoogle Scholar
  40. 40.
    Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting text of arbitrary orientations in natural images. In: Proceedings of CVPR (2012)Google Scholar
  41. 41.
    Ye, Q., Jiao, J., Huang, J., Yu, H.: Text detection and restoration in natural scene images. J. Vis. Commun. Image Represent. 18(6), 504–513 (2007)CrossRefGoogle Scholar
  42. 42.
    Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In: Image Processing (2011)Google Scholar
  43. 43.
    Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Zhao, M., Li, S., Kwok, J.: Text dectection in images using sparse representation with discriminative dictioanries. Image Vis. Comput. 28, 1590–1599 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Thanh-Ha Do
    • 1
    Email author
  • Thi Minh Huyen Nguyen
    • 1
  • K. C. Santosh
    • 2
  1. 1.Department of InformaticsVNU University of ScienceHanoiVietnam
  2. 2.Department of Computer ScienceUniversity of South DakotaVermillionUSA

Personalised recommendations