Fast RT-LoG operator for scene text detection


This paper proposes a new real-time Laplacian of Gaussian (RT-LoG) operator for scene text detection. This method takes advantage of the Gaussian kernel distribution in the spatial/scale-space domains and kernel decomposition with the box filtering method. Two levels of optimization are given. The first level of optimization within the spatial domain is obtained by box mutualization. The second level of optimization within the spatial/scale-space domains is performed using a mixed method for box selection. The proposed RT-LoG operator is evaluated on the ICDAR2017 RRC-MLT dataset in terms of robustness and time processing. The results are compared with the state-of-the-art real-time operators for scene text detection. The proposed operator appears as the top performance with the best trade-off between robustness and time processing. The proposed operator can support approximately 30 frames per second (FPS) up to the Quad-HD resolution on a regular CPU architecture with a low-level latency. In addition, the proposed operator can support the full pipeline for scene text detection. Our system is competitive with the top accurate systems of the literature while processing with a difference of two orders of magnitude in term of processing resources.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22


  1. 1.

    In practice, \(k \in ]1, \sqrt{2}]\).

  2. 2.

    For simplification, considering the 1D case.

  3. 3.

    Single Precision.


  1. 1.

    Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. PAMI 37(7), 1480–1500 (2015)

    Article  Google Scholar 

  2. 2.

    Long, S., He, X., Ya, C.: Scene text detection and recognition: the deep learning era, arXiv:1811.04256 (2018)

  3. 3.

    Nayef, N., Yin, F., Bizid, I., Choi, H.: ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. ICDAR (2017).

    Article  Google Scholar 

  4. 4.

    Neumann, L., Matas, J.: Real-time lexicon-free scene text localization and recognition. PAMI 38(9), 1872–1885 (2016)

    Article  Google Scholar 

  5. 5.

    Buttazzo, G.C.: Hard real-time computing systems: predictable scheduling algorithms and applications. Springer Science & Business Media, Berlin (2011)

    Google Scholar 

  6. 6.

    Rey-Otero, I., Morel, J.M.: An analysis of scale-space sampling in SIFT. ICIP (2014).

    Article  Google Scholar 

  7. 7.

    Busta, M., Neumann, L., Matas, J.: Fastext: efficient unconstrained scene text detector. ICCV (2015).

    Article  Google Scholar 

  8. 8.

    Cho, H., Sung, M., Jun, B.: Canny text detector: Fast and robust scene text localization algorithm. CVPR (2016).

    Article  Google Scholar 

  9. 9.

    Epshtein, B., Ofek, E.: Detecting text in natural scenes with stroke width transform. CVPR (2010).

    Article  Google Scholar 

  10. 10.

    Girones, X., Julia, C.: Real-time text localization in natural scene images using a linear spatial filter. ICDAR (2017).

    Article  Google Scholar 

  11. 11.

    Gomez, L., Karatzas, D.: MSER-based real-time text detection and tracking. ICPR (2014).

    Article  Google Scholar 

  12. 12.

    Turki, H., Halima, M.B., Alimi, A.: Text detection based on MSER and CNN features. ICDAR (2017).

    Article  Google Scholar 

  13. 13.

    Zhao, R., Niu, X., Wu, Y., Luk, W., Liu, Q.: Optimizing CNN-based object detection algorithms on embedded FPGA platforms. ISARC (2017).

    Article  Google Scholar 

  14. 14.

    Maceina, T.J., Manduchi, G.: Assessment of general purpose GPU systems in real-time control. TNS 64(6), 1455–1460 (2017)

    Google Scholar 

  15. 15.

    Kim, H., Nam, H., Jung, W., Lee, J.: Performance analysis of CNN frameworks for GPUs. ISPASS (2017).

    Article  Google Scholar 

  16. 16.

    Wang, F., Zhao, L., Li, X., Wang, X.: Geometry-aware scene text detection with instance transformation network. CVPR (2018).

    Article  Google Scholar 

  17. 17.

    Fragoso, V., Srivastava, G., Nagar, A., Li, Z.: Cascade of box (CABOX) filters for optimal scale space approximation. CVPR (2014).

    Article  Google Scholar 

  18. 18.

    Liu, Y., Zhang, D., Zhang, Y.: Real-time scene text detection based on stroke model. ICPR (2014).

    Article  Google Scholar 

  19. 19.

    Nguyen, D.C., Delalandre, M., Conte, D., Pham, T.A.: Performance evaluation of real-time and scale-invariant LoG operators for text detection. VISAPP (2019).

    Article  Google Scholar 

  20. 20.

    Lindeberg, T.: Scale-space theory: a basic tool for analysing structures at different scales. JAS 21, 224–270 (1994)

    Google Scholar 

  21. 21.

    Charalampidis, D.: Recursive implementation of the Gaussian filter using truncated cosine functions. TIP 64(14), 3554–3565 (2016)

    MathSciNet  MATH  Google Scholar 

  22. 22.

    Elboher, E., Werman, M.: Efficient and accurate Gaussian image filtering using running sums. ISDA 897–902, (2011)

  23. 23.

    Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)

    Article  Google Scholar 

  24. 24.

    Strang, G.: Introduction to Linear Algebra, 5th edn. Cambridge Press, Cambridge (1993)

    Google Scholar 

  25. 25.

    Karatzas, D., Gomez-Bigorda, L.: ICDAR 2015 competition on robust reading. ICDAR 1156–1160, (2015)

  26. 26.

    Siddhesh, K., Amit, A.: Faster K-Means Cluster Estimation, arXiv, vol.1701.04600 (2017)

  27. 27.

    Medioni, G.G., Lim, J., Park, J.: Text segmentation in color images using tensor voting. Image Vis Comput IVC 25.5, 671–685 (2007)

    Google Scholar 

  28. 28.

    Mao, J., Li, H., Zhou, W., Yan, S., Tian, Q.: Scale based region growing for scene text detection. ACMMM (2013).

    Article  Google Scholar 

  29. 29.

    Zhu, W., Lou, J., Chen, L., Xia, Q., Ren, M.: Scene text detection via extremal region based double threshold convolutional network classification. PLoS One 12(8), e0182227 (2017)

    Article  Google Scholar 

  30. 30.

    Yin, X.C., Pei, W.Y., Zhang, J.: Multi-orientation scene text detection with adaptive clustering. PAMI 37(9), 1930–1937 (2015)

    Article  Google Scholar 

  31. 31.

    Dai, J., Wang, Z., Zhao, X., Shao, S.: Scene text detection based on enhanced multi-channels MSER and a fast text grouping process. ICCCBDA (2018).

    Article  Google Scholar 

  32. 32.

    Nguyen, C., Delalandre, M., Conte, D., Pham, T.: Fast scene text detection with RT-LoG operator and CNN. VISAPP, (2020)

  33. 33.

    Liu, J., Liu, X., Sheng, J., Liang, D.: Pyramid Mask Text Detector, arXiv preprint: arXiv:1903.11800 (2019)

  34. 34.

    He, W., Zhang, X.Y., Yin, F., Liu, C.L.: Multi-oriented and multi-lingual scene text detection with direct regression. TIP 27(11), 5406–5419 (2018)

    MathSciNet  Google Scholar 

  35. 35.

    Huang, Z., Zhong, Z., Sun, L., Huo, Q.: Mask R-CNN with pyramid attention network for scene text detection. WACV (2019).

    Article  Google Scholar 

  36. 36.

    Zhang, C., Liang, B., Huang, Z., En, M., Han, J.: Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes, arXiv preprint: arXiv:1904.06535 (2019)

  37. 37.

    Lyu, P., Yao, C., Wu, W., Yan, S.: Multi-oriented scene text detection via corner localization and region segmentation. CVPR 7553–7563, (2018)

  38. 38.

    Liu, X., Liang, D., Yan, S., Chen, D.: Fots: Fast oriented text spotting with a unified network. CVPR 5676–5685, (2018)

  39. 39.

    Zhong, Z., Sun, L., Huo, Q.: An anchor-free region proposal network for faster r-cnn based text detection approaches, arXiv preprint: arXiv:1804.09003 (2018)

  40. 40.

    Wang, H., Rong, X., Tian, Y.: Towards accurate instance-level text spotting with guided attention. ICME (2019).

    Article  Google Scholar 

  41. 41.

    Lyu, P., Liao, M., Yao, C., Wu, W.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes, ECCV (2018)

  42. 42.

    Zhou, X., Yao, C., Wen, H., Wang, Y.: EAST: an efficient and accurate scene text detector. CVPR 5551–5560, (2017)

  43. 43.

    He, P., Huang, W., He, T., Zhu, Q.: Single shot text detector with regional attention. ICCV 3047–3055, (2017)

  44. 44.

    Miao, Z., Jiang, X.: Contrast invariant interest point detection by zero-norm log filter. TIP 25(1), 331–342 (2016)

    MathSciNet  MATH  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Cong Nguyen Dinh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nguyen Dinh, C., Delalandre, M., Conte, D. et al. Fast RT-LoG operator for scene text detection. J Real-Time Image Proc 18, 19–36 (2021).

Download citation


  • Scene text detection
  • RT-LoG
  • Stroke model
  • Box filter
  • Box selection
  • Real-time
  • Predictability