Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids

  • Saad Bin Ahmed
  • Saeeda Naz
  • Muhammad Imran RazzakEmail author
  • Rubiyah Yusof
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)


The camera captured images have various aspects to investigate. Generally, the emphasis of research depends on the interesting regions. Sometimes the focus could be on color segmentation, object detection or scene text analysis. The image analysis, visibility and layout analysis are the tasks easier for humans as suggested by behavioural trait of humans, but in contrast when these same tasks are supposed to perform by machines then it seems to be challenging. The learning machines always learn from the properties associated to provided samples. The numerous approaches are designed in recent years for scene text extraction and recognition and the efforts are underway to improve the accuracy. The convolutional approach provided reasonable results on non-cursive text analysis appeared in natural images. The work presented in this manuscript exploited the strength of linear pyramids by considering each pyramid as a feature of the provided sample. Each pyramid image process through various empirically selected kernels. The performance was investigated by considering Arabic text on each image pyramid of EASTR-42k dataset. The error rate of 0.17% was reported on Arabic scene text recognition.


Linear pyramids Kernels Feature extraction Arabic scene text 



The authors would like to thank Ministry of Education Malaysia and Universiti Teknologi Malaysia for funding this research project.


  1. 1.
    Ahmed, S.B., Naz, S., Razzak, M.I., Rashid, S.F., Afzal, M.Z., Breuel, T.M.: Evaluation of cursive and non-cursive scripts using recurrent neural networks. Neural Comput. Appl. 27(3), 603–613 (2016)CrossRefGoogle Scholar
  2. 2.
    Ahmed, S.B., Naz, S., Razzak, M.I., Yousaf, R.: Deep learning based isolated arabic scene character recognition. In: International Workshop on Arabic Script Analysis and Recognition (ASAR), pp. 46–51. IEEE (2017)Google Scholar
  3. 3.
    Ahmed, S.B., Naz, S., Swati, S., Razzak, M.I.: Handwritten Urdu character recognition using one-dimensional BLSTM classifier. Neural Comput. Appl., pp. 1–9 (2017)Google Scholar
  4. 4.
    Ahmed, S.B., Naz, S., Swati, S., Razzak, M.I., Umar, A.I., Khan, A.A.: UCOM offline dataset-an Urdu handwritten dataset generation. Int. Arab J. Inf. Technol. 14(2), 239–245 (2017)Google Scholar
  5. 5.
    Gluckman, J.M.: Scale variant image pyramids. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. I, pp. 1069–1075 (2006)Google Scholar
  6. 6.
    Grauman, K., Darrell, T.J.: The pyramid match kernel: discriminative classification with sets of image features. In: ICCV, vol. II, pp. 1458–1465 (2005)Google Scholar
  7. 7.
    Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. SCI, vol. 385. Springer, Berlin (2012). Scholar
  8. 8.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. II, pp. 2169–2178 (2006)Google Scholar
  9. 9.
    Lee, S., Cho, M.S., Jung, K., Kim, J.H.: Scene text extraction with edge constraint and text collinearity. In: ICPR, pp. 3983–3986. IEEE Computer Society (2010)Google Scholar
  10. 10.
    Naz, S., Ahmed, S.B., Ahmad, R., Razzak, M.I.: Arabic script based digit recognition systems. In: International Conference on Recent Advances in Computer Systems (RACS), pp. 67–73 (2016)Google Scholar
  11. 11.
    Naz, S., Hayat, K., Razzak, M.I., Anwar, M.W., Madani, S.A., Khan, S.U.: The optical character recognition of Urdu-like cursive scripts. Pattern Recognit. 47(3), 1229–1248 (2014)CrossRefGoogle Scholar
  12. 12.
    Naz, S., Umar, A.I., Shirazi, S.H., Ahmed, S.B., Razzak, M.I., Siddiqi, I.: Segmentation techniques for recognition of arabic-like scripts: a comprehensive survey. Educ. Inf. Technol. 21(5), 1225–1241 (2016)CrossRefGoogle Scholar
  13. 13.
    Naz, S., et al.: Urdu Nastaliq recognition using convolutional-recursive deep learning. Neurocomputing 243, 80–87 (2017)CrossRefGoogle Scholar
  14. 14.
    Naz, S., Umar, A.I., Ahmed, R., Razzak, M.I., Rashid, S.F., Shafait, F.: Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks. SpringerPlus 5(1), 2010 (2016)CrossRefGoogle Scholar
  15. 15.
    Razzak, M.I., Anwar, F., Husain, S.A., Belaid, A., Sher, M.: HMM and fuzzy logic: a hybrid approach for online urdu script-based languages’ character recognition. Knowl.-Based Syst. 23(8), 914–923 (2010)CrossRefGoogle Scholar
  16. 16.
    Sánchez, J., Perronnin, F., de Campos, T.E.: Modeling the spatial layout of images beyond spatial pyramids. Pattern Recognit. Lett. 33(16), 2216–2223 (2012)CrossRefGoogle Scholar
  17. 17.
    Tan, C.L., Yuan, B., Ang, C.H.: Agent-based text extraction from pyramid images. In: Singh, S. (ed.) International Conference on Advances in Pattern Recognition, pp. 344–352. Springer, London (1999). Scholar
  18. 18.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  19. 19.
    Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090. IEEE Computer Society (2012)Google Scholar
  20. 20.
    Yousfi, S., Berrani, S.A., Garcia, C.: ALIF: a dataset for Arabic embedded text recognition in TV broadcast. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1221–1225. IEEE (2015)Google Scholar
  21. 21.
    Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–238 (2007)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Saad Bin Ahmed
    • 1
    • 2
  • Saeeda Naz
    • 4
  • Muhammad Imran Razzak
    • 3
    Email author
  • Rubiyah Yusof
    • 2
  1. 1.King Saud bin Abdulaziz University for Health SciencesRiyadhSaudi Arabia
  2. 2.Malaysia Japan International Institute of Technology (MJIIT)Universiti Teknologi MalaysiaKuala-LumpurMalaysia
  3. 3.University of TechnologySydneyAustralia
  4. 4.Higher Education DepartmentGovernment Post Graduate College No. 01AbbottabadPakistan

Personalised recommendations