A HMM-Based Arabic/Latin Handwritten/Printed Identification System

Cheikh Rouhou, Ahmed; Abdelhedi, Zeineb; Kessentini, Yousri

doi:10.1007/978-3-319-52941-7_30

A HMM-Based Arabic/Latin Handwritten/Printed Identification System

Ahmed Cheikh Rouhou²⁰,
Zeineb Abdelhedi²⁰ &
Yousri Kessentini^20,21

Conference paper
First Online: 23 February 2017

1048 Accesses
1 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 552))

Abstract

For document analysis and recognition systems, script identification is considered as an important preprocessing step in the design of multi-scripts OCR system. In this paper, we propose a novel HMM based identification system to recognize on only one level the writing type (handwritten or machine-printed) and the script nature (Arabic or Latin) of the input image. The proposed system is based on Histogram of Oriented Gradient (HOG) features which have demonstrated an interesting properties for script characterization. Experiments have been conducted on word and line images collected from public databases and show promising results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Baati, K., Kanoun, S., Benjlaiel, M.: Diffirenciation d’ecriture Arabe et Latine de natures Imprimee et Manuscrite par approche globale. In: Proceedings of Colloque International Francophone sur l’ecrit et le Document CIFED, pp. 313–324 (2010)
Google Scholar
Kavallieratou, E., Stamatatos, S.: Discrimination of machine-printed from handwritten text using simple structural characteristics. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 1, pp. 437–440, 23–26 August 2004
Google Scholar
Zhou, L., Lu, Y., Tan, C.L.: Bangla/English script identification based on analysis of connected component profiles. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 243–254. Springer, Heidelberg (2006). doi:10.1007/11669487_22
Chapter Google Scholar
Mozaffari, S., Bahar, P.: Farsi/Arabic handwritten from machine-printed words discrimination. In: 2012 International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 698–703, 18–20 September 2012
Google Scholar
Pal, U., Chaudhuri, B.B.: Script line separation from Indian multi-script documents. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, ICDAR 1999, pp. 406–409, 20–22 September 1999
Google Scholar
Faria da Silva, L., Conici, A., Sanchez, A.: Automatic discrimination between printed and handwrittentext in documents. In: 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI), pp. 261–267, 11–15 October 2009
Google Scholar
Pal, U., Chaudhuri, B.B.: Machine-printed and hand-written text lines identification. Pattern Recogn. Lett. 22(3–4), 431–441 (2001)
Article MATH Google Scholar
Benjelil, M., Kanoun, S., Alimi, A.M., Mullot, R.: Three decision levels strategy for Arabic and Latin texts differentiation in printed and handwritten natures. In: Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 2, pp. 1103–1107, 23–26 September 2007
Google Scholar
Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recogn. Lett. 29(9), 1218–1229 (2008)
Article Google Scholar
Guo, J.K., Ma, M.Y.: Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 439–443 (2001)
Google Scholar
Genzel, D., Popat, A.C., Teunen, R., Fujii, Y.: HMM-based script identification for OCR. In: Proceedings of the 4th International Workshop on Multilingual OCR, article 2. ACM, New York (2013)
Google Scholar
El Abed, H., Margner, V.: The IFN/ENIT-database - a tool to develop Arabic handwriting recognition systems. In: 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, pp. 1–4, 12–15 February 2007
Google Scholar
Slimane, F., Ingold, R., Kanoun, S., Alimi, A.M., Hennebert, J.: A new Arabic printed text image database and evaluation protocols. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 946–950, 26–29 July 2009
Google Scholar
Chtourou, I., Cheikh Rouhou, A., Kallem, F., Kanoun, S.: ALTID: Arabic/Latin text images database for recognition research. In: 13th International Conference on Document Analysis and Recognition, ICDAR 2015 (2015)
Google Scholar
Mahmoud, S.A., Ahmad, I., Al-Khatib, W.G., Alshayeb, M., Parvez, M.T., Margner, V.: KHATT: an open Arabic offline handwritten text database. Pattern Recogn. PR 47(3), 1096–1112 (2014)
Article Google Scholar
Hamzah, L., Mahmoud, S.A., Sameh, A.: KAFD Arabic font database. Pattern Recogn. PR 47(6), 2231–2240 (2014)
Article Google Scholar
Grosicki, E., El Abed, H.: ICDAR 2009 handwriting recognition competition. In: 10th International Conference on Document Analysis and Recognition, ICDAR 2009, pp. 1398–1402, 26–29 July 2009
Google Scholar
Rodriguez, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: Proceedings of International Conference on Frontiers in Handwriting Recognition (ICFHR 2008), pp. 7–12 (2008)
Google Scholar
Ghosh, D., Dube, T., Shivaprasad, A.P.: Script recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 32, 2142–2161 (2009)
Article Google Scholar
Saidani, A., Kacem, A., Belaid, A.: Arabic/Latin and machine-printed/handwritten word discrimination using HOG-based shape descriptor. ELCVIA Electron. Lett. Comput. Vis. Image Anal. 14, 1–23 (2015)
Article Google Scholar
Smith, R.W.: Hybrid page layout analysis via tab-stop detection. In: 10th International Conference on Document Analysis and Recognition, pp. 241–245 (2009)
Google Scholar
Smith, R.: An overview of the Tesseract OCR engine. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition, ICDAR 2007, vol. 02, pp. 629–633 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

MIRACL Laboratory, Digital Research Center of Sfax, University of Sfax, Sfax, Tunisia
Ahmed Cheikh Rouhou, Zeineb Abdelhedi & Yousri Kessentini
LITIS Laboratory EA 4108, St Etienne du Rouvray, France
Yousri Kessentini

Authors

Ahmed Cheikh Rouhou
View author publications
You can also search for this author in PubMed Google Scholar
Zeineb Abdelhedi
View author publications
You can also search for this author in PubMed Google Scholar
Yousri Kessentini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmed Cheikh Rouhou .

Editor information

Editors and Affiliations

(MIR Labs), Machine Intelligence Research Labs, Auburn, Washington, USA
Ajith Abraham
Hassan 1st University, Settat, Morocco
Abdelkrim Haqiq
ENIS, University of Sfax, Sfax, Tunisia
Adel M. Alimi
Technopolis Rabat-Shore Rocade, International University of Rabat, Sala el Jadida, Morocco
Ghita Mezzour
Inst Applied Dept of Electronics, Taffala, University of Sousse, Sousse, Tunisia
Nizar Rokbani
Fakulti Teknologi Maklumat dan Komunikas, Universiti Teknikal Malaysia Melaka, Durian Tunggal, Malaysia
Azah Kamilah Muda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheikh Rouhou, A., Abdelhedi, Z., Kessentini, Y. (2017). A HMM-Based Arabic/Latin Handwritten/Printed Identification System. In: Abraham, A., Haqiq, A., Alimi, A., Mezzour, G., Rokbani, N., Muda, A. (eds) Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016). HIS 2016. Advances in Intelligent Systems and Computing, vol 552. Springer, Cham. https://doi.org/10.1007/978-3-319-52941-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-52941-7_30
Published: 23 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52940-0
Online ISBN: 978-3-319-52941-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics