Advertisement

A Novel Hybrid Optical Character Recognition Approach for Digitizing Text in Forms

  • Roland GraefEmail author
  • Mazen M. N. Morsy
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11491)

Abstract

The huge amount of document-based processes has considerably contributed to the need of automated systems which are able to appropriately digitize text in documents concerning forms. For example, the text in scanned administrative forms is not accessible without an adequate conversion from pixels to editable text. Against this background, many organizations tap the potential of Optical Character Recognition (OCR) as it is capable of supporting the digitization of text in documents. However, there is still a lack of integrated OCR approaches, considering both handwritten and machine printed texts, which are both of major importance in the context of digitizing text in forms. To address this problem, we propose a new hybrid OCR approach recognizing handwritten and machine printed text based on neural networks in an integrated perspective. We demonstrate the practical applicability of our approach using publicly available forms on which the approach could be successfully applied. Finally, we evaluate our novel hybrid approach in comparison to existing state-of-the-art approaches.

Keywords

Forms Optical Character Recognition Long Short-Term Memory Networks 

References

  1. 1.
    Manyika, J., Chui, M., Miremadi, M., et al.: A future that works: automation, employment, and productivity. McKinsey Global Institute (2017)Google Scholar
  2. 2.
    Geissbauer, R., Khurana, A., Arora, J.: Industry 4.0: Building the Digital Industrial Enterprise. PwC (2016)Google Scholar
  3. 3.
  4. 4.
    Weintraub, A., Le Clair, C.: The Forrester Wave™. Multichannel Capture, Q3 2012. Forrester Research, Inc. (2012)Google Scholar
  5. 5.
    Rehman, A., Saba, T.: Neural networks for document image preprocessing: state of the art. Artif. Intell. Rev. 42(2), 253–273 (2014)CrossRefGoogle Scholar
  6. 6.
    Ahmad, I., Mahmoud, S.A.: Arabic bank check processing. State of the art. J. Comput. Sci. Technol. 28(2), 285–299 (2013)CrossRefGoogle Scholar
  7. 7.
    Palacios, R., Gupta, A.: A system for processing handwritten bank checks automatically. Image Vis. Comput. 26(10), 1297–1313 (2008)CrossRefGoogle Scholar
  8. 8.
    Department of the Treasury Internal Revenue Service: Internal Revenue Service Data Book. https://www.irs.gov/pub/irs-soi/17databk.pdf. Accessed 14 Jan 2019
  9. 9.
    McKinsey & Company: Bots, algorithms, and the future of the finance function. https://mck.co/2LcvwaM. Accessed 30 Jan 2019
  10. 10.
    Chaudhuri, A., Mandaviya, K., Badelia, P., Ghosh, S.K.: Optical Character Recognition Systems for Different Languages with Soft Computing. SFSC, vol. 352. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-50252-6CrossRefGoogle Scholar
  11. 11.
    Singh, A., Desai, S.: Optical character recognition using template matching and back propagation algorithm. In: 3rd ICICT, pp. 1–6. IEEE (2016)Google Scholar
  12. 12.
    Dohrmann, T., Pinshaw, G.: The Road to Improved Compliance – A McKinsey Benchmarking Study of Tax Administrations. McKinsey & Company, Washington, D.C. (2009)Google Scholar
  13. 13.
    Xue, Y.: Optical Character Recognition. Department of Biomedical Engineering, University of Michigan (2014)Google Scholar
  14. 14.
    Balci, B., Saadati, D., Shiferaw, D.: Handwritten Text Recognition Using Deep Learning. CS231n: Convolutional Neural Networks for Visual Recognition, Stanford University, Course Project Report (2017)Google Scholar
  15. 15.
    Graves, A., Liwicki, M., Fernández, S., et al.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)CrossRefGoogle Scholar
  16. 16.
    Su, B., Zhang, X., Lu, S., et al.: Segmented handwritten text recognition with recurrent neural network classifiers. In: 13th ICDAR, Tunis, Tunisia, pp. 386–390. IEEE (2015)Google Scholar
  17. 17.
    Shkarupa, Y., Mencis, R., Sabatelli, M.: Offline handwriting recognition using LSTM recurrent neural networks. In: 28th BNAIC, pp. 88–95. Springer (2016)Google Scholar
  18. 18.
    Salvi, D., Zhou, J., Waggoner, J., et al.: Handwritten text segmentation using average longest path algorithm. In: WACV, pp. 505–512. IEEE (2013)Google Scholar
  19. 19.
    Lee, S.-W., Kim, S.-Y.: Integrated segmentation and recognition of handwritten numerals with cascade neural network. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 29(2), 285–290 (1999)CrossRefGoogle Scholar
  20. 20.
    El-Yacoubi, A., Gilloux, M., Sabourin, R., et al.: An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 752–760 (1999)CrossRefGoogle Scholar
  21. 21.
    Chakraborty, B., Mukherjee, P.S., Bhattacharya, U.: Bangla online handwriting recognition using recurrent neural network architecture. In: 10th ICVGIP. ACM (2016)Google Scholar
  22. 22.
    Kaltenmeier, A., Caesar, T., Gloger, J.M., et al.: Sophisticated topology of hidden Markov models for cursive script recognition. In: 2nd ICDAR, pp. 139–142. IEEE (1993)Google Scholar
  23. 23.
    Al-Muhtaseb, H.A., Mahmoud, S.A., Qahwaji, R.S.: Recognition of off-line printed Arabic text using Hidden Markov Models. Sig. Process. 88(12), 2902–2912 (2008)CrossRefGoogle Scholar
  24. 24.
    Din, I.U., Siddiqi, I., Khalid, S., et al.: Segmentation-free optical character recognition for printed Urdu text. Eur. Assoc. Sig. Process. J. Image Video Process. 2017(62), 1–18 (2017)Google Scholar
  25. 25.
    Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., et al.: High-performance OCR for printed English and Fraktur using LSTM networks. In: 12th ICDAR, pp. 683–687. IEEE (2013)Google Scholar
  26. 26.
    Naz, S., Hayat, K., Razzak, M.I., et al.: The optical character recognition of Urdu-like cursive scripts. Pattern Recogn. 47(3), 1229–1248 (2014)CrossRefGoogle Scholar
  27. 27.
  28. 28.
    Peffers, K., Tuunanen, T., Rothenberger, M.A., et al.: A design science research methodology for information systems research. JMIS 24(3), 45–77 (2007)Google Scholar
  29. 29.
    Grother, P., Hanaoka, K.: NIST special database 19 handprinted forms and characters 2nd Edition. National Institute of Standards and Technology, Technical report (2016)Google Scholar
  30. 30.
    Srihari, S.N.: Recognition of handwritten and machine-printed text for postal address interpretation. Pattern Recogn. Lett. 14(4), 291–302 (1993)CrossRefGoogle Scholar
  31. 31.
    Gorski, N., Anisimov, V., Augustin, E., et al.: Industrial bank check processing. The A2iA CheckReaderTM. IJDAR 3(4), 196–206 (2001)CrossRefGoogle Scholar
  32. 32.
    Eskenazi, S., Gomez-Krämer, P., Ogier, J.-M.: A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recogn. 64, 1–14 (2017)CrossRefGoogle Scholar
  33. 33.
    Clausner, C., Antonacopoulos, A., Pletschacher, S.: ICDAR2017 competition on recognition of documents with complex layouts. In: 14th ICDAR, pp. 1404–1410. IEEE (2017)Google Scholar
  34. 34.
    Smith, R.W.: Hybrid page layout analysis via tab-stop detection. In: 10th ICDAR, pp. 241–245. IEEE (2009)Google Scholar
  35. 35.
    Malakar, S., Das, R.K., Sarkar, R., et al.: Handwritten and printed word identification using gray-scale feature vector and decision tree classifier. Procedia Technol. 10, 831–839 (2013)CrossRefGoogle Scholar
  36. 36.
    Srivastva, R., Raj, A., Patnaik, T., et al.: A survey on techniques of separation of machine printed text and handwritten text. IJEAT 2(3), 552–555 (2013)Google Scholar
  37. 37.
    Saidani, A., Kacem, A., Belaid, A.: Arabic/Latin and machine-printed/handwritten word discrimination using HOG-based shape descriptor. ELCVIA 14(2), 1–23 (2015)CrossRefGoogle Scholar
  38. 38.
    Zagoris, K., Pratikakis, I., Antonacopoulos, A., et al.: Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recogn. 47(3), 1051–1062 (2014)CrossRefGoogle Scholar
  39. 39.
    Marti, U., Bunke, H.: Text line segmentation and word recognition in a system for general writer independent handwriting recognition. In: 6th ICDAR, pp. 159–163. IEEE (2001)Google Scholar
  40. 40.
    Graves, A., Fernández, S., Gomez, F., et al.: Connectionist temporal classification. Labelling unsegmented sequence data with recurrent neural networks. In: 23rd ICML, pp. 369–376. ACM (2006)Google Scholar
  41. 41.
    Jacobs, C., Simard, P.Y., Viola, P., et al.: Text recognition of low-resolution document images. In: 8th ICDAR, pp. 695–699. IEEE Computer Society (2005)Google Scholar
  42. 42.
    Amin, A.: Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn. 33(8), 1309–1323 (2000)CrossRefGoogle Scholar
  43. 43.
    Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 14th ICDAR, pp. 67–72. IEEE (2017)Google Scholar
  44. 44.
    Tran, T.A., Na, I.-S., Kim, S.-H.: Hybrid page segmentation using multilevel homogeneity structure. In: 9th IMCOM, pp. 78:1–78:6. ACM (2015)Google Scholar
  45. 45.
    He, L., Ren, X., Gao, Q., et al.: The connected-component labeling problem. A review of state-of-the-art algorithms. Pattern Recogn. 70, 25–43 (2017)CrossRefGoogle Scholar
  46. 46.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 CVPR, pp. 886–893. IEEE Computer Society (2005)Google Scholar
  47. 47.
    Park, D.C., El-Sharkawi, M.A., Marks, R.J., et al.: Electric load forecasting using an artificial neural network. IEEE Trans. Power Syst. 6(2), 442–449 (1991)CrossRefGoogle Scholar
  48. 48.
    Bloomberg, D.S., Kopec, G.E., Dasari, L.: Measuring document image skew and orientation. In: Document Recognition II, vol. 2422, pp. 302–317 (1995)Google Scholar
  49. 49.
    The Tesseract open source OCR engine. https://github.com/tesseract-ocr/tesseract. Accessed 30 Jan 2019
  50. 50.
    Bengio, Y., LeCun, Y., et al.: Scaling learning algorithms towards AI. In: Large-Scale Kernel Machines, vol. 34, no. 5, pp. 1–41 (2007)Google Scholar
  51. 51.
    Abby FinerReader. https://www.abbyy.com/de-de/finereader/. Accessed 30 Jan 2019

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of UlmUlmGermany
  2. 2.German University in CairoCairoEgypt

Personalised recommendations