Skip to main content

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10072))

Abstract

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016)

    Google Scholar 

  2. Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015)

    Google Scholar 

  3. Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)

    Article  Google Scholar 

  4. Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001)

    Google Scholar 

  5. Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)

    Article  MATH  Google Scholar 

  6. Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016)

    Google Scholar 

  7. Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear)

    Google Scholar 

  8. Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003)

    Google Scholar 

  9. Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005)

    Google Scholar 

  10. Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)

    Article  Google Scholar 

  11. Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016)

    Google Scholar 

  12. Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016)

    Google Scholar 

  13. Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982)

    Google Scholar 

  14. Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)

    Article  Google Scholar 

  15. Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015)

    Google Scholar 

  16. Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015)

    Google Scholar 

  17. Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008)

    Google Scholar 

  18. Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27863-6_50

    Chapter  Google Scholar 

  19. Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)

    Article  Google Scholar 

  20. Google docs (2012). http://docs.google.com

  21. Tesseract OCR (2016). https://github.com/tesseract-ocr

  22. Tesseract.js, a pure javascript version of the tesseract OCR engine (2016). http://tesseract.projectnaptha.com/

  23. Abbyy OCR (2016). https://www.abbyy.com/

  24. Abbyy OCR online (2016). https://finereaderonline.com/en-us/Tasks/Create

  25. Transym (2016). http://www.transym.com/

  26. Online OCR (2016). http://www.onlineocr.net/

  27. Free OCR (2016). http://www.free-ocr.com/

  28. Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014)

    Google Scholar 

  29. Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995)

    Google Scholar 

  30. Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016)

    Google Scholar 

  31. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  32. Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009)

    Google Scholar 

  33. Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)

    Article  Google Scholar 

  34. Google drive (2012). http://drive.google.com

  35. Apache license, version 2.0 (2004). http://www.apache.org/licenses/LICENSE-2.0

  36. Smith, R.: An overview of the tesseract OCR engine (2007)

    Google Scholar 

  37. Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)

    Article  Google Scholar 

  38. Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012)

    Google Scholar 

  39. Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994)

    Google Scholar 

  40. Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006)

    Google Scholar 

  41. Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014)

    Google Scholar 

  42. Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)

    Article  Google Scholar 

  43. Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012)

    Google Scholar 

  44. Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015)

    Google Scholar 

Download references

Acknowledgement

The authors of the paper wish to thank Anne Nikolai at Marshfield Clinic Research Foundation for her valuable contributions in manuscript preparation. We also thank two anonymous reviewers for their useful comments on the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ahmad P. Tafti or Peggy Peissig .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Tafti, A.P., Baghaie, A., Assefi, M., Arabnia, H.R., Yu, Z., Peissig, P. (2016). OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2016. Lecture Notes in Computer Science(), vol 10072. Springer, Cham. https://doi.org/10.1007/978-3-319-50835-1_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50835-1_66

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50834-4

  • Online ISBN: 978-3-319-50835-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics