OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

Tafti, Ahmad P.; Baghaie, Ahmadreza; Assefi, Mehdi; Arabnia, Hamid R.; Yu, Zeyun; Peissig, Peggy

doi:10.1007/978-3-319-50835-1_66

OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

Ahmad P. Tafti²⁵,
Ahmadreza Baghaie²⁶,
Mehdi Assefi²⁷,
Hamid R. Arabnia²⁷,
Zeyun Yu²⁸ &
…
Peggy Peissig²⁵

Conference paper
First Online: 10 December 2016

5486 Accesses
19 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10072))

Abstract

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016)
Google Scholar
Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015)
Google Scholar
Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)
Article Google Scholar
Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001)
Google Scholar
Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)
Article MATH Google Scholar
Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016)
Google Scholar
Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear)
Google Scholar
Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003)
Google Scholar
Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005)
Google Scholar
Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)
Article Google Scholar
Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016)
Google Scholar
Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016)
Google Scholar
Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982)
Google Scholar
Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)
Article Google Scholar
Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015)
Google Scholar
Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015)
Google Scholar
Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008)
Google Scholar
Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27863-6_50
Chapter Google Scholar
Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)
Article Google Scholar
Google docs (2012). http://docs.google.com
Tesseract OCR (2016). https://github.com/tesseract-ocr
Tesseract.js, a pure javascript version of the tesseract OCR engine (2016). http://tesseract.projectnaptha.com/
Abbyy OCR (2016). https://www.abbyy.com/
Abbyy OCR online (2016). https://finereaderonline.com/en-us/Tasks/Create
Transym (2016). http://www.transym.com/
Online OCR (2016). http://www.onlineocr.net/
Free OCR (2016). http://www.free-ocr.com/
Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014)
Google Scholar
Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995)
Google Scholar
Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009)
Google Scholar
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)
Article Google Scholar
Google drive (2012). http://drive.google.com
Apache license, version 2.0 (2004). http://www.apache.org/licenses/LICENSE-2.0
Smith, R.: An overview of the tesseract OCR engine (2007)
Google Scholar
Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)
Article Google Scholar
Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012)
Google Scholar
Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994)
Google Scholar
Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006)
Google Scholar
Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014)
Google Scholar
Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)
Article Google Scholar
Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012)
Google Scholar
Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015)
Google Scholar

Download references

Acknowledgement

The authors of the paper wish to thank Anne Nikolai at Marshfield Clinic Research Foundation for her valuable contributions in manuscript preparation. We also thank two anonymous reviewers for their useful comments on the manuscript.

Author information

Authors and Affiliations

Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, 54449, USA
Ahmad P. Tafti & Peggy Peissig
Department of Electrical Engineering, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, 53211, USA
Ahmadreza Baghaie
Department of Computer Science, University of Georgia, Athens, Georgia, 30602, USA
Mehdi Assefi & Hamid R. Arabnia
Department of Computer Science, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, 53211, USA
Zeyun Yu

Authors

Ahmad P. Tafti
View author publications
You can also search for this author in PubMed Google Scholar
Ahmadreza Baghaie
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Assefi
View author publications
You can also search for this author in PubMed Google Scholar
Hamid R. Arabnia
View author publications
You can also search for this author in PubMed Google Scholar
Zeyun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Peggy Peissig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ahmad P. Tafti or Peggy Peissig .

Editor information

Editors and Affiliations

University of Nevada , Reno, Nevada, USA
George Bebis
NASA Ames Research Center , Moffett Field, California, USA
Richard Boyle
Lawrence Berkeley National Laboratory , Berkeley, California, USA
Bahram Parvin
Desert Research Institute , Reno, Nevada, USA
Darko Koracin
The Australian National University , O'Malley, Aust Capital Terr, Australia
Fatih Porikli
Pilot AI Labs , Redwood City, California, USA
Sandra Skaff
University of Florida , Gainesville, Florida, USA
Alireza Entezari
Google Inc. , Mountain View, California, USA
Jianyuan Min
Osaka University , Osaka, Japan
Daisuke Iwai
The MOVES Institute , Monterey, California, USA
Amela Sadagic
University of Arizona , Tucson, Arizona, USA
Carlos Scheidegger
Université Paris-Sud , Orsay, France
Tobias Isenberg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tafti, A.P., Baghaie, A., Assefi, M., Arabnia, H.R., Yu, Z., Peissig, P. (2016). OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2016. Lecture Notes in Computer Science(), vol 10072. Springer, Cham. https://doi.org/10.1007/978-3-319-50835-1_66

Download citation

DOI: https://doi.org/10.1007/978-3-319-50835-1_66
Published: 10 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50834-4
Online ISBN: 978-3-319-50835-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics