OCR as a Service: An Experimental Evaluation of Google Docs OCR, Tesseract, ABBYY FineReader, and Transym

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10072)

Abstract

Optical character recognition (OCR) as a classic machine learning challenge has been a longstanding topic in a variety of applications in healthcare, education, insurance, and legal industries to convert different types of electronic documents, such as scanned documents, digital images, and PDF files into fully editable and searchable text data. The rapid generation of digital images on a daily basis prioritizes OCR as an imperative and foundational tool for data analysis. With the help of OCR systems, we have been able to save a reasonable amount of effort in creating, processing, and saving electronic documents, adapting them to different purposes. A set of different OCR platforms are now available which, aside from lending theoretical contributions to other practical fields, have demonstrated successful applications in real-world problems. In this work, several qualitative and quantitative experimental evaluations have been performed using four well-know OCR services, including Google Docs OCR, Tesseract, ABBYY FineReader, and Transym. We analyze the accuracy and reliability of the OCR packages employing a dataset including 1227 images from 15 different categories. Furthermore, we review the state-of-the-art OCR applications in healtcare informatics. The present evaluation is expected to advance OCR research, providing new insights and consideration to the research area, and assist researchers to determine which service is ideal for optical character recognition in an accurate and efficient manner.

Keywords

Cataract 

Notes

Acknowledgement

The authors of the paper wish to thank Anne Nikolai at Marshfield Clinic Research Foundation for her valuable contributions in manuscript preparation. We also thank two anonymous reviewers for their useful comments on the manuscript.

References

  1. 1.
    Lin, H.-Y., Hsu, C.-Y.: Optical character recognition with fast training neural network. In: 2016 IEEE International Conference on Industrial Technology (ICIT), pp. 1458–1461. IEEE (2016)Google Scholar
  2. 2.
    Patil, V.V., Sanap, R.V., Kharate, R.B.: Optical character recognition using artificial neural network. Int. J. Eng. Res. Gen. Sci. 3(1), 7 (2015)Google Scholar
  3. 3.
    Spitsyn, V.G., Bolotova, Y.A., Phan, N.H., Bui, T.T.T.: Using a haar wavelet transform, principal component analysis and neural networks for OCR in the presence of impulse noise. Comput. Opt. 40(2), 249–257 (2016)CrossRefGoogle Scholar
  4. 4.
    Bunke, H., Caelli, T.: Hidden Markov Models: Applications in Computer Vision, vol. 45. World Scientific, River Edge (2001)Google Scholar
  5. 5.
    Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)CrossRefMATHGoogle Scholar
  6. 6.
    Jadhav, P., Kelkar, P., Patil, K., Thorat, S.: Smart traffic control system using image processing (2016)Google Scholar
  7. 7.
    Afli, H., Qiu, Z., Way, A., Sheridan, P.: Using SMT for OCR error correction of historical texts. In: Proceedings of LREC-2016, Portorož, Slovenia (2016, to appear)Google Scholar
  8. 8.
    Kolak, O., Byrne, W., Resnik, P.: A generative probabilistic OCR model for NLP applications. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 55–62. Association for Computational Linguistics (2003)Google Scholar
  9. 9.
    Kolak, O., Resnik, P.: OCR post-processing for low density languages. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 867–874. Association for Computational Linguistics (2005)Google Scholar
  10. 10.
    Deselaers, T., Müller, H., Clough, P., Ney, H., Lehmann, T.M.: The CLEF 2005 automatic medical image annotation task. Int. J. Comput. Vis. 74(1), 51–58 (2007)CrossRefGoogle Scholar
  11. 11.
    Kaggal, V.C., Elayavilli, R.K., Mehrabi, S., Joshua, J.P., Sohn, S., Wang, Y., Li, D., Rastegar, M.M., Murphy, S.P., Ross, J.L., et al.: Toward a learning health-care system-knowledge delivery at the point of care empowered by big data and NLP. Biomed. Inf. Insights 8(Suppl1), 13 (2016)Google Scholar
  12. 12.
    Pomares-Quimbaya, A., Gonzalez, R.A., Quintero, S., Muñoz, O.M., Bohórquez, W.R., García, O.M., Londoño, D.: A review of existing applications and techniques for narrative text analysis in electronic medical records (2016)Google Scholar
  13. 13.
    Herbert, H.F.: The History of OCR, Optical Character Recognition. Recognition Technologies Users Association, Manchester Center (1982)Google Scholar
  14. 14.
    Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRefGoogle Scholar
  15. 15.
    Assefi, M., Liu, G., Wittie, M.P., Izurieta, C.: An experimental evaluation of apple siri and google speech recognition. In: Proccedings of the 2015 ISCA SEDE (2015)Google Scholar
  16. 16.
    Assefi, M., Wittie, M., Knight, A.: Impact of network performance on cloud speech recognition. In: 2015 24th International Conference on Computer Communication and Networks (ICCCN), pp. 1–6. IEEE (2015)Google Scholar
  17. 17.
    Hatch, R.: SaaS Architecture, Adoption and Monetization of SaaS Projects using Best Practice Service Strategy, Service Design, Service Transition, Service Operation and Continual Service Improvement Processes. Emereo Pty Ltd., London (2008)Google Scholar
  18. 18.
    Tafti, A.P., Hassannia, H., Piziak, D., Yu, Z.: SeLibCV: a service library for computer vision researchers. In: Bebis, G., et al. (eds.) ISVC 2015. LNCS, vol. 9475, pp. 542–553. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-27863-6_50 CrossRefGoogle Scholar
  19. 19.
    Xiaolan, X., Wenjun, W., Wang, Y., Yuchuan, W.: Software crowdsourcing for developing software-as-a-service. Front. Comput. Sci. 9(4), 554–565 (2015)CrossRefGoogle Scholar
  20. 20.
    Google docs (2012). http://docs.google.com
  21. 21.
    Tesseract OCR (2016). https://github.com/tesseract-ocr
  22. 22.
    Tesseract.js, a pure javascript version of the tesseract OCR engine (2016). http://tesseract.projectnaptha.com/
  23. 23.
    Abbyy OCR (2016). https://www.abbyy.com/
  24. 24.
  25. 25.
    Transym (2016). http://www.transym.com/
  26. 26.
    Online OCR (2016). http://www.onlineocr.net/
  27. 27.
    Free OCR (2016). http://www.free-ocr.com/
  28. 28.
    Mendelson, E.: Abbyy finereader 12 professional. Technical report, PC Magazine (2014)Google Scholar
  29. 29.
    Rice, S.V., Jenkins, F.R., Nartker, T.A.: The fourth annual test of OCR accuracy. Technical report, Technical Report 95 (1995)Google Scholar
  30. 30.
    Bautista, C.M., Dy, C.A., Mañalac, M.I., Orbe, R.A., Cordel, M.: Convolutional neural network for vehicle detection in low resolution traffic videos. In: 2016 IEEE Region 10 Symposium (TENSYMP), pp. 277–281. IEEE (2016)Google Scholar
  31. 31.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  32. 32.
    Shah, P., Karamchandani, S., Nadkar, T., Gulechha, N., Koli, K., Lad, K.: OCR-based chassis-number recognition using artificial neural networks. In: 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), pp. 31–34. IEEE (2009)Google Scholar
  33. 33.
    Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)CrossRefGoogle Scholar
  34. 34.
    Google drive (2012). http://drive.google.com
  35. 35.
    Apache license, version 2.0 (2004). http://www.apache.org/licenses/LICENSE-2.0
  36. 36.
    Smith, R.: An overview of the tesseract OCR engine (2007)Google Scholar
  37. 37.
    Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. GPU Game Tools 12(2), 13–21 (2007)CrossRefGoogle Scholar
  38. 38.
    Rasmussen, L.V., Peissig, P.L., McCarty, C.A., Starren, J.: Development of an optical character recognition pipeline for handwritten form fields from an electronic health record. J. Am. Med. Inf. Assoc. 19(e1), e90–e95 (2012)Google Scholar
  39. 39.
    Titlestad, G.: Use of document image processing in cancer registration: how and why? Medinfo. MEDINFO 8, 462 (1994)Google Scholar
  40. 40.
    Bussmann, H., Wester, C.W., Ndwapi, N., Vanderwarker, C., Gaolathe, T., Tirelo, G., Avalos, A., Moffat, H., Marlink, R.G.: Hybrid data capture for monitoring patients on highly active antiretroviral therapy (haart) in urban Botswana. Bull. World Health Org. 84(2), 127–131 (2006)Google Scholar
  41. 41.
    Hawker, C.D., McCarthy, W., Cleveland, D., Messinger, B.L.: Invention and validation of an automated camera system that uses optical character recognition to identify patient name mislabeled samples. Clin. Chem. 60(3), 463–470 (2014)Google Scholar
  42. 42.
    Peissig, P.L., Rasmussen, L.V., Berg, R.L., Linneman, J.G., McCarty, C.A., Waudby, C., Chen, L., Denny, J.C., Wilke, R.A., Pathak, J., et al.: Importance of multi-modal approaches to effectively identify cataract cases from electronic health records. J. Am. Med. Inform. Assoc. 19(2), 225–234 (2012)CrossRefGoogle Scholar
  43. 43.
    Fenz, S., Heurix, J., Neubauer, T.: Recognition and privacy preservation of paper-based health records. Stud. Health Technol. Inf. 180, 751–755 (2012)Google Scholar
  44. 44.
    Li, X., Hu, G., Teng, X., Xie, G.: Building structured personal health records from photographs of printed medical records. In: AMIA Annual Symposium Proceedings, vol. 2015, p. 833. American Medical Informatics Association (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Biomedical Informatics Research CenterMarshfield Clinic Research FoundationMarshfieldUSA
  2. 2.Department of Electrical EngineeringUniversity of Wisconsin-MilwaukeeMilwaukeeUSA
  3. 3.Department of Computer ScienceUniversity of GeorgiaAthensUSA
  4. 4.Department of Computer ScienceUniversity of Wisconsin-MilwaukeeMilwaukeeUSA

Personalised recommendations