Skip to main content

OCR Using Computer Vision and Machine Learning

  • Chapter
  • First Online:
Book cover Machine Learning Algorithms for Industrial Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 907))

Abstract

Being a well-researched area, optical character recognition or OCR has seen many advancements. Many state-of-the-art algorithms have been developed that can be used for the purpose of OCR but extracting text from images containing tables while preserving the structure of the table still remains a challenging task. Here, we present an efficient and highly scalable parallel architecture to segment input images containing tabular data with and without borders into cells and reconstruct the tabular data while preserving the tabular format. The performance improvement thus made can be used to ease the tedious task of digitizing tabular data in bulk. The same architecture can be used for regular OCR applications to improve performance if the data is in huge quantities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Grother, P. J. (1995). NIST special database 19. Handprinted forms and characters database. National Institute of Standards and Technology.

    Google Scholar 

  2. LeCun, Y., Cortes, C., & Burges, C. J. C. (1998). The MNIST Database of Handwritten Digits (vol. 10, p. 34). http://yann.lecun.com/exdb/mnist.

  3. Mouchere, H., et al. (2011). Crohme2011: competition on recognition of online handwritten mathematical expressions. In: 2011 International Conference on Document Analysis and Recognition. IEEE.

    Google Scholar 

  4. Acharya, S., Pant, A. K., & Gyawali, P. K. (2015). Deep learning based large scale handwritten Devanagari character recognition. In: 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), p. 2015. IEEE.

    Google Scholar 

  5. Zhou, S., Chen, Q., & Wang, X. (2010). HIT-OR3C: an opening recognition corpus for Chinese characters. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. ACM.

    Google Scholar 

  6. Wang, K., & Belongie, S. (2010, September). Word spotting in the wild. In: European Conference on Computer Vision (pp. 591–604). Berlin: Springer.

    Google Scholar 

  7. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning.

    Google Scholar 

  8. Nagy, R., Dicker, A., & Meyer-Wegener, K. (2011, September). NEOCR: a configurable dataset for natural image text recognition. In: International Workshop on Camera-Based Document Analysis and Recognition (pp. 150–163). Berlin: Springer.

    Google Scholar 

  9. Lee, S., Cho, M. S., Jung, K., & Kim, J. H. (2010, August). Scene text extraction with edge constraint and text collinearity. In: 2010 20th International Conference on Pattern Recognition (pp. 3983–3986). IEEE.

    Google Scholar 

  10. Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012, June). Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1083–1090). IEEE.

    Google Scholar 

  11. Indermühle, E., Liwicki, M., & Bunke, H. (2010, June). IAMonDo-database: an online handwritten document database with non-uniform contents. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (pp. 97–104). ACM.

    Google Scholar 

  12. Liwicki, M., & Bunke, H. (2005, August). IAM-OnDB-an on-line English sentence database acquired from handwritten text on a whiteboard. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005) (pp. 956–961). IEEE.

    Google Scholar 

  13. de Campos, T. E., Babu, B. R., Varma, M. (2009, February). Character recognition in natural images. In: Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal.

    Google Scholar 

  14. Bieniecki, W., Grabowski, S., Rozenberg, W. (2007, May). Image preprocessing for improving OCR accuracy. In: 2007 International Conference on Perspective Technologies and Methods in MEMS Design (pp. 75–80). IEEE.

    Google Scholar 

  15. Shapiro, L., & Stockman, G. (2001). Computer Vision. Upper Saddle River: Prentice Hall.

    Google Scholar 

  16. Sezgin, M., & Sankur, B. (2004). Survey over image thresholding techniques and quantitative performance evaluation. Journal of Electronic Imaging, 13(1), 146–166.

    Article  Google Scholar 

  17. Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62–66.

    Article  Google Scholar 

  18. Serra, J., & Soille, P. (Eds.). (2012). Mathematical Morphology and Its Applications to Image Processing (Vol. 2). Heidelberg: Springer.

    Google Scholar 

  19. Srisha, R., & Khan, A. (2013). Morphological Operations for Image Processing: Understanding and its Applications.

    Google Scholar 

  20. Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000, July). Image inpainting. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (pp. 417–424). ACM Press/Addison-Wesley Publishing Co.

    Google Scholar 

  21. Sasaki, K., Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Joint gap detection and inpainting of line drawings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5725–5733).

    Google Scholar 

  22. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 679–698.

    Article  Google Scholar 

  23. Maire, M. R. (2009). Contour Detection and Image Segmentation. Berkeley: University of California.

    Google Scholar 

  24. Bradski, G., & Kaehler, A. (2008). Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc.

    Google Scholar 

  25. Zhou, X., et al. (2017). EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5551–5560).

    Google Scholar 

  26. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 779–788).

    Google Scholar 

  27. Liu, W., et al. (2016, October). SSD: single shot multibox detector. In: European Conference on Computer Vision (pp. 21–37). Cham: Springer.

    Google Scholar 

  28. He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (pp. 2961–2969).

    Google Scholar 

  29. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (pp. 91–99).

    Google Scholar 

  30. Smith, R. (2007, September). An overview of the Tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) (vol. 2, pp. 629–633). IEEE.

    Google Scholar 

  31. Karaa, W. B. A., Ashour, A. S., Sassi, D. B., Roy, P., Kausar, N., & Dey, N. (2016). Medline text mining: an enhancement genetic algorithm based approach for document clustering. In: Applications of Intelligent Optimization in Biology and Medicine (pp. 267–287). Cham: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashish Ranjan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ranjan, A., Behera, V.N.J., Reza, M. (2021). OCR Using Computer Vision and Machine Learning. In: Das, S., Das, S., Dey, N., Hassanien, AE. (eds) Machine Learning Algorithms for Industrial Applications. Studies in Computational Intelligence, vol 907. Springer, Cham. https://doi.org/10.1007/978-3-030-50641-4_6

Download citation

Publish with us

Policies and ethics