Abstract
Being a well-researched area, optical character recognition or OCR has seen many advancements. Many state-of-the-art algorithms have been developed that can be used for the purpose of OCR but extracting text from images containing tables while preserving the structure of the table still remains a challenging task. Here, we present an efficient and highly scalable parallel architecture to segment input images containing tabular data with and without borders into cells and reconstruct the tabular data while preserving the tabular format. The performance improvement thus made can be used to ease the tedious task of digitizing tabular data in bulk. The same architecture can be used for regular OCR applications to improve performance if the data is in huge quantities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Grother, P. J. (1995). NIST special database 19. Handprinted forms and characters database. National Institute of Standards and Technology.
LeCun, Y., Cortes, C., & Burges, C. J. C. (1998). The MNIST Database of Handwritten Digits (vol. 10, p. 34). http://yann.lecun.com/exdb/mnist.
Mouchere, H., et al. (2011). Crohme2011: competition on recognition of online handwritten mathematical expressions. In: 2011 International Conference on Document Analysis and Recognition. IEEE.
Acharya, S., Pant, A. K., & Gyawali, P. K. (2015). Deep learning based large scale handwritten Devanagari character recognition. In: 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), p. 2015. IEEE.
Zhou, S., Chen, Q., & Wang, X. (2010). HIT-OR3C: an opening recognition corpus for Chinese characters. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. ACM.
Wang, K., & Belongie, S. (2010, September). Word spotting in the wild. In: European Conference on Computer Vision (pp. 591–604). Berlin: Springer.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning.
Nagy, R., Dicker, A., & Meyer-Wegener, K. (2011, September). NEOCR: a configurable dataset for natural image text recognition. In: International Workshop on Camera-Based Document Analysis and Recognition (pp. 150–163). Berlin: Springer.
Lee, S., Cho, M. S., Jung, K., & Kim, J. H. (2010, August). Scene text extraction with edge constraint and text collinearity. In: 2010 20th International Conference on Pattern Recognition (pp. 3983–3986). IEEE.
Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012, June). Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1083–1090). IEEE.
Indermühle, E., Liwicki, M., & Bunke, H. (2010, June). IAMonDo-database: an online handwritten document database with non-uniform contents. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (pp. 97–104). ACM.
Liwicki, M., & Bunke, H. (2005, August). IAM-OnDB-an on-line English sentence database acquired from handwritten text on a whiteboard. In: Eighth International Conference on Document Analysis and Recognition (ICDAR 2005) (pp. 956–961). IEEE.
de Campos, T. E., Babu, B. R., Varma, M. (2009, February). Character recognition in natural images. In: Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal.
Bieniecki, W., Grabowski, S., Rozenberg, W. (2007, May). Image preprocessing for improving OCR accuracy. In: 2007 International Conference on Perspective Technologies and Methods in MEMS Design (pp. 75–80). IEEE.
Shapiro, L., & Stockman, G. (2001). Computer Vision. Upper Saddle River: Prentice Hall.
Sezgin, M., & Sankur, B. (2004). Survey over image thresholding techniques and quantitative performance evaluation. Journal of Electronic Imaging, 13(1), 146–166.
Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62–66.
Serra, J., & Soille, P. (Eds.). (2012). Mathematical Morphology and Its Applications to Image Processing (Vol. 2). Heidelberg: Springer.
Srisha, R., & Khan, A. (2013). Morphological Operations for Image Processing: Understanding and its Applications.
Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000, July). Image inpainting. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (pp. 417–424). ACM Press/Addison-Wesley Publishing Co.
Sasaki, K., Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Joint gap detection and inpainting of line drawings. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5725–5733).
Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 679–698.
Maire, M. R. (2009). Contour Detection and Image Segmentation. Berkeley: University of California.
Bradski, G., & Kaehler, A. (2008). Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly Media, Inc.
Zhou, X., et al. (2017). EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5551–5560).
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 779–788).
Liu, W., et al. (2016, October). SSD: single shot multibox detector. In: European Conference on Computer Vision (pp. 21–37). Cham: Springer.
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (pp. 2961–2969).
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (pp. 91–99).
Smith, R. (2007, September). An overview of the Tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) (vol. 2, pp. 629–633). IEEE.
Karaa, W. B. A., Ashour, A. S., Sassi, D. B., Roy, P., Kausar, N., & Dey, N. (2016). Medline text mining: an enhancement genetic algorithm based approach for document clustering. In: Applications of Intelligent Optimization in Biology and Medicine (pp. 267–287). Cham: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Ranjan, A., Behera, V.N.J., Reza, M. (2021). OCR Using Computer Vision and Machine Learning. In: Das, S., Das, S., Dey, N., Hassanien, AE. (eds) Machine Learning Algorithms for Industrial Applications. Studies in Computational Intelligence, vol 907. Springer, Cham. https://doi.org/10.1007/978-3-030-50641-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-50641-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50640-7
Online ISBN: 978-3-030-50641-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)