Reading Industrial Inspection Sheets by Inferring Visual Relations

Rahul, Rohit; Chowdhury, Arindam; Animesh; Mittal, Samarth; Vig, Lovekesh

doi:10.1007/978-3-030-21074-8_13

Reading Industrial Inspection Sheets by Inferring Visual Relations

Rohit Rahul¹⁶,
Arindam Chowdhury¹⁶,
Animesh¹⁶,
Samarth Mittal¹⁷ &
…
Lovekesh Vig¹⁶

Conference paper
First Online: 19 June 2019

1575 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11367))

Abstract

The traditional mode of recording faults in heavy factory equipment has been via handmarked inspection sheets, wherein a machine engineer manually marks the faulty machine regions on a paper outline of the machine. Over the years, millions of such inspection sheets have been recorded and the data within these sheets has remained inaccessible. However, with industries going digital and waking up to the potential value of fault data for machine health monitoring, there is an increased impetus towards digitization of these handmarked inspection records. To target this digitization, we propose a novel visual pipeline combining state of the art deep learning models, with domain knowledge and low level vision techniques, followed by inference of visual relationships. Our framework is robust to the presence of both static and non-static background in the document, variability in the machine template diagrams, unstructured shape of graphical objects to be identified and variability in the strokes of handwritten text. The proposed pipeline incorporates a capsule and spatial transformer network based classifier for accurate text reading, and a customized CTPN [15] network for text detection in addition to hybrid techniques for arrow detection and dialogue cloud removal. We have tested our approach on a real world dataset of 50 inspection sheets for large containers and boilers. The results are visually appealing and the pipeline achieved an accuracy of 87.1% for text detection and 94.6% for text reading.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agin, G.J.: Computer vision systems for industrial inspection and assembly. Computer 5, 11–20 (1980)
Article Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29(6), 141–142 (2012)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
Golnabi, H., Asadpour, A.: Design and application of industrial machine vision systems. Robot. Comput.-Integr. Manuf. 23(6), 630–637 (2007)
Article Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Marinai, S.: Introduction to document analysis and recognition. In: Marinai, S., Fujisawa, H. (eds.) Machine Learning in Document Analysis and Recognition. Studies in Computational Intelligence, vol. 90, pp. 1–20. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-76280-5_1
Chapter MATH Google Scholar
Ramakrishna, P., et al.: An AR inspection framework: feasibility study with multiple AR devices. In: 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct), pp. 221–226, September 2016. https://doi.org/10.1109/ISMAR-Adjunct.2016.0080
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Advances in Neural Information Processing Systems, pp. 3859–3869 (2017)
Google Scholar
Seddati, O., Dupont, S., Mahmoudi, S.: Deepsketch: deep convolutional neural networks for sketch recognition and similarity search. In: 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2015)
Google Scholar
Shimrat, M.: Algorithm 112: position of point relative to polygon. Commun. ACM 5(8), 434 (1962)
Article Google Scholar
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Chapter Google Scholar
Yoo, J.C., Han, T.H.: Fast normalized cross-correlation. Circuits Syst. Signal Process. 28(6), 819–843 (2009). https://doi.org/10.1007/s00034-009-9130-7
Article MATH Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

TCS Research, Delhi, India
Rohit Rahul, Arindam Chowdhury, Animesh & Lovekesh Vig
BITS Pilani, Goa Campus, Pilani, India
Samarth Mittal

Authors

Rohit Rahul
View author publications
You can also search for this author in PubMed Google Scholar
Arindam Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
Animesh
View author publications
You can also search for this author in PubMed Google Scholar
Samarth Mittal
View author publications
You can also search for this author in PubMed Google Scholar
Lovekesh Vig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rohit Rahul .

Editor information

Editors and Affiliations

School of Computer Science, University of Adelaide, Adelaide, Australia
Gustavo Carneiro
Data61, Commonwealth Scientific and Industrial Research Organization, Canberra, Australia
Shaodi You

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rahul, R., Chowdhury, A., Animesh, Mittal, S., Vig, L. (2019). Reading Industrial Inspection Sheets by Inferring Visual Relations. In: Carneiro, G., You, S. (eds) Computer Vision – ACCV 2018 Workshops. ACCV 2018. Lecture Notes in Computer Science(), vol 11367. Springer, Cham. https://doi.org/10.1007/978-3-030-21074-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-21074-8_13
Published: 19 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21073-1
Online ISBN: 978-3-030-21074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics