Abstract
In this paper, we describe a system capable of extracting textual information from images of structured documents. In particular the model and the algorithms we described are used to process forms in which the information fields can not be located only by their position on the page, but can also be identified after locating the corresponding instruction fields. The proposed model is based on attributed relational graphs and performs form registration and location of information fields using algorithms based on the hypothesize-and-verify paradigm. The location of instruction fields is carried out in an holistic way, by using connectionist models.
Preview
Unable to display preview. Download preview PDF.
References
M. Bianchini, P. Frasconi and M. Gori. Learning in Multilayered Networks Used as Autoassociators. IEEE Transaction on Neural Networks 1995, Vol. 6, No. 2 pp. 512–515.
F. Cesarini, M. Gori, S. Marinai, G. Soda. A Hybrid System for Locating Low Level Graphic Items. To appear in Proceedings of the First IAPR Workshop on Graphic Recognition, Pen State University, 1995.
F. Cesarini, M. Gori, S. Marinai, G. Soda. A System for Data Extraction from Forms of Known Class. To appear in Proceedings of the 3th International Conference on Document Analysis and Recognition, Montreal 1995.
D. S. Doermann, A. Rosenfeld The Processing of Form Documents. Proceedings of International Conference on Document Analysis and Recognition, 1993, pp. 497–501.
M.A. Eshera and K.S. Fu. An Image Understanding System using Attributed Symbolic Representation and Inexact Graph-matching. IEEE Transaction on PAMI 1986, Vol. 8, No. 5 pp. 604–617.
M. D. Garris et als. NIST Form-based Handprint Recognition System. NISTIR 5469. U.S. Department of Commerce. Technology Administration. National Institute of Standards and Technology. July 1994.
W.E.L. Grimson. Object Recognition by Computer, the Role of Geometric Constraints. Cambridge. MIT Press, 1990.
S.W. Lam, S.N. Srihari. Multi-domain Document Layout Understanding. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 112–120.
S.W. Lam. An Adaptive Approach to Document Classification and Understanding. Proceedings of the IAPR Workshop on Document Analysis Systems Kaiserslautern, Germany, October 1994.
Y.Y. Tang, C.De Yan, C.Y. Suen. Document Processing for Automatic Knowledge Acquisition. IEEE Transaction on Knowledge and Data Engineering 1994, Vol. 6, No. 1 pp. 3–20.
C.D. Yan, Y.Y. Tang, C.Y. Suen. Form Understanding System Based on Form Description Language. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 283–293.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cesarini, F., Gori, M., Marinai, S., Soda, G. (1995). Data extraction from form images. In: Revell, N., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 1995. Lecture Notes in Computer Science, vol 978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0049141
Download citation
DOI: https://doi.org/10.1007/BFb0049141
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60303-0
Online ISBN: 978-3-540-44790-0
eBook Packages: Springer Book Archive