Abstract
In this paper, we present a method to classify forms by a statistical approach; the physical structure may vary from one writer to another. An automatic form segmentation is performed to extract the physical structure which is described by the main rectangular block set. During the form learning phase, a block matching is made inside each class; the number of occurrences of each block is counted, and statistical block attributes are computed. During the phase of identification, we solve the block instability by introducing a block penalty coefficient, which modifies the classical expression of Mahalanobis distance. A block penalty coefficient depends on the block occurrence probability. Experimental results, using the different form types, are given.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Reference
D. Doermann, A. Rosenfeld, E, Rivlin: The Function of documents, Proc. of ICDAR'97, Ulm, Germany (1997) 1077–1081.
J. Mao, M. Abayan, K. Mohiuddin: A Model-Based Form Processing Sub-System, Proc. of ICPR'96, Vienna, Austria (1996) 691–695.
L. Y. Tseng, R. C. Chen: The Recognition of Form Documents Based on Three Types of Line Segments, Proc. of ICDAR‘97, Ulm, Germany (1997) 71–75.
Y. Ishitani, " Model Matching Based on Association Graph for Form Image Understanding, Proc. of ICDAR'95, Montreal, Canada (1995) 287–292.
C. D. Yan, Y. Y Tang, C. Y. suen: Form Understanding System Based on Form Description Language. Proc. of ICDAR'91,Saint Malo, France (1991) 283–293
J. Yuan, Y. Y. Tang, C. Y. Suen: Four Directional Adjacency Graphs (FDAG) and Their Application in Locating Field in Forms. Proc of ICDAR'95, Montreal, Canada (1995) 752–755
F. Cesarini, M. Gori, S. Marinai, G. Soda: A System for Data Extraction from Forms of Known Class. Proc. of ICDAR'95, Montreal, Canada (1995) 1136–1140 96 Saddok Kebairi et al.
U. Bohnacker, J. Schacht, T. Yücel.: Matching form lines Based on a Heuristic Search ", Proc. of ICDAR‘97, Ulm, Germany, (1997) 86–90.
F. Dubiel, A. Dengel.: FormClass-A System For OCR Free identification Of Forms. DAS'96, USA (1996) 189–208
P. Héroux, S. Diana, A. Ribert, E. Trupin: Etude de Méthodes de Classification pour l'Identification Automatique de Classes de Formulaires. Proc. of CIFED'98, Quebec, Canada (1998) 463–472
S. Kebairi, B. Taconet, A. Zahour, P. Mercy: Détection Automatique du Type de Formulaire Parmi un Ensemble Appris et Extraction des Données Utiles. CIFED'98, Quebec, Canada (1998) 255–264
S. Kebairi, B. Taconet: A System of Automatic Reading of Forms: Int. Conf. of Pattern Recognition and Information Analysis, PRIP'97, Minsk Belarus, (1997) 264–270.
L. Boukined, B. Taconet, A. Zahour: Recherche de la Structure Physique d'un Document Imprimé par Rectangulation., Proc. RFIA 91, France (1991) 1027–1031
S. Kebairi, A. Zahour, B. Taconet, L. Boukined: Segmentation of Composite Documents Into Homogenous Blocks. Proc. IGS'98, Genova Italy (1997) 111–112
J.F. Allen: Maintaing Knowledge About Temporel Intervals. Communication of the ACM, 26 (11), (1983) 832–843
H. Walischewski: Automatic Knowledge Acquisition for Spatial Document Interpretation.Proc. of ICDAR'97, Ulm, Germany (1997) 243–247
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kebairi, S., Taconet, B., Zahour, A., Ramdane, S. (1999). A Statistical Method for an Automatic Detection of Form Types. In: Lee, SW., Nakano, Y. (eds) Document Analysis Systems: Theory and Practice. DAS 1998. Lecture Notes in Computer Science, vol 1655. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48172-9_8
Download citation
DOI: https://doi.org/10.1007/3-540-48172-9_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66507-6
Online ISBN: 978-3-540-48172-0
eBook Packages: Springer Book Archive