Abstract
Layout analysis is the process of extracting a hierarchical structure describing the layout of a page. In the document processing system WISDOM++ the layout analysis is performed in two steps: firstly, the global analysis determines possible areas containing paragraphs, sections, columns, figures and tables, and secondly, the local analysis groups together blocks that possibly fall within the same area. The result of the local analysis process strongly depends on the quality of the results of the first step. In this paper we investigate the possibility of supporting the user during the correction of the results of the global analysis. This is done by allowing the user to correct the results of the global analysis and then by learning rules for layout correction from the sequence of user actions. Experimental results on a set of multi-page documents are reported.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altamura O., Esposito F., & Malerba D.: Transforming paper documents into XML format with WISDOM++, Int. Journal on Document Analysis and Recognition, 4(1), pp. 2–17, 2001.
Akindele O.T., & Belaïd A.: Construction of generic models of document structures using inference of tree grammars, Proc. of the 3rd Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 206–209, 1995.
Dengel A.: Initial learning of document structures, Proc. of the 2nd Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 86–90, 1993.
Dengel A., & Dubiel F.: Clustering and classification of document structure-A machine learning approach, Proc. of the 3rd Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 587–591, 1995.
Esposito F., Malerba D., & Semeraro G.: A Knowledge-Based Approach to the Layout Analysis, Proc. of the 3rd Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 466–471, 1995.
Esposito F., Malerba D., & Lisi F.A.: Machine learning for intelligent processing of printed documents, Journal of Intelligent Information Systems, 14(2/3), pp. 175–198, 2000.
Esposito F., Malerba D., & Lisi F.A.: Induction of recursive theories in the normal ILP setting: issues and solutions, in J. Cussens and A. Frisch (Eds.), Inductive Logic Programming, Lecture Notes in Artificial Intelligence, 1866, pp. 93–111, Springer: Berlin, 2000.
Kise K.: Incremental acquisition of knowledge about layout structures from examples of documents. Proc. of the 2nd Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 668–671, 1993.
Malerba D., Esposito F., & Lisi F.A.: Learning recursive theories with ATRE, Proc. of the 13th European Conf. on Artificial Intelligence, John Wiley & Sons, pp. 435–439, 1998.
Srihari S.N., & Zack G.W.: Document Image Analysis. Proc. of the 8th Int. Conf. on Pattern Recognition, pp. 434–436, 1986.
Walischewski H.: Automatic knowledge acquisition for spatial document interpretation. Proc. of the 4th Int. Conf. on Document Analysis and Recognition, IEEE Computer Society Press, pp. 243–247, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Malerba, D., Esposito, F., Altamura, O. (2002). Adaptive Layout Analysis of Document Images. In: Hacid, MS., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds) Foundations of Intelligent Systems. ISMIS 2002. Lecture Notes in Computer Science(), vol 2366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48050-1_56
Download citation
DOI: https://doi.org/10.1007/3-540-48050-1_56
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43785-7
Online ISBN: 978-3-540-48050-1
eBook Packages: Springer Book Archive