A Graph Based Approach for Heterogeneous Document Segmentation

Zirari, Fattah; Mammass, Driss; Ennaji, Abdellatif; Nicolas, Stephane

doi:10.1007/978-3-642-31254-0_48

Fattah Zirari^21,22,
Driss Mammass²¹,
Abdellatif Ennaji²² &
…
Stephane Nicolas²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7340))

Included in the following conference series:

International Conference on Image and Signal Processing

2393 Accesses

Abstract

In the field of document image processing, the text/graphic separation is a major step that conditions the performance of the recognition and indexing systems. That involves identifying and separating the graphical and textual components of a document image. In this context, it is important to implement approaches that effectively address these problems. This paper presents a method for separating textual and non textual components in document images using a graph-based modeling and structural analysis. This is a fast and efficient method to separate adequately the graphical and the textual areas of a document. Some examples obtained on technical documents and magazines issued from the databases approved by the community make it possible to validate the approach.

Download to read the full chapter text

Chapter PDF

Connected Operators for Non-text Object Segmentation in Grayscale Document Images

Page Segmentation Techniques in Document Analysis

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

Article 11 March 2016

Keywords

References

Antonacopoulos, A., Karatzas, D.: Semantics based content extraction in typewritten historical documents. In: 8th International Conference on Document Analysis and Recognition, pp. 48–53 (2005)
Google Scholar
Jain, A.K.: Fundamentals of digital image processing. Prentice Hall (1989)
Google Scholar
Mitchell, P.E., Yan, H.: Newspaper document analysis featuring connected line segmentation. In: Sixth International Conference on Document Analysis and Recognition, pp. 1181–1185 (2001)
Google Scholar
Faure, C., Vincent, N.: Simultaneous detection of vertical and horizontal text lines based on perceptual organization. In: 16th Document Recognition and Retrieval Conference, DRR 2009, USA (2009)
Google Scholar
Wong, K.Y., Casey, R.G., Wahi, F.M.: Document analysis system. IBM Journal of Research Development 26, 647–656 (1982)
Article Google Scholar
Caponetti, L., Castiello, C., Gorecki, P.: Document page segmentation using neurofuzzy approach. Applied Soft Computing (2007) (in press, corrected proof)
Google Scholar
Bukhari, S.S., Shafait, F., Breuel, T.M.: Segmentation of curled textlines using active contours. In: The Eighth IAPR Workshop on Document Analysis Systems (2008)
Google Scholar
Ramel, J., Leriche, S.: Segmentation et analyse interactive de documents anciens imprimes. In: Traitement du Signal (TS), pp. 209–222 (2005)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient Graph-Based Image Segmentation. International Journal of Computer Vision 59(2), 167–181 (2004)
Article Google Scholar
Antonacopoulos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: Performance Analysis Framework for Layout Analysis Methods. In: Proceedings of The 10th International Conference on Document Analysis and Recognition (ICDAR 2009), Catalonia, Spain, pp. 296–300 (September 2009)
Google Scholar
Guyon, I., Haralick, R.M., Hull, J.J., Phillips, I.T.: Data sets for OCR and document image understanding research. In: Bunke, H., Wang, P. (eds.) Handbook of Character Recognition and Document Image Analysis, pp. 779–799. World Scientific, Singapore (1997)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory IRF-SIC, Ibn Zohr University, Agadir, Morocco
Fattah Zirari & Driss Mammass
Laboratory LITIS, University of Rouen, Rouen, France
Fattah Zirari, Abdellatif Ennaji & Stephane Nicolas

Authors

Fattah Zirari
View author publications
You can also search for this author in PubMed Google Scholar
Driss Mammass
View author publications
You can also search for this author in PubMed Google Scholar
Abdellatif Ennaji
View author publications
You can also search for this author in PubMed Google Scholar
Stephane Nicolas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ENSICAEN, CNRS, GREYC Image Team, Université de Caen Basse-Normandie, 6 Boulevard Maréchal Juin, F-14050, Caen Cedex, France
Abderrahim Elmoataz
Faculté des Sciences, Université IbnZohr, Agadir, Morocco
Driss Mammass
GREYC UMR CNRS 6072, ENSICAEN, Université de Caen Basse-Normandie, 14050, Caen, France
Olivier Lezoray
Département de Mathématiques et d’ informatique, Université de Québec à Trois-Rivières, C.P. 500, G9A 5H7, Trois-Rivières, Québec, Canada
Fathallah Nouboud
Faculté des Sciences, Université Mohammed V- Agdal, 4, avenue Ibn Battouta, B.P. : 1014, Rabat, Maroc
Driss Aboutajdine

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zirari, F., Mammass, D., Ennaji, A., Nicolas, S. (2012). A Graph Based Approach for Heterogeneous Document Segmentation. In: Elmoataz, A., Mammass, D., Lezoray, O., Nouboud, F., Aboutajdine, D. (eds) Image and Signal Processing. ICISP 2012. Lecture Notes in Computer Science, vol 7340. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31254-0_48

Download citation

DOI: https://doi.org/10.1007/978-3-642-31254-0_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31253-3
Online ISBN: 978-3-642-31254-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

A Graph Based Approach for Heterogeneous Document Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Connected Operators for Non-text Object Segmentation in Grayscale Document Images

Page Segmentation Techniques in Document Analysis

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

A Graph Based Approach for Heterogeneous Document Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Connected Operators for Non-text Object Segmentation in Grayscale Document Images

Page Segmentation Techniques in Document Analysis

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation